[open-bibliography] Call for Use Cases: Library Linked Data
jodi.schneider at deri.org
Sun Oct 17 20:31:08 BST 2010
Thanks, Jim. Interesting use case. It's now on the LLD wiki at
On 17 Oct 2010, at 19:34, Jim Pitman wrote:
> Here's another response to the call for use cases.
> many thanks for your assistance
> Jim Pitman
> Director, Bibliographic Knowledge Network Project
> Professor of Statistics and Mathematics
> University of California
> 367 Evans Hall # 3860
> Berkeley, CA 94720-3860
> ph: 510-642-9970 fax: 510-642-7892
> e-mail: pitman at stat.berkeley.edu
> URL: http://www.stat.berkeley.edu/users/pitman
> === name ===
> Community Information Service
> === Owner ===
> Jim Pitman
> === Background and Current Practice ===
> Academic organizations of varying sizes (research groups, university departments,
> universities, university consortia, subject specific communities such as scholarly societies and special interest groups)
> have a strong interest in maintaining awareness and quality of information in their domain, and in openly publishing this
> information to the broader academic community and to the general public.
> A significant component of this information is bibliographic metadata available from library resources,
> especially information about books and articles published in a particular field, or associated with a particular
> Current practice varies greatly. Many publishers and scholarly societies offer subscription-based A&I services which are paid
> for by libraries. Typical license agreements limit these services to "individual" use.
> This inhibits creative selection, remixing and republication of bibliographic metadata by interested individuals and organizations.
> Most university departments and universities are unable to extract from their university library catalogs a list of all publications
> of their own faculty. Even if they could, they are typically not be allowed to publish it without renegotiating license agreements with
> bibliographic metadata suppliers.
> A typical subject-specific interest group may be able to extract subject-specific bibliographic metadata from a variety of sources.
> But again, there is a high barrier to cross before the group can obtain clear rights to republish or remix such material.
> Essentially, the group has to acquire some legal identity, capable of making licensing agreements, before it can do so legally.
> Then the group has to find a business model capable of supporting some individual whose job it is to manage such agreements.
> This organizational overhead is unnecessary in a universe of linked data.
> === Goal ===
> Make libary catalog and other publisher-genetated bibliographic metadata freely available to community data curators so it is easily filtered by
> author/affiliation/subject/... to allow large numbers of small to medium sized academic communities to easily extract what data is of particular interest to them,
> with minimal technical and legal overhead, and to openly republish that data in ways they find worthwhile. For example, by selecting, ranking or
> classifying the data, and providing simple searches and faceted displays over bibliographic collections of special interest to the community.
> How to use linked data technology to achieve this goal: provide the data with an open license which allows its reuse for such purposes,
> and support the APIs, data standards and client software to lower the barrier to participation in information curation and sharing.
> === Target Audience ===
> Scholars as service providers: all those who edit, curate and arrange scholarly information for the purpose of making it openly
> accessible to a wide audience.
> Indirectly, the general public which may find subject-specific resources curated by scholars more informative
> than generic search services or Wikipedia.
> Computer programs, inasmuch as these may be used for tasks of filtering, deduplication, selection, ... to save the time of expert curators.
> === Use Case Scenario ===
> Curator of a community information service selects data from input sources to determine what books, articles, photographs, videos, ....
> were published recently which would be of interest to the community.
> Curator has input data available in such a way that they can easily control what is piped through to their information service.
> === Application of linked data for the given use case ===
> Make it easy for data providers (publishers, libraries, other aggregators) to provide linked data with suitable API and client software
> for community data curators to use.
> Curators should expect that bibliographic records come equipped with identifiers for all entities
> (editions, people, subjects, journals, publishers, .... ) and that this information is easily loaded into some
> community managed CMS to allow remixing with whatever ranking/selection/faceting/... the community service may wish to provide.
> === Existing Work (optional) ===
> Most A&I services maintain some data ingest systems for these purposes. But they are usually proprietary, and not readily available for use by smaller agents with
> interests in biblio data curation. These mostly rely on converting raw publisher data into proprietary biblio formats for internal use, and licensing
> data to libraries in degraded formats for use by supplicant scholars. These services add no value to the universe of linked data, but rather compete with it.
> Some examples of software systems for open display of community curated bibliographic collections are
> BibSonomy, BibServer, BibApp, Open Scholar. All of these systems would benefit from easy
> availability of comprehensive linked library and publisher data via API.
> An example of a typical community website which would benefit greatly from integration with linked data is the Probability Web.
> See especially the lists of Books, People, and the link to the Probability Abstract Service, all of which could be
> recreated to both import and export linked data.
> There are more advanced services in other fields, especially RePEc (laudably open, but with large amounts of data whose license status is indeterminate)
> and SSRN (free but not open to reuse). Such large community services are typically built with an architecture that is difficult to replicate.
> What is needed is a simple and easily replicable architecture for community data curation services of various sizes to develop and interoperate.
> BKNpeople and VIVO are starts in this direction at the level of identifying people and their interests. Integation of
> such systems with the ORCID initiative will be important. See also the BKN Project.
> === Related Vocabularies ===
> BIBO, CiTO, ...
> === Problems and Limitations ===
> Reasons why this scenario is or may be difficult to achieve:
> -- vested interests in A&I services
> -- lack of suitably licensed metadata
> -- commercial publishers, universities and conservative scholarly societies refusing to release their metadata with an open license
> Technical obstacles:
> Lack of convergence towards a simple widely adopted standard for exchange of bibliographic metadata suitable for the community
> information service use case.
> The necessary data fields are little more than traditional bibtex fields, plus some conventions for handling entity identifiers and links.
> BibJSON is an attempt at an adequate lightweight data exchange standard, compatible with linked data principles,
> and influenced by the success of BibTeX and RePEc's Academic Metadata Format.
> This standard is easily managed and understood by typical community data service managers, even without advanced software tools.
> Providing and managing/adapting/maintaining good UIs for non-technical curators to manage BibJSON or similar record structures is the biggest technical challenge.
> Also, supporting the necessary CMS over which these UIs can operate.
> Needlebase shows promise of providing an adequate UI over a graphical datastore.
> This is proprietary software, but it should be configurable to import and export linked data. Such systems for managing simple editorial
> workflows over linked data are greatly needed.
> === Related Use Cases and Unanticipated Uses ===
> If simple and easily affordable editorial systems are developed for managing collections of biblio data, it is hard to anticipate
> which agents will emerge to provide the best services on various scales. Communities nest and overlap with each other. They
> compete for the attention of their members. If communities export their enhancements as linked data, this data may be consumed again by larger aggregators,
> especially Google and other big players, in ways which which should greatly improve current means of search and discovery of academic information.
> === References ===
> Academic Metadata Format http://amf.openlib.org/doc/ebisu.html
> arXiv http://arxiv.org/
> BibServer http://bibserver.berkeley.edu/cgi-bin/bibs7?source=http://www.stat.berkeley.edu/users/pitman/bibserver.bib
> BibApp http://www.bibapp.org/
> BibJSON http://www.bibkn.org/bibjson/index.html
> BibTeX http://en.wikipedia.org/wiki/BibTeX
> BibSonomy http://www.bibsonomy.org/
> BIBO http://bibliontology.com/
> BKNpeople http://people.bibkn.org/
> BKN Project: http://www.bibkn.org/
> CiTO, the Citation Typing Ontology, by David Shotton. http://dx.doi.org/10.1186/2041-1480-1-S1-S6
> Google Scholar http://scholar.google.com/
> Needlebase http://www.needlebase.com/
> Open Scholar http://scholar.harvard.edu/
> ORCID http://www.orcid.org/
> Probability Abstract Service http://pas.imstat.org/
> RePEc http://repec.org/
> SSRN http://www.ssrn.com/
> The Probability Web http://www.mathcs.carleton.edu/probweb/probweb.html
> VIVO http://www.vivoweb.org/
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the open-bibliography