[open-bibliography] New BNB sample data available
aisaac at few.vu.nl
Mon Feb 7 17:31:34 GMT 2011
Point taken: I obviously oversimplified the problem! I was really focusing on the fact that BL would declare statements that contradicts the ones at LoC.
Again, I would find very strange that the BL would want to "overrule" a prefLabel (for a given language tag) from LCSH. It ruins the benefits that building and using *controlled* vocabularies bring, wouldn't it? Which is why using URIs in the subjects only really better.
But well, if for some reason BL really wanted to have an English spelling published in their data, it is possible to use another language tag (like, "en-uk"), or even no language tag, since id.loc.gov is using "en" as a default for now. But if one day LoC changes their mind (considering all your arguments on mixed headings and other subtleties, for instance) and remove their language tags, then there could be contradictions arising...
> In a previous life I served on a MARBI task group considering the problem of language in authorized headings. The problem here is that the AACR2 rules allowed for the creation of 'mixed' headings, e.g., heading strings that included portions (usually separately subfielded) in different languages. The group was trying to figure out how these separate parts could be separately identified as to language, but eventually we gave up on the task--MARC just wasn't set up to do that, and in fact every accommodation for the varieties of language and script used to 'patch' MARC up in some specific circumstances create their own particular problems down the line.
> So what we have here is yet another 'holy grail' (compatibility) which may not be achievable. Any default attribution of language will be wrong in some unknown percentage of cases. I really think that our only hope is to separately identify what has been cobbled together for use within card catalogs, and move towards a faceted approach, perhaps the one that FAST has taken.
> As for the separate choices made by BL and LC, I'm not entirely sure that agreement is required or even optimal, particularly given the differences in spelling everyone insists on retaining!
> On 2/4/11 9:35 AM, Antoine Isaac wrote:
>> Now, on having a language tag or not, I see your issue, but personally I'm ok with originally Spanish labels being considered as English ones, if there's no English translation for them.
>> Anyway, the core issue to me here is that this language tag dilemma also applies for LoC, which made the opposite choice. Ideally if you publish data on LC concepts, it should be compatible with what LC has--"compatible" in the formal but also informal way: whether there is an inconsistency or not, a data consumer may still be extremely puzzled why LC and BL can't agree on their concepts' prefLabels!
More information about the open-bibliography