[od-discuss] OKD questions from 2010

Mike Linksvayer ml at gondwanaland.com
Mon Mar 4 06:58:21 UTC 2013

This will be fixed in 1.2, on which I'll post a list of final issues
to close on in the next week.

Current draft at


On Sun, Mar 3, 2013 at 11:00 PM, Luis Villa <luis at tieguy.org> wrote:
> The OSI board is (informally) on board with this idea. The only minor
> concern is that tt was noted that the OKD references the OSD without
> defining "OSD", and also says that software is covered by "previous
> work" without actually mentioning OSI/OSD. Could we adjust those to
> give some credit to OSI/OSD and actually define "OSD"?
> Luis
> On Mon, Feb 11, 2013 at 6:59 PM, Mike Linksvayer <ml at gondwanaland.com> wrote:
>> http://blog.okfn.org/2010/08/04/update-on-open-source-initiatives-adoption-of-the-open-knowledge-definition/
>> lists some question, with answers in
>> http://okfnpad.org/okd-questions (copied below).
>> I think Q3 is the only one pertinent to a new version of the OKD, but
>> feel free to suggest otherwise. I don't have a strong feeling as to
>> whether the suggested language is more clear or otherwise better.
>> Existing:
>>     The rights attached to the work must apply to all to whom it is
>> redistributed without the need for execution of an additional
>> *license* by those parties.
>> Suggested:
>>     The rights attached to the work must apply to all to whom it is
>> redistributed without the need for execution of an additional
>> *agreement* by those parties.
>> (one word change highlighted)
>> I do think reaching out to OSI again for feedback would make sense
>> before releasing a new OKD version, but I'll leave guidance on that to
>> Luis Villa.
>> Mike
>> ...
>> Please add your name in bold to any comments/questions below -- and
>> ensure you have a different colour from others!
>> Question 1. What happens with data that's not copyrightable?
>> Question 1a. What about data that consists of facts about the world
>> and thus even a collection of it cannot be copyrighted, but the exact
>> file format can be copyrighted?  Many sub-federal-level governments in
>> the US have to publish facts on demand but claim a copyright on the
>> formatting.
>> Jordan Hatcher: These are both great questions -- I think it helps to
>> separate out a few areas when talking about open data and the law:
>> ## 1.  the data / database distinction.
>> I prefer to use the term "contents of a databasea" because lots of
>> databases will have clearly copyrightable material in them, such as
>> music (ex: mp3s), images (ex: Flickr photos), or video (ex: Internet
>> Archive stuff). When people talk about "data" they start automatically
>> thinking  about 'facts" and then really starting to question the
>> available rights  (and rightly so, as it is different -- facts aren't
>> copyrightable on  their own).
>> ## 2.  legal questions on the boundaries of how the law (in various
>> jurisdictions) interacts with databases and their contents
>> There's a bunch of tricky legal questions around how databases and
>> their contents, particularly when those contents contain factual
>> information, interact with the law.  The law varies between
>> jurisdictions and so there's no one set global answer and just like in
>>  most areas of the law, these aren't binary yes/no questions.
>> Just like with open source, I think you have to leave many of these
>> questions to the side when trying to define a standard in the area
>> such  as what "open data" means.  It doesn't matter in regards to the
>> Open  Source Definition what the outcome is on the "is it a license or
>> a  contract?" debate lawyers get into.
>> Question 2. What about data that's not accessible as a whole, but only
>> through an API?
>> Jordan Hatcher: As a concrete example to this question, what would
>> open data mean  in the context of the results ("data" or "contents")
>> returned from  querying an API? I think to be "open data" then you
>> should have the rights to do  what you wish (within the open
>> definition scope) with the results of  that query. I don't think that
>> opening up the database that sits behind that  API should be required
>> to be distributed in order to be open data.
>> However as a corollary to the above, any license restriction that
>> prevented you from being able to get all the database through running
>> multiple queries would prevent that from being open data. Technical
>> restrictions (number of API calls, for example) would be okay.
>> Incidentally, the Open Database License (ODbL) would view public  API
>> access as distribution and so would trigger access to the full
>> database -- see 4.6 of the text
>> <http://www.opendatacommons.org/licenses/odbl/1.0/>
>> Question 3. We're thinking that OKD #9 should read "execution of an
>> additional agreement" rather than "additional license".
>> Jordan Hatcher: Seems to make sense. Is OSI considering a similar
>> change to the Open Source Definition?
>> Question 4. Does OKD #4 apply to works distributed in a particular
>> file format? Is a movie not open data if it's encoded in a
>> patent-encumbered codec? Does it become open data if it's re-encoded?
>> Jordan Hatcher: Though in the end, we're nothing but data (GATC in
>> DNA) I think it might be stretching it a bit as a practical matter to
>> consider an MPEG4  file movie as a typical example for open data.This
>> does bring up the point though about file formats, and it can be an
>> issue. I think that you can still have open data in a proprietary
>> format, as long as users aren't restricted from converting the
>> database into an open format (or any other format), then it would be
>> open data.
>> Kirrily: here's a real use case, then. Is it open data if the govt
>> distributes it as files that need to be opened in a massively
>> expensive, non-free statistical software package?  What about geo data
>> in formats that are specific to non-free GIS platforms?  These are
>> real and common use cases.
>> Jordan: Yes, an expensive proprietary format can be a barrier, but I
>> think there is a line to be drawn on being open *data*. If you are
>> legally able to convert that data into an open format, then the data
>> is open. There are lots of things that can present a barrier in the
>> same way as a massively expensive software package to read a certain
>> format -- what about access to the internet, or access to a computer
>> at all, or skills to use a computer?  All can form a real barrier to
>> access and use/reuse of data, but to be a workable definition I think
>> it's best to focus on the legal rights around the ability to do these
>> things.
>>  Question 5. What constitutes onerous attribution in OKD #5?  If you
>> get open data from somebody, and they have an attribution page, is it
>> sufficient for you to comply with the attribution requirement if you
>> point to the attribution page?
>> Jordan Hatcher: I think this is one of those great questions that is
>> best left up to  users to define in practice, rather than to set down
>> in stone what is  onerous and what isn't as this will evolve over time
>> and across  technologies.  Asking attribution not to be onerous I
>> think is a good  starting place, particularly for open data as many
>> contributions could  get quite burdensome quite quickly depending on
>> how the license was  written.
