[ckan-dev] Google's Dataset Publishing Language (DSPL)

Rufus Pollock rufus.pollock at okfn.org
Fri Mar 18 08:59:42 GMT 2011

On 2 March 2011 10:50, Seb Bacon <seb.bacon at gmail.com> wrote:
> Hi,
> I recently posted to ckan-discuss about Google Public Data.
> On a more technical note, just to make sure everyone's aware of this:
>  http://googleresearch.blogspot.com/2011/02/slicing-and-dicing-data-for-interactive.html
>  http://code.google.com/apis/publicdata/
> This makes me think two things:
> (1) It would be very interesting if we could think of simple ways to
> help users of CKAN make a DSPL package with their data.  This would be
> quite tricky to do in a generic sense, but for standard types of data
> (e.g. things with postcodes, administrative regions, dates etc) we
> could probably do some clever stuff to help out.  The benefit is huge:
> lovely, interactive visualisations of the data embedded in the package
> page.  I would love to have a stab at doing this some time.

I think this would be great. I had a very good chat with the lead dev
of the Public Data Explorer at Google last week and discussed exactly
this idea of providing a way for CKAN users to export easily (using
the DSPL format) in the data explorer thereby allowing them to take
advantage of the great viz tools there.

> (2) There is a potential "threat" from Google here.  What they are
> building at http://www.google.com/publicdata/home, along with the
> metadata provided by the DSPL format, clearly indicates a desire at
> Google to move into the type of space currently occupied by CKAN.  Of
> course, we focus on openness, open source software, etc, so it's not
> necessarily much of an issue, but I think it's worth considering /
> articulating how what we do is different from what they are doing /
> planning to do.

In my chat with the PDE guy the main distinction was:

a) The DSPL is a mechanism for people to get data into the PDE -- it's
not supposed to support general data hub/storage work
b) The focus with the PDE is on the visualization and hence the
requirements in the DSPL is for data structured in a way that fits
with that visualization tool (mainly time series with different
dimensions). That's a very important set of data but far from all.

All in all I think there is far more space for collaboration here than
competition :)


More information about the ckan-dev mailing list