I recently attended the Crowd Sourcing In National Mapping workshop at the University of Nottingham (ie. in the other building from where I work) run by AGILE, EuroSDR and others (from my perspective special kudos to Jeremy Morley and Peter Mooney in making it as successful as it was).
This blog however isn’t about the event, that can come in a later post. I’ve been mulling over recently due to many factors; provenance of data, usage of data, licensing of data. This isn’t to say that the software platform isn’t as important, however the development of open source software has a long history, with usage, business models and licences all being smoothed over in the past twenty years or so. The impetus of digital economy in some areas has shifted focus from the platform to the data, some people are trying to control content (around this SOPA and PIPA are great examples) but this has created new models subscription (Netflix, LoveFilm) or ad-supported (Spotify, Hulu).
So what does this mean to Geodata? Arguably we have a lot more data in the wild now than we’ve ever had before. Tools like TileMill have really good UI/UX making ‘pretty’ visualisations a breeze, even for people with not cartographic experience. So why could we be living in the 1990s? I think we’re at the point of no return with licences. We can go two ways, one where everything converges and is awesome or it all falls down and goes wrong. In some ways I believe it’s like the development of Linux. Linux started out with a niche user base; sysadmin, hacker types, then started to filter into the wider public and corporations, the community gained more developers but also gained usability experts, designers and other non-coding contributors. While the majority of users of Linux don’t push back or contribute code, due to the mechanisms of feedback that now exist users at time unwittingly contribute to the wider project.
Currently there are more than 500,000 people registered with Open Street Map (this isn’t the only dataset out there, but it is the one closest to my heart, I’d also really like to know how many people use OSM that don’t contribute). OSM started out with a bunch of computer hackers with GPS’ making maps and having fun and has grown a into a vibrant global community. The attitude of JFDI was inbuilt, using some of the first editing software, tremendous wasn’t too usable, it had a high barrier to entry. Then it got easier, a lot easier. Things became stable, usability became key. OSM became a subject of research with researchers like Muki Haklay, Peter Mooney among others leading the charge. Like Linux, contributors other than the core group are coming on board, big business is seeing the potential, researchers are engaging with improving the project through academic means.
The ‘right’ licence for geospatial information is still a big problem, OSM is currently transitioning from Creative Commons to ODBL (a process that’s been around since 2008) with some real danger that some data will be lost from users not accepting the new licence change for whatever reason.Google’s Map Maker and the World Bank have stirred up the ire of a few in the community primarily about sharing data; Requiring that datasets derived from mixing map makers and dataset are given back to Google, clearly this gives a competitive advantage to Google as they then lock the data without making it available to other parties. Also what if the data you would like to mix isn’t able to be shared?
This really is only the tip of the iceberg. There are some really good research papers around comparing the quality of authoritative datasets and VGI (primarily OSM). In some cases the topological accuracy is ‘as-good’ in OSM (considering the equipment isn’t ‘professional’ and the contributors largely hobbyists) not mentioning the other aspects of quality like metadata, attributes etc. OSM has the adaptable data-structure to meet the bar set by authoritative sources and keep going. Projects like OSM GB are aiming to create data in the format, CRS and projection used in Great Britain and develop new automatic quality assurance processes to enhance (add value) to the dataset. What about joining this new VGI dataset and an authoritative Ordinance Survey dataset together, this would bring in the best of both worlds. Wouldn’t this just be a ‘good’ thing, but what about our licences?
Google’s previously pretty much ‘free’ service has changed lowering the number of free requests it will deal with. Consequently there have been a few important (IMHO) defections to OSM. Switching to OSM has never been easier with http://switch2osm.org/ breaking it down into simple and clear language. The delivery mechanisms (ie. the Internet) is in place, as is the bandwidth and capacity to deal with large geospatial datasets. User generated maps are becoming ubiquitous with Flickr and Twitter geotagging for us and citizens around the world are contributing with the barrier being lowered all the time. The licences we use shouldn’t inhibit people, with governments opening their datasets eventually the cost of using data will be driven to zero, be it a monetary or licence cost. In the ’90s the cost of software like Linux was driven to zero, sharing was free, money gained through value added services. Certain projects in the geospatial world like OSM (IMHO) have taken great steps in the right direction. In transitioning to ODBL, as a license that specifically covers the data and requires that users who ‘improve’ the data submit it back, this should be better for conflation but questions remain when mixing data of different licences. In 2012 different platforms all play along nicely, without visible overhead. In the 90s world of geospatial will different datasets move to a point when they can?
Written and submitted from the Nottingham Geospatial Building (52.953, -1.18405)