Towards the Next Generation Road Survey

Over the past few weeks, I’ve managed to escape the office and get back to the field. With an impending change, it’s been a very refreshing time to get back into the mix – especially out onto the roads of Zanzibar.

Advertisements

Alongside work with scaling out Ramani Huria and working with (awesome!) colleagues on the signing of an Memorandum of Understanding between Ardhi University, World Bank, and DfID to support the development of curriculum (with ITC Twente) and the sustainability of community mapping in Tanzania for the next five years. I’ve been working on a side project to look at how machine learning can be used to assess road quality.

To do this, the N/LAB team at University of Nottingham and Spatial Info (the spin out of my team that helped build Ramani Huria/Tanzania Open Data Initiative) and I are working with the Zanzibari Department of Roads, under the Ministry of Infrastructure, Communications and Transport to survey all roads in Ugunja Island Zanzibar.

The Department of Roads & Uni Nottingham Team

So far, we’ve worked on getting a surveying vehicle back on the road, initial back and forth with government stakeholders, and working on pulling together the various road data sources (such those from the Government and OpenStreetMap) to work out where to drive and the sequencing of the survey. All of this will support a data collection protocol that merges traditional surveying techniques, with novel ones such as RoadLab Pro.

All of these data streams will then be used as a training dataset to see how machine learning can inform on road quality. But first, we’re getting the traditional survey underway. It’s going to be long road ahead – as long as all the roads in Zanzibar!

Watch this space, the project’s Medium page, and the N/LAB’s blog on using machine learning for automated feature detection from imagery. Get in contact below in the comments as well.

Written in the Al-Minaar Hotel, Stone Town, Zanzibar (-6.16349,39.18690)

OSM: Going Back in Time

I’ve been playing around with the full planet file to look at going back in time in OSM. Mainly, this is to look at how Ramani Huria’s data has evolved over time and is all part of extracting more value from Ramani Huria’s data.

I’ve been playing around with the full planet file to look at going back in time in OSM. Mainly, this is to look at how Ramani Huria’s data has evolved over time and is all part of extracting more value from Ramani Huria’s data. This process wasn’t as straightforward as I had hoped, but eventually got there – also, this isn’t to say that this is the only or best way. It’s the one that worked for me!

To do this, you’ll need a pretty hefty machine – I’ve used a Lenovo x230 Intel i5 quad core 2.6ghz, 16gb of ram with over 500gb of free space – This is to deal with the large size of the files that you’ll be downloading. This is all running on Ubuntu 16.04.

Firstly, download the OSM Full History file. I used the uGet download manager to deal with the 10 hour download of a 60gb+ file over 10meg UK broadband connection. Leaving it overnight, I had a full file downloaded and ready for use. Now to set up the machine environment.

The stack is a combination of OSMIUM and OSMconvert. On paper, the OSMIUM tool should be the only tool needed. However, for reasons that I’ll come to, it didn’t work, so I found a workaround.

OSMconvert is easily installed:

sudo apt-get install osmctools

This installs OSMconvert other useful OSM manipulation tools. Installing OSMIUM is slightly more complicated and needs to be done through compiling by source.

Firstly, install LibOSMIUM – I found not installing the header files meant that compilation of OSMIUM proper would fail. Then use the OSMIUM docs to install OSMIUM. While there is a package included in Ubuntu for OSMIUM, it’s of a previous version which doesn’t allow the splitting of data by a timeframe. Now things should be set up and ready for pulling data out.

Dar es Salaam being the city of interest, has the bounding box (38.9813,-7.2,39.65,-6.45) – you’d replace these with the South West, North West point coordinates of your place of interest, and use OSMconvert, in the form:

$ osmcovert history_filename bounding_box o=output_filename

osmconvert history-170206.osm.pbf -b=38.9813,-7.2,39.65,-6.45 -o=clipped_dar_history-170206.pbf

This clips the full history file to that bounding box. It will take a bit of time. Now we can use OSMIUM to pull out the data from a date of our choice in the form:

$ osmium time-filter clipped_history_filename timestamp -o output_filename

osmium time-filter clipped_dar_history-170206.pbf 2011-09-06T00:00:00Z -o clipped_dar_history-170206-06092011.pbf 

This gives a nicely formatted .pbf file that can be used in QGIS (drag and drop), POSTGIS or anything else. As the contrast below illuminates!

tandale_01082011
Tandale, Dar es Salaam, Tanzania – 1st August 2011
tandale_2017_lowres
Tandale, Dar es Salaam, Tanzania – 13th February 2017

Enjoy travelling back in time!

All map data © OpenStreetMap contributors.

Building Heights in Dar es Salaam

I first went to Dar es Salaam in 2011, there were a few skyscrapers adorning the city’s skyline, now they’re everywhere! Sitting on a rooftop bar in the center of the city, it’s a mass of cranes and pristine new buildings.

Alongside this rapid growth, Ramani Huria has been collecting a lot of data but a lot of it doesn’t get rendered by the default OSM styles… so I’ve dug into the data and created a map of the different floors across the city.

This interactive map allows you to explore where the tallest buildings are in the city, but in displaying the data in this way, also allows for the densest, unplanned and informal areas of the city to become very clear.

There is still some way to go though – in Dar es Salaam there are around 750,000 buildings, with roughly 220,000 (~30%) having been surveyed by the Ramani Huria team and given an appropriate attribute. Ramani Huria has focused its efforts in the urban centres of Dar es Salaam, where most of the multi-story buildings are to be found. But, still a lot more to be covered towards Bagomoyo and Morogoro.

Hat tip to Harry Wood who’s advice and guidance pointed me in the right direction – a more technical blog post and more details of other challenges around correctness of tagging but that’s for another post – now to look at Floor Spaces Indices…!

GISRUK 2013

On the 3rd to the  5th of April I attended GISRUK (Geospatial Information Research in the United Kingdom) to give a paper on Community Mapping as a Socio-Technical Work Domain. In keeping with Christoph Kinkeldey‘s love of 1990s pop stars Vanilla Ice made a second slide appearance, leveraging the fact it’s a very technical academic title. In short I’m using Cognitive Work Analysis (CWA) to create a structural framework to assess the quality (currently defined by ISO 19113:Geographic Quality Principles – well worth a read…) where there is no comparative dataset.

CWA is used to assess the design space in which a system exists, not the system itself. In taking a holistic view and not enforcing constraints on the system you can understand what components and physical objects you would need to achieve the values of the system and vice-versa. In future iterations I’m going to get past first base and look at decision trees and strategic trees to work out how to establish the quality of volunteered geographic data without a comparative dataset. Building quality analysis into day one, as opposed to being an after thought.

Written and submitted from Home (52.962339,-1.173566)

 

A Manifesto for the OSM Academic Working Group

A fellow member of the OSM Foundation replied to a conversation on the mailing list: “As a guerrilla academic…“. The context was around a suggestion for increased academic cooperation within OSM. To this end I proposed a new working group for the OSMF: Academic Working Group. This would have the aim of improving the academic quality and communication of those using OSM in their research and facilitating collaboration.

Below is the start of the manifesto. It’s not complete, but it’s a start.

Background

Academic institutions use OSM data. Be it part of their published research or testing hypotheses. Some of the publications are listed on the wiki: http://wiki.openstreetmap.org/wiki/Research. However within OSM and OSMF this research is undertaken under the researchers own initiative. Researchers are looking at OSM through recommendation (supervision) or self interest within their own academic structures. Given the growth of OSM and the research into it, it seems likely that academic interest will widen and grow.

Goals

Support academic research in OSM, encouraging best practices and acting as a forum for researchers. This has the aim to support researchers starting out with OSM but also to unify a community of existing researchers; collaborations and knowledge sharing will hopefully follow. Identification of areas of research for the community as a whole among potential themes of usability and business models (as a starting point).

Tasks

  • Uniting existing researchers, either at existing institutions or those following independent academic study.
  • Provide documentation (a la learnOSM) but focused for researchers.
  • Provide a forum for researchers to discuss their research and bridge into the community
  • Support and provide problems to the academic corpus.
  • Communicate potential collaborations, needs, wants.
  • More TBD

Working Group vs. Community

I think this is hitting a gap that exists in the community currently. I don’t see potential areas for conflict. However that being said do we have enough members within the OSM(F?) to create and steer the working group?

WWWG vs. other WGs

There is a small amount of overlap in interest between this proposed AWG and other Working Groups.  I can see potential overlap with communications and strategic working group. Communications as this would aim to focus on building up the OSM academic community. Strategic as they may wish to commission studies or at least support them, into critical areas of OSM.

Next steps 

Again, I’ll throw this to the OSMF. Where should we go from here?

Written and submitted from the London St. Pancras to Nottingham Train.

Usability Of OSM’s ‘Toolkit’ In Community Mapping

This blog post isn’t a formal evaluation of the usability of OSM’s software or the equipment used for mapping. It is not meant to attack particular software; The software and implementation of OSM deserves many medals with equal amount of recognition.

This post is about things I noticed while mapping in Tandale, there is no statistical analysis, I have no dependent or independent variables, it’s based mainly around anecdotes and conversations with people. Though this doesn’t exist as a formal ethnography, it could serve for some useful pointers in future.

JOSM

As we had netbooks with a small-ish (11″) screen-size and a trackpad, mice are essential for mappers getting started. In month spent in Tandale the designated editors have become JOSM gods with the majority of students and community members having fair literacy within JOSM’s processes. However when starting, the software was made accessible to the mappers purely through using a mouse. Most of the mappers were familiar with mice, whereas a trackpad was a piece of technology that wasn’t commonly used.

Conflicts commonly occurred within JOSM, in that groups where editing and uploading areas that they had mapped independently. This was difficult to control at first, as we had started with a blank slate, however boundaries of the sub-wards was relatively well known and demarcated by physical boundaries. Regardless groups wandered into areas which weren’t theirs to map. With the division of labour, in that roughly half were mappers undertaking the bulk of the surveying and with the others editing. When conflicts occurred the process was occasionally esoteric, especially if the group in question had been editing for a while.

To counter this I requested that each of the different sub-ward teams follow the mantra of save, upload and download often. Unfortunately this, on many an occasion, fell on deaf ears. This just meant conflicts were a laborious process, how could they be made better? Also JOSM’s autosave feature was a godsend, inevitably something would crash, causing people to start again.

Within the final presentation to the wider community and stakeholders, one of the points raised was incorrect spelling. There is autocomplete in JOSM, however it seems that if a spelling mistake got in first, like ‘Madrasah’ (an Islamic school, with debate on its correct spelling anyway) this would filter down, with the new mappers believing that the system is right. This would start adding clunky bits of software onto something that was never designed for spelling correction, but should plugins be created to improve this?

OSM Tagging

Due to the informal economy within the slums formal medical advice and dispensing is very rare. The community-at-large simply cannot afford ‘professional’ medical care. This has led to ‘dawa’ – medicine – shops dispensing everything from medical advice to prescription medication. Formally defining these structures into OSM is difficult, we could just create custom presets, it’s something done within Map Kibera and Map Mathare.

The issue here is that we are using the same ‘custom’ presets repeatedly. It surely would be better to include the commonly used attributes (common when mapping in environments such as Tandale/Mathare/Kibera) in the JOSM package itself? Is this feasible?

Satellite Image Tracing

One of the experiments that ‘failed’ was the tracing of satellite imagery. Bing were very kind in releasing their imagery to the OSM community to derive data, and our initial idea was to derive building outlines from this imagery. Initially it was perceived that tracing went well, some buildings weren’t quite perpendicular but using JOSM’s built in ‘q’ function fixed this. When map completeness was approaching, validation errors were caught informing that pathways were going through buildings and vice-versa. There are three explanations for this;

  1. The GPS has recorded an inaccurate position, i.e. path through the environment due to the accuracy being imprecise. (Technology Error)
  2. When editing the editor has generalised a GPS position or incorrectly mapped a building. (Human Error)
  3. The imagery is not rectified properly, or some error exists in the processing/the quality of a ‘high’ enough quality with which to derive information. (Human and Technology Errors)

These factors are a combination of human and technical problems, in this case I believe it is a culmination of each of the factors. Some of them, especially with image quality and GPS accuracy would presumably need some sort of best practice to be implemented. Other sources of human error in the editing process are harder problems, especially without a comparable dataset, this is a more open ended problem.

Rendering

When I joined OSM I was a student in a foreign city, with no map with which to explore with. A massively pro open source friend recommended the OSM project. I already had a GPS from my time working at a camping store during summer holidays  so it was a match made in heaven really. My first edit was of the D400 road from Nancy to Lunéville around 2007/8 then I set to work in the area.

The community was very small and so, presumably was the power of the servers; it would take a few days for anything to be rendered on the OSM homepage. Now something uploaded can take anything from five minutes to an hour. The server administrators deserve more recognition in their services, so if you meet them, buy them a drink – they deserve it.

Summary

In summary, I believe that the tools we use in OSM are great, none of what I’ve written is a slant at a particular software or person. I believe that we should however consider certain points about widening access to the software in making it more usable. I also welcome comments below!

Written and submitted from the World Bank Offices, Washington DC (38.899, -77.04256)

Understanding Landuse In Tandale

Tandale Landuse September 2011

Towards the end of the mapping phase landuse was demarcated, the results are above. This isn’t representative of the official (the city council) view of landuse this represents landuse as it was observed on the ground. If people wish to download the shapefile it is here. As this is Open Street Map you can also just download data and use it freely under the terms of the OSM licence.

When presenting the project to the community/being questioned why a bunch of people are wandering around a slum with GPS’ the questions were always along the same lines. “Are you mapping land boundaries to prove property ownership? … No? … Why are you here then?”. This then led us to explain and pitch the project to them. However it illustrates people’s concerns; quite a lot of housing in the slum is informal, with precedents of slum improvements destroying homes in the name of progress, regardless of its merits and pitfalls. However this is a story for another time.

From this why map landuse if not for property demarcation? We have access to official population data, this combined with our landuse data we can then understand the provision of services across Tandale. Within the residential areas we have the building blocks to understand not just where the toilets are, but the potential average of each person using that toilet in that area. The same methodological approach can apply to water access points, shops and butchers; any point of interest basically.

Understanding the reality of the ground situation, a ground truth if you will is important. While data is collected the reliability and time since collection are questionable in developed societies where the demographic shifts over decades like the UK. Dar Es Salaam is the 3rd fastest growing city in Africa and 10th fastest in the world. The majority of people contributing to this influx are moving from rural areas to the city. The economics of this mean they gravitate to slums like Tandale. Land can be reclaimed and houses built as rapidly as they fall, simply because people need a place to live.

The increasing population puts enough of a strain on the existing infrastructure, this situation will not resolve itself organically. For example the market of Tandale acts as a staging area for majority of fresh fruit, vegetables, grain and rice. The supply chain starts outside Dar Es Salaam, in areas like Bagamoyo and Morogoro and shipped to the one market. The roads are a mixture of paved and unpaved which on occasion grind to a standstill. Using the data collected we can now start to ask questions like ‘How do we keep the supply chain going?’, ‘How many people in residential areas have access to toilets?’. It’s not quite “open data now” but it’s close and getting closer.

Written and submitted from Broadway Cinéma, Nottingham, UK (52.9540,1.1437)