19/3/2012

Creating dot density maps with Chicago Tribune’s new open source toolkit

 

Originally published by Christopher Groskopf on the Chicago Tribune's News Apps Blog on 12 August 2011 under a Creative Commons Attribution license. This post has been republished with minor modifications.

 

 

pic_1.png

Distribution of children less than five years old in several U.S. counties

There hasn’t been a time in the last six months when at least one of the members of the Chicago Tribune's News Applications team wasn’t working with census data.

In April, May, and June we contributed to a joint effort with an esteemed cadre of news nerds to develop census.ire.org, a site intended to make it easier for journalists to report from census data. To prepare for this recent release, we even spent a week hacking near-complete prototype maps using data that the census had already released about Kings County, New York.

We learned hard lessons about the scale and nuance of the census data in the last few months, and, along the way, we further built out our toolkit for making maps. Last week the Census Bureau released detailed (summary file) data for Illinois, and we used our new tools to produce a couple of maps we’re pretty excited about:

These maps demonstrate a map style we haven’t attempted before: dot density mapping. Dot maps let us represent multi-variate data more richly than choropleth maps. For example, they allow us to illustrate variation in race and population density simultaneously. We were inspired in this effort by Bill Rankin’s Radical Cartography project and Dennis McClendon’s map of residential patterns according to census racial categories, created for the Encyclopedia of Chicago.

Many of the tools needed to create the maps we wanted didn’t exist. Using TileMill, a fantastic application for creating maps, as our starting point, we began to build a toolkit.

 

Invar

pics_2.png

Invar automates the generation of map tiles, and the deployment of tiles to S3. It is the first and least glamorous of the tools we created, but crucially, it’s very, very fast. Fast!

The first time we ever tried to create our own tileset, it took hours to render and twice as long to deploy. Thanks to invar’s parallelizing these tasks, we can now produce a map in minutes and deploy it just as fast. In fact, we now deploy our maps to four separate S3 buckets so that we can take advantage of Leaflet's support for round-robining tile requests to multiple subdomains. Fast!

 

Englewood

Next we needed to distribute dots across geographies. We found one implementation of dot distribution in Python, which we extended into a module for reuse.

Englewood (named after an ailing Chicago neighborhood that the Chicago Tribune writes many sad stories about) uses the Python bindings for GDAL to load data from PostGIS or shapefile. It scatters points within each feature and then writes the points out to a table or new shapefile.

A small snippet of Python is required to configure Englewood. The following code renders the dots for our map of children less than five from a database. (A demo using shapefiles can be found in the Github repository):

python_snippet.jpg

 

Deployment

A fast and stable process is useless if you can’t repeat it. We’ve built out a fabric configuration which allows us to make these maps in the quickest and most efficient way possible. Among other things, it allows us to keep some configuration (such as a bounding box) in a per-map YAML file. It parses this file and handles passing the correct arguments to invar for rendering and deployment. Perhaps most exciting, if you’re using the new TileMill 0.4 (available for OSX or Ubuntu), it can completely automate the production of Wax interactivity grids, such as the ones we used to do the highlighting in our recent maps.

 

Styling dots

Via Crayonsman (CC BY-SA 3.0)Creating dot density maps created new challenges with regards to styling. We tried numerous approaches to color and size the dots, and ultimately settled on a few principles that worked pretty well:

  • Use a dark, sparse base-layer (we used a custom-styled Google Maps layer, but would like to move to an Open Street Map base-layer in the future).
  • Make your dots to stand out brightly. Try the fluorescent colors from the palette of Crayola crayons.
  • Play with transparency - you may want to take advantage of the effect of overlapping transparent dots.
  • Make Dots scale on zoom.
  • Whenever possible, use one dot per individual. It will make for a more interesting map.

Here is the style we settled on:

ishot-14.03.121.jpg                                             

 

Wrapping up

Although I’ve linked to a number of projects and code snippets in this post, you may find it useful to see a complete project. This week, with Illinois under our belt, I decided to apply the same methodology to my side-project, Hack Tyler. I produced a map of racial diversity in Smith County, Texas (see related blog post). Part of Hack Tyler’s modus operandi is developing in a completely transparent manner. As a result, you can see complete examples of both our backend and client-side mapping rigs in the following projects:

 

 

Comments