I’m excited to announce that next month I will be joining 18F, a new team within the United States General Services Administration tasked with improving digital services for the Federal Government and the American people.
I’ve spent my career practicing and advocating for open data. At Stamen my colleagues and I urged our clients to provide us with easy-to-use web services (APIs) into their data, and when they couldn’t find the resources to we built them ourselves. These services were often purpose-made to power “special projects” such as Digg Labs and Trulia Hindsight, so they weren’t necessarily intended to be open. But in at least one notable case (Digg), the services that we designed outlasted the projects for which they were created and became an integral part of the product.
Mike Migurski (then one of my partners at Stamen, now CTO of Code for America) and I brought our technical perspective on open data to Carl Malamud’s Open Government Working Group in 2007. I felt out of my element, and mostly out of shyness I contributed little except for the unofficial Open Government gang sign. But I was inspired by the passion of the other attendees, including Lawrence Lessig, Adrian Holovaty and Dan X. O’Neil from the original Everyblock, Tom Steinberg, and folks from MapLight and the Sunlight Foundation. Until then I had thought of “open data” as a primarily technological problem. This meeting—and Carl Malamud’s tireless crusade to digitize government documents that might as well have vanished from the face of the earth without his efforts—helped me to understand it as a social problem.
My first direct experience with “civic technology” was adapting Oakland Crimespotting for San Francisco in 2009. My colleagues Mike Migurski and Tom Carden had done all of the hard work building the site, API and map over the previous couple of years. All I had to do was fork the Oakland code and make it work with never-before-seen crime report data from the City’s new DataSF portal. Getting crime reports out of Oakland had been a thankless exercise in freeing data that should have been open (and eventually would be), but San Francisco just handed it out for free. About two weeks later, San Francisco Crimespotting went live, and I became its de facto maintainer.
Crimespotting became a calling card for Stamen, and the timing coincided with a particular civic interest of my own. I had been bicycling to work for a couple of years, and my curiosity with how communities could develop better transportation infrastructure led me down the urban planning rabbit hole. I read Streetsblog and The Life and Death of Great American Cities, and I became deeply interested in the often tenuous relationship between citizens and their governments. I attended the first Open Cities conference in DC and was once again inspired by Nick Grossman, Ben Berkowitz, the conference’s organizer and Next City founder Diana Lind, and many of the other passionate speakers and attendees.
But I was also skeptical of technology’s role in civic matters. Crimespotting, while an important stake in the ground for open data and interactive mapping, was problematic because it led many viewers to reach unfortunate conclusions about Oakland and San Francisco, and it reinforced negative perceptions of high-crime areas. Mike put it nicely (emphasis mine):
Just showing crime is relentlessly negative, and seems to really draw out the kind of graffiti-squad neighborhood busybodies who focus solely on little problems. A near-universal reaction from non-residents to this particular project has been relief that they don’t live in Oakland, but it’s really not that bad here. It just looks bad when all you show is crime. We’d like to map other things: city services (police, fire, emergency), tax parcels, effects of policy, other administrative information that’s hugely important.
I, too, was dissatisfied with just—cue Eric, with “just” in air quotes—plotting crime reports on a map and letting people draw their own conclusions. At Stamen we talked for years about augmenting Crimespotting with report statistics over time and other data, such as calls for service, but we simply never found the time (or the grant money) to do it.
Beauty and Utility
In my own spare time, though, I was experimenting with graphical treatments to show crime with other data, and when I spoke at Open Cities in 2010 I showed this slide:
I made the image above, Trees, Cabs and Crime in San Francisco, in 2009 from three seemingly unrelated data sources: the locations of street trees under the care of Friends of the Urban Forest in cyan, Yellow Cab taxi GPS pings from Cabspotting in yellow; and crime reports from SFPD via Crimespotting in magenta. The three colors were blended subtractively (crudely mimicking the CMYK color process) to create red, green, blue and black wherever they coincided. The phrase “beautiful and useful” was a response to a frequent critique of Stamen projects, “beautiful but useless.”
Even though the image was of dubious utility, I thought that it illustrated my point that beauty can serve a valuable role in making data relevant to society. The folks at the Institute for Urban Design seemed to know exactly what I meant when they chose Trees, Cabs and Crime to be featured in the Spontaneous Interventions exhibit at the 2012 Venice Biennale of Architecture, because they didn’t ascribe any higher purpose to it than “sussing out patterns.” I was genuinely intersted in seeing whether the three data sets that I’d chosen—admittedly, arbitrarily—matched up in interesting ways and, more generally, in experimenting with a novel way to combine multiple geographic data sets without obscuring any of them. It failed on the first point but succeeded on the second, and also in another way that I hadn’t anticipated: the unexpected beauty of the image attracted attention, and it raised questions.
Just as information wants to be free, I believe that data wants to be seen. Visualization is a powerful tool for communicating and understanding complex data, and when it’s successful as a process it not only answers important questions but prompts new ones.
For instance, people often asked me why there aren’t any trees in the long swath of Golden Gate Park in the image above, and the reason is that the data came from a non-profit that plants and maintains street trees. (The Urban Forest Map now houses this database, and the Department of Public Works maintains their own list.) Similarly, when discussing Crimespotting I always took care to explain that it was not a map of crime, but of crime reports, which must pass through numerous institutional filters (in some instances, to protect the victim’s privacy) before being released to the public. Homicides, I discovered when I sought out to map them on Dotspotting, were initially included in San Francisco’s report data but mysteriously disappeared some time in 2011. Visually exploring data was an integral first step in nearly every single project at Stamen.
Seeing and interacting with the data helps us perceive its boundaries and better understand its limitations, its vagaries and subtleties. And understanding those aspects of data allows us to ask the really tough questions: not just about what’s in the data, but what’s not. Questions about timeliness and provenance, and the often bureaucratic processes by which data becomes open. Purveyors of open data (most notably, government) need to raise the bar above links to spreadsheets and learn to deal also in the interactive interfaces that help translate data into useful information.
Onward and Upward
It’s been a wild nine years since I started working at Stamen, and it was a wonderful and unique environment for me to develop these thoughts and skills. I simply wouldn’t be where I am today if it weren’t for all of my thoughtful and inspiring colleagues—in particular, my brilliant business partners and friends, Eric Rodenbeck and Mike Migurski—and clients with whom I’ve had the pleasure of working. I have so much more to learn, and I can’t wait to put it all to work for the American people.