19/6/2017

Georgia lawmakers write ~4,800 things, mostly honors

 

The idea for “Georgia lawmakers write ~4,800 things, mostly honors” came to me in the state House press box.  I was drumming my fingers, watching the morning parade of worthy state residents who arrived at the Capitol with their beauty queen tiaras or tournament-scarred footballs to collect applause from elected officials.

These ceremonies take more than an hour most days. And that’s just the people who show up to be honored. Many more “honoring” resolutions get approved en masse each day. So, I got to wondering how to show people that there’s a ton of ceremony amid the lawmaking?

Brainstorming on that, I saw I could show a lot more: what happens to every bill in the legislature, not just the ceremonial honors.

A little background: I’m a freelancer, one person working alone. Most of what I sell is print reporting on government and politics, not data projects. But I do think that data analysis, especially rolled out as visualizations, can help us tell complex or abstract stories. 

The data set here is small, about 4,800 lines; really it's small enough to be done in Excel.

But Python is better for this project than Excel because it makes the data analysis repeatable and less error-prone.  I’ve used the same script for several years now. But it was a good project for making the leap from a GUI to a blinking cursor — I could go back to Excel and "see" that the instructions I gave Python resulted in the numbers I expected.

So, again the question was: how can I show what the Georgia state Legislature does?

I decided to interrogate data that’s easy to get via a Python scraper: a list of all the bills and resolutions that lawmakers filed.  The scraper grabs key information about each piece of legislation: is it a resolution or a bill, to which committee was it sent, what was the final status of the bill?

So that’s fine, but how to present it?

I was starting to fiddle around with how to visualize information, and stumbled on the canonical Sankey diagram example on D3js.org.

To put this info in a Sankey diagram, I need to know how many bills go in each node, each rectangle in the chart.  So I needed to sort the legislation into several buckets, then count how many landed in each bucket (or node, or rectangle on the chart.)

First, is it an “honorary” piece of legislation?  Or is it “political?”

If it’s “political,” did it get further: a vote by the full House? (or Senate?)

If it got through one chamber, did it get through the other?

Well, that’s a lot of yes/no questions and if-then-else statements.  Perfect for an algorithm!

I decided to use Python because that was what I knew.  Then I found pandas as I was Googling for a library I could use to make multiple filters on tabular data.

So, take the first question: is a piece of legislation “honorary” or “political?”  There’s no flag for that in the data set, so I had to use several logic tests.

# A is the dataframe that holds all legislation.

# “SR” and “HR” are legislation types: Senate and House resolutions.

# Generally, “honorary” stuff is in resolutions, so grabbing them is a good first step 

B = A[(A.type == 'SR') | (A.type == 'HR') # choose all resolutions]

So there’s my first filter:  B is a dataframe of all resolutions.

Now, because it’s lawmaking, it’s never that simple.  There are always arcane exceptions.  I have to put more logic tests on B.  I need to remove “-CA” titles, “constitutional amendments.”

CA_mask = (B.title.str.contains("-CA") == True)

B = B[~CA_mask]

Now I won’t get any further into the weeds of Georgia lawmaking.  There are a lot more filters in my Python script.  And in fact, just looking at my old code, I can see a lot more efficient ways to use pandas and Python, whew, and better variable names. Yikes.

But anyway, the takeaway is this:  Do you need to write complex filters for tabular data?  If so, consider pandas! 

The output of my script is a simple .csv table that counts up how many of each kind of bill there are:  total bills, honorary bills, political bills, political bills that got a hearing, and so on.

And I use that .csv to create a .json for my online dataviz. The code is all on my GitHub.

I wanted very fine control of the look of the final chart; and besides, there’s no reason for it to be interactive.

So I exported the unannotated, raw graphic via SVG Crowbar. And I finished it in Illustrator.

And I put in lots of footnotes about the judgments I made about what is “political,” and about the practical problems of trying to quantify the very messy process of lawmaking. 

About the author:

Maggie Lee is a freelance reporter based in Atlanta. She narrowly dodged a career in academia after a BS in International Relations at Georgia Tech and an MA in Southeast Asian Studies at the National University of Singapore. Her bylines have been in places like the Atlanta alt-weekly Creative Loafing and the Macon Telegraph, and she became an Atlanta Press Club board member in 2016.

 

Image: Steven Martin.

Comments