Statwing: Powerful Data Analysis, Simple to Use


Originally published by Mirko Lorenz on Vision Cloud on 23 September 2012. This article is republished with permission.


There is so much data in the world, but as Hans Rosling recently said at OKFest: "We (the people) did not update." Not knowing facts, not checking the data points behind what we believe, might lead to wrong conclusions and mistakes. Presumably, this is where we are now. This is why there is a growing need for data analysis tools. There is much talk about data science and Big Data. But what we miss so far are simple to use tools.

Maybe that is a bit too much of drumming up expectations. But here, enter Statwing. Launched only recently, the offering enables complex statistical analysis while being easy to use. Like R or SPSS, but much, much simpler to handle. We ran a test with a not too big, not too small dataset - in this particular case a datatable with the passengers from the Titanic. Result: With the help of Statwing it was incredibly simple to analyse statistical relations. The way to do that is to compare different values (e.g. survival vs. passenger class, or survival vs. gender, etc.). See some screenshots of results below.

Below is an interview we conducted with Greg Laughlin, one of the founders of Statwing.

How would you describe what Statwing aims to do?

Laughlin: "We're trying to make a really easy to use data analysis tool. In other words, we want to make data analysis intuitive and beautiful. At a more technical level, we encode statistical best practices into our software so that non-experts can get the same insight into their data as a statistician. For example, let's say you took a survey and you're interested in whether one gender has higher customer satisfaction than the other. To use existing software like SPSS or R, you have to know that that question calls for an Independent Samples T-Test; you then have to know how to run that test, and then how to interpret the very technical output. Statwing knows what the appropriate analysis is depending on how the data looks (e.g.: Is the data continuous or categorical? Does it have outliers?), runs the right test, and reports the results in understandable English."

Where did the idea for Statwing come from? Is there a story about the founders?

Laughlin: "I used to use SPSS to do survey data analysis at a small consultancy working to help foundations like the Gates Foundation or the Packard Foundation improve. I used SPSS to analyze survey data and I absolutely hated the software. I only needed the core functionality, and the software was clearly designed 30 years prior for a very different audience. I later moved to San Francisco and joined CrowdFlower. That's where I met my cofounder, John Le. John was an engineer and a data scientist there, and he had similar pain points as me. He used R a lot, and found it very clunky for doing the basic exploratory data analysis that analysts spend a lot of their time doing. We decided to leave CrowdFlower about a year ago, and the very first day we talked about what product we wanted to work on, we talked about making data analysis easier. Our working title for the product was briefly 'Data to the People', and that philosophy of making data analysis for normal individuals, not just statisticians, still underpins our work."

How did you get this off the ground in the first place?

Laughlin: "We relied on our savings (and my wife) at first, but then in April of this year we were accepted to Y Combinator. That program has been incredibly valuable; we've been surrounded by valuable mentors and inspiring peers. Getting accepted into that program also guaranteed us about $150,000 in funding. We're raising more money right now, but even that amount is plenty for us to live on for some time. Running our site is very inexpensive, so almost all our costs are basic things like food and rent that we as individuals need to pay for. We launched about a month ago in August 2012. We're very far from done, but even now we have users from around the world using the site and getting a lot of value out of it. Many of those users had used R or SPSS and just preferred Statwing, but many other users had never used a data analysis tool like that before."

Why do you think this offer is needed?

Laughlin: "People talk a lot about Big Data, but there's also been an explosion of Small Data out there. Now every data-driven professional has data about their customers, their product, their advertising, etc. Even individuals now track their health and activities at a very granular level. But while the amount of data in the world is exploding, the number of people with highly technical analysis skills is going up quite slowly. So either a lot of people need to learn a lot about data analysis, or we need to make data analysis easier. We think it's very possible to modernize data analysis in such a way that anyone can get a really deep understanding of their data without having to be highly trained."

Is there a data analysis or any example that you really admire?

Laughlin: "There are so many analyses I admire. I think my favourite analysis, which is really more of a favourite psychology or political science technique, is the use of "List experiments". I'm always a fan of using data to unearth surprising or hidden information. Unrelated note: I'm annoyed that most political pollsters don't release the full data, they only release summarization. I'd love to dig into the data and run some regressions."

Which use of Statwing by someone out there surprised you so far?

Laughlin: "My favourite use of Statwing so far was by my brother in law. He likes the Quantified Self movement, and he measured how many steps he takes each day, how happy he is, how much sleep he gets, etc. He'd never used a data analysis tool before but he used Statwing to find out that there was a positive correlation between how many steps he took that day and how happy he was. Unfortunately, there was also a positive correlation between how many steps he took and how much his knees hurt! We now have that analysis publicly available on our website."


Note: Statwing is currently in a free trial period. Users who want to use the tool regularly can choose a starter plan, which costs $19 per month. Additional plans will be rolled out in the future.