5/12/2017

The Python Graph Gallery

 

Hundreds of python charts, displayed with their reproducible code snippets.

By Yan Holtz, Founder

Data journalism is a field closely related with data science. To write an article, data journalists have to follow the traditional steps of any data driven project. These include exploratory and explanatory analysis, and data visualization is a key step in both.

During exploratory analysis, journalists must be able to quickly understand their data through simple graphics, going quickly from one chart to another to answer their questions. Once interesting results are discovered, data visualization is often used to showcase these results. But for a story to be eye-catching and easy to understand, the journalist will often spend  a lot of time customizing the graphic. .

Python is probably the most widely used programming language for data science and offers great possibilities when it comes to representing your data graphically. However, the huge number of tools and the potential complexity of the documentation makes it difficult to build a desired chart.

The Python Graph Gallery is a website that displays hundreds of graphics made with python, always providing a reproducible code snippet. Whatever the chart you want to make, you will probably find an example close to it in the gallery.

400 graphics and 40 sections

The gallery currently provides about 400 distinct charts organized in 40 sections. Each section is represented by a logo made by designer Conor Healy. The color depends of the topic of the graphic: distribution, correlation, part of a whole, maps, flow, evolution, and more. This classification is inspired from the graphic continuum and should allow you to quickly find the chart you need.

Image: The gallery is organized in 40 chart types.

Of course, most common plot types like barplot, scatterplot, boxplot or histograms are present. But some less common data visualization types are present as well, like lollipop plot, bubble plot, 2D density plot or wordclouds.

From easy to tricky

Once you enter a chart section, several examples are displayed, from the easiest to the hardest. Usually, the first example describes how the input dataset must be formatted, and how to make the graphic using the default parameters. Explanations are provided, code is shortened to its strict minimum, and commented line by line, making it easy to understand how the function works. Here is an overview of the most simple density plot you can do:

Image: Overview of the 'basic density plot' page.

Progressively, examples lead you from a very basic version to highly customized charts. Each example aims to explain one particular tip, like customizing colours, flipping axes, adding several groups and so on. At the end of the section, you will find some ‘real life examples’ combining all these tips to get a nice customized figure.

Image:L​lollipop plot with 2 groups (chart #184).

A focus on Matplotlib and Seaborn

A bunch of libraries exist when it comes to making charts with python. I decided to rely mainly on Matplotlib and Seaborn, these are currently the main two tools used. Almost every type of chart is feasible with them. When it is not, I used other libraries like folium for maps or networkX for networks.

Note that both Matplotlib and Seaborn have a dedicated page showing tips generic to every type of chart, like customizing axes and titles, calling different themes, and controlling colours. These pages can be useful to quickly source the every-day code snippets that we tend to forget.

Code: Here.

Conclusion

The Python Graph Gallery displays hundreds of graphics and can hopefully help you to quickly realize the chart you need, for both exploratory and explanatory analysis. In this sense, it mainly aims to help users on a technical point of view.

However, the goal is also to increase the knowledge of users in the field of data visualization:

  • by visiting the website, you may discover new type of dataviz options that could fit your data
  • each section is introduced by a short description explaining when the chart type is advised
  • A bad chart section is sometimes included at the bottom of a section, warning you about the common mistakes made on this type of chart

The gallery is a project developed by Yan Holtz during his nights and holidays. Thus, please be indulgent concerning bugs, imprecisions and English language mistakes. Any bug or feedback is strongly welcome at .(JavaScript must be enabled to view this email address) or via twitter: @R_Graph_Gallery. Last but not least, note that the Python graph gallery has a twin sister: the R Graph Gallery.

About the author

Yan Holtz is a passionate data analyst and bioinformatician currently working for the Queensland Brain Institute of Brisbane.  He has a special attraction for data visualization which lead him to build the R and the Python graph galleries. Website here.

Explore the Python Graph Gallery here.

Comments