Learning Python Data Visualization Summary

Learning Python Data Visualization

by Chad Adams 2014 212 pages
3.0
5 ratings

Key Takeaways

1. Set up Python development environment for data visualization

Python 2.7 on Windows includes pip and easy_install by default.

Choose your platform. Install Python 2.7 on Windows, Mac, or Linux. For Windows, download the 32-bit MSI installer from python.org and add Python to your system PATH. Mac and Linux often come with Python pre-installed.

Install essential tools. Use pip or easy_install to add key libraries:

  • lxml: XML parsing library (use Windows installer if needed)
  • pygal: Main charting library for this book
  • BeautifulSoup: HTML parsing library (optional)

Select an IDE. Choose a Python-friendly integrated development environment:

  • Visual Studio with Python Tools (Windows)
  • Eclipse with PyDev (cross-platform)
  • PyCharm (cross-platform)

2. Master Python basics and file I/O for chart creation

Python is a very loose language—there are no braces wrapping the function and no semicolons to terminate a line of code.

Understand Python syntax. Python uses indentation for code blocks instead of braces. Functions are defined with 'def', and the main program entry point is often marked with 'if name == " main":'.

Work with data types and structures:

  • Strings, integers, floats
  • Lists and dictionaries
  • Date and time objects

File operations. Learn to read from and write to files:
python
with open('filename.txt', 'r') as file:
content = file.read()
with open('output.txt', 'w') as file:
file.write('Hello, World!')

Generate basic graphics. Use the Python Imaging Library (PIL) to create simple images and text overlays.

3. Create basic charts with pygal: line, bar, and XY charts

Pygal (http://pygal.org/) is a Python-based SVG Charts Creator, developed by the Kozea Community (http://community.kozea.fr/), as shown in the following screenshot.

Install and import pygal. Use pip to install pygal and import it in your Python script:
python
import pygal

Create line charts. Build simple and stacked line charts to show data trends over time:
python
line_chart = pygal.Line()
line_chart.title = 'Monthly Sales'
line_chart.x_labels = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun']
line_chart.add('Product A', [100, 150, 200, 180, 220, 190])
line_chart.render_to_file('line_chart.svg')

Develop bar charts. Use bar charts for comparing categories:
python
bar_chart = pygal.Bar()
bar_chart.add('Browser Usage', [('Firefox', 34.1), ('Chrome', 33.6), ('Safari', 17.9)])

Construct XY charts. Create scatter plots and XY line charts for showing relationships between variables:
python
xy_chart = pygal.XY()
xy_chart.add('A', [(0, 0), (1, 1), (2, 4), (3, 9), (4, 16)])

4. Develop advanced charts: pie, radar, box plots, and worldmaps

Worldmaps are great SVG charts. One thing to note is that worldmaps are very complex SVG images, so consider the amount of data your own charts will include, and steer clear of extremely complex datasets.

Create pie charts. Use pie charts to show parts of a whole:
python
pie_chart = pygal.Pie()
pie_chart.add('Chrome', 36.3)
pie_chart.add('Firefox', 33.9)
pie_chart.add('Safari', 16.8)

Build radar charts. Visualize multivariate data on a two-dimensional chart:
python
radar_chart = pygal.Radar()
radar_chart.x_labels = ['Speed', 'Reliability', 'Comfort', 'Safety', 'Efficiency']
radar_chart.add('Car A', [4, 3, 5, 4, 3])
radar_chart.add('Car B', [3, 4, 3, 5, 2])

Construct box plots. Show distribution of data sets:
python
box_plot = pygal.Box()
box_plot.add('Sample A', [1, 5, 7, 9, 12, 13, 15])
box_plot.add('Sample B', [2, 4, 6, 8, 11, 14, 16])

Generate worldmaps. Visualize global data:
python
worldmap = pygal.Worldmap()
worldmap.title = 'Countries visited'
worldmap.add('Visits', {
'us': 10,
'fr': 5,
'de': 3,
})

5. Customize pygal charts with themes and parameters

Pygal offers 14 prebuilt themes.

Apply themes. Change the visual style of your charts:
python
from pygal.style import NeonStyle
chart = pygal.Line(style=NeonStyle)

Adjust chart parameters. Customize various aspects of your charts:

  • Title and labels: chart.title = 'My Chart'
  • Size: chart.width = 800, chart.height = 600
  • Legend: chart.legend_at_bottom = True
  • Axis: chart.x_label_rotation = 45

Create custom configurations. Define reusable chart configurations:
python
class MyConfig(pygal.Config):
width = 1000
height = 600
title_font_size = 24

chart = pygal.Line(MyConfig())

6. Import and parse dynamic web data for charts

HTTP is the foundation of Internet communication. It's a protocol with two distinct types of requests: a request for data or GET and a push of data called a POST.

Fetch web data. Use the urllib2 library to retrieve data from the web:
python
import urllib2
response = urllib2.urlopen('http://example.com/data.xml')
data = response.read()

Parse XML data. Use the ElementTree library to extract information from XML:
python
from xml.etree import ElementTree
tree = ElementTree.parse(response)
root = tree.getroot()
for item in root.findall('.//item'):
title = item.find('title').text

Work with JSON data. Use the json library to parse JSON responses:
python
import json
data = json.loads(response.read())
for item in data['items']:
print(item['title'])

Handle dates and times. Convert string dates to Python datetime objects:
python
from datetime import datetime
date_string = "2023-06-15T10:30:00Z"
date_object = datetime.strptime(date_string, "%Y-%m-%dT%H:%M:%SZ")

7. Build a complete data visualization application

Together we have been through a long process of learning Python data visualization development, handling data, and creating charts using the pygal library. Now it's time to put these skills to work.

Plan your application. Break down the project into modules:

  1. Data retrieval module
  2. Data processing module
  3. Chart generation module
  4. Main application module

Create a data retrieval module. Fetch and parse RSS feed data:
python
def get_rss_data(url):

Fetch and parse RSS data

return parsed_data

Develop a data processing module. Clean and organize the retrieved data:
python
def process_data(raw_data):

Process and structure the data

return processed_data

Build a chart generation module. Create charts based on the processed data:
python
def generate_chart(data):
chart = pygal.Bar()

Add data to the chart

return chart

Combine modules in the main application. Orchestrate the data flow and chart creation:
python
def main():
raw_data = get_rss_data(RSS_URL)
processed_data = process_data(raw_data)
chart = generate_chart(processed_data)
chart.render_to_file('output_chart.svg')

8. Explore additional Python visualization libraries

In the Python world of graphics and data charting, one of the most popular libraries out there is matplotlib.

Learn matplotlib. A powerful 2D and 3D plotting library:

  • Installation: pip install matplotlib
  • Basic usage:

    import matplotlib.pyplot as plt plt.plot([1, 2, 3, 4]) plt.ylabel('some numbers') plt.show()

Discover Plotly. A web-based plotting library with Python API:

  • Requires an account and API key
  • Enables interactive, shareable charts
  • Example:

    import plotly.graph_objs as go import plotly.plotly as py

    trace = go.Scatter(x=[1, 2, 3], y=[4, 5, 6]) py.plot([trace], filename='basic-line')

Explore other libraries:

  • Seaborn: Statistical data visualization
  • Bokeh: Interactive visualization library
  • Altair: Declarative statistical visualization library

Remember to choose the right library based on your specific needs, such as interactivity requirements, chart types, and deployment environment.

Last updated:

Report Issue