Csv to chart with Pandas

Any computer uses data all the time. Sometimes thats in databases, sometimes on the web, sometimes from sensory input and sometimes office data like excel or csv.

So you probably know you can easily parse a csv file with Pandas. But did you know you can quite easily create plots directly from the csv data?

Data set

A csv data set is simply data. It could come from an office suite like GSheets or Open Office. You can save a file a csv, comma separated value. As the name defines, every value is separated by a comma.

Any data set will work, but the example below uses this csv dataset.

This data set is about movies.
For every movie it saves these values:

  • Rank
  • Title
  • Genre
  • Description
  • Director
  • Actors
  • Year
  • Runtime (Minutes)
  • Rating
  • Votes
  • Revenue (Millions)
  • Metascore

So that’s a lot of information. It’s a small data set of 1000 records.

Pandas

We first load the pandas module, matplotlib for plotting and numpy for number crunching. Then uses matplotlib to plot the data. Load the csv data and create the figure.

#!/usr/bin/python3
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

movie = pd.read_csv("IMDB-Movie-Data.csv")

movie["Rating"].mean()
movie["Rating"].plot(kind="hist", figsize=(20, 8))

plt.figure(figsize=(20, 8), dpi=80)
plt.hist(movie["Rating"], 20)
plt.xticks(np.linspace(movie["Rating"].min(), movie["Rating"].max(), 21))
plt.grid(linestyle="--", alpha=0.5)
plt.show()

Enter fullscreen mode Exit fullscreen mode

So that shows you the movie rating data. Mind you, there are a lot of records in the csv file and pandas does is instantly.

Related links:

原文链接:Csv to chart with Pandas

© 版权声明
THE END
喜欢就支持一下吧
点赞10 分享
评论 抢沙发

请登录后发表评论

    暂无评论内容