Introduction
- In this post you will learn
- Why we need a template system
- Basic syntax of Jinja Template (Python Jinja2)
- Generate a simple HTML report with Jinja and Google Chart (Google Chart)
Why we need a template system
While string formatting in Python is pretty straight forward e.g. print(f"Hello {name}")
, it could be quite troublesome when we are writing some long messages with many variables, or sometimes we want to use Python function like if
or for
loop in the paragraph.
A template system in Python, or any programming language, is a tool that allows you to separate the presentation logic (Template) from business logic (Code) in your applications. It provides a way to define and generate content, such as HTML, XML, or plain text, by combining static template files with dynamic data. In this post, we will demonstrate how to generate some simple HTML report in the following section.
Basic Syntax of Jinja Template
- Render your first jinja template
It looks very similar to string formatting at the first glance. We will first define a template t = Template(...)
with variable {{ var_name }}
, and then render it with the python variables t.render(var_name="sth")
.
<span>from</span> <span>jinja2</span> <span>import</span> <span>Template</span><span>t</span> <span>=</span> <span>Template</span><span>(</span><span>'</span><span>Hello, {{ name }}!</span><span>'</span><span>)</span><span>print</span><span>(</span><span>t</span><span>.</span><span>render</span><span>(</span><span>name</span><span>=</span><span>'</span><span>John Doe</span><span>'</span><span>))</span><span># Output: </span><span>"</span><span>Hello, John Doe!</span><span>"</span><span>from</span> <span>jinja2</span> <span>import</span> <span>Template</span> <span>t</span> <span>=</span> <span>Template</span><span>(</span><span>'</span><span>Hello, {{ name }}!</span><span>'</span><span>)</span> <span>print</span><span>(</span><span>t</span><span>.</span><span>render</span><span>(</span><span>name</span><span>=</span><span>'</span><span>John Doe</span><span>'</span><span>))</span> <span># Output: </span><span>"</span><span>Hello, John Doe!</span><span>"</span>from jinja2 import Template t = Template('Hello, {{ name }}!') print(t.render(name='John Doe')) # Output: "Hello, John Doe!"
Enter fullscreen mode Exit fullscreen mode
Basic syntax
The syntax is almost the same as python with a little bit of syntactic sugar, we can use some conditional block via {% %}
, and reference to the variable with {{ }}
. And another useful thing to note is, we can add a dash -
to the operator, so Jinja knows we dont want an additional line break, e.g. {%- -%}
(No line break) vs {% %}
(With line break).
a) Variables
Syntax: {{ foo }}
, {{ foo.bar }}
, {{ foo["bar"] }}
You can use a dot (.)
to access attributes of a variable in addition to the standard Python __getitem__
“subscript” syntax ([]
).
<span>from</span> <span>jinja2</span> <span>import</span> <span>Template</span><span>baz</span> <span>=</span> <span>"</span><span>a</span><span>"</span><span>foo</span> <span>=</span> <span>{}</span><span>foo</span><span>[</span><span>"</span><span>bar</span><span>"</span><span>]</span> <span>=</span> <span>"</span><span>b</span><span>"</span><span>t</span> <span>=</span> <span>Template</span><span>(</span><span>"""</span><span> {{ baz }} {{ foo.bar }} {{ foo[</span><span>"</span><span>bar</span><span>"</span><span>]}} </span><span>"""</span><span>)</span><span>print</span><span>(</span><span>t</span><span>.</span><span>render</span><span>(</span><span>foo</span><span>=</span><span>foo</span><span>,</span><span>baz</span><span>=</span><span>baz</span><span>,</span><span>))</span><span># Output </span><span>a</span><span>b</span><span>b</span><span>from</span> <span>jinja2</span> <span>import</span> <span>Template</span> <span>baz</span> <span>=</span> <span>"</span><span>a</span><span>"</span> <span>foo</span> <span>=</span> <span>{}</span> <span>foo</span><span>[</span><span>"</span><span>bar</span><span>"</span><span>]</span> <span>=</span> <span>"</span><span>b</span><span>"</span> <span>t</span> <span>=</span> <span>Template</span><span>(</span><span>"""</span><span> {{ baz }} {{ foo.bar }} {{ foo[</span><span>"</span><span>bar</span><span>"</span><span>]}} </span><span>"""</span><span>)</span> <span>print</span><span>(</span><span>t</span><span>.</span><span>render</span><span>(</span> <span>foo</span><span>=</span><span>foo</span><span>,</span> <span>baz</span><span>=</span><span>baz</span><span>,</span> <span>))</span> <span># Output </span><span>a</span> <span>b</span> <span>b</span>from jinja2 import Template baz = "a" foo = {} foo["bar"] = "b" t = Template(""" {{ baz }} {{ foo.bar }} {{ foo["bar"]}} """) print(t.render( foo=foo, baz=baz, )) # Output a b b
Enter fullscreen mode Exit fullscreen mode
b) For loop
Syntax: {% for i in some_list %} {{ i }} {% endfor %}
with line break and {%- for i in some_list -%} {{ i }} {% endfor %}
without line break
<span>from</span> <span>jinja2</span> <span>import</span> <span>Template</span><span>some_students</span> <span>=</span> <span>[</span><span>"</span><span>john</span><span>"</span><span>,</span> <span>"</span><span>terry</span><span>"</span><span>,</span> <span>"</span><span>ken</span><span>"</span><span>]</span><span>t</span> <span>=</span> <span>Template</span><span>(</span><span>"""</span><span> {%- for student in students -%} <li> {{ student }} </li> {% endfor %} </span><span>"""</span><span>)</span><span>print</span><span>(</span><span>t</span><span>.</span><span>render</span><span>(</span><span>students</span><span>=</span><span>some_students</span><span>))</span><span># output </span><span><</span><span>li</span><span>></span> <span>john</span> <span></</span><span>li</span><span>></span><span><</span><span>li</span><span>></span> <span>terry</span> <span></</span><span>li</span><span>></span><span><</span><span>li</span><span>></span> <span>ken</span> <span></</span><span>li</span><span>></span><span>from</span> <span>jinja2</span> <span>import</span> <span>Template</span> <span>some_students</span> <span>=</span> <span>[</span><span>"</span><span>john</span><span>"</span><span>,</span> <span>"</span><span>terry</span><span>"</span><span>,</span> <span>"</span><span>ken</span><span>"</span><span>]</span> <span>t</span> <span>=</span> <span>Template</span><span>(</span><span>"""</span><span> {%- for student in students -%} <li> {{ student }} </li> {% endfor %} </span><span>"""</span><span>)</span> <span>print</span><span>(</span><span>t</span><span>.</span><span>render</span><span>(</span><span>students</span><span>=</span><span>some_students</span><span>))</span> <span># output </span><span><</span><span>li</span><span>></span> <span>john</span> <span></</span><span>li</span><span>></span> <span><</span><span>li</span><span>></span> <span>terry</span> <span></</span><span>li</span><span>></span> <span><</span><span>li</span><span>></span> <span>ken</span> <span></</span><span>li</span><span>></span>from jinja2 import Template some_students = ["john", "terry", "ken"] t = Template(""" {%- for student in students -%} <li> {{ student }} </li> {% endfor %} """) print(t.render(students=some_students)) # output <li> john </li> <li> terry </li> <li> ken </li>
Enter fullscreen mode Exit fullscreen mode
c) If.. elif.. else
Syntax: {% if x = "a" %} {% elif x = "b" %} {% else %} {% endif %}
<span>from</span> <span>jinja2</span> <span>import</span> <span>Template</span><span>x</span> <span>=</span> <span>40</span><span># Additional spacing is added for readability </span><span>t</span> <span>=</span> <span>Template</span><span>(</span><span>"""</span><span> {% if x >= 0 and x < 30 %} x is larger than 0 {% elif x >= 30 %} x is larger than 30 {% else %} x is smaller than 30 {% endif %} </span><span>"""</span><span>)</span><span>print</span><span>(</span><span>t</span><span>.</span><span>render</span><span>(</span><span>x</span><span>=</span><span>x</span><span>))</span><span># output </span><span>x</span> <span>is</span> <span>larger</span> <span>than</span> <span>30</span><span>from</span> <span>jinja2</span> <span>import</span> <span>Template</span> <span>x</span> <span>=</span> <span>40</span> <span># Additional spacing is added for readability </span><span>t</span> <span>=</span> <span>Template</span><span>(</span><span>"""</span><span> {% if x >= 0 and x < 30 %} x is larger than 0 {% elif x >= 30 %} x is larger than 30 {% else %} x is smaller than 30 {% endif %} </span><span>"""</span><span>)</span> <span>print</span><span>(</span><span>t</span><span>.</span><span>render</span><span>(</span><span>x</span><span>=</span><span>x</span><span>))</span> <span># output </span><span>x</span> <span>is</span> <span>larger</span> <span>than</span> <span>30</span>from jinja2 import Template x = 40 # Additional spacing is added for readability t = Template(""" {% if x >= 0 and x < 30 %} x is larger than 0 {% elif x >= 30 %} x is larger than 30 {% else %} x is smaller than 30 {% endif %} """) print(t.render(x=x)) # output x is larger than 30
Enter fullscreen mode Exit fullscreen mode
d) Comments
Syntax: {# #}
(with new line) or {#- -#}
(without new line)
<span>from</span> <span>jinja2</span> <span>import</span> <span>Template</span><span>t</span> <span>=</span> <span>Template</span><span>(</span><span>"""</span><span> {#- This is a comment. And it is not displayed in the output. -#} Hello, {{ name }}! </span><span>"""</span><span>)</span><span>print</span><span>(</span><span>t</span><span>.</span><span>render</span><span>(</span><span>name</span><span>=</span><span>'</span><span>John Doe</span><span>'</span><span>))</span><span># Output </span><span>Hello</span><span>,</span> <span>John</span> <span>Doe</span><span>!</span><span>from</span> <span>jinja2</span> <span>import</span> <span>Template</span> <span>t</span> <span>=</span> <span>Template</span><span>(</span><span>"""</span><span> {#- This is a comment. And it is not displayed in the output. -#} Hello, {{ name }}! </span><span>"""</span><span>)</span> <span>print</span><span>(</span><span>t</span><span>.</span><span>render</span><span>(</span><span>name</span><span>=</span><span>'</span><span>John Doe</span><span>'</span><span>))</span> <span># Output </span><span>Hello</span><span>,</span> <span>John</span> <span>Doe</span><span>!</span>from jinja2 import Template t = Template(""" {#- This is a comment. And it is not displayed in the output. -#} Hello, {{ name }}! """) print(t.render(name='John Doe')) # Output Hello, John Doe!
Enter fullscreen mode Exit fullscreen mode
e) Read template from a file
For better file management, we usually put the files into a ./templates
folder. It helps to separate the presentation layer (template) and logical layer (code). The file structure would be something like this.
<span>.</span>├── main.py└── templates└── simple_report.html<span>.</span> ├── main.py └── templates └── simple_report.html. ├── main.py └── templates └── simple_report.html
Enter fullscreen mode Exit fullscreen mode
First we put the template file under ./templates/simple<sub>report.html
{#- <span><!-- templates/simple_report.html --></span> -#}<span><!DOCTYPE html></span><span><html></span><span><body></span><span><h1></span>Hi {{ name }}!<span></h1></span><span></body></span><span></html></span>{#- <span><!-- templates/simple_report.html --></span> -#} <span><!DOCTYPE html></span> <span><html></span> <span><body></span> <span><h1></span>Hi {{ name }}!<span></h1></span> <span></body></span> <span></html></span>{#- <!-- templates/simple_report.html --> -#} <!DOCTYPE html> <html> <body> <h1>Hi {{ name }}!</h1> </body> </html>
Enter fullscreen mode Exit fullscreen mode
Then we can read the templates with open("some_file.html", "r") as f: ... Template(f.read())
in main.py
<span># May mess up Jinja Template Hierarchy tho. In our simple case, it wont matter much </span><span>from</span> <span>jinja2</span> <span>import</span> <span>Template</span><span>name</span> <span>=</span> <span>"</span><span>John</span><span>"</span><span>with</span> <span>open</span><span>(</span><span>"</span><span>templates/simple_report.html</span><span>"</span><span>,</span> <span>"</span><span>r</span><span>"</span><span>)</span> <span>as</span> <span>f</span><span>:</span><span>template</span> <span>=</span> <span>Template</span><span>(</span><span>f</span><span>.</span><span>read</span><span>())</span><span>rendered</span> <span>=</span> <span>template</span><span>.</span><span>render</span><span>(</span><span>name</span><span>=</span><span>name</span><span>)</span><span>print</span><span>(</span><span>rendered</span><span>)</span><span># May mess up Jinja Template Hierarchy tho. In our simple case, it wont matter much </span><span>from</span> <span>jinja2</span> <span>import</span> <span>Template</span> <span>name</span> <span>=</span> <span>"</span><span>John</span><span>"</span> <span>with</span> <span>open</span><span>(</span><span>"</span><span>templates/simple_report.html</span><span>"</span><span>,</span> <span>"</span><span>r</span><span>"</span><span>)</span> <span>as</span> <span>f</span><span>:</span> <span>template</span> <span>=</span> <span>Template</span><span>(</span><span>f</span><span>.</span><span>read</span><span>())</span> <span>rendered</span> <span>=</span> <span>template</span><span>.</span><span>render</span><span>(</span><span>name</span><span>=</span><span>name</span><span>)</span> <span>print</span><span>(</span><span>rendered</span><span>)</span># May mess up Jinja Template Hierarchy tho. In our simple case, it wont matter much from jinja2 import Template name = "John" with open("templates/simple_report.html", "r") as f: template = Template(f.read()) rendered = template.render(name=name) print(rendered)
Enter fullscreen mode Exit fullscreen mode
Simple Reporting Templates with External Javascript Packages
After the long syntax introduction, we will jump straight to the use cases.
Sometimes when the buisness users are asking for some advance/deep-dive adhoc reports, there are not much report choices for a quick study, you can use
- Excel file
- Commercial products (e.g. Tableau, PowerBI, etc..)
- Self host a Python server (e.g. matplotlib, seaborn, etc..)
- A simple HTML file generated with Template and some Javascript Library (Below example)
Each of the option has it’s pros and cons, a static file (like HTML) provides you an other options, if other commercial tools or hosting a python server is not available.
Google Chart Library
Google chart tools are powerful, simple to use, and free. And usually these kinds of Javascript chart library provides a richer (and customizable) analytic gallery for you to use, which sometimes it is quite difficult to do it in excel, like Sankey Diagram, Interval Plots and Tree Map.
Here I will use a simple bar chart as an example. And you will see how we can create a simple HTML file with bar chart (or other graphs) easily if you know how to use Jinja Template.
This is the outline of the steps to create a standalone HTML file with Google Chart
- Copy the minimal example of Stacked bar chart (Google Charts)
- Replace the
data
part with Jinja Template variable - Try it with dummy data
- Replace it with real data. Here we will use the Daily Passenger Traffic for different Control Points from HK Government Open Data (data.gov.hk)
- Render the Template, and export as HTML file
a) Stacked Bar Chart
Let’s copy the minimal example of stack bar chart. I rename some variables name from the original example here (e.g. materialChart to chart, etc..)
The code is pretty easy to read. We import some chart library with <script type="text/javascript" src="https://www.gstatic.com/charts/loader.js"></script>
and initialize the google chart object and callback function with with google.charts.setOnLoadCallback(drawChart);
. It’s quite straight forward even if you do not have much experience with Javascript.
<span><html></span><span><head></span><span><script </span><span>type=</span><span>"text/javascript"</span> <span>src=</span><span>"https://www.gstatic.com/charts/loader.js"</span><span>></script></span><span><script </span><span>type=</span><span>"text/javascript"</span><span>></span><span>google</span><span>.</span><span>charts</span><span>.</span><span>load</span><span>(</span><span>'</span><span>current</span><span>'</span><span>,</span> <span>{</span><span>packages</span><span>:</span> <span>[</span><span>'</span><span>corechart</span><span>'</span><span>,</span> <span>'</span><span>bar</span><span>'</span><span>]});</span><span>google</span><span>.</span><span>charts</span><span>.</span><span>setOnLoadCallback</span><span>(</span><span>drawChart</span><span>);</span><span>function</span> <span>drawChart</span><span>()</span> <span>{</span><span>var</span> <span>data</span> <span>=</span> <span>google</span><span>.</span><span>visualization</span><span>.</span><span>arrayToDataTable</span><span>([</span><span>[</span><span>'</span><span>City</span><span>'</span><span>,</span> <span>'</span><span>2010 Population</span><span>'</span><span>,</span> <span>'</span><span>2000 Population</span><span>'</span><span>],</span><span>[</span><span>'</span><span>New York City, NY</span><span>'</span><span>,</span> <span>8175000</span><span>,</span> <span>8008000</span><span>],</span><span>[</span><span>'</span><span>Los Angeles, CA</span><span>'</span><span>,</span> <span>3792000</span><span>,</span> <span>3694000</span><span>],</span><span>[</span><span>'</span><span>Chicago, IL</span><span>'</span><span>,</span> <span>2695000</span><span>,</span> <span>2896000</span><span>],</span><span>[</span><span>'</span><span>Houston, TX</span><span>'</span><span>,</span> <span>2099000</span><span>,</span> <span>1953000</span><span>],</span><span>[</span><span>'</span><span>Philadelphia, PA</span><span>'</span><span>,</span> <span>1526000</span><span>,</span> <span>1517000</span><span>]</span><span>]);</span><span>var</span> <span>options</span> <span>=</span> <span>{</span><span>chart</span><span>:</span> <span>{</span><span>title</span><span>:</span> <span>'</span><span>Population of Largest U.S. Cities</span><span>'</span><span>},</span><span>hAxis</span><span>:</span> <span>{</span><span>title</span><span>:</span> <span>'</span><span>Total Population</span><span>'</span><span>,</span><span>minValue</span><span>:</span> <span>0</span><span>,</span><span>},</span><span>vAxis</span><span>:</span> <span>{</span><span>title</span><span>:</span> <span>'</span><span>City</span><span>'</span><span>},</span><span>bars</span><span>:</span> <span>'</span><span>horizontal</span><span>'</span><span>};</span><span>var</span> <span>chart</span> <span>=</span> <span>new</span> <span>google</span><span>.</span><span>charts</span><span>.</span><span>Bar</span><span>(</span><span>document</span><span>.</span><span>getElementById</span><span>(</span><span>'</span><span>chart_div</span><span>'</span><span>));</span><span>chart</span><span>.</span><span>draw</span><span>(</span><span>data</span><span>,</span> <span>options</span><span>);</span><span>}</span><span></script></span><span></head></span><span><body></span><span><div</span> <span>id=</span><span>"chart_div"</span><span>></div></span><span></body></span><span></html></span><span><html></span> <span><head></span> <span><script </span><span>type=</span><span>"text/javascript"</span> <span>src=</span><span>"https://www.gstatic.com/charts/loader.js"</span><span>></script></span> <span><script </span><span>type=</span><span>"text/javascript"</span><span>></span> <span>google</span><span>.</span><span>charts</span><span>.</span><span>load</span><span>(</span><span>'</span><span>current</span><span>'</span><span>,</span> <span>{</span><span>packages</span><span>:</span> <span>[</span><span>'</span><span>corechart</span><span>'</span><span>,</span> <span>'</span><span>bar</span><span>'</span><span>]});</span> <span>google</span><span>.</span><span>charts</span><span>.</span><span>setOnLoadCallback</span><span>(</span><span>drawChart</span><span>);</span> <span>function</span> <span>drawChart</span><span>()</span> <span>{</span> <span>var</span> <span>data</span> <span>=</span> <span>google</span><span>.</span><span>visualization</span><span>.</span><span>arrayToDataTable</span><span>([</span> <span>[</span><span>'</span><span>City</span><span>'</span><span>,</span> <span>'</span><span>2010 Population</span><span>'</span><span>,</span> <span>'</span><span>2000 Population</span><span>'</span><span>],</span> <span>[</span><span>'</span><span>New York City, NY</span><span>'</span><span>,</span> <span>8175000</span><span>,</span> <span>8008000</span><span>],</span> <span>[</span><span>'</span><span>Los Angeles, CA</span><span>'</span><span>,</span> <span>3792000</span><span>,</span> <span>3694000</span><span>],</span> <span>[</span><span>'</span><span>Chicago, IL</span><span>'</span><span>,</span> <span>2695000</span><span>,</span> <span>2896000</span><span>],</span> <span>[</span><span>'</span><span>Houston, TX</span><span>'</span><span>,</span> <span>2099000</span><span>,</span> <span>1953000</span><span>],</span> <span>[</span><span>'</span><span>Philadelphia, PA</span><span>'</span><span>,</span> <span>1526000</span><span>,</span> <span>1517000</span><span>]</span> <span>]);</span> <span>var</span> <span>options</span> <span>=</span> <span>{</span> <span>chart</span><span>:</span> <span>{</span> <span>title</span><span>:</span> <span>'</span><span>Population of Largest U.S. Cities</span><span>'</span> <span>},</span> <span>hAxis</span><span>:</span> <span>{</span> <span>title</span><span>:</span> <span>'</span><span>Total Population</span><span>'</span><span>,</span> <span>minValue</span><span>:</span> <span>0</span><span>,</span> <span>},</span> <span>vAxis</span><span>:</span> <span>{</span> <span>title</span><span>:</span> <span>'</span><span>City</span><span>'</span> <span>},</span> <span>bars</span><span>:</span> <span>'</span><span>horizontal</span><span>'</span> <span>};</span> <span>var</span> <span>chart</span> <span>=</span> <span>new</span> <span>google</span><span>.</span><span>charts</span><span>.</span><span>Bar</span><span>(</span><span>document</span><span>.</span><span>getElementById</span><span>(</span><span>'</span><span>chart_div</span><span>'</span><span>));</span> <span>chart</span><span>.</span><span>draw</span><span>(</span><span>data</span><span>,</span> <span>options</span><span>);</span> <span>}</span> <span></script></span> <span></head></span> <span><body></span> <span><div</span> <span>id=</span><span>"chart_div"</span><span>></div></span> <span></body></span> <span></html></span><html> <head> <script type="text/javascript" src="https://www.gstatic.com/charts/loader.js"></script> <script type="text/javascript"> google.charts.load('current', {packages: ['corechart', 'bar']}); google.charts.setOnLoadCallback(drawChart); function drawChart() { var data = google.visualization.arrayToDataTable([ ['City', '2010 Population', '2000 Population'], ['New York City, NY', 8175000, 8008000], ['Los Angeles, CA', 3792000, 3694000], ['Chicago, IL', 2695000, 2896000], ['Houston, TX', 2099000, 1953000], ['Philadelphia, PA', 1526000, 1517000] ]); var options = { chart: { title: 'Population of Largest U.S. Cities' }, hAxis: { title: 'Total Population', minValue: 0, }, vAxis: { title: 'City' }, bars: 'horizontal' }; var chart = new google.charts.Bar(document.getElementById('chart_div')); chart.draw(data, options); } </script> </head> <body> <div id="chart_div"></div> </body> </html>
Enter fullscreen mode Exit fullscreen mode
b) Replace with Jinja Template variables
Then we can replace some of the values with Template variable like {{ data }}
, {{ title }}
, {{ h_axis }}
, etc..
<span>from</span> <span>jinja2</span> <span>import</span> <span>Template</span><span>t</span> <span>=</span> <span>Template</span><span>(</span><span>"""</span><span> <html> <head> <script type=</span><span>"</span><span>text/javascript</span><span>"</span><span> src=</span><span>"</span><span>https://www.gstatic.com/charts/loader.js</span><span>"</span><span>></script> <script type=</span><span>"</span><span>text/javascript</span><span>"</span><span>> google.charts.load(</span><span>'</span><span>current</span><span>'</span><span>, {packages: [</span><span>'</span><span>corechart</span><span>'</span><span>, </span><span>'</span><span>bar</span><span>'</span><span>]}); google.charts.setOnLoadCallback(drawChart); function drawChart() { var data = google.visualization.arrayToDataTable( {{ data }} ); var options = { chart: { title: </span><span>'</span><span>{{ title }}</span><span>'</span><span> }, hAxis: { title: </span><span>'</span><span>{{ h_axis }}</span><span>'</span><span>, minValue: 0, }, vAxis: { title: </span><span>'</span><span>{{ v_axis }}</span><span>'</span><span> }, bars: </span><span>'</span><span>horizontal</span><span>'</span><span> }; var chart = new google.charts.Bar(document.getElementById(</span><span>'</span><span>chart_div</span><span>'</span><span>)); chart.draw(data, options); } </script> </head> <body> <div id=</span><span>"</span><span>chart_div</span><span>"</span><span>></div> </body> </html> </span><span>"""</span><span>)</span><span>from</span> <span>jinja2</span> <span>import</span> <span>Template</span> <span>t</span> <span>=</span> <span>Template</span><span>(</span> <span>"""</span><span> <html> <head> <script type=</span><span>"</span><span>text/javascript</span><span>"</span><span> src=</span><span>"</span><span>https://www.gstatic.com/charts/loader.js</span><span>"</span><span>></script> <script type=</span><span>"</span><span>text/javascript</span><span>"</span><span>> google.charts.load(</span><span>'</span><span>current</span><span>'</span><span>, {packages: [</span><span>'</span><span>corechart</span><span>'</span><span>, </span><span>'</span><span>bar</span><span>'</span><span>]}); google.charts.setOnLoadCallback(drawChart); function drawChart() { var data = google.visualization.arrayToDataTable( {{ data }} ); var options = { chart: { title: </span><span>'</span><span>{{ title }}</span><span>'</span><span> }, hAxis: { title: </span><span>'</span><span>{{ h_axis }}</span><span>'</span><span>, minValue: 0, }, vAxis: { title: </span><span>'</span><span>{{ v_axis }}</span><span>'</span><span> }, bars: </span><span>'</span><span>horizontal</span><span>'</span><span> }; var chart = new google.charts.Bar(document.getElementById(</span><span>'</span><span>chart_div</span><span>'</span><span>)); chart.draw(data, options); } </script> </head> <body> <div id=</span><span>"</span><span>chart_div</span><span>"</span><span>></div> </body> </html> </span><span>"""</span> <span>)</span>from jinja2 import Template t = Template( """ <html> <head> <script type="text/javascript" src="https://www.gstatic.com/charts/loader.js"></script> <script type="text/javascript"> google.charts.load('current', {packages: ['corechart', 'bar']}); google.charts.setOnLoadCallback(drawChart); function drawChart() { var data = google.visualization.arrayToDataTable( {{ data }} ); var options = { chart: { title: '{{ title }}' }, hAxis: { title: '{{ h_axis }}', minValue: 0, }, vAxis: { title: '{{ v_axis }}' }, bars: 'horizontal' }; var chart = new google.charts.Bar(document.getElementById('chart_div')); chart.draw(data, options); } </script> </head> <body> <div id="chart_div"></div> </body> </html> """ )
Enter fullscreen mode Exit fullscreen mode
c) Test with Dummy Data
If we go back and study the embeded data in the json, we will see the data we need is a list of list, with 3 columns (One categorical and two numerics). So we we create the dataframe accordingly, with ["Control Point", "Arrival", "Departure"]
as our column for our traffic data.
<span># The embeded data in the original json. A list of list with each record as an element. </span><span>[</span><span>[</span><span>'</span><span>City</span><span>'</span><span>,</span> <span>'</span><span>2010 Population</span><span>'</span><span>,</span> <span>'</span><span>2000 Population</span><span>'</span><span>],</span><span>[</span><span>'</span><span>New York City, NY</span><span>'</span><span>,</span> <span>8175000</span><span>,</span> <span>8008000</span><span>],</span><span>[</span><span>'</span><span>Los Angeles, CA</span><span>'</span><span>,</span> <span>3792000</span><span>,</span> <span>3694000</span><span>],</span><span>[</span><span>'</span><span>Chicago, IL</span><span>'</span><span>,</span> <span>2695000</span><span>,</span> <span>2896000</span><span>],</span><span>[</span><span>'</span><span>Houston, TX</span><span>'</span><span>,</span> <span>2099000</span><span>,</span> <span>1953000</span><span>],</span><span>[</span><span>'</span><span>Philadelphia, PA</span><span>'</span><span>,</span> <span>1526000</span><span>,</span> <span>1517000</span><span>]</span><span>]</span><span># The embeded data in the original json. A list of list with each record as an element. </span><span>[</span> <span>[</span><span>'</span><span>City</span><span>'</span><span>,</span> <span>'</span><span>2010 Population</span><span>'</span><span>,</span> <span>'</span><span>2000 Population</span><span>'</span><span>],</span> <span>[</span><span>'</span><span>New York City, NY</span><span>'</span><span>,</span> <span>8175000</span><span>,</span> <span>8008000</span><span>],</span> <span>[</span><span>'</span><span>Los Angeles, CA</span><span>'</span><span>,</span> <span>3792000</span><span>,</span> <span>3694000</span><span>],</span> <span>[</span><span>'</span><span>Chicago, IL</span><span>'</span><span>,</span> <span>2695000</span><span>,</span> <span>2896000</span><span>],</span> <span>[</span><span>'</span><span>Houston, TX</span><span>'</span><span>,</span> <span>2099000</span><span>,</span> <span>1953000</span><span>],</span> <span>[</span><span>'</span><span>Philadelphia, PA</span><span>'</span><span>,</span> <span>1526000</span><span>,</span> <span>1517000</span><span>]</span> <span>]</span># The embeded data in the original json. A list of list with each record as an element. [ ['City', '2010 Population', '2000 Population'], ['New York City, NY', 8175000, 8008000], ['Los Angeles, CA', 3792000, 3694000], ['Chicago, IL', 2695000, 2896000], ['Houston, TX', 2099000, 1953000], ['Philadelphia, PA', 1526000, 1517000] ]
Enter fullscreen mode Exit fullscreen mode
Let’s create a dummy dataframe, and convert it to list of list object.
<span>import</span> <span>pandas</span> <span>as</span> <span>pd</span><span>df</span> <span>=</span> <span>pd</span><span>.</span><span>DataFrame</span><span>({</span><span>"</span><span>Control Point</span><span>"</span><span>:</span> <span>[</span><span>"</span><span>Airport</span><span>"</span><span>,</span> <span>"</span><span>Lok Ma Chau</span><span>"</span><span>,</span> <span>"</span><span>Lo Wu</span><span>"</span><span>],</span><span>"</span><span>Arrival</span><span>"</span><span>:</span> <span>[</span><span>154</span><span>,</span> <span>120</span><span>,</span> <span>40</span><span>],</span><span>"</span><span>Departure</span><span>"</span><span>:</span> <span>[</span><span>21</span><span>,</span> <span>40</span><span>,</span> <span>32</span><span>],</span><span>})</span><span>print</span><span>(</span><span>df</span><span>.</span><span>head</span><span>())</span><span># Convert dataframe to List of list, and insert to column header as the first element. </span><span>data</span> <span>=</span> <span>df</span><span>.</span><span>values</span><span>.</span><span>tolist</span><span>()</span><span>data</span><span>.</span><span>insert</span><span>(</span><span>0</span><span>,</span> <span>df</span><span>.</span><span>columns</span><span>.</span><span>tolist</span><span>())</span><span>print</span><span>(</span><span>data</span><span>)</span><span># df </span> <span>Control</span> <span>Point</span> <span>Arrival</span> <span>Departure</span><span>0</span> <span>Airport</span> <span>154</span> <span>21</span><span>1</span> <span>Lok</span> <span>Ma</span> <span>Chau</span> <span>120</span> <span>40</span><span>2</span> <span>Lo</span> <span>Wu</span> <span>40</span> <span>32</span><span># data </span><span>[</span><span>[</span><span>'</span><span>Control Point</span><span>'</span><span>,</span> <span>'</span><span>Arrival</span><span>'</span><span>,</span> <span>'</span><span>Departure</span><span>'</span><span>],</span><span>[</span><span>'</span><span>Airport</span><span>'</span><span>,</span> <span>154</span><span>,</span> <span>21</span><span>],</span><span>[</span><span>'</span><span>Lok Ma Chau</span><span>'</span><span>,</span> <span>120</span><span>,</span> <span>40</span><span>],</span><span>[</span><span>'</span><span>Lo Wu</span><span>'</span><span>,</span> <span>40</span><span>,</span> <span>32</span><span>]</span><span>]</span><span>import</span> <span>pandas</span> <span>as</span> <span>pd</span> <span>df</span> <span>=</span> <span>pd</span><span>.</span><span>DataFrame</span><span>({</span> <span>"</span><span>Control Point</span><span>"</span><span>:</span> <span>[</span><span>"</span><span>Airport</span><span>"</span><span>,</span> <span>"</span><span>Lok Ma Chau</span><span>"</span><span>,</span> <span>"</span><span>Lo Wu</span><span>"</span><span>],</span> <span>"</span><span>Arrival</span><span>"</span><span>:</span> <span>[</span><span>154</span><span>,</span> <span>120</span><span>,</span> <span>40</span><span>],</span> <span>"</span><span>Departure</span><span>"</span><span>:</span> <span>[</span><span>21</span><span>,</span> <span>40</span><span>,</span> <span>32</span><span>],</span> <span>})</span> <span>print</span><span>(</span><span>df</span><span>.</span><span>head</span><span>())</span> <span># Convert dataframe to List of list, and insert to column header as the first element. </span><span>data</span> <span>=</span> <span>df</span><span>.</span><span>values</span><span>.</span><span>tolist</span><span>()</span> <span>data</span><span>.</span><span>insert</span><span>(</span><span>0</span><span>,</span> <span>df</span><span>.</span><span>columns</span><span>.</span><span>tolist</span><span>())</span> <span>print</span><span>(</span><span>data</span><span>)</span> <span># df </span> <span>Control</span> <span>Point</span> <span>Arrival</span> <span>Departure</span> <span>0</span> <span>Airport</span> <span>154</span> <span>21</span> <span>1</span> <span>Lok</span> <span>Ma</span> <span>Chau</span> <span>120</span> <span>40</span> <span>2</span> <span>Lo</span> <span>Wu</span> <span>40</span> <span>32</span> <span># data </span><span>[</span> <span>[</span><span>'</span><span>Control Point</span><span>'</span><span>,</span> <span>'</span><span>Arrival</span><span>'</span><span>,</span> <span>'</span><span>Departure</span><span>'</span><span>],</span> <span>[</span><span>'</span><span>Airport</span><span>'</span><span>,</span> <span>154</span><span>,</span> <span>21</span><span>],</span> <span>[</span><span>'</span><span>Lok Ma Chau</span><span>'</span><span>,</span> <span>120</span><span>,</span> <span>40</span><span>],</span> <span>[</span><span>'</span><span>Lo Wu</span><span>'</span><span>,</span> <span>40</span><span>,</span> <span>32</span><span>]</span> <span>]</span>import pandas as pd df = pd.DataFrame({ "Control Point": ["Airport", "Lok Ma Chau", "Lo Wu"], "Arrival": [154, 120, 40], "Departure": [21, 40, 32], }) print(df.head()) # Convert dataframe to List of list, and insert to column header as the first element. data = df.values.tolist() data.insert(0, df.columns.tolist()) print(data) # df Control Point Arrival Departure 0 Airport 154 21 1 Lok Ma Chau 120 40 2 Lo Wu 40 32 # data [ ['Control Point', 'Arrival', 'Departure'], ['Airport', 154, 21], ['Lok Ma Chau', 120, 40], ['Lo Wu', 40, 32] ]
Enter fullscreen mode Exit fullscreen mode
d) Replace with real data
Here we will get the 2022 passenger traffic data from HK Government Open Data (data.gov.hk). And aggregate the data from daily records to yearly records. For the df operation, you can check out my previous post on Common Pandas Functions.
<span>import</span> <span>pandas</span> <span>as</span> <span>pd</span><span># Get the 2022 arrival/departure data </span><span>url</span> <span>=</span> <span>"</span><span>https://www.immd.gov.hk/opendata/eng/transport/immigration_clearance/statistics_on_daily_passenger_traffic.csv</span><span>"</span><span>df</span> <span>=</span> <span>pd</span><span>.</span><span>read_csv</span><span>(</span><span>url</span><span>)</span><span>df</span> <span>=</span> <span>df</span><span>[[</span><span>"</span><span>Date</span><span>"</span><span>,</span> <span>"</span><span>Control Point</span><span>"</span><span>,</span> <span>"</span><span>Arrival / Departure</span><span>"</span><span>,</span> <span>"</span><span>Total</span><span>"</span><span>]]</span><span>df</span><span>[</span><span>'</span><span>Date</span><span>'</span><span>]</span> <span>=</span> <span>pd</span><span>.</span><span>to_datetime</span><span>(</span><span>df</span><span>[</span><span>'</span><span>Date</span><span>'</span><span>],</span> <span>format</span><span>=</span><span>"</span><span>%d-%m-%Y</span><span>"</span><span>)</span><span>df_subset</span> <span>=</span> <span>df</span><span>[(</span><span>df</span><span>[</span><span>'</span><span>Date</span><span>'</span><span>]</span> <span>>=</span> <span>'</span><span>2022-01-01</span><span>'</span><span>)</span><span>&</span> <span>(</span><span>df</span><span>[</span><span>'</span><span>Date</span><span>'</span><span>]</span> <span><=</span> <span>'</span><span>2022-12-31</span><span>'</span><span>)].</span><span>reset_index</span><span>(</span><span>drop</span><span>=</span><span>True</span><span>)</span> <span># 2022 data only, for simplicity </span><span>print</span><span>(</span><span>df_subset</span><span>)</span><span># Group by Control Point. We can safely ignore date, as we are only using 2022 data </span><span>df_agg</span> <span>=</span> <span>df_subset</span><span>.</span><span>groupby</span><span>([</span><span>"</span><span>Control Point</span><span>"</span><span>,</span> <span>"</span><span>Arrival / Departure</span><span>"</span><span>]).</span><span>agg</span><span>(</span><span>Total</span><span>=</span><span>(</span><span>"</span><span>Total</span><span>"</span><span>,</span> <span>"</span><span>sum</span><span>"</span><span>),).</span><span>reset_index</span><span>()</span><span>df_agg</span> <span>=</span> <span>df_agg</span><span>[</span><span>df_agg</span><span>.</span><span>Total</span> <span>></span> <span>0</span><span>].</span><span>reset_index</span><span>(</span><span>drop</span><span>=</span><span>True</span><span>)</span><span>print</span><span>(</span><span>df_agg</span><span>)</span><span># Long to wide. pivot_table is used here, just for the fill_value argument </span><span>df_wide</span> <span>=</span> <span>df_agg</span><span>.</span><span>pivot_table</span><span>(</span><span>index</span><span>=</span><span>[</span><span>"</span><span>Control Point</span><span>"</span><span>],</span><span>columns</span><span>=</span><span>"</span><span>Arrival / Departure</span><span>"</span><span>,</span><span>values</span><span>=</span><span>[</span><span>"</span><span>Total</span><span>"</span><span>],</span><span>aggfunc</span><span>=</span><span>"</span><span>sum</span><span>"</span><span>,</span><span>fill_value</span><span>=</span><span>0</span><span>,</span><span>)</span><span>df_wide_flattened</span> <span>=</span> <span>df_wide</span><span>.</span><span>copy</span><span>()</span><span>df_wide_flattened</span><span>.</span><span>columns</span> <span>=</span> <span>[</span><span>"</span><span>_</span><span>"</span><span>.</span><span>join</span><span>(</span><span>x</span><span>)</span> <span>for</span> <span>x</span> <span>in</span> <span>df_wide_flattened</span><span>.</span><span>columns</span><span>.</span><span>to_flat_index</span><span>()]</span><span>df_wide_flattened</span><span>.</span><span>reset_index</span><span>(</span><span>inplace</span><span>=</span><span>True</span><span>)</span><span>print</span><span>(</span><span>df_wide_flattened</span><span>)</span><span>data</span> <span>=</span> <span>df_wide_flattened</span><span>.</span><span>values</span><span>.</span><span>tolist</span><span>()</span><span>data</span><span>.</span><span>insert</span><span>(</span><span>0</span><span>,</span> <span>df_wide_flattened</span><span>.</span><span>columns</span><span>.</span><span>tolist</span><span>())</span> <span># Insert back the column header </span><span>print</span><span>(</span><span>data</span><span>)</span><span># df_subset (2022 Data) </span> <span>Date</span> <span>Control</span> <span>Point</span> <span>Arrival</span> <span>/</span> <span>Departure</span> <span>Total</span><span>0</span> <span>2022</span><span>-</span><span>01</span><span>-</span><span>01</span> <span>Airport</span> <span>Arrival</span> <span>628</span><span>1</span> <span>2022</span><span>-</span><span>01</span><span>-</span><span>01</span> <span>Airport</span> <span>Departure</span> <span>778</span><span>2</span> <span>2022</span><span>-</span><span>01</span><span>-</span><span>01</span> <span>Express</span> <span>Rail</span> <span>Link</span> <span>West</span> <span>Kowloon</span> <span>Arrival</span> <span>0</span><span>3</span> <span>2022</span><span>-</span><span>01</span><span>-</span><span>01</span> <span>Express</span> <span>Rail</span> <span>Link</span> <span>West</span> <span>Kowloon</span> <span>Departure</span> <span>0</span><span>...</span><span># df_agg (Aggregate to Year level) </span> <span>Control</span> <span>Point</span> <span>Arrival</span> <span>/</span> <span>Departure</span> <span>Total</span><span>0</span> <span>Airport</span> <span>Arrival</span> <span>1957891</span><span>1</span> <span>Airport</span> <span>Departure</span> <span>2175626</span><span>8</span> <span>Heung</span> <span>Yuen</span> <span>Wai</span> <span>Arrival</span> <span>301</span><span>10</span> <span>Hong</span> <span>Kong</span><span>-</span><span>Zhuhai</span><span>-</span><span>Macao</span> <span>Bridge</span> <span>Arrival</span> <span>77075</span><span>...</span><span># df_wide_flattened (Long to wide table) </span> <span>Control</span> <span>Point</span> <span>Total_Arrival</span> <span>Total_Departure</span><span>0</span> <span>Airport</span> <span>1957891</span> <span>2175626</span><span>1</span> <span>Heung</span> <span>Yuen</span> <span>Wai</span> <span>301</span> <span>0</span><span>2</span> <span>Hong</span> <span>Kong</span><span>-</span><span>Zhuhai</span><span>-</span><span>Macao</span> <span>Bridge</span> <span>77075</span> <span>114504</span><span>3</span> <span>Kai</span> <span>Tak</span> <span>Cruise</span> <span>Terminal</span> <span>8233</span> <span>3610</span><span>4</span> <span>Shenzhen</span> <span>Bay</span> <span>485248</span> <span>439669</span><span># data (Final List of List object for input of the graph) </span><span>[[</span><span>'</span><span>Control Point</span><span>'</span><span>,</span> <span>'</span><span>Total_Arrival</span><span>'</span><span>,</span> <span>'</span><span>Total_Departure</span><span>'</span><span>],</span> <span>[</span><span>'</span><span>Airport</span><span>'</span><span>,</span> <span>1957891</span><span>,</span> <span>2175626</span><span>],</span> <span>[</span><span>'</span><span>China Ferry Terminal</span><span>'</span><span>,</span> <span>0</span><span>,</span> <span>0</span><span>],</span> <span>..]</span><span>import</span> <span>pandas</span> <span>as</span> <span>pd</span> <span># Get the 2022 arrival/departure data </span><span>url</span> <span>=</span> <span>"</span><span>https://www.immd.gov.hk/opendata/eng/transport/immigration_clearance/statistics_on_daily_passenger_traffic.csv</span><span>"</span> <span>df</span> <span>=</span> <span>pd</span><span>.</span><span>read_csv</span><span>(</span><span>url</span><span>)</span> <span>df</span> <span>=</span> <span>df</span><span>[[</span><span>"</span><span>Date</span><span>"</span><span>,</span> <span>"</span><span>Control Point</span><span>"</span><span>,</span> <span>"</span><span>Arrival / Departure</span><span>"</span><span>,</span> <span>"</span><span>Total</span><span>"</span><span>]]</span> <span>df</span><span>[</span><span>'</span><span>Date</span><span>'</span><span>]</span> <span>=</span> <span>pd</span><span>.</span><span>to_datetime</span><span>(</span><span>df</span><span>[</span><span>'</span><span>Date</span><span>'</span><span>],</span> <span>format</span><span>=</span><span>"</span><span>%d-%m-%Y</span><span>"</span><span>)</span> <span>df_subset</span> <span>=</span> <span>df</span><span>[(</span><span>df</span><span>[</span><span>'</span><span>Date</span><span>'</span><span>]</span> <span>>=</span> <span>'</span><span>2022-01-01</span><span>'</span><span>)</span> <span>&</span> <span>(</span><span>df</span><span>[</span><span>'</span><span>Date</span><span>'</span><span>]</span> <span><=</span> <span>'</span><span>2022-12-31</span><span>'</span><span>)].</span><span>reset_index</span><span>(</span><span>drop</span><span>=</span><span>True</span><span>)</span> <span># 2022 data only, for simplicity </span><span>print</span><span>(</span><span>df_subset</span><span>)</span> <span># Group by Control Point. We can safely ignore date, as we are only using 2022 data </span><span>df_agg</span> <span>=</span> <span>df_subset</span><span>.</span><span>groupby</span><span>([</span><span>"</span><span>Control Point</span><span>"</span><span>,</span> <span>"</span><span>Arrival / Departure</span><span>"</span><span>]).</span><span>agg</span><span>(</span><span>Total</span><span>=</span><span>(</span><span>"</span><span>Total</span><span>"</span><span>,</span> <span>"</span><span>sum</span><span>"</span><span>),).</span><span>reset_index</span><span>()</span> <span>df_agg</span> <span>=</span> <span>df_agg</span><span>[</span><span>df_agg</span><span>.</span><span>Total</span> <span>></span> <span>0</span><span>].</span><span>reset_index</span><span>(</span><span>drop</span><span>=</span><span>True</span><span>)</span> <span>print</span><span>(</span><span>df_agg</span><span>)</span> <span># Long to wide. pivot_table is used here, just for the fill_value argument </span><span>df_wide</span> <span>=</span> <span>df_agg</span><span>.</span><span>pivot_table</span><span>(</span> <span>index</span><span>=</span><span>[</span><span>"</span><span>Control Point</span><span>"</span><span>],</span> <span>columns</span><span>=</span><span>"</span><span>Arrival / Departure</span><span>"</span><span>,</span> <span>values</span><span>=</span><span>[</span><span>"</span><span>Total</span><span>"</span><span>],</span> <span>aggfunc</span><span>=</span><span>"</span><span>sum</span><span>"</span><span>,</span> <span>fill_value</span><span>=</span><span>0</span><span>,</span> <span>)</span> <span>df_wide_flattened</span> <span>=</span> <span>df_wide</span><span>.</span><span>copy</span><span>()</span> <span>df_wide_flattened</span><span>.</span><span>columns</span> <span>=</span> <span>[</span><span>"</span><span>_</span><span>"</span><span>.</span><span>join</span><span>(</span><span>x</span><span>)</span> <span>for</span> <span>x</span> <span>in</span> <span>df_wide_flattened</span><span>.</span><span>columns</span><span>.</span><span>to_flat_index</span><span>()]</span> <span>df_wide_flattened</span><span>.</span><span>reset_index</span><span>(</span><span>inplace</span><span>=</span><span>True</span><span>)</span> <span>print</span><span>(</span><span>df_wide_flattened</span><span>)</span> <span>data</span> <span>=</span> <span>df_wide_flattened</span><span>.</span><span>values</span><span>.</span><span>tolist</span><span>()</span> <span>data</span><span>.</span><span>insert</span><span>(</span><span>0</span><span>,</span> <span>df_wide_flattened</span><span>.</span><span>columns</span><span>.</span><span>tolist</span><span>())</span> <span># Insert back the column header </span><span>print</span><span>(</span><span>data</span><span>)</span> <span># df_subset (2022 Data) </span> <span>Date</span> <span>Control</span> <span>Point</span> <span>Arrival</span> <span>/</span> <span>Departure</span> <span>Total</span> <span>0</span> <span>2022</span><span>-</span><span>01</span><span>-</span><span>01</span> <span>Airport</span> <span>Arrival</span> <span>628</span> <span>1</span> <span>2022</span><span>-</span><span>01</span><span>-</span><span>01</span> <span>Airport</span> <span>Departure</span> <span>778</span> <span>2</span> <span>2022</span><span>-</span><span>01</span><span>-</span><span>01</span> <span>Express</span> <span>Rail</span> <span>Link</span> <span>West</span> <span>Kowloon</span> <span>Arrival</span> <span>0</span> <span>3</span> <span>2022</span><span>-</span><span>01</span><span>-</span><span>01</span> <span>Express</span> <span>Rail</span> <span>Link</span> <span>West</span> <span>Kowloon</span> <span>Departure</span> <span>0</span> <span>...</span> <span># df_agg (Aggregate to Year level) </span> <span>Control</span> <span>Point</span> <span>Arrival</span> <span>/</span> <span>Departure</span> <span>Total</span> <span>0</span> <span>Airport</span> <span>Arrival</span> <span>1957891</span> <span>1</span> <span>Airport</span> <span>Departure</span> <span>2175626</span> <span>8</span> <span>Heung</span> <span>Yuen</span> <span>Wai</span> <span>Arrival</span> <span>301</span> <span>10</span> <span>Hong</span> <span>Kong</span><span>-</span><span>Zhuhai</span><span>-</span><span>Macao</span> <span>Bridge</span> <span>Arrival</span> <span>77075</span> <span>...</span> <span># df_wide_flattened (Long to wide table) </span> <span>Control</span> <span>Point</span> <span>Total_Arrival</span> <span>Total_Departure</span> <span>0</span> <span>Airport</span> <span>1957891</span> <span>2175626</span> <span>1</span> <span>Heung</span> <span>Yuen</span> <span>Wai</span> <span>301</span> <span>0</span> <span>2</span> <span>Hong</span> <span>Kong</span><span>-</span><span>Zhuhai</span><span>-</span><span>Macao</span> <span>Bridge</span> <span>77075</span> <span>114504</span> <span>3</span> <span>Kai</span> <span>Tak</span> <span>Cruise</span> <span>Terminal</span> <span>8233</span> <span>3610</span> <span>4</span> <span>Shenzhen</span> <span>Bay</span> <span>485248</span> <span>439669</span> <span># data (Final List of List object for input of the graph) </span><span>[[</span><span>'</span><span>Control Point</span><span>'</span><span>,</span> <span>'</span><span>Total_Arrival</span><span>'</span><span>,</span> <span>'</span><span>Total_Departure</span><span>'</span><span>],</span> <span>[</span><span>'</span><span>Airport</span><span>'</span><span>,</span> <span>1957891</span><span>,</span> <span>2175626</span><span>],</span> <span>[</span><span>'</span><span>China Ferry Terminal</span><span>'</span><span>,</span> <span>0</span><span>,</span> <span>0</span><span>],</span> <span>..]</span>import pandas as pd # Get the 2022 arrival/departure data url = "https://www.immd.gov.hk/opendata/eng/transport/immigration_clearance/statistics_on_daily_passenger_traffic.csv" df = pd.read_csv(url) df = df[["Date", "Control Point", "Arrival / Departure", "Total"]] df['Date'] = pd.to_datetime(df['Date'], format="%d-%m-%Y") df_subset = df[(df['Date'] >= '2022-01-01') & (df['Date'] <= '2022-12-31')].reset_index(drop=True) # 2022 data only, for simplicity print(df_subset) # Group by Control Point. We can safely ignore date, as we are only using 2022 data df_agg = df_subset.groupby(["Control Point", "Arrival / Departure"]).agg(Total=("Total", "sum"),).reset_index() df_agg = df_agg[df_agg.Total > 0].reset_index(drop=True) print(df_agg) # Long to wide. pivot_table is used here, just for the fill_value argument df_wide = df_agg.pivot_table( index=["Control Point"], columns="Arrival / Departure", values=["Total"], aggfunc="sum", fill_value=0, ) df_wide_flattened = df_wide.copy() df_wide_flattened.columns = ["_".join(x) for x in df_wide_flattened.columns.to_flat_index()] df_wide_flattened.reset_index(inplace=True) print(df_wide_flattened) data = df_wide_flattened.values.tolist() data.insert(0, df_wide_flattened.columns.tolist()) # Insert back the column header print(data) # df_subset (2022 Data) Date Control Point Arrival / Departure Total 0 2022-01-01 Airport Arrival 628 1 2022-01-01 Airport Departure 778 2 2022-01-01 Express Rail Link West Kowloon Arrival 0 3 2022-01-01 Express Rail Link West Kowloon Departure 0 ... # df_agg (Aggregate to Year level) Control Point Arrival / Departure Total 0 Airport Arrival 1957891 1 Airport Departure 2175626 8 Heung Yuen Wai Arrival 301 10 Hong Kong-Zhuhai-Macao Bridge Arrival 77075 ... # df_wide_flattened (Long to wide table) Control Point Total_Arrival Total_Departure 0 Airport 1957891 2175626 1 Heung Yuen Wai 301 0 2 Hong Kong-Zhuhai-Macao Bridge 77075 114504 3 Kai Tak Cruise Terminal 8233 3610 4 Shenzhen Bay 485248 439669 # data (Final List of List object for input of the graph) [['Control Point', 'Total_Arrival', 'Total_Departure'], ['Airport', 1957891, 2175626], ['China Ferry Terminal', 0, 0], ..]
Enter fullscreen mode Exit fullscreen mode
e) Render with Jinja Template, and export to html file
Finally, we can render the variables with Jinja, and export to the HTML file.
<span>from</span> <span>jinja2</span> <span>import</span> <span>Template</span><span>import</span> <span>pandas</span> <span>as</span> <span>pd</span><span># Get the 2022 arrival/departure data, hopefully the URL will not break </span><span>url</span> <span>=</span> <span>"</span><span>https://www.immd.gov.hk/opendata/eng/transport/immigration_clearance/statistics_on_daily_passenger_traffic.csv</span><span>"</span><span>df</span> <span>=</span> <span>pd</span><span>.</span><span>read_csv</span><span>(</span><span>url</span><span>)</span><span>df</span> <span>=</span> <span>df</span><span>[[</span><span>"</span><span>Date</span><span>"</span><span>,</span> <span>"</span><span>Control Point</span><span>"</span><span>,</span> <span>"</span><span>Arrival / Departure</span><span>"</span><span>,</span> <span>"</span><span>Total</span><span>"</span><span>]]</span><span>df</span><span>[</span><span>'</span><span>Date</span><span>'</span><span>]</span> <span>=</span> <span>pd</span><span>.</span><span>to_datetime</span><span>(</span><span>df</span><span>[</span><span>'</span><span>Date</span><span>'</span><span>],</span> <span>format</span><span>=</span><span>"</span><span>%d-%m-%Y</span><span>"</span><span>)</span><span>df_subset</span> <span>=</span> <span>df</span><span>[(</span><span>df</span><span>[</span><span>'</span><span>Date</span><span>'</span><span>]</span> <span>>=</span> <span>'</span><span>2022-01-01</span><span>'</span><span>)</span><span>&</span> <span>(</span><span>df</span><span>[</span><span>'</span><span>Date</span><span>'</span><span>]</span> <span><=</span> <span>'</span><span>2022-12-31</span><span>'</span><span>)].</span><span>reset_index</span><span>(</span><span>drop</span><span>=</span><span>True</span><span>)</span> <span># 2022 data only, for simplicity </span><span># Group by Control Point. We can safely ignore date, as we are only using 2022 data </span><span>df_agg</span> <span>=</span> <span>df_subset</span><span>.</span><span>groupby</span><span>([</span><span>"</span><span>Control Point</span><span>"</span><span>,</span> <span>"</span><span>Arrival / Departure</span><span>"</span><span>]).</span><span>agg</span><span>(</span><span>Total</span><span>=</span><span>(</span><span>"</span><span>Total</span><span>"</span><span>,</span> <span>"</span><span>sum</span><span>"</span><span>),).</span><span>reset_index</span><span>()</span><span>df_agg</span> <span>=</span> <span>df_agg</span><span>[</span><span>df_agg</span><span>.</span><span>Total</span> <span>></span> <span>0</span><span>].</span><span>reset_index</span><span>(</span><span>drop</span><span>=</span><span>True</span><span>)</span><span># Long to wide </span><span>df_wide</span> <span>=</span> <span>df_agg</span><span>.</span><span>pivot_table</span><span>(</span><span>index</span><span>=</span><span>[</span><span>"</span><span>Control Point</span><span>"</span><span>],</span><span>columns</span><span>=</span><span>"</span><span>Arrival / Departure</span><span>"</span><span>,</span><span>values</span><span>=</span><span>[</span><span>"</span><span>Total</span><span>"</span><span>],</span><span>aggfunc</span><span>=</span><span>"</span><span>sum</span><span>"</span><span>,</span><span>fill_value</span><span>=</span><span>0</span><span>,</span><span>)</span><span>df_wide_flattened</span> <span>=</span> <span>df_wide</span><span>.</span><span>copy</span><span>()</span><span>df_wide_flattened</span><span>.</span><span>columns</span> <span>=</span> <span>[</span><span>"</span><span>_</span><span>"</span><span>.</span><span>join</span><span>(</span><span>x</span><span>)</span> <span>for</span> <span>x</span> <span>in</span> <span>df_wide_flattened</span><span>.</span><span>columns</span><span>.</span><span>to_flat_index</span><span>()]</span><span>df_wide_flattened</span><span>.</span><span>reset_index</span><span>(</span><span>inplace</span><span>=</span><span>True</span><span>)</span><span># List of List </span><span>data</span> <span>=</span> <span>df_wide_flattened</span><span>.</span><span>values</span><span>.</span><span>tolist</span><span>()</span><span>data</span><span>.</span><span>insert</span><span>(</span><span>0</span><span>,</span> <span>df_wide_flattened</span><span>.</span><span>columns</span><span>.</span><span>tolist</span><span>())</span> <span># Insert back the column header </span><span># Create template </span><span>t</span> <span>=</span> <span>Template</span><span>(</span><span>"""</span><span> <html> <head> <script type=</span><span>"</span><span>text/javascript</span><span>"</span><span> src=</span><span>"</span><span>https://www.gstatic.com/charts/loader.js</span><span>"</span><span>></script> <script type=</span><span>"</span><span>text/javascript</span><span>"</span><span>> google.charts.load(</span><span>'</span><span>current</span><span>'</span><span>, {packages: [</span><span>'</span><span>corechart</span><span>'</span><span>, </span><span>'</span><span>bar</span><span>'</span><span>]}); google.charts.setOnLoadCallback(drawChart); function drawChart() { var data = google.visualization.arrayToDataTable( {{ data }} ); var options = { chart: { title: </span><span>'</span><span>{{ title }}</span><span>'</span><span> }, hAxis: { title: </span><span>'</span><span>{{ h_axis }}</span><span>'</span><span>, minValue: 0, }, vAxis: { title: </span><span>'</span><span>{{ v_axis }}</span><span>'</span><span> }, bars: </span><span>'</span><span>horizontal</span><span>'</span><span> }; var chart = new google.charts.Bar(document.getElementById(</span><span>'</span><span>chart_div</span><span>'</span><span>)); chart.draw(data, options); } </script> </head> <body> <div id=</span><span>"</span><span>chart_div</span><span>"</span><span>></div> </body> </html> </span><span>"""</span><span>)</span><span># Render template, and save to a html file </span><span>output</span> <span>=</span> <span>t</span><span>.</span><span>render</span><span>(</span><span>data</span><span>=</span><span>data</span><span>,</span><span>title</span><span>=</span><span>"</span><span>Inbound and Outbound Passenger at different Control Points (Year 2022)</span><span>"</span><span>,</span><span>h_axis</span><span>=</span><span>"</span><span>Passengers</span><span>"</span><span>,</span><span>v_axis</span><span>=</span><span>"</span><span>Control Point</span><span>"</span><span>,</span><span>)</span><span>with</span> <span>open</span><span>(</span><span>'</span><span>output.html</span><span>'</span><span>,</span> <span>'</span><span>w</span><span>'</span><span>)</span> <span>as</span> <span>f</span><span>:</span><span>f</span><span>.</span><span>write</span><span>(</span><span>output</span><span>)</span><span>from</span> <span>jinja2</span> <span>import</span> <span>Template</span> <span>import</span> <span>pandas</span> <span>as</span> <span>pd</span> <span># Get the 2022 arrival/departure data, hopefully the URL will not break </span><span>url</span> <span>=</span> <span>"</span><span>https://www.immd.gov.hk/opendata/eng/transport/immigration_clearance/statistics_on_daily_passenger_traffic.csv</span><span>"</span> <span>df</span> <span>=</span> <span>pd</span><span>.</span><span>read_csv</span><span>(</span><span>url</span><span>)</span> <span>df</span> <span>=</span> <span>df</span><span>[[</span><span>"</span><span>Date</span><span>"</span><span>,</span> <span>"</span><span>Control Point</span><span>"</span><span>,</span> <span>"</span><span>Arrival / Departure</span><span>"</span><span>,</span> <span>"</span><span>Total</span><span>"</span><span>]]</span> <span>df</span><span>[</span><span>'</span><span>Date</span><span>'</span><span>]</span> <span>=</span> <span>pd</span><span>.</span><span>to_datetime</span><span>(</span><span>df</span><span>[</span><span>'</span><span>Date</span><span>'</span><span>],</span> <span>format</span><span>=</span><span>"</span><span>%d-%m-%Y</span><span>"</span><span>)</span> <span>df_subset</span> <span>=</span> <span>df</span><span>[(</span><span>df</span><span>[</span><span>'</span><span>Date</span><span>'</span><span>]</span> <span>>=</span> <span>'</span><span>2022-01-01</span><span>'</span><span>)</span> <span>&</span> <span>(</span><span>df</span><span>[</span><span>'</span><span>Date</span><span>'</span><span>]</span> <span><=</span> <span>'</span><span>2022-12-31</span><span>'</span><span>)].</span><span>reset_index</span><span>(</span><span>drop</span><span>=</span><span>True</span><span>)</span> <span># 2022 data only, for simplicity </span> <span># Group by Control Point. We can safely ignore date, as we are only using 2022 data </span><span>df_agg</span> <span>=</span> <span>df_subset</span><span>.</span><span>groupby</span><span>([</span><span>"</span><span>Control Point</span><span>"</span><span>,</span> <span>"</span><span>Arrival / Departure</span><span>"</span><span>]).</span><span>agg</span><span>(</span><span>Total</span><span>=</span><span>(</span><span>"</span><span>Total</span><span>"</span><span>,</span> <span>"</span><span>sum</span><span>"</span><span>),).</span><span>reset_index</span><span>()</span> <span>df_agg</span> <span>=</span> <span>df_agg</span><span>[</span><span>df_agg</span><span>.</span><span>Total</span> <span>></span> <span>0</span><span>].</span><span>reset_index</span><span>(</span><span>drop</span><span>=</span><span>True</span><span>)</span> <span># Long to wide </span><span>df_wide</span> <span>=</span> <span>df_agg</span><span>.</span><span>pivot_table</span><span>(</span> <span>index</span><span>=</span><span>[</span><span>"</span><span>Control Point</span><span>"</span><span>],</span> <span>columns</span><span>=</span><span>"</span><span>Arrival / Departure</span><span>"</span><span>,</span> <span>values</span><span>=</span><span>[</span><span>"</span><span>Total</span><span>"</span><span>],</span> <span>aggfunc</span><span>=</span><span>"</span><span>sum</span><span>"</span><span>,</span> <span>fill_value</span><span>=</span><span>0</span><span>,</span> <span>)</span> <span>df_wide_flattened</span> <span>=</span> <span>df_wide</span><span>.</span><span>copy</span><span>()</span> <span>df_wide_flattened</span><span>.</span><span>columns</span> <span>=</span> <span>[</span><span>"</span><span>_</span><span>"</span><span>.</span><span>join</span><span>(</span><span>x</span><span>)</span> <span>for</span> <span>x</span> <span>in</span> <span>df_wide_flattened</span><span>.</span><span>columns</span><span>.</span><span>to_flat_index</span><span>()]</span> <span>df_wide_flattened</span><span>.</span><span>reset_index</span><span>(</span><span>inplace</span><span>=</span><span>True</span><span>)</span> <span># List of List </span><span>data</span> <span>=</span> <span>df_wide_flattened</span><span>.</span><span>values</span><span>.</span><span>tolist</span><span>()</span> <span>data</span><span>.</span><span>insert</span><span>(</span><span>0</span><span>,</span> <span>df_wide_flattened</span><span>.</span><span>columns</span><span>.</span><span>tolist</span><span>())</span> <span># Insert back the column header </span> <span># Create template </span><span>t</span> <span>=</span> <span>Template</span><span>(</span><span>"""</span><span> <html> <head> <script type=</span><span>"</span><span>text/javascript</span><span>"</span><span> src=</span><span>"</span><span>https://www.gstatic.com/charts/loader.js</span><span>"</span><span>></script> <script type=</span><span>"</span><span>text/javascript</span><span>"</span><span>> google.charts.load(</span><span>'</span><span>current</span><span>'</span><span>, {packages: [</span><span>'</span><span>corechart</span><span>'</span><span>, </span><span>'</span><span>bar</span><span>'</span><span>]}); google.charts.setOnLoadCallback(drawChart); function drawChart() { var data = google.visualization.arrayToDataTable( {{ data }} ); var options = { chart: { title: </span><span>'</span><span>{{ title }}</span><span>'</span><span> }, hAxis: { title: </span><span>'</span><span>{{ h_axis }}</span><span>'</span><span>, minValue: 0, }, vAxis: { title: </span><span>'</span><span>{{ v_axis }}</span><span>'</span><span> }, bars: </span><span>'</span><span>horizontal</span><span>'</span><span> }; var chart = new google.charts.Bar(document.getElementById(</span><span>'</span><span>chart_div</span><span>'</span><span>)); chart.draw(data, options); } </script> </head> <body> <div id=</span><span>"</span><span>chart_div</span><span>"</span><span>></div> </body> </html> </span><span>"""</span><span>)</span> <span># Render template, and save to a html file </span><span>output</span> <span>=</span> <span>t</span><span>.</span><span>render</span><span>(</span> <span>data</span><span>=</span><span>data</span><span>,</span> <span>title</span><span>=</span><span>"</span><span>Inbound and Outbound Passenger at different Control Points (Year 2022)</span><span>"</span><span>,</span> <span>h_axis</span><span>=</span><span>"</span><span>Passengers</span><span>"</span><span>,</span> <span>v_axis</span><span>=</span><span>"</span><span>Control Point</span><span>"</span><span>,</span> <span>)</span> <span>with</span> <span>open</span><span>(</span><span>'</span><span>output.html</span><span>'</span><span>,</span> <span>'</span><span>w</span><span>'</span><span>)</span> <span>as</span> <span>f</span><span>:</span> <span>f</span><span>.</span><span>write</span><span>(</span><span>output</span><span>)</span>from jinja2 import Template import pandas as pd # Get the 2022 arrival/departure data, hopefully the URL will not break url = "https://www.immd.gov.hk/opendata/eng/transport/immigration_clearance/statistics_on_daily_passenger_traffic.csv" df = pd.read_csv(url) df = df[["Date", "Control Point", "Arrival / Departure", "Total"]] df['Date'] = pd.to_datetime(df['Date'], format="%d-%m-%Y") df_subset = df[(df['Date'] >= '2022-01-01') & (df['Date'] <= '2022-12-31')].reset_index(drop=True) # 2022 data only, for simplicity # Group by Control Point. We can safely ignore date, as we are only using 2022 data df_agg = df_subset.groupby(["Control Point", "Arrival / Departure"]).agg(Total=("Total", "sum"),).reset_index() df_agg = df_agg[df_agg.Total > 0].reset_index(drop=True) # Long to wide df_wide = df_agg.pivot_table( index=["Control Point"], columns="Arrival / Departure", values=["Total"], aggfunc="sum", fill_value=0, ) df_wide_flattened = df_wide.copy() df_wide_flattened.columns = ["_".join(x) for x in df_wide_flattened.columns.to_flat_index()] df_wide_flattened.reset_index(inplace=True) # List of List data = df_wide_flattened.values.tolist() data.insert(0, df_wide_flattened.columns.tolist()) # Insert back the column header # Create template t = Template(""" <html> <head> <script type="text/javascript" src="https://www.gstatic.com/charts/loader.js"></script> <script type="text/javascript"> google.charts.load('current', {packages: ['corechart', 'bar']}); google.charts.setOnLoadCallback(drawChart); function drawChart() { var data = google.visualization.arrayToDataTable( {{ data }} ); var options = { chart: { title: '{{ title }}' }, hAxis: { title: '{{ h_axis }}', minValue: 0, }, vAxis: { title: '{{ v_axis }}' }, bars: 'horizontal' }; var chart = new google.charts.Bar(document.getElementById('chart_div')); chart.draw(data, options); } </script> </head> <body> <div id="chart_div"></div> </body> </html> """) # Render template, and save to a html file output = t.render( data=data, title="Inbound and Outbound Passenger at different Control Points (Year 2022)", h_axis="Passengers", v_axis="Control Point", ) with open('output.html', 'w') as f: f.write(output)
Enter fullscreen mode Exit fullscreen mode
Output – output.html
Final Thoughts
There is a lot of other Javascript chart packages (e.g. Vega Chart, HighCharts, etc..), you can easily create some pretty charts with the techniques here with Jinja Template. As it is in HTML format, you can easily include some Insight session as well for some quick comments for your report.
Remarks: Just be careful for not leaking any sensitive data, as a static report is difficult to do access control or audit logging.
I also write on my own blog (https://data-gulu.com). You can find more articles about python and machine learning there.
Happy Coding!
暂无评论内容