This is the 3rd part of exploration of pandas package.

Part 2:

Part 1:

All data are taken from the official websate: weekly-deaths-by-location-age-sex.xlsx

Taken from official the site:

Let's have another look at the graph from the previous part:

New data arrived. The line for the last week is going up a little, but still well below the average.

Ok, sit back, relax, and have a sip from a flask. Why Flask? This is first what comes to mind we start taking about web data visualization with Python. Also we have a long list of libraries, let's just take one of the most popular - Plotly.

Plotly for Python is actually only a part of a family of products by Plotly. What we will use here - an ability to create data in json format by Plotly for Python.These json data can be used by Plotly for JS to plot a graph on a web page. A bit complicated. Easier done that said.

First we need to install Flask, Plotly and Plotly Express - the last one makes it easier to deal with Plotly.

pip install flask
pip install plotly
pip install plotly_express

Then we'll create a Flask application file:

import json
from flask import Flask,render_template,request
import numpy as np
import pandas as pd
import plotly
import as px

app = Flask(__name__)

def get_data():
    data_2021 = pd.read_excel ('',

    data_2021['year'] = data_2021.week.str.slice(0,2).astype(int)
    data_2021['week'] = data_2021.week.str.slice(3,5).astype(int)

    data_1519 = pd.read_excel ('',

    data_1519['cause'] = 'Non-COVID-19'

    data = data_1519.copy()
    data = data.append(data_2021)
    return data

def get_plot():
    data = get_data()

    dt1 = data.groupby(['year','week']).agg({'deaths':np.sum})

    fig = px.line(dt1,x='week', y='deaths',line_group='year',color='year')

    fig = json.dumps(fig, cls=plotly.utils.PlotlyJSONEncoder)
    return fig

def init():
    fig = get_plot()
    return render_template('mycovidash.html',fig=fig)

We basically repeat my exercises from Part 2. But instead of using pandas' plot function we use Plotly Express' line function to build a line graph and output this figure as json. What is inside this json data? The short answer - everything needed to display a graph on html page.

We also need an html template file: mycovidash.html

<!DOCTYPE html>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <link rel="stylesheet" href="">
    <section class="section">
        <div class="container">
            <h1 class="title has-text-centered">
                Weekly Deaths in Scotland

            <div id="myPlot"></div>
<script src=""></script>
    Plotly.newPlot('myPlot', {{ fig | safe }} );

Div myPlot is a place where the plot will be rendered. Plotly.newPlot function renders the plot, getting the data from the variable fig, which we set in the Python function. Notice that we need to use "| safe" filter (see Jinja documentation), otherwise our data in json format will be html-encoded and thus broken.

...And run the app:

export FLASK_ENV=development
flask run

Opening localhost:5000 show show the following page:

This graph is already interactive. You can hover over various parts of the graph, see additional controls which appear when you hover over certain places, play with these controls, e.g. zoom-in - zoom-out. It's already better that what we had before in Jupyter Notebook. (Yes, we could use the same Plotly functionality in a Jupyter Notebook, but now we wanted to build a web application using Flask)

Let's go further. Let's include checkboxes corresponding different groups in our data (year, age, sex, location, cause) and allow user include/exclude certain groups by checking/unchecking these checkboxes.

First, we need to find out what unique values these groups contain.

all_years = data['year'].unique()
all_ages = data['age'].unique()
all_sexes = data['sex'].unique()
all_causes = data['cause'].unique()
all_locations = data['location'].unique()

Looking at the data, all_ages contain slightly incorrect info: 0 value is represented twice - as string '0' and as an integer 0. We need to clean the data. As all other values for age are strings, let's just convert all integer 0s to string '0's, adding a simple line to get_data function:


Next we need to pass all these lists to our template.

def get_plot():
    data = get_data()

    all_years = data['year'].unique()
    all_ages = data['age'].unique()
    all_sexes = data['sex'].unique()
    all_causes = data['cause'].unique()
    all_locations = data['location'].unique()

    dt1 = data.groupby(['year','week']).agg({'deaths':np.sum})

    fig = px.line(dt1,x='week', y='deaths',line_group='year',color='year')

    fig = json.dumps(fig, cls=plotly.utils.PlotlyJSONEncoder)
    print (all_years,all_ages,all_sexes,all_causes,all_locations)
    return fig,all_years,all_ages,all_sexes,all_causes,all_locations

def init():
    fig,all_years,all_ages,all_sexes,all_causes,all_locations = get_plot()
    return render_template('mycovidash.html',fig=fig,all_years=all_years,all_ages=all_ages,all_sexes=all_sexes,all_causes=all_causes,all_locations=all_locations)

And in mycovidash.html, just after <div id="myPlot"></div>:

            <div class="columns">
                <div class="column is-2">
                    {% for s in all_years %}
                    <input type="checkbox" value="{{ s }}" class="years_cb" checked> 20{{ s }}<br>
                    {% endfor %}
                <div class="column is-2">
                    {% for s in all_ages %}
                    <input type="checkbox" value="{{ s }}" class="ages_cb" checked> {{ s }}<br>
                    {% endfor %}
                <div class="column is-1">
                    {% for s in all_sexes %}
                    <input type="checkbox" value="{{ s }}" class="sexes_cb" checked> {{ s }}<br>
                    {% endfor %}
                <div class="column is-3">
                    {% for s in all_locations %}
                    <input type="checkbox" value="{{ s }}" class="locations_cb" checked> {{ s }}<br>
                    {% endfor %}
                <div class="column">
                    {% for s in all_causes %}
                    <input type="checkbox" value="{{ s }}" class="causes_cb" checked> {{ s }}<br>
                    {% endfor %}
            <div class="has-text-centered">
                <button id="update" class="button is-primary">Update</button>

(It's time to notice that in this html file I used Bulma css framework instead of usual Bootstrap. I found Bulma very promising and easier to use than Bottstrap. Let's see...)

Now as all checkboxes are populated, we need somehow to react to them. Let's use it the most usual way - using jQuery and ajax call. These lines are going into out template:

<script src="" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script>
        let years = '';
            years += this.value + ','
        let ages = '';
            ages += this.value + ','
        let sexes = '';
            sexes += this.value + ','
        let locations = '';
            locations += this.value + ','
        let causes = '';
            causes += this.value + ','

            url: "/update",
            type: "GET",
            contentType: 'application/json;charset=UTF-8',
            data: {
                years: years,
                ages: ages,
                sexes: sexes,
                locations: locations,
                causes: causes
            success: function (data) {
                Plotly.react('myPlot', data );

In Python file we'll add additional route:

def update():
    fig,all_years,all_ages,all_sexes,all_causes,all_locations = get_plot([request.args.get('years',''),
    return fig

... and add processing of these new arguments to get_plot function:

def get_plot(pars=None):
    years = list(map(int,pars[0].strip(',').split(','))) if pars else None
    ages = list(pars[1].strip(',').split(',')) if pars else None
    sexes = list(pars[2].strip(',').split(',')) if pars else None
    locations = list(pars[3].strip(',').split(',')) if pars else None
    causes = list(pars[4].strip(',').split(',')) if pars else None

    if years:
        data = data[data['year'].isin(years)]
    if ages:
        data = data[data['age'].isin(ages)]
    if sexes:
        data = data[data['sex'].isin(sexes)]
    if locations:
        data = data[data['location'].isin(locations)]
    if causes:
        data = data[data['cause'].isin(causes)]

One thing to notice. We have to use DataFrame.isin() method. Initially I wrote

    data = data[data['year'] in (years)]

This failed, with an error message which is really difficult to interpret. But if we think about it, data['year'] is a Series object, which reminds a list. So actually using 'in' operator here would be something like [15,16,17,18,19,20,21] in [20,21] - what is apparently not what we want. But using .isin() method gives us exactly what we need.

Ok, it works now! You can select/deselect different groups and the plot is changing accordingly... but very slow. Of course, we need to add caching for the data. Let's use to_pickle method of DataFrame, whatever this format is:

import os
import time


def get_data():
    store_path = 'data.pkl'

        tm = os.path.getmtime(store_path) 
        if int(time.time())-int(tm) > 24 * 60 * 60:
            raise Exception("Cache expired")

        data = pd.read_pickle(store_path)
        return data
    except Exception as e:
        print (e)


    return data

It's all for today.

Here is a GitHub repository:

And here is a working version:

< Earlier

Blog Index

Later >