How To Read JSON Data Using Python Pandas

Reading json data in Python is very easy. Json data can be read from a file or it could be a json web link. Let us first try to read the json from a web link.

Let us first import the necessary packages "requests and pandas".

In [1]:
import requests
import pandas as pd

How to read json data from web link

Let us import the covid19 timeseries data from json link pomber.github.io/covid19/timeseries.json using requests

In [2]:
data = requests.get('https://pomber.github.io/covid19/timeseries.json')

Let us check the type of our data type.

In [3]:
type(data)
Out[3]:
requests.models.Response

To get the json type, we need to use data.json() method.

In [4]:
jsondata = data.json()

Let us check the length of our json data.

In [5]:
len(jsondata)
Out[5]:
178

Let us check the data type of jsondata.

In [6]:
type(jsondata)
Out[6]:
dict

Ok, it is python dictionary. Lets check the keys of our dict.

In [7]:
keys = jsondata.keys()
for key in keys:
    print(key)
    break
Afghanistan

Ok looks like the keys are the country names. Lets check the first row.

In [8]:
jsondata['Afghanistan'][0]
Out[8]:
{'date': '2020-1-22', 'confirmed': 0, 'deaths': 0, 'recovered': 0}

Ok now we know our data a little better. Let us construct a dataframe from our json data.

How to convert Json to Pandas dataframe

The easiest way is to just use pd.DataFrame.from_dict method. Let us try it and see what we get.

In [9]:
df = pd.DataFrame.from_dict(jsondata)
In [10]:
df.head(1)
Out[10]:
Afghanistan Albania Algeria Andorra Angola Antigua and Barbuda Argentina Armenia Australia Austria ... Laos Libya West Bank and Gaza Guinea-Bissau Mali Saint Kitts and Nevis Kosovo Burma MS Zaandam Botswana
0 {'date': '2020-1-22', 'confirmed': 0, 'deaths'... {'date': '2020-1-22', 'confirmed': 0, 'deaths'... {'date': '2020-1-22', 'confirmed': 0, 'deaths'... {'date': '2020-1-22', 'confirmed': 0, 'deaths'... {'date': '2020-1-22', 'confirmed': 0, 'deaths'... {'date': '2020-1-22', 'confirmed': 0, 'deaths'... {'date': '2020-1-22', 'confirmed': 0, 'deaths'... {'date': '2020-1-22', 'confirmed': 0, 'deaths'... {'date': '2020-1-22', 'confirmed': 0, 'deaths'... {'date': '2020-1-22', 'confirmed': 0, 'deaths'... ... {'date': '2020-1-22', 'confirmed': 0, 'deaths'... {'date': '2020-1-22', 'confirmed': 0, 'deaths'... {'date': '2020-1-22', 'confirmed': 0, 'deaths'... {'date': '2020-1-22', 'confirmed': 0, 'deaths'... {'date': '2020-1-22', 'confirmed': 0, 'deaths'... {'date': '2020-1-22', 'confirmed': 0, 'deaths'... {'date': '2020-1-22', 'confirmed': 0, 'deaths'... {'date': '2020-1-22', 'confirmed': 0, 'deaths'... {'date': '2020-1-22', 'confirmed': 0, 'deaths'... {'date': '2020-1-22', 'confirmed': 0, 'deaths'...

1 rows × 178 columns

Ok we got the dataframe but not in the form that we wanted. We will have to unwind the nested data to build a proper dataframe.

In [11]:
columns=['country','date','confirmed','deaths','recovered']
data = []
for country in jsondata:
    for x in jsondata[country]:
        data.append([country, x['date'],x['confirmed'],x['deaths'],x['recovered']])
df = pd.DataFrame(data,columns=columns)

As we see above, we had to loop through dictionary of dictionaries. Let us see our dataframe now.

In [12]:
df.head(2)
Out[12]:
country date confirmed deaths recovered
0 Afghanistan 2020-1-22 0 0 0
1 Afghanistan 2020-1-23 0 0 0

Ok this looks good now. Now we can perform all the regular dataframe methods on our data.