How to Read and Write JSON Files using Python and Pandas

In this Python tutorial, we will learn how to read and write JSON files using Python. In the first part, we are going to use the Python package json to create and read a JSON file as well as write a JSON file. After that, we are going to use Pandas read_json method to load JSON files into a Pandas dataframe. Here, we will learn how to read from a JSON file locally, from a URL, and how to read a nested JSON file.

Finally, as a bonus, we will also learn how to manipulate data in Pandas dataframes, rename columns, and plot the data using Seaborn.

Table of Contents

What is a JSON File?

JSON, short for JavaScript Object Notation, is a compact, text based format used to exchange data. This format that is common for downloading, and storing, information from web servers via so-called Web APIs. JSON is a text-based format and  when opening up a JSON file, we will recognize the structure. That is, it is not so different from Python’s structure for a dictionary.

  • Save
Example JSON file

In the first Python parsing json example, we will use the Python module json to create a JSON file. After we have done that, we are going to load the JSON file. First. we create a dictionary with data to practice with:

import json

# Creating a Python Dictionary
data = {"Sub_ID":["1","2","3","4","5","6","7","8" ],
        "Name":["Erik", "Daniel", "Michael", "Sven",
                "Gary", "Carol","Lisa", "Elisabeth" ],
        "Salary":["723.3", "515.2", "621", "731", 
                  "844.15","558", "642.8", "732.5" ],
        "StartDate":[ "1/1/2011", "7/23/2013", "12/15/2011",
                     "6/11/2013", "3/27/2011","5/21/2012", 
                     "7/30/2013", "6/17/2014"],
        "Department":[ "IT", "Manegement", "IT", "HR", 
                      "Finance", "IT", "Manegement", "IT"],
        "Sex":[ "M", "M", "M", 
              "M", "M", "F", "F", "F"]}

print(data)Code language: Python (python)
  • Save
Python dictionary

Parsing JSON in Python

In the next Python read a JSON file example, we will read the JSON file we created above. Reading a JSON file in Python is pretty easy. We open the file using open(). After this, we read the JSON file using the load() method.

# Python Parsing JSON 
with open('data.json') as json_file:
    data = json.load(json_file)

print(data)Code language: Python (python)

Now, using Python to read a json file was pretty easy, right? Now, before learning how to save a JSON file in Python, it is worth mentioning that we can also parse Excel files. For example, see the post about reading xlsx files in Python for more information-

Saving to a JSON file

In this section, we will learn how to write a JSON file in Python. In Python, this can be done using the module json . This module enables us both to read and write content to and from a JSON file. This module converts the JSON format to Python’s internal format for Data Structures. So we can work with JSON structures just as we do in the usual way with Python’s own data structures.

Python Write JSON File Example:

In the example code below, we will learn how to make a JSON file in Python. First, we start by importing the json module. After we’ve done that, we open up a new file and use the dump method to write a json file using Python.

import json

# Parse JSON
with open('data.json', 'w') as outfile:
    json.dump(data, outfile)Code language: Python (python)

It is, of course, possible to save JSON to other formats, such as xlsx, and CSV, and we will learn how to export a Pandas dataframe to CSV, later in this blog post.

How to Use Pandas to Load a JSON File

Now, if we are going to work with the data we might want to use Pandas to load the JSON file into a Pandas dataframe. This will enable us to manipulate data, do summary statistics, and data visualization using Pandas built-in methods. Note, we will cover this briefly later in this post also.

Pandas Read Json Example:

In the next example, we will use Pandas read_json method to read the JSON file we wrote earlier (i.e., data.json). Simply, we start by importing pandas as pd:

import pandas as pd

# Read JSON as a dataframe with Pandas:
df = pd.read_json('data.json')

dfCode language: Python (python)

The output, when working with Jupyter Notebooks, will look like this:

  • Save

It’s also possible to convert a dictionary to a Pandas dataframe. Make sure to check that post out for more information.

Data Manipulation using Pandas

Now that we have loaded the JSON file into a Pandas dataframe we are going use Pandas inplace method to modify our dataframe. We start by setting the Sub_ID column as index.

df.set_index('Sub_ID', inplace=True)
dfCode language: Python (python)
  • Save

Pandas JSON to CSV Example

Now when we have loaded a JSON file into a dataframe we may want to save it in another format. For instance, we may want to save it as a CSV file and we can do that using Pandas to_csv method. It may be useful to store it in a CSV, if we prefer to browse through the data in a text editor or Excel.

We carry out the same data manipulation method in the Pandas JSON to CSV example below.

# Saving .json to .csv in Python
df.to_csv("data.csv")Code language: Python (python)
  • Save
Pandas json to CSV example file

In a more recent post, you will learn how to convert JSON to Excel (.xlsx) in Python with Pandas, among other Python packages.

Pandas to JSON Example

In this section, we will learn how to save Pandas dataframe to JSON. If we, for instance, have our data stored in a CSV file locally, but want to enable the functionality of the JSON files, we will use Pandas to_json method:

df = pd.read_csv("data.csv")

# Save dataframe to JSON format
df.to_json("data.json")Code language: Python (python)

Learn more about working with CSV files using Pandas in the  Pandas Read CSV Tutorial

How to Load JSON from an URL

We have now seen how easy it is to create a JSON file, write it to our hard drive using Python Pandas, and, finally, how to read it using Pandas. However, as previously mentioned, many times, the data stored in the JSON format are on the web.

Thus, in this section of the Python json guide, we will learn how to use Pandas read_json method to read a JSON file from a URL. Most often, it is fairly simple: we create a string variable pointing to the URL:

url = "https://api.exchangerate-api.com/v4/latest/USD"

# Parsing JSON from URL:
df = pd.read_json(url)
df.head()Code language: Python (python)
  • Save

Load JSON from a URL: Second Example

When loading some data, using Pandas read_json seems to create a dataframe with dictionaries within each cell. One way to deal with these dictionaries, nested within dictionaries, is to work with the Python module request. This module also has a method for parsing JSON files. After we have parsed the JSON file we will use the method json_normalize to convert the JSON file to a dataframe.

  • Save
Pandas Dataframe from JSON
import requests
from pandas.io.json import json_normalize

url = "https://think.cs.vt.edu/corgis/json/airlines/airlines.json"
resp = requests.get(url=url)

# Handling nested JSON:
df = json_normalize(resp.json())
df.head()Code language: Python (python)
  • Save

As seen in the image above, the column names are pretty long. This is quite impractical when we are going to create a time series plot later using Seaborn. We are now going to rename the columns so they become a bit easier to use.

In the code example below, we use Pandas rename method together with the Python module re. We are using a regular expression to remove “statistics.# of” and “statistics.” from the column names. Finally, we are also replacing dots (“.”) with underscores (“_”) using the str.replace method:

import re

df.rename(columns=lambda x: re.sub("statistics.# of","",x), 
          inplace=True)
df.rename(columns=lambda x: re.sub("statistics.","",x), 
          inplace=True)

df.columns = df.columns.str.replace("[.]", "_")
df.head()Code language: Python (python)

Renaming columns is something that is quicker, and easier, done using the Python package Pyjanitor (thanks Shantanu Oak, for pointing this out in the comment section). See how to use Pandas and Pyjanitor to rename columns, among other things, using Pyjanitor:

Time Series Plot from JSON Data using Seaborn

In the last example in this post, we will use Seaborn to create a time series plot. The data we loaded from JSON to a dataframe contains data about delayed and canceled flights. We are going to use Seaborns lineplot method to create a time series plot of the number of canceled flights from 2003 to 2016, grouped by carrier code.

%matplotlib inline

import matplotlib.pyplot as plt
import seaborn as sns

fig = plt.figure(figsize=(10, 7))
g = sns.lineplot(x="timeyear", y="flightscancelled", ci=False,
             hue="carriercode", data=df)

g.set_ylabel("Flights Cancelled",fontsize=20)
g.set_xlabel("Year",fontsize=20)


plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)Code language: Python (python)
  • Save

Note, that we changed the font size as well as the x- and y-axis’ labels using the methods set_ylabel and set_xlabel. Furthermore, we also moved the legend using the legend method from matplotlib.

For more about exploratory data analysis using Python:

Conclusion

In this post, we have learned how to write a JSON file from a Python dictionary, how to load that JSON file using Python and Pandas. Furthermore, we have also learned how to use Pandas to load a JSON file from a URL to a dataframe, how to read a nested JSON file to a dataframe.

Here is a link to a Jupyter Notebook containing all code examples in this post.

Tutorials

  • Save

5 thoughts on “How to Read and Write JSON Files using Python and Pandas”

  1. Thank you Erik, you have broken it down really well. Just started learning programming and this has been the best article explaining the nested dictionaries to a dataframe.

    1. Hi Natasha!

      Thank you for your kind comments. I am really glad the article helped you in your learning progress. Have a nice day,

      Erik

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top
Share via
Copy link