In this Pandas tutorial, we will learn how to export JSON to Excel in Python. This guide will cover four simple steps to use Python’s json module, the Python package requests, and Pandas.
Table of Contents
- Outline
- Basic Code Example to import JSON to Excel with Python
- Conversion
- Prerequisites
- 4 Steps to Convert JSON to Excel in Python
- JSON to Excel: Reading data from a URL
- Import JSON to Excel and Specifying the Sheet Name
- JSON to Excel with Pandas Removing the Index Column
- Other Options: the JSON to excel converter Package
- Conclusion
- Resources
Outline
The structure of this tutorial is as follows. The first section will look at a basic example of converting JSON to an Excel file with Pandas and Python. After we have seen and briefly learned the syntax, we will continue with a section covering some examples of when this knowledge may be useful. In the third section, we will look at the prerequisites of this Python tutorial and how to install Pandas. After we are sure we have everything needed, we will go through four steps on how to save JSON to Excel in Python. Here, we will start by reading the JSON file from the hard drive and saving it as an Excel file. Furthermore, we will also look at an example when reading JSON from a URL and saving it as a .xlsx file. Finally, we will also use the Python package JSON to Excel converter.
Basic Code Example to import JSON to Excel with Python
Here is the easiest way to convert JSON data to an Excel file using Python and Pandas:
import pandas as pd
df_json = pd.read_json(‘DATAFILE.json’)
df_json.to_excel(‘DATAFILE.xlsx’)
Code language: Python (python)
In the code chunk above, we use the pandas library to read data from a JSON file and save it to an Excel file. Here is a step-by-step explanation of the code:
- The first line of the code imports the pandas library and gives it the name “pd”. This library provides easy-to-use data structures and data analysis tools for Python.
- The second line reads the data from the JSON file ‘DATAFILE.json’ and creates a pandas DataFrame object called ‘df_json’. A DataFrame is a two-dimensional data table, similar to a spreadsheet or SQL table.
- The third line takes the DataFrame ‘df_json’ and saves it as an Excel file named ‘DATAFILE.xlsx’ using the ‘to_excel’ method of the DataFrame.
Conversion
In this section, the problem will be briefly formulated. Although we can work with data using only Python and its packages, we might collaborate with people who want to work in Excel. It is true that Microsoft Excel is packed with features to keep and organize tabular data. Furthermore, getting an overview of the data in Excel might also be easier. Now, a lot of open data resources are storing data in a variety of different file formats. But sometimes, we might find out that the data is stored in the JSON format only. In some cases, we might want to obtain data from JSON and save it to Excel and send it to our collaborators. In other cases, we just want to explore data using Excel, as mentioned before.
Despite it is possible to import JSON to Excel with Microsoft Excel itself, the optimal way is to automate the conversion by importing data from JSON to Excel worksheets programmatically. Especially, if we already know Python.
Prerequisites
In this section, you will learn what you need to have installed to convert JSON data to Excel files.
Obviously, to follow this guide, you must have Python, requests, and Pandas installed. Secondly, and not that obvious maybe, you will also need to install the package openpyxl. This package is what will be used, by Pandas, to create Excel files (e.g., .xlsx).
Now, there is one very easy way to install both Python and Pandas: installing a Python scientific distribution such as Anaconda, ActivePython, or Canopy (see here for a list of Python distributions). For example, if you install Anaconda you will get a library with useful Python packages, including Pandas.
How to Install Pandas and openpyxl
If you already have Python installed, you can use pip to install Python packages. To install Pandas and openpyxl using pip open up Windows Command Prompt or the Terminal you prefer and type the following code:
pip install pandas openpyxl
Code language: Bash (bash)
Sometimes, pip will warn us that there’s a newer version. If this is the case now, read the post about upgrading pip for more information.
Creating a virtual environment and installing your new packages in this environment is generally a good idea. Now that you have everything you need, we will continue to the next section of this tutorial. Using a tool such as pipx you can automatically install Python packages in virtual environments. Here’s a YouTube video on how to install Pandas and Anaconda:
As a final note, before converting JSON to Excel with Python: if you need, you can use pip to install specific version of Python packages.
4 Steps to Convert JSON to Excel in Python
In this section, we will go through, step-by-step, how to import JSON data and save it as an Excel file using Python. Here’s a summary of what this chapter will cover: 1) importing pandas and json, 2) reading the JSON data from a directory, 3) converting the data to a Pandas dataframe, and 4) using Pandas to_excel method to export the data to an Excel file.
1. Importing the Pandas and json Packages
First, we start by importing Pandas and json:
import json
import pandas as pd
Code language: Python (python)
Now, it may be evident, but we are importing json and then Pandas to use json to read the data from the file and, after we have done this, we will use Pandas to save it as a .xlsx file.
2. Reading the JSON file
Now, we are ready to import the data from a file using the load method:
with open('./SimData/save_to_excel.json') as json_file:
data = json.load(json_file)
Code language: Python (python)
As you can see in the code chunk above, we are first opening a file with Python (i.e., as json_file) with the with-statement. Moreover, in the with-statement, we are using the load method. This is when we are reading the JSON file.
3. Creating a Pandas Dataframe
Now, before we can save the data, we have imported, we need to create a dataframe:
df = pd.DataFrame(data)
Code language: Python (python)
In the code chunk above, we used Pandas DataFrame class to create a dataframe from the JSON data we loaded into Python in the previous section. What we do is similar to converting a Python dictionary to a Pandas dataframe. Note adding new columns to the dataframe, before saving it is, of course, also possible.
4. Saving the Imported Data as a .xlsx File
Finally, we can use the to_excel method that is available in the dataframe object:
df.to_excel('./SimData/exported_json_data.xlsx')
Code language: Python (python)
Now, if we want to convert JSON data to other formats such as CSV, SPSS, or STATA to name a few, we can use other methods available in the dataframe object. For example, if we want to save the JSON data to a CSV file we can use the to_csv method. If you are interested in learning more about reading and writing data files with Python and Pandas check out the following blog posts:
- How to Read and Write Excel Files with Pandas
- How to Read and Write Stata (.dta) Files in R with Haven
Now, it is worth mentioning that we can skip step 2 and use Pandas read_json method to import the JSON data to a dataframe. Here is a full working example of how to save JSON data with only the Pandas package:
import pandas as pd
df = pd.read_json('./SimData/save_to_excel.json')
df.to_excel('./SimData/exported_json_data.xlsx')
Code language: Python (python)
In the next section, we will look at how to read JSON data from a URL and save it as an Excel file.
JSON to Excel: Reading data from a URL
In the previous section, we learned how to automatically convert JSON data to an Excel file in Python step-by-step. Here, we will use the same steps but read the data from a URL. First, we will look at a simple example using the same code as above. However, we will change the string so that it is pointing at the URL location instead of a directory.
import pandas as pd
df = pd.read_json('http://api.open-notify.org/iss-now.json')
df.to_excel('locationOfISS.xlsx')
Code language: Python (python)
As you can see, we managed to read this JSON file with Python and save it as an Excel file using Pandas. In the next section, we will look at a more complex JSON file where the data is nested.
Nested JSON data to Excel
In this section, we will look at a bit more complex example. Sometimes the JSON data is nested, and if we use the method above, the Excel file will be messy. Luckily, we can fix this by using json_normalize from Pandas and the requests module:
import requests
from pandas.io.json import json_normalize
url = "https://think.cs.vt.edu/corgis/datasets/json/airlines/airlines.json"
resp = requests.get(url=url)
df = json_normalize(resp.json())
# Writing Excel File:
df.to_excel('Airlines.xlsx')
Code language: Python (python)
In the code chunk above, we first imported requests and then json_normalize. Second, we created a string with the URL to the JSON data we want to save as an Excel file. Now, the next thing we do is to use the get method, which sends a GET request to our URL. Briefly, this is pretty much reading in the webpage that we want, and we can, then, use the json method to get the json data. Finally, we used the json_normalize to create a dataframe that we saved as an Excel file. Again, saving the data we read from the JSON file uses the to_excel method.
Now, as you may have noticed in the image of the output (i.e., the .xlsx files opened in Excel) we have a row that is not part of our original data. This row is the index row from the Pandas dataframe and we can, of course, get rid of this. In the next section, we will look at some of the arguments of the to_excel method. For example, if we want to get rid of the index row, we will learn how to do that. If you are interested check out the tutorial on how to read and write JSON with Python and Pandas.
Import JSON to Excel and Specifying the Sheet Name
This section will specify the sheet name using the sheet_name argument. Now, the string we input will be the name of the Worksheet in the Excel file:
df.to_excel('Airlines.xlsx', sheet_name='Session1')
Code language: Python (python)
In the example code above, we named the sheet ‘Session1’. However, we still can see the index column in the Excel file. In the next section, we will have a look on how to remove this index.
JSON to Excel with Pandas Removing the Index Column
To remove the index column when converting JSON to Excel using Pandas we use the index argument:
df.to_excel('Airlines.xlsx', index=False)
Code language: Python (python)
This argument takes a bool (True/False), and the default is True. Thus, in the code example above, we set it to False, and we get this Excel file:
Of course, we could use plenty of more arguments when converting JSON to Excel with Pandas. For instance, we can set the encoding of the Excel file. Another thing we can do is to set the engine: openpyxl or xlsxwriter. Note, to use the to_excel method to save the JSON file to an Excel file, you need to have one of them installed on your computer (see the previous section). See the documentation for more information on how to use the arguments of the to_excel method.
Other Options: the JSON to excel converter Package
Now, we can use other methods and/or Python packages to convert JSON to Excel in Python. For example, we could work with the json module and choose one of xlsxwriter or openpyxl packages. It is worth noting, however, that the resulting Python script will be a bit more complex. Furthermore, there is a Python package created just to convert JSON to Excel: JSON to excel converter.
JSON to Excel with the JSON to Excel converter package
Here is a how-to quickly convert JSON to a .xlsx file:
from json_excel_converter import Converter
from json_excel_converter.xlsx import Writer
url = 'https://think.cs.vt.edu/corgis/datasets/json/broadway/broadway.json'
resp = requests.get(url=url)
conv = Converter()
conv.convert(resp.json(), Writer(file='./SimData/ExcelData.xlsx'))
Code language: Python (python)
This package is straightforward to use. We converted the airlines.json data to an Excel file in the example code above. Remember, this data is nested, and one very neat thing with the JSON to Excel converter package is that it can handle this very nicely (see the image below for the resulting Excel file).
Note, however, that you need to install the xlsxwriter package to use the JSON to excel converter. Both can be installed using pip: pip install json-excel-converter[xlsxwriter]
.
Conclusion
In this post, we have covered a lot of things related to opening JSON files and saving them as Excel files using Python. Indeed, knowing how to convert JSON data to Excel files might be useful in many situations. For instance, if we get the data and collaborate with someone who prefers using Excel. In this tutorial, we have used the Python packages Pandas, json, requests, and JSON to Excel converter to read JSON files and save them as Excel files. First, we had a quick look at the syntax, then we learned four steps to convert JSON to Excel. After that, we also learned how to read the JSON data from a URL and how to work a bit with some of the arguments of the to_excel method.
Hopefully, you learned something from this Python tutorial, and if you did find it useful, please share it on your social media accounts. Finally, if you want to learn anything, please comment below, and I might put that together in a new tutorial.
Resources
Here are some more helpful Python tutorials:
- How to Make a Violin plot in Python using Matplotlib and Seaborn
- Python Scientific Notation & How to Suppress it in Pandas & NumPy
- How to use Python to Perform a Paired Sample T-test
- Coefficient of Variation in Python with Pandas & NumPy
- How to use Square Root, log, & Box-Cox Transformation in Python
- How to Perform Mann-Whitney U Test in Python with Scipy and Pingouin