In this Pandas tutorial, we will go through thee methods to add empty columns to a dataframe. The methods we are going to cover in this post are:
- Simply assigning an empty string and missing values (e.g., np.nan)
- Adding empty columns using the assign method
- Creating empty columns using the insert method
Now, in all the examples here we will both insert empty strings and/or missing values as both could be considered being empty. In the first section, however, we will create a dataframe from a dictionary. After we have created a dataframe we will go on to the examples on how to create empty columns in a dataframe. Note, if you have new data, adding it as new columns to the dataframe can be done in a similar way.,
Table of Contents
Create a Pandas dataframe
In this section, we will create a Pandas dataframe from a Python dictionary. Hereis how to create the example dataframe:
import pandas as pd
import numpy as np
gender = ['M', 'F', 'F', 'M']
cond = ['Silent', 'Silent',
'Noise', 'Noise']
age = [19, 21, 20, 22]
rt = [631.2, 601.3,
721.3, 722.4]
data = {'Gender':gender,
'Condition':cond,
'age':age,
'RT':rt}
# Creating the Datafame from dict:
df = pd.DataFrame(data)
Code language: Python (python)
Now, of course, we created 4 lists containing categorical and numerical data, and we then added these to a dictionary. Finally, we added these lists to a Python dictionary and the used the DataFrame method to create our dataframe. In the image below, you can see the resulting dataframe that we created from a dictionary. We can, if needed, make a column index.
Note, that most of the time we are import data from other sources, such as CSV, Excel (.xlsx), or JSON. If you need to learn how to load data in Python with Pandas see the following tutorials:
- Pandas: How to read and write Excel files in Python
- Pandas read CSV tutorial: How to import and save data
- How to read and write JSON files using Python and Pandas
In the next section, we will start learning how to add a column to the dataframe we just created. First, we will learn how to add columns using ordinary Python syntax. After that, we will use the assign method. Finally, we will use the insert method.
Add Empty Columns to a Pandas Dataframe
In this section, we will cover the three methods to create empty columns in a dataframe in Pandas. First, we will use simple assigning to add empty columns. Second, we are going to use the assign method, and finally, we are going to use the insert method.
1. Adding Empty Columns using Simple Assigning
Here is how to add empty columns using simple assigning:
df['Empty_col1'] = ''
df['Empty_col2'] = np.nan
Code language: Python (python)
In the code above, we created two empty columns to our dataframe called df. Now, to explain the code above: we created two new columns using the brackets (i.e., ‘[]’) and within the brackets we used strings which will be the new column names. Finally, we assigned the values (“”” and ‘np.nan’). As a result, we get two new columns that are essentially empty. Note, that the column containing ‘NaN’ is not really empty, however. Here is the resulting dataframe containing the new, and empty, columns:
2. Adding Empty Columns using the assign() Method
To add empty columns we can use the assign() method:
df = df.assign(Empty_Col1='', Empty_Col2='')
Code language: Python (python)
In the code example above, we added two empty columns to our dataframe by adding two arguments. These two arguments will become the new column names and what we assign to them will be the values (i.e., empty). As we have learned here, assign() will add new columns to a dataframe, and return a new object with the new columns added to the dataframe. If there are existing columns, with the same names, they will be overwritten. If you want to add columns with data, the new added column must be the same length as the ones existing in the dataframe (i.e., same number of rows).
3. Adding Empty Columns using the insert() Method
Adding empty columns can also be done using the insert() method:
df.insert(4, 'Empty_Col1', '')
df.insert(5, 'Empty_Col2', '')
Code language: Python (python)
In the code example code above, we added two empty columns, again. Now, when we add columns using this method, we use 3 arguments. First, by using the loc argument we “tell” the assign method where we want our new, added, column to be located. In our case, we put them in the last position in the dataframe. Second, we used the column argument and added a string for our new column names. Third, we used the value argument to actually add something. In our case, we just added empty strings as we wanted do create empty columns. Here are the firsts three rows of the resulting dataframe:
Now, working with the insert() method requires us to know the number of columns in the dataframe. For instance, using the code above it is not possible to insert a column where there already exist one. Now, if we don’t know the number of columns, we can get the number of columns typing len(df.columns)
. Another possibility is to just add the length of the columns as the first argument:
df.insert(len(df.columns), 'Empty_col3', '')
df.insert(len(df.columns), 'Empty_Col4', '')
Code language: Python (python)
Note, we can insert an empty column almost wherever we want if we use the allow_duplicates argument. Of course, we cannot use insert() to create a new column outside of the index. For example, if there are 10 columns Python indexing makes it impossible to add a column with loc=10. Here is a code example to insert an empty column at an existing index:
df.insert(3, ‘NEW COL’, ”, allow_duplicates=True)Now, that we have added new and empty columns, we can continue working with the dataframe. For example, first, we can get the column names, and then we can go on and renaming columns in Pandas dataframe.
Now, the best way to add an empty column to a dataframe is to use the assign() method. For example, df.assign(ColName=”) will add an empty column called ‘ColName’ to the dataframe called ‘df’. Adding more arguments, to assign(), will enable you to create multiple empty columns.
Now, it was easy to add an empty column to Pandas dataframe. Now, that you know, you can go on and use Pandas to_datetime() convert e.g. string to date.
Conclusion
In this post, we learned how to add columns to a dataframe. Specifically, we used 3 different methods. First, we added a column by simply assigning an empty string and np.nan much like when we assign variables to ordinary Python variables. Second, we used the assign() method and created empty columns in the Pandas dataframe. Finally, we looked at the insert() method and how we could use this to insert new columns in the dataframe. In conclusion, the most convenient method to add columns is the assign() method because we only need to use this one.
Hope you enjoyed the Pandas tutorial, and please leave a comment below if there is something you want to be covered, on the blog, or in the blog post. Finally, share the post if you learned something new!
Resources
Here you will find some great resources:
- Coefficient of Variation in Python with Pandas & NumPy
- Python Scientific Notation & How to Suppress it in Pandas & NumPy
- Pandas Count Occurrences in Column – i.e. Unique Values
- How to Convert a NumPy Array to Pandas Dataframe: 3 Examples
- Create a Correlation Matrix in Python with NumPy and Pandas