In this brief tutorial, you will learn how to add a column to a dataframe in R. More specifically, you will learn 1) to add a column using base R (i.e., by using the $-operator and brackets, 2) to add a column using the add_column() function (i.e., from tibble), 3) to add multiple columns, and 4) to add columns from one dataframe to another.
In Psychological research, there are instances when we need to merge data from different sources. For example, suppose we have collected data on participants’ personality traits using a self-report questionnaire and their performance on a cognitive task. The personality data may be stored in one dataframe, while the cognitive task data is in another dataframe. To analyze the relationship between personality and cognitive performance, we need to combine these datasets. In such cases, we may need to add a column from one DataFrame to another in R. By adding a column from one dataframe, containing the personality data, to the dataframe with cognitive task data, we can create a unified dataset. This allows us to investigate how individual personality differences relate to cognitive performance. Adding a column to a dataframe in R provides us with the flexibility to merge and integrate various data sources, enabling comprehensive analyses that consider multiple factors. This approach enhances the depth and richness of our understanding of psychological phenomena and facilitates more nuanced interpretations of research findings.
Note, when adding a column with tibble we are, as well, going to use the %>%
operator which is part of dplyr. Note, dplyr, as well as tibble, has plenty of useful functions that, apart from enabling us to add columns, make it easy to remove a column by name from the R dataframe (e.g., using the select()
function).
Table of Contents
- Outline
- Prerequisites
- Example Data
- Two Methods to Add a Column to a Dataframe in R (Base).
- How to Add a Column to a dataframe in R using the add_column() Function
- Compute and Add a New Variable to a Dataframe in R with mutate()
- How to Add Multiple Columns to the Dataframe in R
- Add Columns from One Dataframe to Another Dataframe
- Conclusion
- Other R Tutorials
Outline
In this post, we will learn how to add columns to a dataframein R, covering different methods and examples. To start, we will outline the prerequisites necessary for understanding the concepts discussed. Next, we will explore two base methods to add a column to a dataframe: using the $-operator and brackets ([]
). After that, we will demonstrate how to add a column to a dataframe using the add_column()
function. In example 1, we will add a new column after another column, while in example 2, we will add a column before another. Example 3 will focus on adding an empty column to the DataFrame, and example 4 will showcase adding a column based on other columns conditionally. We will then learn about the computation and how to add a new variable to a dataframe using the mutate() function. Furthermore, we will cover how to add multiple columns to a dataframe and columns from one dataframe to another. By the end of this post, you will have a comprehensive understanding of various techniques to add columns to DataFrames in R.
Prerequisites
To follow this tutorial, in which we will carry out a simple data manipulation task in R, you only need to install dplyr and tibble if you want to use the add_column()
and mutate()
functions as well as the %>% operator. However, if you want to read the example data, you must install the readr package.
It may be worth noting that all the mentioned packages are part of the Tidyverse. This package comes packed with a lot of tools that can be used for cleaning data, and visualizing data (e.g., to create a scatter plot in R with ggplot2). Before installing packages, check the R version and update R to the latest version if needed.
To add a new column to a dataframe in R, you can use the $-operator. For example, to add the column “NewColumn”, you can do like this: dataf$NewColumn <- Values
. Now, this will effectively add your new variable to your dataset.
To add a column from one dataframe to another, you can use the $ operator. For example, if you want to add the column named “A” from the dataframe called “dfa” to the dataframe called “dfb” you can run the following code. dfb$A <- dfa$A
. Adding multiple columns from one dataframe to another can also be accomplished.
In the next section, we are going to use the read_excel()
function from the readr package. After this, we will use R to add a column to the created dataframe.
Example Data
here is how to read a .xlsx file in R:
# Import readxl
library(readxl)
# Read data from .xlsx file
dataf <- read_excel('./SimData/add_column.xlsx')
Code language: R (r)
In the code chunk above, we imported the file add_column.xlsx. This file was downloaded to the same directory as the script. We can obtain some information about the structure of the data using the str()
function:
Before going to the next section it may be worth pointing out that importing data from other formats is possible. For example, you can see a couple of tutorials covering how to read data from SPSS, Stata, and SAS:
- How to Read and Write Stata (.dta) Files in R with Haven
- Reading SAS Files in R
- How to Read & Write SPSS Files in R Statistical Environment
Now that we have some example data, to practice with, move on to the next section in which we will learn how to add a new column to a dataframe in base R.
Two Methods to Add a Column to a Dataframe in R (Base).
First, we will use the $-operator and assign a new variable to our dataset. Second, we will use brackets (“[ ]”) to do the same.
1) Add a Column Using the $-Operator
Here is how to add a new column to a dataframe using the $-operator in R:
# add column to dataframe
dataf$Added_Column <- "Value"
Code language: R (r)
Note how we used the operator $ to create the new column in the dataframe. What we added, to the dataframe, was a character (i.e., the same word). This will produce a character vector as long as the number of rows. Here is the first 6 rows of the dataframe with the added column:
If we, on the other hand, tried to assign a vector that is not of the same length as the dataframe, it would fail. We would get an error similar to “Error: Assigned data `c(2, 1)` must be compatible with existing data.” For more about the dollar sign operator, check the post “How to use $ in R: 6 Examples – list & dataframe (dollar sign operator)“.
If we would like to add a sequence of numbers we can use seq()
function and the length.out
argument:
# add column to dataframe
dataf$Seq_Col <- seq(1, 10, length.out = dim(dataf)[1])
Code language: R (r)
Notice how we also used the dim()
function and selected the first element (the number of rows) to create a sequence with the same length as the number of rows. Of course, in a real-life example, we would probably want to specify the sequence more before adding it as a new column. In the next section, we will learn how to add a new column using brackets.
2) Add a Column Using Brackets (“[]”)
here is how to append a column to a dataframe in R using brackets (“[]”):
# Adding a new column
dataf["Added_Column"] <- "Value"
Code language: R (r)
Using the brackets will give us the same result as using the $-operator. However, it may be easier to use the brackets instead of $, sometimes. For example, when we have column names containing whitespaces, brackets may be the way to go. Also, when selecting multiple columns you have to use brackets and not $. In the next section, we are going to create a new column by using tibble and the add_column()
function.
How to Add a Column to a dataframe in R using the add_column() Function
Here is how to add a column to a dataframe in R:
# Append column using Tibble:
dataf <- dataf %>%
add_column(Add_Column = "Value")
Code language: R (r)
In the example above, we added a new column at “the end” of the dataframe. Note, that we can use dplyr to remove columns by name. This was done to produce the following output:
Finally, if we want to, we can add a column and create a copy of our old dataframe. Change the code so that the left “dataf” is something else e.g. “dataf2”. Now, that we have added a column to the dataframe it might be time for other data manipulation tasks. For example, we may now want to remove duplicate rows from the R dataframe or transpose your dataframe.
Example 1: Add a New Column After Another Column
If we want to append a column at a specific position, we can use the .after
argument:
# R add column after another column
dataf <- dataf %>%
add_column(Column_After = "After",
.after = "A")
Code language: R (r)
As you probably understand, doing this will add the new column after the column “A”. In the next example, we are going to append a column before a specified column.
Example 2: Add a Column Before Another Column
Here is how to add a column to the dataframe before another column:
# R add column before another column
dataf <- dataf %>%
add_column(Column_Before = "Before",
.after = "Cost")
Code language: R (r)
In the next example, we are going to use add_column()
to add an empty column to the dataframe.
Example 3: Add an Empty Column to the Dataframe
Here is how we would do if we wanted to add an empty column in R:
# Empty
dataf <- dataf %>%
add_column(Empty_Column = NA) %>%
Code language: R (r)
Note that we just added NA (missing value indicator) as the empty column. Here is the output, with the empty column, added, to the dataframe:
If we want to do this, we just replace the NA
with “‘’”, for example. However, this would create a character column and may not be considered empty. In the next example, we are going to add a column to a dataframe based on other columns.
Example 4: Add a Column Based on Other Columns (Conditionally)
Here is how to use R to add a column to a dataframe based on other columns:
# Append column conditionally
dataf <- dataf %>%
add_column(C = if_else(.$A == .$B, TRUE, FALSE))
Code language: R (r)
In the code chunk above, we added something to the add_column()
function: the if_else()
function. We did this because we wanted to add a value in the column based on the value in another column. Furthermore, we used the .$
so that we get the two columns compared (using ==
). If the values in these two columns are the same, we add TRUE
on the specific row. Here is the new column added:
Note you can also work with the mutate()
function (also from dplyr) to add columns based on conditions. See this tutorial for more information about adding columns on the basis of other columns.
In the next section, we will have a look at how to work with the mutate()
function to compute, and add a new variable to the dataset.
Compute and Add a New Variable to a Dataframe in R with mutate()
Here is how to compute and add a new variable (i.e., column) to a dataframe in R:
# insert new column with mutate
dataf <- dataf %>%
mutate(DepressionIndex = mean(c_across(Depr1:Depr5))) %>%
head()
Code language: R (r)
Notice how we, in the example code above, calculated a new variable called “depression index” which was the mean of the 5 columns named Depr1 to Depr5. Obviously, we used the mean()
function to calculate the mean of the columns. Notice how we also used the c_across()
function. This was done so that we can calculate the mean across these columns.
Note now that you have added new columns, to the dataframe, you may also want to rename factor levels in R with e.g. dplyr. In the next section, however, we will add multiple columns to a dataframe.
How to Add Multiple Columns to the Dataframe in R
Here is how you would insert multiple columns, to the dataframe, using the add_column()
function:
# Add multiple columns
dataf <- %>%
add_column(New_Column1 = "1st Column Added",
New_Column2 = "2nd Column Added")
Code language: R (r)
In the example code above, we had two vectors (“a” and “b”). Now, we then used the add_column()
method to append these two columns to the dataframe. Here are the first six rows of the dataframe with added columns:
Note, if you want to add multiple columns, you just add an argument as we did above for each column you want to insert. Again, it is important that the vector’s length is the same as the number of rows in the dataframe. Or else we will end up with an error. Note, a more realistic example can be that we want to take the absolute value in R (from e.g. one column) and add it to a new column. In the next example, however, we will add columns from one dataframe to another.
Add Columns from One Dataframe to Another Dataframe
In this section, you will learn how to add columns from one dataframe to another. Here is how you append e.g., two columns from one dataframe to another:
# Read data from the .xlsx files:
dataf <- read_excel('./SimData/add_column.xlsx')
dataf2 <- read_excel('./SimData/add_column2.xlsx')
# Add the columns from the second dataframe to the first
dataf3 <- cbind(dataf, dataf2[c("Anx1", "Anx2", "Anx3")])
Code language: R (r)
In the example above, we used the cbind()
function together with selecting which columns we wanted to add. Note that dplyr has the bind_cols()
function that can be used similarly. Now that you have put together your data sets, you can create dummy variables in R with ,e.g., the fastDummies package or calculate descriptive statistics.
Conclusion
In this post, you have learned how to add a column to a dataframe in R. Specifically, you have learned how to use the base functions available, as well as the add_column() function from Tibble. Furthermore, you have learned how to use the mutate() function from dplyr to append a column. Finally, you have also learned how to add multiple columns and how to add columns from one dataframe to another.
I hope you learned something valuable. If you did, please share the tutorial on your social media accounts, add a link to it in your projects, or leave a comment below! Finally, suggestions and corrections are welcomed, also as comments below.
Other R Tutorials
Here, you will find some additional resources that you may find useful- The first three here are especially interesting if you work with datetime objects (e.g., time-series data):
- How to Extract Year from Date in R with Examples with e.g. lubridate (Tidyverse)
- Learn How to Extract Day from Datetime in R with Examples with e.g. lubridate (Tidyverse)
- How to Extract Time from Datetime in R – with Examples
If you are interested in other useful functions and/or operators, these two posts might be useful:
- How to use %in% in R: 7 Example Uses of the Operator
- Modulo in R: Practical Example using the %% Operator
- How to use the Repeat and Replicate functions in R
- How to Create a Matrix in R with Examples – empty, zeros