In this blog post, we will learn how to select columns from a data.table in R. The data.table package is widely used for its speed and capability in handling large datasets. We will look at different ways to select columns, including selection by variable name, multiple column selection, and selection by index. Each section includes practical examples with explanations. Previously, we looked at selecting columns with dplyr
using select()
. While dplyr
is intuitive, data.table
provides a faster alternative for large datasets.
Table of Contents
- Selecting Columns from data.table by Variable Name
- Selecting Columns from data.table by Variable Using a Character Vector
- Selecting Multiple Columns in data.table
- Selecting Columns by Index in data.table
- Summary

Selecting Columns from data.table by Variable Name
One of the most common ways to select columns in data.table is by using variable names. Here is how:
library(data.table)
# Create a sample data.table
dt <- data.table(id = 1:5, name = c("A", "B", "C", "D", "E"), age = c(25, 30, 35, 40, 45))
# Select a single column
dt[, name]
# Select multiple columns
dt[, .(name, age)]
Code language: R (r)
In the first example, dt[, name]
returns a vector of names, while dt[, .(name, age)]
returns a data.table with only the selected columns. Using .(column1, column2)
ensures that the result remains a data.table
rather than a vector.

Selecting Columns from data.table by Variable Using a Character Vector
If we have column names stored as a character vector, we need to use the .SDcols
argument:
cols <- c("name", "age")
dt[, ..cols]
Code language: JavaScript (javascript)
We will look closer at this technique in the next section.
Selecting Multiple Columns in data.table
Sometimes, we may want to select multiple columns dynamically. Here are a few ways to do it:
Using .SD
The .SD
(Subset of Data) approach is nice to use when we need to select multiple columns while keeping flexibility.
# Select multiple columns dynamically
dt[, .SD, .SDcols = c("name", "age")]
Code language: CSS (css)
.SDcols
defines which columns to include in the .SD
subset. This method is particularly useful when using external inputs to define column selections.
Using Indexing
In data.table, columns can also be selected using their position (index):
# Select columns by index
dt[, c(2, 3), with = FALSE]
Code language: PHP (php)
Here, c(2, 3)
refers to the second and third columns (name
and age
). The with = FALSE
ensures that column indices are treated as names rather than positions.
Selecting Columns by Index in data.table
Selecting columns by index can be useful when we do not know column names in advance. Here is how:
# Select the first and third columns
dt[, .SD, .SDcols = c(1, 3)]
Code language: PHP (php)
This method is similar to selecting by name but works with numeric indices. It is useful when we are working with dynamically changing datasets where column positions are known, but names may vary.
Summary
In this blog post, we learned how to select columns in data.table using different approaches:
- Selecting by variable name:
dt[, name]
anddt[, .(name, age)]
- Selecting multiple columns using
.SDcols
:dt[, .SD, .SDcols = c("name", "age")]
- Selecting columns by index:
dt[, c(2, 3), with = FALSE]
Using a data.table for column selection provides flexibility when working with large datasets. Do you use data.table for data manipulation? Share your thoughts and experiences in the comments below! If you found this post helpful, consider sharing it on social media so others can learn these techniques too.