How to Get Number of Rows in R Using data.table

Knowing how to get the number of rows in R is essential when working with large datasets. In this post, we will look at different ways to count rows, including using data.table. We have previously looked at how to select columns in data.table and how to filter data. Now, we will focus on a simple task: counting rows in data.frames and data.tables.

Table of Contents

How to Get the Number of Rows using Base R

The absolute easiest way to get the number of rows in R is by using the nrow() function:

df <- data.frame(a = 1:5, b = 6:10)
nrow(df)  Code language: HTML, XML (xml)

In the code chunk above, we created a simple data frame and used nrow() to count the rows. This method works for both data.frame and data.table objects.

how to get numer of rows in R's dataframe
  • Save

How to Get the Number of Rows in data.table

For data.table, we can use the .N symbol, which is optimized for performance:

library(data.table)

dt <- data.table(a = 1:5, b = 6:10)
nrow(dt)Code language: HTML, XML (xml)

In the code chunk above, we created a simple data.table with two columns (a and b) and five rows. We then used the nrow() function to get the number of rows in the data.table.

count rows in R
  • Save

data.table vs. Base R: Which Counts Rows Quickest?

Now we have seen that we can use nrow() on both a data.frame and a data.table. But why use data.table? Here, we can test which one counts rows fastest:

library(data.table)
library(microbenchmark)
set.seed(20250326)

# Create large datasets
df <- data.frame(a = runif(2e8), b = runif(2e8))
dt <- as.data.table(df)

# Benchmark nrow() vs .N
microbenchmark(
    base_r = nrow(df),
    data_table = nrow(dt),
    times = 10
)Code language: PHP (php)

In the code chunk above, we created two datasets: one as a data.frame (df) and the other as a data.table (dt). We used microbenchmark to compare the time it takes to count rows using nrow() in Base R for the data.frame and data.table. The function nrow() is applied to both objects, and we measure the time for 10 iterations.

Results

The table below shows the benchmarking results of counting rows for both Base R and data.table:

ExpressionMin (ns)LQ (ns)Mean (ns)Median (ns)UQ (ns)Max (ns)EvaluationsCld
base_r140014004810195022003180010a
data_table800800193090011001100010a
Results from the benchmarking (values are in Nanoseconds). We can see that working with data.tables is faster than with data.frames.

As we can see from the table, nrow() on a data.table is slightly faster than on a data.frame, but the difference is relatively minor in this case. However, you may see a more considerable difference as datasets grow and the number of rows increases.

Conclusion

In this post, we first learned how to use nrow() to count the number of rows in a data frame in R. We then learned how to perform the same task using data.table, highlighting the advantages of working with data.table for efficient data manipulation. Finally, we ran a speed comparison between nrow() in Base R and nrow() in data.table, discovering that data.table performs quicker when working with large datasets.

If you found this post helpful, consider sharing it on social media to spread the knowledge! I would love to hear your thoughts, so feel free to comment below with any questions or insights.

  • Save

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top
Share via
Copy link