Home > Software engineering >  How to find column wise sum of 10^n values and store into another dataframe in R?
How to find column wise sum of 10^n values and store into another dataframe in R?

Time:11-24

Given a dataset like this with 200,000 rows:

  Date          Value1  Value2
1 2019-01-01     110    310
2 2019-01-02     70    200
3 2019-01-03     64    115
4 2019-01-04     50    80
5 2019-01-05     70    90
6 2019-01-06     30    49
... 

How can I create a dataframe like this for each Type

  No_of_rows  Value1                       Value2
1 10         (sum of first 10 values)   (sum of first 10 values)
1 100        (sum of first 100 values)  (sum of first 100 values)

then 1,000, 10,000 and so on

I did some finding and came up with this command as.numeric(apply(df[,1:2], 2, sum)), which apparently finds the sum of two columns, but I don't know how to do it with for specific number of rows (10^n, where n=1,...,4) and store it for each column in new dataframe.

Any guidance please?

CodePudding user response:

Here's a solution based on the tidyverse.

library(tidyverse)

# For reproducibility
set.seed(12345)

# Define some test data
df <- tibble(
        Value1=floor(rnorm(200000, 500, 100)), 
        Value2=floor(rnorm(200000, 500, 100))
      )

lapply(
  # For powers of 10
  1:4,
  # Apply this function
  function(n) {
    df %>% 
      # Take the first 10^n rows
      head(10^n) %>% 
      # Calculate column means
      summarise(
        Value1=mean(Value1),
        Value2=mean(Value2),
        .groups="drop"
      ) %>% 
      # Add the index column
      add_column(No_of_rows=10^n, .before=1)
  }
) %>% 
# Concatenate the results
bind_rows()
# A tibble: 4 × 3
  No_of_rows Value1 Value2
       <dbl>  <dbl>  <dbl>
1         10   486.   484.
2        100   524    497.
3       1000   504.   502.
4      10000   499.   500.

  • Related