Home > database >  Add All Values In A Data Frame Column Where Each Cell Is Another Vector
Add All Values In A Data Frame Column Where Each Cell Is Another Vector

Time:12-15

I have a data frame structured like so:

MonthYear                 Total
 01/2020           c(1, 1, 1, 1, 1...)
 02/2020           c(2, 14, 6, 12, 91...)
   ...                     ...

How can I make this data frame, so that I can sum all the values in each vector and store them again in the data frame? So, say the total of the first vector in total is 100, how can I get that sum and store it in my data frame?

So far, I have tried doing an aggregation, but that threw back an error. My aggregation looked like this: > aggregate(df$total ~ df$MonthYear, df, FUN = sum, however, when doing this, I was given an Null Variable error for the vector column and am not sure how to fix this.

CodePudding user response:

Assuming Total is a list of numeric vectors, we can get the sum of values by applying the sum function to each element of the list:

library(tidyverse)

tibble(
  MonthYear = c("01/2020", "02/2020"),
  Total = list(rep(1, 100), c(2,14,6,12,91))
) %>%
  mutate(Total = map_dbl(Total, sum))

# A tibble: 2 x 2
  MonthYear Total
  <chr>     <dbl>
1 01/2020     100
2 02/2020     125

CodePudding user response:

Here is a simple base R solution too.

df$Total <- lapply(lapply(df$Total, unlist), sum)

Output

  MonthYear    Total
1   01/2020 20.50097
2   02/2020 352.6011
3   03/2020 112.8006

Data

df <- structure(list(
  MonthYear = c("01/2020", "02/2020", "03/2020"),
  Total = list(
    c(
      2.35268858028576, 0.749567511957139, 9.35646147234365, 1.57167825149372, 6.47057350026444
    ),
    c(
      19.7490084229503,  6.70629078173079, 49.3116769392509, 12.3336209286936, 29.0551994147245,
      40.7369665871374, 54.9186405551154, 73.569026516052, 22.1695927530527, 44.051097095944
    ),
    c(
      8.96286227740347, 14.6749551640823, 14.2950962111354, 16.624439577572, 17.731322417967,
      19.0528914798051, 11.0123344371095, 10.4467153223231
    )
  )
),
row.names = c(NA, -3L),
class = "data.frame")
  • Related