Home > database >  Appending a column to each data frame within a list
Appending a column to each data frame within a list

Time:04-11

I have a list of dataframes and want to append a new column to each, however I keep getting various error messages. Can anybody explain why the below code doesn't work for me? I'd be happy if rowid_to)column works as the data in my actual set is alright ordered correctly, otherwise i'd like a new column with a list going from 1:length(data$data)

##dataset
data<- tibble(Location = c(rep("London",6),rep("Glasgow",6),rep("Dublin",6)),
              Day= rep(seq(1,6,1),3),
              Average = runif(18,0,20),
              Amplitude = runif(18,0,15))%>%
  nest_by(Location)

###map   rowid_to_column
attempt1<- data%>%
  map(.,rowid_to_column(.,var = "hour"))

##mutate
attempt2<-data %>%
  map(., mutate("Hours" = 1:6))

###add column
attempt3<- data%>%
  map(.$data,add_column(.data,hours = 1:6))

newcolumn<- 1:6

###lapply
attempt4<- lapply(data,cbind(data$data,newcolumn))

Many thanks, Stuart

CodePudding user response:

You were nearly there with your base R attempt, but you want to iterate over data$data, which is a list of data frames.

data$data  <- lapply(data$data, function(x) {
    hour  <- seq_len(nrow(x))
    cbind(x, hour)
})

data$data

# [[1]]
#   Day   Average Amplitude hour
# 1   1  6.070539  1.123182    1
# 2   2  3.638313  8.218556    2
# 3   3 11.220683  2.049816    3
# 4   4 12.832782 14.858611    4
# 5   5 12.485757  7.806147    5
# 6   6 19.250489  6.181270    6

Edit: Updated as realised it was iterating over columns rather than rows. This approach will work if the data frames have different numbers of rows, which the methods with the vector defined as 1:6 will not.

CodePudding user response:

a data.table approach

library(data.table)
setDT(data)
data[, data := lapply(data, function(x) cbind(x, new_col = 1:6))]

data$data
# [[1]]
#   Day   Average  Amplitude test new_col
# 1   1 11.139917  0.3690539    1       1
# 2   2  5.350847  7.0925508    2       2
# 3   3  9.602104  6.1782818    3       3
# 4   4 14.866074 13.7356913    4       4
# 5   5  1.114201  1.1007080    5       5
# 6   6  2.447236  5.9944926    6       6
# 
# [[2]]
#   Day   Average Amplitude test new_col
# 1   1 17.230213 13.966576    1       1
# .....

CodePudding user response:

A purrr approach:

data<- tibble(Location = c(rep("London",6),rep("Glasgow",6),rep("Dublin",6)),
              Day= rep(seq(1,6,1),3),
              Average = runif(18,0,20),
              Amplitude = runif(18,0,15))%>%
  group_split(Location) %>% 
  purrr::map_dfr(~.x %>% mutate(Hours = c(1:6)))

If you want to use your approach and preserve the same data structure, this is a way again using purrr (you need to ungroup, otherwise it will not work due to the rowwise grouping)

data %>% ungroup() %>%
  mutate_at("data", .f = ~map(.x, ~.x %>% mutate(Hours = c(1:6))) )
  • Related