I have a list of dfs like:
Name <- c("Jon", "Bill", "Maria", "Ben", "Tina")
Age <- c(23, 41, 32, 58, 26)
df1 <- data.frame(Name, Age)
Name <- c("Jon", "Bill", "Maria", "Ben", "Tina")
Age <- c(23, 41, 32, 58, 26)
df2 <- data.frame(Name, Age)
list <- list(df1, df2)
I want to create a subsequent ID through all DFs. My desired Output should look like:
Name <- c("Jon", "Bill", "Maria", "Ben", "Tina")
Age <- c(23, 41, 32, 58, 26)
ID <- c(1:5)
df1 <- data.frame(Name, Age, ID)
Name <- c("Jon", "Bill", "Maria", "Ben", "Tina")
Age <- c(23, 41, 32, 58, 26)
ID <- c(5:9)
df2 <- data.frame(Name, Age, ID)
list <- list(df1, df2)
CodePudding user response:
(I named it list1
instead of list
, not wanting to confuse variables/functions :-)
I'm assuming df2
should start at nrow(df1) 1
, not at nrow(df1)
.
lens <- sapply(list1, nrow)
list1 <- Map(function(X, fm, len) transform(X, ID = fm seq_len(len)),
list1, c(0, lens[-length(lens)]), lens)
list1
# [[1]]
# Name Age ID
# 1 Jon 23 1
# 2 Bill 41 2
# 3 Maria 32 3
# 4 Ben 58 4
# 5 Tina 26 5
# [[2]]
# Name Age ID
# 1 Jon 23 6
# 2 Bill 41 7
# 3 Maria 32 8
# 4 Ben 58 9
# 5 Tina 26 10
CodePudding user response:
IIUC, this should do:
Name <- c("Jon", "Bill", "Maria", "Ben", "Tina")
Age <- c(23, 41, 32, 58, 26)
df1 <- data.frame(Name, Age) %>% mutate(origin = 'df1')
Name <- c("Jon", "Bill", "Maria", "Ben", "Tina")
Age <- c(23, 41, 32, 58, 26)
df2 <- data.frame(Name, Age) %>% mutate(origin = 'df2')
list <- bind_rows(df1, df2) %>% mutate(ID = row_number()) %>% group_split(origin)
Output:
[[1]]
# A tibble: 5 × 4
Name Age origin ID
<fct> <dbl> <chr> <int>
1 Jon 23 df1 1
2 Bill 41 df1 2
3 Maria 32 df1 3
4 Ben 58 df1 4
5 Tina 26 df1 5
[[2]]
# A tibble: 5 × 4
Name Age origin ID
<fct> <dbl> <chr> <int>
1 Jon 23 df2 6
2 Bill 41 df2 7
3 Maria 32 df2 8
4 Ben 58 df2 9
5 Tina 26 df2 10
You could obviously drop the origin
column if you don't need it.
Any reason why the second ID starts at 5 and not 6 in your example?