Home > Software design >  taking the max value in columns including vectors in a dataframe
taking the max value in columns including vectors in a dataframe

Time:06-29

I have a dataframe like below:

ID      Name                  R1              R2               R3
A1  c("Rob","Rob")            27              29              100
A2  c("Emil","Emil","Emil").  c("1000","26")  c("70","26") c("100","80")
A3  c("Nick","Nick","Nick")   c("123","26")  c("567","80") c("93","80")

I also tried to generate it but couldn't be successful:

df<-tibble(ID=c("A1",            "A2",                       "A3"),
           Name=c(c("Rob","Rob") ,c("Emil","Emil","Emil"),c("Nick","Nick","Nick")),
           R1=c(c("1000","26") , c("70","26") ,c("100","80")), 
           R2= c(c("123","26") , c("567","80") ,c("93","80")))

I want to take the max value in each row contaiagning multiple numeric values. So it should be like below:

ID      Name                  R1              R2               R3
A1  "Rob"                     27              29              100
A2  "Emil"                    1000            70              100
A3  "Nick"                   123             567               93

For unifying the Name column, I used the below code, but for numeric values I have no idea. Any suggestion?

df$Name<-lapply(df$Name, unique) 

CodePudding user response:

We may do this as

library(dplyr)
library(purrr)
library(tidyr)
df %>% 
  mutate(across(R1:R3, ~ map_dbl(.x, max))) %>%
  unnest(Name) %>% 
  distinct

-output

# A tibble: 3 × 5
  ID    Name     R1    R2    R3
  <chr> <chr> <dbl> <dbl> <dbl>
1 A1    Rob    1000   123   100
2 A2    Emil     70   567   100
3 A3    Nick    100    93    93

data

df <- structure(list(ID = c("A1", "A2", "A3"), Name = list(c("Rob", 
"Rob"), c("Emil", "Emil", "Emil"), c("Nick", "Nick", "Nick")), 
    R1 = list(c(1000, 26), c(70, 26), c(100, 80)), R2 = list(
        c(123, 26), c(567, 80), c(93, 80)), R3 = list(100, c(100, 
    80), c(93, 80))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -3L))

CodePudding user response:

A data.table option

> library(data.table)

> setDT(df)[, lapply(.SD, function(x) max(unlist(x))), ID]
   ID Name   R1  R2  R3
1: A1  Rob 1000 123 100
2: A2 Emil   70 567 100
3: A3 Nick  100  93  93
  • Related