Home > OS >  Insert missing value based on other rows
Insert missing value based on other rows

Time:11-08

I have a dataframe similar (just longer). My goal is to copy the username of [2] to [8] and [5] to [7]. I know it's weird making a duplicate, but there's a reason.

I've been trying solving it with an ifelse:

df$Username <- ifelse(df$Name == df$Name, df$Username, NA)

But it doesn't work. I believe it's rather simple, but can't find a function anywhere on stackoverlow. Thanks in advance for any help.

# A tibble: 6 x 2
  Name           Username       
  <chr>          <chr>          
1 ZiadAboultaif  ziad_aboultaif 
2 ScottAitchison ScottAAitchison
3 DanAlbas       DanAlbas       
4 JohnAldag      jwaldag        
5 OmarAlghabra   OmarAlghabra   
6 ShafqatAli     Shafqat_Ali_1  
7 OmarAlghabra   NA
8 ScottAitchison NA

# Reproducilbe data:
df <- structure(list(Name = c("ZiadAboultaif", "ScottAitchison", "DanAlbas", 
"JohnAldag", "OmarAlghabra", "ShafqatAli"), Username = c("ziad_aboultaif", 
"ScottAAitchison", "DanAlbas", "jwaldag", "OmarAlghabra", "Shafqat_Ali_1"
)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"
))

CodePudding user response:

I'd use dplyr:

df %>% group_by(Name) %>% 
   mutate(Username=max(Username, na.rm=TRUE))

Output:

  Name           Username       
  <chr>          <chr>          
1 ZiadAboultaif  ziad_aboultaif 
2 ScottAitchison ScottAAitchison
3 DanAlbas       DanAlbas       
4 JohnAldag      jwaldag        
5 OmarAlghabra   OmarAlghabra   
6 ShafqatAli     Shafqat_Ali_1  
7 OmarAlghabra   OmarAlghabra   
8 ScottAitchison ScottAAitchison

To get the maximum value without having NAs (na.rm=TRUE) inside mutate.

CodePudding user response:

A possible solution:

library(tidyverse)

df <- structure(list(Name = c("ZiadAboultaif", "ScottAitchison", "DanAlbas", 
"JohnAldag", "OmarAlghabra", "ShafqatAli"), Username = c("ziad_aboultaif", 
"ScottAAitchison", "DanAlbas", "jwaldag", "OmarAlghabra", "Shafqat_Ali_1"
)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"
))

df <- rbind(df, c("OmarAlghabra", NA), c("ScottAitchison",NA))

df %>% 
  group_by(Name) %>% 
  fill(Username) %>% 
  ungroup

#> # A tibble: 8 × 2
#>   Name           Username       
#>   <chr>          <chr>          
#> 1 ZiadAboultaif  ziad_aboultaif 
#> 2 ScottAitchison ScottAAitchison
#> 3 DanAlbas       DanAlbas       
#> 4 JohnAldag      jwaldag        
#> 5 OmarAlghabra   OmarAlghabra   
#> 6 ShafqatAli     Shafqat_Ali_1  
#> 7 OmarAlghabra   OmarAlghabra   
#> 8 ScottAitchison ScottAAitchison

CodePudding user response:

We could use add_row from dplyr package:

dplyr::add_row(df,df[c(5,2),], .after = 6)
  Name           Username       
  <chr>          <chr>          
1 ZiadAboultaif  ziad_aboultaif 
2 ScottAitchison ScottAAitchison
3 DanAlbas       DanAlbas       
4 JohnAldag      jwaldag        
5 OmarAlghabra   OmarAlghabra   
6 ShafqatAli     Shafqat_Ali_1  
7 OmarAlghabra   OmarAlghabra   
8 ScottAitchison ScottAAitchison
  • Related