Home > Software design >  If value exists in another column, take value from the original column in the same row, if not nothi
If value exists in another column, take value from the original column in the same row, if not nothi

Time:06-27

Example data

library(tidyverse)

df <- tibble(
  value = c(10, 50, 80, 90, 45),
  value_1 = c(90, 20, 55, 30, 15) 
)

# A tibble: 5 × 2
  value value_1
  <dbl>   <dbl>
1    10      90
2    50      20
3    80      55
4    90      30
5    45      15

90 in value_1 exists in the column value. I want to mutate another column result that takes value_1 in that identified row where value_1 is matched with value. If the data does not exist in the column value, simply result would equal to value_1

Such that the output would look like:

# A tibble: 5 × 3
  value value_1 result
  <dbl>   <dbl>  <dbl>
1    10      90     30
2    50      20     20
3    80      55     55
4    90      30     30
5    45      15     15

What I have tried so far is mutate with case_when but I do not know how to incorporate which or whatever to complete the script

df %>%  
  mutate(result = case_when(value_1 %in% value ~ 1, 
                            TRUE ~ 0))

# A tibble: 5 × 3
  value value_1 result
  <dbl>   <dbl>  <dbl>
1    10      90      1
2    50      20      0
3    80      55      0
4    90      30      0
5    45      15      0

CodePudding user response:

Using match you could do:

library(dplyr)
df |> 
  mutate(result = value_1[match(value_1, value)],
         result = coalesce(result, value_1))
#> # A tibble: 5 × 3
#>   value value_1 result
#>   <dbl>   <dbl>  <dbl>
#> 1    10      90     30
#> 2    50      20     20
#> 3    80      55     55
#> 4    90      30     30
#> 5    45      15     15

or using a left_join:

df |> 
  left_join(rename(df, result = value_1), by = c("value_1" = "value")) |> 
  mutate(result = coalesce(result, value_1))
#> # A tibble: 5 × 3
#>   value value_1 result
#>   <dbl>   <dbl>  <dbl>
#> 1    10      90     30
#> 2    50      20     20
#> 3    80      55     55
#> 4    90      30     30
#> 5    45      15     15

CodePudding user response:

You can use this function fn

fn <- function(x){
    y <- which(df$value_1[x] == df$value)
    if(length(y) > 0) return(df$value_1[y])
    else return(df$value_1[x])
}

df$result <- sapply(1:nrow(df) , fn)

df

output

# A tibble: 5 × 3
  value value_1 result
  <dbl>   <dbl>  <dbl>
1    10      90     30
2    50      20     20
3    80      55     55
4    90      30     30
5    45      15     15

CodePudding user response:

Where you have 1, use value_1

But, you expected result makes no sense for what you describe.

  • Related