Home > Blockchain >  Using dplyr - how can I create a new category for one column when another column has duplicates?
Using dplyr - how can I create a new category for one column when another column has duplicates?

Time:11-23

I have a dataframe of coordinates for different studies that have been conducted. The studies are either experiment or observation however at some locations both experiment AND observation occur. For these sites, I would like to create a new study category called both. How can I do this using dplyr?

Example Data

df1 <- data.frame(matrix(ncol = 4, nrow = 6))
colnames(df1)[1:4] <- c("value", "study", "lat","long")
df1$value <- c(1,1,2,3,4,4)
df1$study <- rep(c('experiment','observation'),3)
df1$lat <- c(37.541290,37.541290,38.936604,29.9511,51.509865,51.509865)
df1$long <- c(-77.434769,-77.434769,-119.986649,-90.0715,-0.118092,-0.118092)
df1

  value       study      lat        long
1     1  experiment 37.54129  -77.434769
2     1 observation 37.54129  -77.434769
3     2  experiment 38.93660 -119.986649
4     3 observation 29.95110  -90.071500
5     4  experiment 51.50986   -0.118092
6     4 observation 51.50986   -0.118092

Note that the value above is duplicated when study has experiment AND observation.

The ideal output would look like this

  value       study      lat        long
1     1        both 37.54129  -77.434769
2     2  experiment 38.93660 -119.986649
3     3 observation 29.95110  -90.071500
4     4        both 51.50986   -0.118092

CodePudding user response:

We can replace those 'value' cases where both experiment and observation is available to 'both' and get the distinct

library(dplyr)
df1 %>% 
  group_by(value) %>%
  mutate(study = if(all(c("experiment", "observation") %in% study)) 
      "both" else study) %>% 
  ungroup %>% 
  distinct

-output

# A tibble: 4 × 4
  value study         lat     long
  <dbl> <chr>       <dbl>    <dbl>
1     1 both         37.5  -77.4  
2     2 experiment   38.9 -120.   
3     3 observation  30.0  -90.1  
4     4 both         51.5   -0.118
  • Related