I'm still little new to R and was wondering if I there was an easy way of manipulating the below df to achieve df2.
I'm not sure if it's a loop that is supposed to be used for this, but basically I want to take a distinct on each V(X)_ID column and it's corresponding V(X)_No column. If the V(X)_ID is repeated in further columns, I would like to ignore it and move the next unique V(X)_ID and its associated V(X)_No.
V1_ID <- c('AUD','CAD','NZD','USD',NA,NA,NA)
V1_No <- c(3.43,2.42,1.58,9.9,NA,NA,NA)
V2_ID <- c('TRY','AUD','EUR','SPY','TRY','BTC',NA)
V2_No <- c(8.4,2.4,6.8,1.2,9.8,9.8,NA)
V3_ID <- c('JPY','EUR','NZD','AUD','SPY','NA',NA)
V3_No <- c(1.8,8.6,4.4,2.1,9.6,NA,NA)
V4_ID <- c('GBP','TRY','HKD','SKD','USD','NZD','CAD')
V4_No <- c(1.3,4.6,7.9,8.5,2.6,7.4,9.1)
df <- data.frame(V1_ID,V1_No,V2_ID,V2_No,V3_ID,V3_No,V4_ID,V4_No)
ID <- c('AUD','CAD','NZD','USD','TRY','EUR','SPY','BTC','JPY','GBP','SKD')
No <- c(3.43,2.42,1.58,9.9,8.4,6.8,1.2,9.8,1.8,1.3,8.5)
df2 <- data.frame(ID,No)
Your assistance is much appreciated as I have 387 of these types of columns in this type of format and approaching it from a manual standpoint is very draining, thus am looking for hopefully a more elegant solution.
Thanks
CodePudding user response:
library(tidyr)
library(dplyr)
df |>
pivot_longer(everything(), names_to = c("set", ".value"), names_sep = "_") |>
distinct(ID, .keep_all = TRUE) |>
drop_na() |>
select(-set)
# A tibble: 12 × 2
ID No
<chr> <dbl>
1 AUD 3.43
2 TRY 8.4
3 JPY 1.8
4 GBP 1.3
5 CAD 2.42
6 EUR 8.6
7 NZD 1.58
8 HKD 7.9
9 USD 9.9
10 SPY 1.2
11 SKD 8.5
12 BTC 9.8
There is a twelfth row though that appears that doesn't match with your desired output, and I don't know what logical step I'm missing.