I want to convert a dataframe (left) to another (right), but I have no idea to do this task. Briefly, the nearest value in A
was retained and merged with B
. Please take your time to help me! Thank you so much!
The original dataframe and converted dataframe (highlighted background)
Here is the original data.
df1 = data.frame(
ID = c(1,1,1,1,1,1,1,1,2,2,2,2,2),
TIME = c(-2, -1, 0, 11, 13, 24, 28, 36, -3, 0, 12, 13, 24),
A = c(30, 50, NA, 80, NA, NA, 30, NA, 20, NA, NA, 80, NA),
B = c(NA, NA, 2.0, NA, 2.5, 2.5, NA, 1.0, NA, 1.0, 1.0, NA, 2.5)
)
CodePudding user response:
library(tidyverse)
df <- df1 %>%
fill(A, .direction = c("down")) %>% # fill missing data in A
drop_na() %>%
rename(B = A, A = B) %>% # swap the names of columns A and B
select(ID, TIME, A, B) # select the order of columns
CodePudding user response:
Here is a solution. It keeps the rows with non-missing values in B
after filling the values in A
with the previous value. Then it's a matter of rearranging and renaming the columns.
library(dplyr)
library(tidyr)
df1 %>%
group_by(ID) %>%
fill(A) %>%
ungroup() %>%
na.omit() %>%
rename(B = A, A = B) %>%
relocate(B, .after = A)
## A tibble: 7 x 4
# ID TIME A B
# <dbl> <dbl> <dbl> <dbl>
#1 1 0 2 50
#2 1 13 2.5 80
#3 1 24 2.5 80
#4 1 36 1 30
#5 2 0 1 20
#6 2 12 1 20
#7 2 24 2.5 80
Edit
TarJae's comment makes the code above shorter:
df1 %>%
fill(A) %>%
select(ID, TIME, A=B, B=A) %>%
drop_na()
CodePudding user response:
Here is a base R method how we could do it:
# swap A and B
df1[ , c(3,4)] <- df1[ , c(4,3)]
# fill na with prvious value from above
df1$B <- na.omit(df1$B)[cumsum(!is.na(df1$B))]
# remove NA
df1[complete.cases(df1$A),]
ID TIME A B
3 1 0 2.0 50
5 1 13 2.5 80
6 1 24 2.5 80
8 1 36 1.0 30
10 2 0 1.0 20
11 2 12 1.0 20
13 2 24 2.5 80