I have a data frame that looks similar to this (I've cut some out for easier reference, data has 93 rows):
Rank 1 A B C D
34 (TPE) 2 4 6 12
35 (TUR) 2 2 9 13
36 (GRE) 2 1 1 4
(UGA) 2 1 1 4 <NA>
I need to have the columns line up, but some of the data in "Rank" is offset to the left one column. I have assigned the rows with this problem to a vector: off.set.rows <- c(which(is.na(df[ , 6]))) I need to have all rows in that vector shift one column to the right and replace the empty space it leaves in column 1 with the number in column 1 in the row previous to it. It should look like this:
Rank 1 A B C D
34 (TPE) 2 4 6 12
35 (TUR) 2 2 9 13
36 (GRE) 2 1 1 4
36 (UGA) 2 1 1 4
I've tried this: df[off.set.rows, 1:(ncol(df))] <- df[off.set.rows, 2:(ncol(df))], but it shifts everything in the row left one column and the (UGA) disappears, it moves the to column 5 and then repeats the value that moves into column 2 again in column 6 like this:
Rank 1 A B C D
34 (TPE) 2 4 6 12
35 (TUR) 2 2 9 13
36 (GRE) 2 1 1 4
2 1 1 4 <NA> 2
Help is much appreciated!!
CodePudding user response:
Base R solution: How it works:
- Subset
df
to only those rows that met the criteria defined in youroff.set.rows
- add a new column at the start to
x
- paste colnames from
df
tox
- bind the rows of
df
andx
together - remove the rows that meet the criteria defined in your
off.set.rows
- Use
lag()
to add the value above inRank
off.set.rows <- c(which(is.na(df[ , 6])))
x <- subset(df, rownames(df) %in% off.set.rows)
x <- cbind(new=0, x)
colnames(x) <- colnames(df)
df <- rbind(df, x[1:6])
df <- subset(df, !rownames(df) %in% off.set.rows)
df$Rank <- ifelse(df$Rank==0, lag(df$Rank), df$Rank)
Rank X1 A B C D
1 34 (TPE) 2 4 6 12
2 35 (TUR) 2 2 9 13
3 36 (GRE) 2 1 1 4
41 36 (UGA) 2 1 1 4
data:
df <- structure(list(Rank = c("34", "35", "36", "(UGA)"), X1 = c("(TPE)",
"(TUR)", "(GRE)", "2"), A = c(2L, 2L, 2L, 1L), B = c(4L, 2L,
1L, 1L), C = c(6L, 9L, 1L, 4L), D = c(12L, 13L, 4L, NA)), class = "data.frame", row.names = c(NA,
-4L))
CodePudding user response:
I think the attempt in the original post was close. The column numbers to replace should be from 2 to ncol(df)
(2-6). And those are replaced by columns 1 through ncol(df) - 1
(1-5).
After moving values for those particular rows, I might consider replacing the first column in those rows with NA
, then use fill
from tidyr
to replace them with the last non-missing value. This will also take care of situations when you may have consecutive offset rows (if that is a possibility).
library(tidyr)
off.set.rows <- c(which(is.na(df[ , ncol(df)])))
df[off.set.rows, 2:ncol(df)] <- df[off.set.rows, 1:ncol(df)-1]
df[off.set.rows, 1] <- NA
fill(df, 1, .direction = "down")
Output
Rank X1 A B C D
1 34 (TPE) 2 4 6 12
2 35 (TUR) 2 2 9 13
3 36 (GRE) 2 1 1 4
4 36 (UGA) 2 1 1 4