Home > Back-end >  Shift select columns one to the right and replace empty space with the row number - 1
Shift select columns one to the right and replace empty space with the row number - 1

Time:11-08

I have a data frame that looks similar to this (I've cut some out for easier reference, data has 93 rows):

                 Rank                      1     A      B      C    D
                 34                     (TPE)    2      4      6    12
                 35                     (TUR)    2      2      9    13
                 36                     (GRE)    2      1      1     4
              (UGA)                         2    1      1      4   <NA>

I need to have the columns line up, but some of the data in "Rank" is offset to the left one column. I have assigned the rows with this problem to a vector: off.set.rows <- c(which(is.na(df[ , 6]))) I need to have all rows in that vector shift one column to the right and replace the empty space it leaves in column 1 with the number in column 1 in the row previous to it. It should look like this:

                 Rank                      1     A      B      C    D
                 34                     (TPE)    2      4      6    12
                 35                     (TUR)    2      2      9    13
                 36                     (GRE)    2      1      1     4
                 36                     (UGA)    2      1      1     4   

I've tried this: df[off.set.rows, 1:(ncol(df))] <- df[off.set.rows, 2:(ncol(df))], but it shifts everything in the row left one column and the (UGA) disappears, it moves the to column 5 and then repeats the value that moves into column 2 again in column 6 like this:

                 Rank                      1     A      B      C    D
                 34                     (TPE)    2      4      6    12
                 35                     (TUR)    2      2      9    13
                 36                     (GRE)    2      1      1     4
                 2                         1     1      4     <NA>   2

Help is much appreciated!!

CodePudding user response:

Base R solution: How it works:

  1. Subset df to only those rows that met the criteria defined in your off.set.rows
  2. add a new column at the start to x
  3. paste colnames from df to x
  4. bind the rows of df and x together
  5. remove the rows that meet the criteria defined in your off.set.rows
  6. Use lag() to add the value above in Rank
off.set.rows <- c(which(is.na(df[ , 6])))

x <- subset(df, rownames(df) %in% off.set.rows)
x <- cbind(new=0, x)
colnames(x) <- colnames(df)
df <- rbind(df, x[1:6])
df <- subset(df, !rownames(df) %in% off.set.rows)
df$Rank <- ifelse(df$Rank==0, lag(df$Rank), df$Rank)
   Rank    X1 A B C  D
1    34 (TPE) 2 4 6 12
2    35 (TUR) 2 2 9 13
3    36 (GRE) 2 1 1  4
41   36 (UGA) 2 1 1  4

data:

df <- structure(list(Rank = c("34", "35", "36", "(UGA)"), X1 = c("(TPE)", 
"(TUR)", "(GRE)", "2"), A = c(2L, 2L, 2L, 1L), B = c(4L, 2L, 
1L, 1L), C = c(6L, 9L, 1L, 4L), D = c(12L, 13L, 4L, NA)), class = "data.frame", row.names = c(NA, 
-4L))

CodePudding user response:

I think the attempt in the original post was close. The column numbers to replace should be from 2 to ncol(df) (2-6). And those are replaced by columns 1 through ncol(df) - 1 (1-5).

After moving values for those particular rows, I might consider replacing the first column in those rows with NA, then use fill from tidyr to replace them with the last non-missing value. This will also take care of situations when you may have consecutive offset rows (if that is a possibility).

library(tidyr)

off.set.rows <- c(which(is.na(df[ , ncol(df)])))
df[off.set.rows, 2:ncol(df)] <- df[off.set.rows, 1:ncol(df)-1]
df[off.set.rows, 1] <- NA
fill(df, 1, .direction = "down")

Output

  Rank    X1 A B C  D
1   34 (TPE) 2 4 6 12
2   35 (TUR) 2 2 9 13
3   36 (GRE) 2 1 1  4
4   36 (UGA) 2 1 1  4
  •  Tags:  
  • r
  • Related