Home > Software design >  Drop specific values from first row but not the complete row
Drop specific values from first row but not the complete row

Time:10-16

I'm importing a csv and I need to do some cleaning before I can get it to run my script. The problem I'm having is the following:

df1

  Time    HR  Start
1 Hour    N   18:55:25
2 0:00:00 57  NA
3 0:00:01 57  NA
4 0:00:02 58  NA
...

I would like to drop "Hour" and "N" which right now are considered values by the dataframe, so that I can move the Time and HR up by one row.

df1

  Time    HR  Start
1 0:00:00 57  18:55:25
2 0:00:01 57  NA
3 0:00:02 58  NA
...

I've tried using the filter option but it deletes the Start value for row 1 which I need.

df1<-filter(df1,Time != "Hour")

with str_detect

df1 %>% 
  filter((!str_detect(Time, "Hour"))

and with the grep function in data.table

df1[c(grep("Hour", df1$Time, invert = TRUE)),, drop = FALSE]

It can also work if I'm able to copy the value of Start in the second row, and therefore I could delete that row but it's providing difficult since it's a time value.

CodePudding user response:

You could try the following brute-force methods, though I am not sure what your full data look like so neither may be ideal, all depends on how you want to shift it (move up or move down):

Data

df <- data.frame(Time = c("Hour", "0:00:00", "0:00:02", "0:00:02"),
                 HR = c("N", "57", "57", "58"),
                 Start = c("18:55:25", rep(NA, 3)))

Code

#separate and remove rows
df_a <- df[-1,1:2] # remove first row for Time and HR
df_b <- df[-nrow(df),3] # remove last row of Start

df_final <- cbind(df_a, df_b) #recombine

Output:

#      Time HR     df_b
# 2 0:00:00 57 18:55:25
# 3 0:00:02 57     <NA>
# 4 0:00:02 58     <NA>

Alternatively you could copy the first value of column three into the second row and delete the first row:

df2 <- df
df2[2,3] <- df[1,3]
df2 <- df2[-1,]

Output:

#      Time HR    Start
# 2 0:00:00 57 18:55:25
# 3 0:00:02 57     <NA>
# 4 0:00:02 58     <NA>

  • Related