How to add new row into the beginning of data frame only if the first value of the column Z contains-CodePudding

I have a list of about thousand data frames. All data frames have column Z and the column consists mostly on NA values, but whenever there is an actual value, it has either "VALUE1" or "VALUE2" in it. For example:

weight | height | Z
---------------------------
62      100      NA
65      89       NA
59      88       randomnumbersVALUE1randomtext
66      92       NA
64      90       NA
64      87       randomnumbersVALUE2randomtext
57      84       NA

The first actual value of each data frame in the column Z should always contain a value of "VALUE1" in it, so in the example data frame above everything is as it should be. However, if the data frame would look like this:

weight | height | Z
---------------------------
62      100      NA
65      89       NA
59      88       randomnumbersVALUE2randomtext
66      92       NA
64      90       NA
64      87       randomnumbersVALUE1randomtext
57      84       NA

I would need to add a new row into the beginning of the data frame with "VALUE1" in the Z column and value 0 in the height and weight columns. How could I do this for my list of data frames (with the help of functions such as add_row and filter)..?

CodePudding user response：

If dfs is your list of dataframes, you can do this:

dfs = lapply(dfs, function(x) {
  if(grepl("VALUE2", x[!is.na(x$Z),"Z"][1])) {
    rbind(data.frame(weight=0,height=0,Z="VALUE1"),x)
  } else x
})

CodePudding user response：

library(dplyr)
dat %>%
  filter(!is.na(Z)) %>%
  slice(1) %>%
  mutate(across(weight:height, ~ 0)) %>%
  filter(!grepl("VALUE1", Z)) %>%
  mutate(Z = "VALUE1") %>%
  bind_rows(., dat)
#   weight height                             Z
# 1      0      0                        VALUE1
# 2     62    100                          <NA>
# 3     65     89                          <NA>
# 4     59     88 randomnumbersVALUE2randomtext
# 5     66     92                          <NA>
# 6     64     90                          <NA>
# 7     64     87 randomnumbersVALUE1randomtext
# 8     57     84                          <NA>

Data

dat <- structure(list(weight = c(62L, 65L, 59L, 66L, 64L, 64L, 57L), height = c(100L, 89L, 88L, 92L, 90L, 87L, 84L), Z = c(NA, NA, "randomnumbersVALUE2randomtext", NA, NA, "randomnumbersVALUE1randomtext", NA)), class = "data.frame", row.names = c(NA, -7L))