Dealing with a leading NA when using na.locf to fill string NA's-CodePudding

I have a list of dataframes:

test_dat <- structure(list(...5 = c("euro", "euro", NA, NA, NA, NA, 
NA, "dollar", NA)), row.names = c(NA, -9L), class = c("tbl_df", 
"tbl", "data.frame"))

test_dat2 <- structure(list(...5 = c(NA, "euro", NA, NA, NA, NA, 
NA, "dollar", NA)), row.names = c(NA, -9L), class = c("tbl_df", 
"tbl", "data.frame"))

test_dat2
# A tibble: 9 × 1
  ...5  
  <chr> 
1 NA    
2 euro  
3 NA    
4 NA    
5 NA    
6 NA    
7 NA    
8 dollar
9 NA    

l = list(test_dat , test_dat2)

I want to fill NA's in the list of df's, but sometimes there is a leading NA. I do not know for which entries there is leading NA.

for (i in seq_along(l)){
  # Fill first column
  l[[i]][1] <- zoo::na.locf(l[[i]][1])
}

Leading to:

Error:
! Assigned data `zoo::na.locf(l[[i]][1])` must be compatible with existing data.
✖ Existing data has 9 rows.
✖ Assigned data has 8 rows.
ℹ Only vectors of size 1 are recycled.
Run `rlang::last_error()` to see where the error occurred.

I assumed that the following would solve it, but did not:

for (i in seq_along(l)){
  # Fill first column
  l[[i]][1] <- zoo::na.locf(l[[i]][1], na.rm=TRUE)
}

Desired output:

test_dat2
# A tibble: 9 × 1
  ...5  
  <chr> 
1 NA    
2 euro  
3 euro
4 euro
5 euro
6 euro
7 euro
8 dollar
9 dollar

CodePudding user response：

Maybe you want something like this. I assume you want to apply this across both dataframes in the list.

library(tidyverse)


l |>
  map(~fill(.x, everything(), .direction = "down"))
#> [[1]]
#> # A tibble: 9 x 1
#>   ...5  
#>   <chr> 
#> 1 euro  
#> 2 euro  
#> 3 euro  
#> 4 euro  
#> 5 euro  
#> 6 euro  
#> 7 euro  
#> 8 dollar
#> 9 dollar
#> 
#> [[2]]
#> # A tibble: 9 x 1
#>   ...5  
#>   <chr> 
#> 1 <NA>  
#> 2 euro  
#> 3 euro  
#> 4 euro  
#> 5 euro  
#> 6 euro  
#> 7 euro  
#> 8 dollar
#> 9 dollar

CodePudding user response：

The code in the question works if you convert the tibble to a data frame or if you change [1] to [[1]] or [, 1].

Using l from the question we can use any of these. If each component of l has only one column (which is the case in the l shown in the question) then a simplification is possible.

Suggest using lapply instead.

library(zoo)

for(i in seq_along(l)) l[[i]][1] <- na.locf(as.data.frame(l[[i]][1]), na.rm = FALSE)

for(i in seq_along(l)) l[[i]][, 1] <- na.locf(l[[i]][, 1], na.rm = FALSE)

for(i in seq_along(l)) l[[i]][[1]] <- na.locf(l[[i]][[1]], na.rm = FALSE)

# if there is only one column in each component of l
for(i in seq_along(l)) l[[i]] <- na.locf(l[[i]], na.rm = FALSE)

# if there is only one column in each component of l
lapply(l, na.locf, na.rm = FALSE)

lapply(l, function(x) replace(x, 1, na.locf(x[, 1], na.rm = FALSE)))