I have a list of 1000s of dataframes.
Each one has the following structure:
structure(list(frame = c(222, 223, 224, 225, 226, 227, 228, 229,
230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242,
243, 244, 245, 246, 247, 248, 249, 250, 251, 252), room = c("B6",
NA, NA, NA, NA, "B6", NA, NA, "B6", NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, "B6", NA, NA, NA, NA, NA, NA, "B6"
), id = c(2, NA, NA, NA, NA, 85, NA, NA, 2, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, 32, NA, NA, NA, NA, NA, NA,
1), id_prob = c(0.710559149006359, NA, NA, NA, NA, 0.676624962451645,
NA, NA, 0.650006199807849, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, 0.668218888964693, NA, NA, NA, NA, NA, NA,
0.786722974412071), x = c(1606, NA, NA, NA, NA, 1319, NA, NA,
1636, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
1316.75, NA, NA, NA, NA, NA, NA, 656.5), y = c(-472.25, NA, NA,
NA, NA, -516.5, NA, NA, -463.5, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, -520, NA, NA, NA, NA, NA, NA, -941),
orientation = c(84.5596680381038, NA, NA, NA, NA, 51.3401926511951,
NA, NA, 71.565048727047, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, 63.4349516145757, NA, NA, NA, NA,
NA, NA, 120.963756691571), area = c(-133, NA, NA, NA, NA,
-98, NA, NA, -140, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, -130, NA, NA, NA, NA, NA, NA, -166)), row.names = c(NA,
-31L), class = c("tbl_df", "tbl", "data.frame"))
I have the following code that fills in the gaps of NA values if the max gap is < 20 rows.
df[c('id','x','y')] <- na.locf(df[c('id','x','y')], na.rm = F, maxgap = 20)
This works completely fine on single data frames and results in the following output.
structure(list(frame = c(222, 223, 224, 225, 226, 227, 228, 229,
230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242,
243, 244, 245, 246, 247, 248, 249, 250, 251, 252), room = c("B6",
NA, NA, NA, NA, "B6", NA, NA, "B6", NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, "B6", NA, NA, NA, NA, NA, NA, "B6"
), id = c(2, 2, 2, 2, 2, 85, 85, 85, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 32, 32, 32, 32, 32, 32, 32, 1), id_prob = c(0.710559149006359,
NA, NA, NA, NA, 0.676624962451645, NA, NA, 0.650006199807849,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0.668218888964693,
NA, NA, NA, NA, NA, NA, 0.786722974412071), x = c(1606, 1606,
1606, 1606, 1606, 1319, 1319, 1319, 1636, 1636, 1636, 1636, 1636,
1636, 1636, 1636, 1636, 1636, 1636, 1636, 1636, 1636, 1636, 1316.75,
1316.75, 1316.75, 1316.75, 1316.75, 1316.75, 1316.75, 656.5),
y = c(-472.25, -472.25, -472.25, -472.25, -472.25, -516.5,
-516.5, -516.5, -463.5, -463.5, -463.5, -463.5, -463.5, -463.5,
-463.5, -463.5, -463.5, -463.5, -463.5, -463.5, -463.5, -463.5,
-463.5, -520, -520, -520, -520, -520, -520, -520, -941),
orientation = c(84.5596680381038, NA, NA, NA, NA, 51.3401926511951,
NA, NA, 71.565048727047, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, 63.4349516145757, NA, NA, NA, NA,
NA, NA, 120.963756691571), area = c(-133, NA, NA, NA, NA,
-98, NA, NA, -140, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, -130, NA, NA, NA, NA, NA, NA, -166)), row.names = c(NA,
-31L), class = c("tbl_df", "tbl", "data.frame"))
However, in order to keep track of which rows are 'filled in' and which ones were already present in the raw data, I only want to apply this to specific columns. I.e. it is critical that only the NA values of the 3 specified columns get filled in. All the other columns should remain as NA.
When I try to apply this code to the list (i.e. to run it on every dataframe within the list) I run this:
test <- lapply(list, function(x) na.locf(x[c('id','x','y')],na.rm = F, maxgap = 20))
Unfortunately this removes all other columns except for those 3 from the data.frame. This option fills in the gaps for every column
test <- lapply(list, function(x) na.locf(x,na.rm = F, maxgap = 20))
Is there a way to apply my original code to the entire list of dataframes?
Thanks!
CodePudding user response:
You can use the same code that you used for a single data frame:
test <- lapply(list, function(x) {
x[c('id','x','y')] <- na.locf(x[c('id','x','y')], na.rm = F, maxgap = 20)
x
})