R interpolate rows with different start and end columns-CodePudding

I have multiple data sets that look like the following:

country	2000	2001	2002	2003	2004	2005	2006	2007
Germany	0.1	0.2	0.5	NA	0.7	0.6	0.8	NA
USA	NA	NA	NA	0.1	0.4	NA	0.4	0.1
France	0.4	0.3	0.5	0.8	NA	NA	1.0	1.3
Belgium	NA	0.1	NA	0.5	0.6	0.5	NA	NA

I want to interpolate horizontally, starting from the first existing value until the last existing value. In essence, NAs at the start and NAs at the end should stay NAs, but NAs in the middle should be interpolated. I have been trying to think of a possible solution, but nothing comes up. I appreciate your help!

CodePudding user response：

You can use zoo::na.approx:

df[-1] <- t(apply(df[-1], 1, \(x) na.approx(x, na.rm = F)))

or even without apply:

replace(df, -1, t(na.approx(t(df[-1]))))

output

  country `2000` `2001` `2002` `2003` `2004` `2005` `2006` `2007`
1 Germany    0.1    0.2    0.5    0.6  0.7    0.6      0.8   NA  
2 USA       NA     NA     NA      0.1  0.4    0.4      0.4    0.1
3 France     0.4    0.3    0.5    0.8  0.867  0.933    1      1.3
4 Belgium   NA      0.1    0.3    0.5  0.6    0.5     NA     NA

Note that a tidy format might be easier for further wrangling here:

library(dplyr)
library(tidyr)
df %>% 
  mutate(id = 1:n()) %>% 
  pivot_longer(-c(country, id)) %>% 
  group_by(id) %>% 
  mutate(value = na.approx(value, na.rm = F))