Home > Software engineering >  finding minimum for a column based on another column and keep result as a data frame
finding minimum for a column based on another column and keep result as a data frame


I have a data frame with five columns:

year<- c(2000,2000,2000,2001,2001,2001,2002,2002,2002)
k<- c(12.5,11.5,10.5,-8.5,-9.5,-10.5,13.9,14.9,15.9)
pop<- c(143,147,154,445,429,430,178,181,211)

pop_obs<- c(150,150,150,440,440,440,185,185,185)

df<- data_frame(year,k,pop,pop_obs)

   year     k   pop pop_obs
   <dbl> <dbl> <dbl>   <dbl>
1  2000  12.5   143     150
2  2000  11.5   147     150
3  2000  10.5   154     150
4  2001  -8.5   445     440
5  2001  -9.5   429     440
6  2001 -10.5   430     440
7  2002  13.9   178     185
8  2002  14.9   181     185
9  2002  15.9   211     185

what I want is, based on each year and each k which value of pop has minimum difference of pop_obs. finally, I want to keep result as a data frame based on each year and each k. my expected output would be like this:

year     k
  <dbl> <dbl>
1  2000  11.5
2  2001  -8.5
3  2003  14.9

CodePudding user response:

You could try with dplyr

df<- data.frame(year,k,pop,pop_obs)

df  %>%  
  mutate(diff = abs(pop_obs - pop)) %>% 
  group_by(year) %>% 
  filter(diff == min(diff)) %>% 
  select(year, k)

#> # A tibble: 3 x 2
#> # Groups:   year [3]
#>    year     k
#>   <dbl> <dbl>
#> 1  2000  11.5
#> 2  2001  -8.5
#> 3  2002  14.9

Created on 2021-12-11 by the reprex package (v2.0.1)

CodePudding user response:

Try tidyverse way

data_you_want = df %>%
  group_by(year, k)%>%
  ungroup() %>% 
  arrange(desc(dif)) %>% 
  select(year, k)

CodePudding user response:

Using base R

subset(df, as.logical(ave(abs(pop_obs - pop), year, 
    FUN = function(x) x == min(x))), select = c('year', 'k'))
# A tibble: 3 × 2
   year     k
  <dbl> <dbl>
1  2000  11.5
2  2001  -8.5
3  2002  14.9
  •  Tags:  
  • r
  • Related