Home > Mobile >  Get Average of two rows within group in a dataframe, R
Get Average of two rows within group in a dataframe, R

Time:01-25

I am trying to insert a new row of the average value between the same month group and thus create a new item for that row. So for example the three rows:

  • 3MTH_IMPVOL_90%MNY_DF: 23.9

  • 3MTH_IMPVOL_100%MNY_DF: 18.0

  • 3MTH_IMPVOL_110%MNY_DF: 15.0

Will then have 1 row in between each and produce a new entry:

  • 3MTH_IMPVOL_90%MNY_DF: 23.9
  • 3MTH_IMPVOL_95%MNY_DF: 20.95
  • 3MTH_IMPVOL_100%MNY_DF: 18.0
  • 3MTH_IMPVOL_105%MNY_DF: 16.5
  • 3MTH_IMPVOL_110%MNY_DF: 15.0

Here is some code that I tried myself but didnt work, maybe it helps:

Issue here, want the entire dataset as is with extra rows, not deprecated values:

df %>% group_by(datestamp) %>% 
  mutate(id = rep(1:n(), length.out = n(), each = 4)) %>%
  mutate(item
  )%>% 
  summarize(item    = unique(datestamp),
            item       = paste0(item[1], '_', item[2]),
            value_mean = mean(value))

Found this, but doesn't work for me either:

do.call(rbind,
        lapply(seq(1, nrow(df), 3), function(i){
          x <- jse_vol[ i:(i   1), , drop = FALSE]
          x <- x %>% group_by(item)
          res <- rbind(x['value'], colSums(x['value'])/2)
          rownames(res)[ nrow(res) ] <- paste(rownames(x), collapse = "_")
          res
        }))

Lastly:

df%>%
  data.frame() %>%
  rownames_to_column() %>%
  mutate_if(is.factor, as.numeric) %>%
  group_by(group = gl(n()/2, 2)) %>%
  group_map(~ bind_rows(.x, tibble(rowname = paste(.x$rowname, collapse = "_"), 
                                   t1 = mean(.x$value),
  )))

Dataframe:

   datestamp  entity        short_name                       item                     value source
   <date>     <chr>         <chr>                            <chr>                    <dbl> <chr> 
 1 2006-01-03 JALSHTR Index FTSE/JSE Africa All Shares Index 3MTH_IMPVOL_90%MNY_DF     23.9 PSEC  
 2 2006-01-03 JALSHTR Index FTSE/JSE Africa All Shares Index 6MTH_IMPVOL_100.0%MNY_DF  18.0 PSEC  
 3 2006-01-03 JALSHTR Index FTSE/JSE Africa All Shares Index 9MTH_IMPVOL_110%MNY_DF    16.2 PSEC  
 4 2006-01-03 JALSHTR Index FTSE/JSE Africa All Shares Index 12MTH_IMPVOL_90%MNY_DF    21.0 PSEC  
 5 2006-01-03 JALSHTR Index FTSE/JSE Africa All Shares Index 3MTH_IMPVOL_100%MNY_DF    18   PSEC  
 6 2006-01-03 JALSHTR Index FTSE/JSE Africa All Shares Index 6MTH_IMPVOL_110.0%MNY_DF  15.8 PSEC  
 7 2006-01-03 JALSHTR Index FTSE/JSE Africa All Shares Index 9MTH_IMPVOL_90%MNY_DF     21.5 PSEC  
 8 2006-01-03 JALSHTR Index FTSE/JSE Africa All Shares Index 12MTH_IMPVOL_100%MNY_DF   18.5 PSEC  
 9 2006-01-03 JALSHTR Index FTSE/JSE Africa All Shares Index 3MTH_IMPVOL_110%MNY_DF    15.0 PSEC  
10 2006-01-03 JALSHTR Index FTSE/JSE Africa All Shares Index 6MTH_IMPVOL_90.0%MNY_DF   22.0 PSEC  
11 2006-01-03 JALSHTR Index FTSE/JSE Africa All Shares Index 9MTH_IMPVOL_100%MNY_DF    18.3 PSEC  
12 2006-01-03 JALSHTR Index FTSE/JSE Africa All Shares Index 12MTH_IMPVOL_110%MNY_DF   16.7 PSEC  
13 2006-01-04 JALSHTR Index FTSE/JSE Africa All Shares Index 3MTH_IMPVOL_90%MNY_DF     23.9 PSEC  
14 2006-01-04 JALSHTR Index FTSE/JSE Africa All Shares Index 6MTH_IMPVOL_100.0%MNY_DF  18.0 PSEC  
15 2006-01-04 JALSHTR Index FTSE/JSE Africa All Shares Index 9MTH_IMPVOL_110%MNY_DF    16.2 PSEC  
16 2006-01-04 JALSHTR Index FTSE/JSE Africa All Shares Index 12MTH_IMPVOL_90%MNY_DF    21.0 PSEC  
17 2006-01-04 JALSHTR Index FTSE/JSE Africa All Shares Index 3MTH_IMPVOL_100%MNY_DF    18   PSEC  
18 2006-01-04 JALSHTR Index FTSE/JSE Africa All Shares Index 6MTH_IMPVOL_110.0%MNY_DF  15.8 PSEC  
19 2006-01-04 JALSHTR Index FTSE/JSE Africa All Shares Index 9MTH_IMPVOL_90%MNY_DF     21.5 PSEC  
20 2006-01-04 JALSHTR Index FTSE/JSE Africa All Shares Index 12MTH_IMPVOL_100%MNY_DF   18.5 PSEC  
21 2006-01-04 JALSHTR Index FTSE/JSE Africa All Shares Index 3MTH_IMPVOL_110%MNY_DF    15.0 PSEC  
22 2006-01-04 JALSHTR Index FTSE/JSE Africa All Shares Index 6MTH_IMPVOL_90.0%MNY_DF   22.0 PSEC  
23 2006-01-04 JALSHTR Index FTSE/JSE Africa All Shares Index 9MTH_IMPVOL_100%MNY_DF    18.3 PSEC  
24 2006-01-04 JALSHTR Index FTSE/JSE Africa All Shares Index 12MTH_IMPVOL_110%MNY_DF   16.7 PSEC

CodePudding user response:

One solution might be something like the following,

df <- data.frame(Name = paste0("Row", 1:10),
                 Value = runif(10, 10, 100))

df2 <- matrix(nrow = 2*nrow(df) - 1,
              ncol = ncol(df))

for (i in 1:nrow(df2)){
  if (i %% 2 != 0){
    df2[i,1] = df[(i 1)/2,1]
    df2[i,2] = df[(i 1)/2,2]
  }else
  if (i %% 2 == 0){
    df2[i,1] = paste0("Average of ",df[(i 1)/2,1], " and ", df[(i 2)/2,1])
    df2[i,2] = (df[i/2,2]   df[(i 2)/2,2])/2
  }
}

CodePudding user response:

Your dataset doesn't make it easy to perform this calculation because you have two important variables (month and percent) buried within a string.

The first step should be to extract these (I've used str_extract() from stringr for 1st number and number before "%").

From there, there are more options. I pivotted the percentage column into rows then calculated the averages as new columns pivotted back to a long format:

library(tidyverse)

set.seed(1337) # I generate random numbers - this is for consistency

# Make an analogue of your dataframe
df <- crossing(
  datestamp = as.Date(c("2006-01-03", "2006-01-04")),
  mnth = c("3MTH", "6MTH", "9MTH", "12MTH"),
  impvol = "_IMPVOL_",
  pc = c("90%", "100%", "110%"),
  str = "MNY_DF"
) %>% 
  transmute(datestamp, item = paste0(mnth, impvol, pc, str),
            value = round(rnorm(24, 20, 2), 1))

new_rows <- df %>%
  mutate(month = as.numeric(str_extract(item, "[0-9] ")),
         pc = as.numeric(str_extract(item, "(\\d) (?=%)"))) %>%
  arrange(datestamp, month, pc) %>%
  select(-item) %>% 
  pivot_wider(
    id_cols = c(datestamp, month),
    names_from = pc,
    values_from = value
  ) %>%
  mutate(`95` = (`90`   `100`) / 2,
         `105` = (`100`   `110`) / 2) %>%
  pivot_longer(
    cols = c(`90`, `95`, `100`, `105`, `110`),
    names_to = "pc",
    values_to = "value"
  )
  • Related