Home > Enterprise >  Percent increase from first column dplyr
Percent increase from first column dplyr

Time:10-10

I have

library(tidyverse)
gapminder <- readr::read_csv("https://raw.githubusercontent.com/OHI-Science/data-science-training/master/data/gapminder.csv") 
gapminder %>%
  group_by(continent, year) %>%
  summarize(cont_pop = sum(pop)) %>%
  arrange(year) %>%
  spread(year,value =cont_pop )

Instead of the absolute numbers I would like to return the percent increase from year 1952 (first column). Is it possible in pure dplyr / tidyverse?

CodePudding user response:

Yes, you may do -

library(dplyr)
library(tidyr)
library(gapminder)

gapminder %>%
  group_by(continent, year) %>%
  summarise(cont_ratio = sum(pop)) %>%
  mutate(cont_ratio = (cont_ratio - first(cont_ratio))/first(cont_ratio)) %>%
  ungroup %>%
  pivot_wider(names_from = year, values_from = cont_ratio)

#  continent `1952` `1957` `1962` `1967` `1972` `1977` `1982` `1987` `1992` `1997` `2002` `2007`
#  <fct>      <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>
#1 Africa         0 0.114   0.248  0.411  0.599  0.822  1.10   1.42   1.77   2.13   2.51   2.91 
#2 Americas       0 0.121   0.255  0.393  0.534  0.675  0.826  0.978  1.14   1.31   1.46   1.60 
#3 Asia           0 0.120   0.216  0.366  0.542  0.709  0.871  1.06   1.25   1.42   1.58   1.73 
#4 Europe         0 0.0473  0.101  0.151  0.197  0.237  0.271  0.299  0.335  0.361  0.383  0.402
#5 Oceania        0 0.118   0.243  0.366  0.507  0.613  0.721  0.832  0.958  1.08   1.19   1.30 
  • Related