Home > Software design >  Data transformation by multiple factors R
Data transformation by multiple factors R

Time:09-29

I need to Normalize some data values (e.g., subtracting from one A-value to another B-value), according to two factors. Assume a data frame like this one:

mydf <- data.frame("countries" = c(rep("USA",3), rep("China",3), rep("France",3)), "Type" = rep(c("Grass","Cow", "Bunny"), 3), "W1" = rnorm(9, 20, 10), "W2" = rnorm(9, 60, 10))

For each Country, I would like to subtract the values of both W1 and W2 of grass from Cows and Bunnies, for example: W1_Norm_Cows = W1_Cows - W1_Grass.

I tried to create a pipe using dplyr, but I got stuck.

normalized_mydf <-  mydf %>% 
  group_by(Country) %>% 
  mutate(W1_norm = W1-??) #How can I specify which values to use?
  mutate(W2_norm = W2-??) #How can I specify which values to use?

I tried to use group_by twice, but of course, the script is wrong (or not complete) because it subtracts the W1 value to itself, giving zero to all the observations.

CodePudding user response:

I suggest reshaping the dataframe in this way:

mydf %>%
  gather("key", "value", -c(countries, Type)) %>%
  spread(Type, value) %>%
  mutate(Norm_cows = Cow - Grass,
         Norm_bunny = Bunny - Grass)

  countries key     Bunny       Cow    Grass   Norm_cows Norm_bunny
1     China  W1 25.347062 20.003938 19.71476   0.2891776    5.632301
2     China  W2 69.627360 49.867221 61.53735 -11.6701307    8.090008
3    France  W1  9.150308 18.257695 26.26750  -8.0098078  -17.117195
4    France  W2 58.396270 65.252741 76.46905 -11.2163100  -18.072781
5       USA  W1 18.803997  6.361533 30.94399 -24.5824534  -12.139989
6       USA  W2 77.703504 69.438751 60.23095   9.2078025   17.472556

And pivot a bit more to get into what seems to be the desired shape:

mydf %>%
  select(-c(Bunny, Cow, Grass)) %>%
  gather('new_key', 'value', c(Norm_cows, Norm_bunny)) %>%
  mutate(key = paste0(key, "_", new_key)) %>%
  select(-new_key) %>%
  spread(key, value)

  countries W1_Norm_bunny W1_Norm_cows W2_Norm_bunny W2_Norm_cows
1     China      5.632301    0.2891776      8.090008   -11.670131
2    France    -17.117195   -8.0098078    -18.072781   -11.216310
3       USA    -12.139989  -24.5824534     17.472556     9.207803
  •  Tags:  
  • r
  • Related