I'm relatively new to R and I have a dataframe that looks like this:
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | |
---|---|---|---|---|---|---|---|---|---|---|
Name | Max | Max | Max | Joey | Joey | Nancy | Nancy | Nancy | Linda | Linda |
Amount_Type | InternetBill | Groceries | WaterBill | InternetBill | Groceries | WaterBill | Groceries | InternetBill | WaterBill | Groceries |
Amount | $75 | $230.66 | $40 | $70 | $188.75 | $35 | $175.89 | $75 | $30 | $236.87 |
I need to add 3 more rows and pivot the dataframe:
The dataframe needs to be grouped by name and outputs 3 totals columns:
- Fixed_Cost which should include InternetBill and WaterBill amounts
- Variable_Cost which should include Groceries
- Total_Cost which should be fixed variable costs
So something like this:
Name | Fixed_Cost | Variable_Cost | Total_Cost |
---|---|---|---|
Max | $115 | $230.66 | $345.66 |
Joey | $70 | $188.75 | $258.75 |
Nancy | $110 | $175.89 | $285.89 |
Linda | $30 | $236.87 | $266.87 |
Any advice on how to go about doing this? Thanks!
CodePudding user response:
library(tidyverse)
setNames(data.frame(t(df1[,-1])), df1[,1]) %>%
pivot_wider(Name, names_from = Amount_Type, values_from = Amount,
values_fn = parse_number, values_fill = 0) %>%
mutate(Fixed_cost = InternetBill WaterBill, variable_cost = Groceries,
Total_Cost = Fixed_cost variable_cost, .keep ='unused')
# A tibble: 4 x 4
Name Fixed_cost variable_cost Total_Cost
<chr> <dbl> <dbl> <dbl>
1 Max 115 231. 346.
2 Joey 70 189. 259.
3 Nancy 110 176. 286.
4 Linda 30 237. 267.
CodePudding user response:
If we transpose
the data, it becomes more easier to do a group by sum
library(data.table)
data.table::transpose(setDT(df1), make.names = 1)[,
Amount := readr::parse_number(Amount)][,
.(Fixed_Cost = sum(Amount[Amount_Type %in% c("InternetBill", "WaterBill")]),
Variable_Cost = sum(Amount[!Amount_Type %in% c("InternetBill", "WaterBill")])),
by = Name][,
Total_Cost := Fixed_Cost Variable_Cost][]
-output
Name Fixed_Cost Variable_Cost Total_Cost
<char> <num> <num> <num>
1: Max 115 230.66 345.66
2: Joey 70 188.75 258.75
3: Nancy 110 175.89 285.89
4: Linda 30 236.87 266.87
data
df1 <- structure(list(`0` = c("Name", "Amount_Type", "Amount"), `1` = c("Max",
"InternetBill", "$75"), `2` = c("Max", "Groceries", "$230.66"
), `3` = c("Max", "WaterBill", "$40"), `4` = c("Joey", "InternetBill",
"$70"), `5` = c("Joey", "Groceries", "$188.75"), `6` = c("Nancy",
"WaterBill", "$35"), `7` = c("Nancy", "Groceries", "$175.89"),
`8` = c("Nancy", "InternetBill", "$75"), `9` = c("Linda",
"WaterBill", "$30"), `10` = c("Linda", "Groceries", "$236.87"
)), class = "data.frame", row.names = c(NA, -3L))