Home > Enterprise >  How to Create Two Amount Columns Based on Specific Categorical Column Values in R
How to Create Two Amount Columns Based on Specific Categorical Column Values in R

Time:07-20

I'm relatively new to R and I have a dataframe that looks like this:

1 2 3 4 5 6 7 8 9 10
Name Max Max Max Joey Joey Nancy Nancy Nancy Linda Linda
Amount_Type InternetBill Groceries WaterBill InternetBill Groceries WaterBill Groceries InternetBill WaterBill Groceries
Amount $75 $230.66 $40 $70 $188.75 $35 $175.89 $75 $30 $236.87

I need to add 3 more rows and pivot the dataframe:

The dataframe needs to be grouped by name and outputs 3 totals columns:

  1. Fixed_Cost which should include InternetBill and WaterBill amounts
  2. Variable_Cost which should include Groceries
  3. Total_Cost which should be fixed variable costs

So something like this:

Name Fixed_Cost Variable_Cost Total_Cost
Max $115 $230.66 $345.66
Joey $70 $188.75 $258.75
Nancy $110 $175.89 $285.89
Linda $30 $236.87 $266.87

Any advice on how to go about doing this? Thanks!

CodePudding user response:

library(tidyverse)   

setNames(data.frame(t(df1[,-1])), df1[,1]) %>%
  pivot_wider(Name, names_from = Amount_Type, values_from = Amount,
              values_fn = parse_number, values_fill = 0) %>%
  mutate(Fixed_cost = InternetBill   WaterBill, variable_cost = Groceries,
         Total_Cost = Fixed_cost   variable_cost, .keep ='unused')

# A tibble: 4 x 4
  Name  Fixed_cost variable_cost Total_Cost
  <chr>      <dbl>         <dbl>      <dbl>
1 Max          115          231.       346.
2 Joey          70          189.       259.
3 Nancy        110          176.       286.
4 Linda         30          237.       267.

CodePudding user response:

If we transpose the data, it becomes more easier to do a group by sum

library(data.table)
data.table::transpose(setDT(df1), make.names = 1)[, 
  Amount := readr::parse_number(Amount)][, 
 .(Fixed_Cost = sum(Amount[Amount_Type %in% c("InternetBill", "WaterBill")]), 
 Variable_Cost =  sum(Amount[!Amount_Type %in% c("InternetBill", "WaterBill")])),
       by = Name][,
   Total_Cost := Fixed_Cost   Variable_Cost][]

-output

     Name Fixed_Cost Variable_Cost Total_Cost
   <char>      <num>         <num>      <num>
1:    Max        115        230.66     345.66
2:   Joey         70        188.75     258.75
3:  Nancy        110        175.89     285.89
4:  Linda         30        236.87     266.87  

data

df1 <- structure(list(`0` = c("Name", "Amount_Type", "Amount"), `1` = c("Max", 
"InternetBill", "$75"), `2` = c("Max", "Groceries", "$230.66"
), `3` = c("Max", "WaterBill", "$40"), `4` = c("Joey", "InternetBill", 
"$70"), `5` = c("Joey", "Groceries", "$188.75"), `6` = c("Nancy", 
"WaterBill", "$35"), `7` = c("Nancy", "Groceries", "$175.89"), 
    `8` = c("Nancy", "InternetBill", "$75"), `9` = c("Linda", 
    "WaterBill", "$30"), `10` = c("Linda", "Groceries", "$236.87"
    )), class = "data.frame", row.names = c(NA, -3L))
  • Related