I am going to start with an example:
inv <- tibble::tribble(
~Date, ~Material, ~Quantity,
"2020-01-01", "nails", 10L,
"2020-01-01", "nails", 100L,
"2020-02-02", "nails", 50L,
"2020-02-02", "nails", 70L,
"2020-01-01", "hammer", 20L,
"2020-01-01", "hammer", 80L,
"2020-02-02", "hammer", 20L,
"2020-02-02", "hammer", 50L
)
This is the example data set. The original data set is way longer with 100 items and dates.
- First I subset data for one day
- Sum all item quantity in that day
- Divide each item quantity with the sum(from previous step), lets call the result weight
- multiply each items quantity with the corresponding weight and sum the results for each item.
- Divide result from step 4 with the sum of all weights.
- Repeat for all days.
The result from the example should be:
- First all items in day 2020-01-01
- 10 100 20 80 = 210
- 10/210 = 0.048, 100/210 = 0.48, 20/210 = 0.095, 80/210 = 0.38
- 10 * 0.048 100 * 0.48 20 * 0.095 80 * 0.38 = 0.48 48 1.9 30.4 = 80.78
- 80.78/(0.048 0.48 0.095 0.38) = 80.78/1.003 = 80.54
- And the same process for 2020-02-02
I though that maybe its doable with a for loop, this is how far I got and now I am stuck:
inv <- read.csv("Book1.csv")
inv$Date <- as.Date(inv$Date)
n <- unique(inv$Date)
item <- vector()
length(n)
for (i in 1:length(n)){
day <- subset(inv, subset = (Date == n[i]))
day_total <- sum(day$Quantity)
m <- unique(day$Material)
for (j in 1:length(m)){
material <- subset(day, subset = (Material == m[j]))
material_total <- sum(material$Quantity)
material_weight <- material_total/day_total
item[j] <- material_total/day_total
}
}
If you have any suggestion it will be greatly appreciated
CodePudding user response:
Here's the dplyr
version of your algorithm:
library(dplyr)
inv %>%
group_by(Date) %>%
mutate(
weight = Quantity / sum(Quantity),
) %>%
summarize(
result = sum(Quantity * weight)
)
# # A tibble: 2 × 2
# Date result
# <chr> <dbl>
# 1 2020-01-01 80.5
# 2 2020-02-02 54.2
Or we can use the built-in weighted.mean
function directly for the same result:
inv %>%
group_by(Date) %>%
summarize(
result = weighted.mean(Quantity, w = Quantity / sum(Quantity))
)
If I'm misunderstanding the goal, please edit your question to show the desired output for the sample input.