I'm looking to assess the importance of the cleaning frequency per day in the amount of feces that accumulates in the floor of a cattle barn. The problem is that I'm only able to correctly calculate the amount of feces after the first cleaning event (i.e, first reduction in feces), with subsequent calculations failing to consider that a cleaning event already happened.
For my attempt, I defined four variables and two constants in R:
Variables
- "hour" (numeric; from 1 to 24)
- "binclean" (binary; 0=no cleaning event, 1=cleaning event)
- "noclean" (numeric; variable storing the cumulative amount of feces in the barn's floor when there is no cleaning)
- "clean" (numeric; contains the amount of feces in the barn's floor after cleaning)
Constants
- "totalfeceshour" (amount of feces produced every hour by 250 cattle = 239.5 kg)
- "H" (amount of feces remaining in the environment after the floor is scraped = 0.05% of feces)
I created a matrix to store all the variables and perform calculations. What I've done is use the Reduce
function (base R function) to get the feces accumulated without any cleaning and store it in the noclean
variable. After this, I used a "for loop" with the ifelse()
function to signal the occurrence of a cleaning event. Information about feces after cleaning is then stored in the "clean" variable. So far, I've been able to obtain the correct amount of feces on the floor only after the first cleaning event, but this information is not carried on to subsequent cells. For example, by the 7th hour of the day, there are 1,676 kg of feces in the floor. If a cleaning event occurs at the 8th hour, then the total amount of feces after cleaning is (1,676 239.5) * 0.05 = 95.8. If you run the code, you will see that the amount of feces by the 8th hour goes up to 2,155.5 (when it should be 95.8 239.5 = 335.3 after a cleaning event already happened). This is my code:
#Constants
totalfeceshour <- 239.5
H <- 1-0.95
#Variables
hour<- 1:24
cleanbin <- c(0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 1) #3 scrapes at hours 8, 16 and
#24. Each value represent an
#hour of the day.
noclean <- c(Reduce(" ", c(totalfeceshour, rep(totalfeceshour,
23)), accumulate = TRUE)) #Feces in the environment
clean <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0)
#Matrix
mtrx <- matrix(data=c(hour, cleanbin, noclean, clean), ncol=4)
colnames(mtrx) <- c("hour", "cleanbin", "noclean", "clean")
#Loop
for(i in 1:length(mtrx[, 2])){
mtrx[, 4][i] <- ifelse(mtrx[, 2][i] == 0, mtrx[, 3][i], mtrx[,3]
[i]*H)
}
mtrx
If you run the code, you could see that cleaning only reduced the amount of feces in cells in which cleanbin==1
, but this information is not carried on to to subsequent cells. I've tried using nested "for" loops, "shift" function ("data.table" package), and other tools but haven't been able to solve this yet. I was able to create an Excel sheet that does the trick, but I am looking to achieve the same in R. I would deeply appreciate your assistance in coding this.
CodePudding user response:
Here's a tidyverse(-ish) solution. The most difficult part of the solution is that you can't use built-in solutions like group_map
or ifelse
as they stand because all of these options work atomically - that is, they operate on a column as whole, whereas you need to work iteratively, building up a solution row by row.
Because my workflow is different to yours, I've renamed some of your variables and dropped others.
First, create a data frame containing the basic information that defines the problem.
library(tidyverse)
d <- tibble(
Hour=1:24,
FecesProduced=239.5,
CleaningEvent=Hour %in% c(8, 16, 24),
CumulativeFeces=ifelse(Hour==1, FecesProduced, NA)
)
d
# A tibble: 24 × 4
Hour FecesProduced CleaningEvent CumulativeFeces
<int> <dbl> <lgl> <dbl>
1 1 240. FALSE 240.
2 2 240. FALSE NA
3 3 240. FALSE NA
4 4 240. FALSE NA
5 5 240. FALSE NA
6 6 240. FALSE NA
7 7 240. FALSE NA
8 8 240. TRUE NA
9 9 240. FALSE NA
10 10 240. FALSE NA
# … with 14 more rows
Now step through rows 2 to 24, populating CumulativeFeces
as required.
for(row in 2:nrow(d)) {
d <- d %>% mutate(
CumulativeFeces=ifelse(
row_number() == row,
ifelse(
CleaningEvent,
(FecesProduced lag(CumulativeFeces)) * 0.05,
FecesProduced lag(CumulativeFeces)
),
CumulativeFeces
)
)
}
Now override the default behaviour when printing tibbles to ensure the accuracy of the solution is demonstrated.
d %>% mutate(CumulativeFeces=num(CumulativeFeces, digits=2))
# A tibble: 24 × 4
Hour FecesProduced CleaningEvent CumulativeFeces
<int> <dbl> <lgl> <num:.2!>
1 1 240. FALSE 239.50
2 2 240. FALSE 479.00
3 3 240. FALSE 718.50
4 4 240. FALSE 958.00
5 5 240. FALSE 1197.50
6 6 240. FALSE 1437.00
7 7 240. FALSE 1676.50
8 8 240. TRUE 95.80
9 9 240. FALSE 335.30
10 10 240. FALSE 574.80
# … with 14 more rows