Home > Back-end >  Defining Cycles based on Thresholds
Defining Cycles based on Thresholds

Time:07-24

I have a data set containing nonuniform 30,000 rows that consist of three variables: Time, Chamber and temperature. The two chambers heat and cool at different rates and I need to assign a cycle count every time a positive slope above 75 degrees. I am trying to convert the x-axis from time, to cycle of heating and cooling. Expected output

CodePudding user response:

A cycle change can be identified by the following algorithm:

  1. Sort your data by Time
  2. Find rows for which Temp > 75 returns TRUE
  3. Within that subset of rows, find rows where Temp <= 75 returns TRUE for the previous row.

Once we have identified cycle changes, we can obtain a running tally of cycles using cumsum().

I've implemented this in the code below. Since there is ambiguity about what you want to do with ChamberID, I've assumed you want to track cycles separating for each of the subgroups defined by the unique values of ChamberID.

Create Data

# Create your original data, stored in a data.frame
df <- data.frame(Time=c(100674751, 100674851, 100674951, 100675051, 100675151, 100675251, 100675351, 100675451, 100675551, 100675651, 100675751, 100675851, 100675951, 100676051, 100676151, 100676251, 100676351, 100676451, 100676551, 100676651, 100676751, 100676851, 100676951, 100677051, 100677151, 100677251, 100677351, 100677451, 100677551, 100677651, 100677751, 100677851, 100677951, 100678051, 100678151, 100678251, 100678351, 100678451, 100678551, 100678651, 100678751, 100678851, 100678951, 100679051, 100679151, 100679251, 100679351, 100679451, 100679551, 100679651, 100679751, 100679851, 100679951, 100680051, 100680151, 100680251, 100680351, 100680451, 100680551, 100680651, 100680751, 100680851, 100680951, 100681051, 100681151, 100681251, 100681351, 100681451, 100681551, 100681651, 100681751, 100681851, 100681951, 100682051, 100682151, 100682251, 100682351, 100682451, 100682551, 100682651, 100682751, 100682851, 100682951, 100683051, 100683151, 100683251, 100683351, 100683451, 100683551, 100683651, 100683751, 100683851, 100683951, 100684051, 100684151, 100684251, 100684351, 100684451, 100684551, 100684651, 100684751, 100684851, 100684951, 100685051, 100685151, 100685251, 100685351, 100685451, 100685551, 100685651, 100685751, 100685851, 100685951, 100686051, 100686151, 100686251, 100686351, 100686451, 100686551, 100686651, 100686751, 100686851, 100686951, 100687051, 100687151, 100687251, 100687351, 100687451, 100687551, 100687651, 100687751, 100687851, 100687951, 100688051, 100688151, 100688251, 100688351, 100688451, 100688551, 100688651, 100688751, 100688851, 100688951, 100689051, 100689151, 100689251, 100689351, 100689451, 100689551, 100689651, 100689751, 100689851, 100689951, 100690051, 100690151, 100690251, 100690351, 100690451, 100690551, 100690651, 100690751, 100690851, 100690951, 100691051, 100691151, 100691251, 100691351, 100691451, 100691551, 100691651, 100691751, 100691851, 100691951, 100692051, 100692151, 100692251, 100692351, 100692451, 100692551, 100692651, 100692751, 100692851, 100692951, 100693051, 100693151, 100693251),  
Chamber_ID=c(0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1),
Temp=c(56.8, 58.2, 59.7, 59.7, 61.4, 63.2, 63.2, 65.0, 66.8, 66.8, 68.7, 70.5, 70.5, 72.3, 74.1, 74.1, 75.9, 77.6, 77.6, 79.3, 81.0, 81.0, 82.7, 84.4, 84.4, 87.9, 86.6, 86.6, 85.4, 84.1, 84.1, 82.9, 81.8, 81.8, 80.6, 79.5, 79.5, 78.3, 77.2, 77.2, 76.2, 75.1, 75.1, 74.1, 73.0, 73.0, 72.0, 71.1, 71.1, 70.1, 69.1, 69.1, 68.2, 67.2, 67.2, 66.3, 65.4, 65.4, 64.6, 63.7, 63.7, 62.8, 62.0, 62.0, 61.2, 60.4, 60.4, 59.6, 58.8, 58.8, 58.0, 57.3, 57.3, 56.8, 58.2, 58.2, 59.7, 61.4, 61.4, 63.2, 65.0, 65.0, 66.9, 68.7, 68.7, 70.5, 72.3, 72.3, 74.1, 75.9, 75.9, 77.6, 79.3, 79.3, 81.0, 82.7, 82.7, 84.4, 86.0, 86.0, 87.7, 89.3, 86.6, 85.3, 85.3, 84.1, 82.9, 82.9, 81.7, 80.6, 80.6, 79.4, 78.3, 78.3, 77.2, 76.1, 76.1, 75.1, 74.1, 74.1, 73.0, 72.0, 72.0, 71.0, 70.1, 70.1, 69.1, 68.1, 68.1, 67.2, 66.3, 66.3, 65.4, 64.5, 64.5, 63.7, 62.8, 62.8, 62.0, 61.2, 61.2, 60.4, 59.6, 59.6, 58.8, 58.0, 58.0, 57.3, 56.5, 56.5, 56.9, 56.9, 58.2, 59.7, 59.7, 61.5, 63.2, 63.2, 65.0, 66.9, 66.9, 68.7, 70.5, 70.5, 72.3, 74.1, 74.1, 75.9, 77.6, 77.6, 79.3, 81.0, 81.0, 82.7, 84.4, 84.4, 86.0, 87.7, 87.7, 89.3, 90.9, 90.9, 92.5, 94.0, 94.0, 95.6))

data.table Solution

# Convert to data.table
dt <- as.data.table(df)

# Add a 1-period lagged temperature column
dt[order(Time), Temp_lag1 := shift(Temp, 1), by = Chamber_ID]

# Add indicator for whether a cycle change occurs
dt[order(Time), 
   cycle_change := ifelse(Temp > 75 & Temp_lag1 <= 75, 1, 0),
   by = Chamber_ID]

# Add cycle tracker (add 1 so first cycle is 1 not 0)
dt[order(Time), 
   cycle_id := cumsum(cycle_change)   1, 
   by = Chamber_ID]

# Print distribution of cycles across chambers
table(chamber_id = dt$Chamber_ID, cycle_id = dt$cycle_id)
#>           cycle_id
#> chamber_id  1  2  3  4
#>          0  8 37 39  9
#>          1  8 36 39 10

dplyr Solution

result <- df %>%
  group_by(Chamber_ID) %>%
  arrange(Time) %>%
  mutate(
    Temp_lag1 = lag(Temp, 1),
    cycle_change = if_else(Temp > 75 & Temp_lag1 <= 75, 1, 0),
    cycle_id = cumsum(cycle_change)   1
  )

table(result$Chamber_ID, result$cycle_id)
#>    
#>      1  2  3  4
#>   0  8 37 39  9
#>   1  8 36 39 10

According to the above, there are 4 cycles per chamber, with the longest durations being the second and third cycles.

  • Related