I have r dataframe in following format
-------- --------------- -------------------- --------
| time | Stress_ratio | shear_displacement | CX |
-------- --------------- -------------------- --------
| <dbl> | <dbl> | <dbl> | <dbl> |
| 50.1 | -0.224 | 4.9 | 0 |
| 50.2 | -0.219 | 4.98 | 0.0100 |
| . | . | . | . |
| . | . | . | . |
| 249.3 | -0.217 | 4.97 | 0.0200 |
| 250.4 | -0.214 | 4.96 | 0.0300 |
| 251.1 | -0.222 | 4.91 | 0.06 |
| 252.1 | -0.222 | 4.91 | 0.06 |
| 253.3 | -0.222 | 4.91 | 0.06 |
| 254.5 | -0.222 | 4.91 | 0.06 |
| 256.8 | -0.222 | 4.91 | 0.06 |
| . | . | . | . |
| . | . | . | . |
| 500.1 | -0.22 | 4.91 | 0.6 |
| 501.4 | -0.22 | 4.91 | 0.6 |
| 503.1 | -0.22 | 4.91 | 0.6 |
-------- --------------- -------------------- --------
and I want a new column which has repetitive values based on the difference between a range of values in column time. The range should be 250 for the column time. For example in all the rows of new_column
I should get number 1 when df$time[1]
and df$time[1]*4.98
is 250. Similarly this number 1 should change to 2 when the next chunk starts of difference of 250. So the new dataframe should be like
-------- --------------- -------------------- -------- ------------
| time | Stress_ratio | shear_displacement | CX | new_column |
-------- --------------- -------------------- -------- ------------
| <dbl> | <dbl> | <dbl> | <dbl> | <dbl> |
| 50.1 | -0.224 | 4.9 | 0 | 1 |
| 50.2 | -0.219 | 4.98 | 0.0100 | 1 |
| . | . | . | . | 1 |
| . | . | . | . | 1 |
| 249.3 | -0.217 | 4.97 | 0.0200 | 1 |
| 250.4 | -0.214 | 4.96 | 0.0300 | 2 |
| 251.1 | -0.222 | 4.91 | 0.06 | 2 |
| 252.1 | -0.222 | 4.91 | 0.06 | 2 |
| 253.3 | -0.222 | 4.91 | 0.06 | 2 |
| 254.5 | -0.222 | 4.91 | 0.06 | 2 |
| 256.8 | -0.222 | 4.91 | 0.06 | 2 |
| . | . | . | . | . |
| . | . | . | . | . |
| 499.1 | -0.22 | 4.91 | 0.6 | 2 |
| 501.4 | -0.22 | 4.91 | 0.6 | 3 |
| 503.1 | -0.22 | 4.91 | 0.6 | 3 |
-------- --------------- -------------------- -------- ------------
CodePudding user response:
If I understand what you're trying to do, a base
R solution could be:
df$new_column <- df$time %/% 250 1
The %/%
operator is integer division (sort of the complement of the modulus operator) and tells you how many copies of 250 would fit into your number; we add 1 to get the value you want.
The tidyverse
version:
df <- df %>%
mutate(new_column = time %/% 250 1)
CodePudding user response:
library(data.table)
setDT(df)[, new_column := rleid(time %/% 250)][]