I am using the code below to produce following result in Python and I want equivalent for this code on R.
here N
is the column of dataframe data
. CN
column is calculated from values of column N
with a specific pattern and it gives me following result in python.
--- ----
| N | CN |
--- ----
| 0 | 0 |
| 1 | 1 |
| 1 | 1 |
| 2 | 2 |
| 2 | 2 |
| 0 | 3 |
| 0 | 3 |
| 1 | 4 |
| 1 | 4 |
| 1 | 4 |
| 2 | 5 |
| 2 | 5 |
| 3 | 6 |
| 4 | 7 |
| 0 | 8 |
| 1 | 9 |
| 2 | 10 |
--- ----
a short overview of my code is
data = pd.read_table(filename,skiprows=15,decimal=',', sep='\t',header=None,names=["Date ","Heure ","temps (s) ","X","Z"," LVDT V(mm) " ,"Force normale (N) ","FT","FN(N) ","TS"," NS(kPa) ","V (mm/min)","Vitesse normale (mm/min)","e (kPa)","k (kPa/mm) " ,"N " ,"Nb cycles normal" ,"Cycles " ,"Etat normal" ,"k imposÈ (kPa/mm)"])
data.columns = [col.strip() for col in data.columns.tolist()]
N = data[data.keys()[15]]
N = np.array(N)
data["CN"] = (data.N.shift().bfill() != data.N).astype(int).cumsum()
an example of data.head() is here
------- ------------- ------------ ----------- ---------- ---------- ------------ ------------------- ----------- ------------- ----------- ------------ ------------ -------------------------- ------------ ------------ ----- ------------------ -------- ------------- ------------------- ----
| Index | Date | Heure | temps (s) | X | Z(mm) | LVDT V(mm) | Force normale (N) | FT | FN(N) | FT (kPa) | NS(kPa) | V (mm/min) | Vitesse normale (mm/min) | e (kPa) | k (kPa/mm) | N | Nb cycles normal | Cycles | Etat normal | k imposÈ (kPa/mm) | CN |
------- ------------- ------------ ----------- ---------- ---------- ------------ ------------------- ----------- ------------- ----------- ------------ ------------ -------------------------- ------------ ------------ ----- ------------------ -------- ------------- ------------------- ----
| 184 | 01/02/2022 | 12:36:52 | 402.163 | 6.910243 | 1.204797 | 0.001101 | 299.783665 | 31.494351 | 1428.988908 | 11.188704 | 505.825016 | 0.1 | 2.0 | 512.438828 | 50.918786 | 0.0 | 0.0 | Sort | Monte | 0.0 | 0 |
| 185 | 01/02/2022 | 12:36:54 | 404.288 | 6.907822 | 1.205647 | 4.9e-05 | 296.072718 | 31.162313 | 1404.195316 | 11.028167 | 494.97955 | 0.1 | -2.0 | 500.084986 | 49.685639 | 0.0 | 0.0 | Sort | Descend | 0.0 | 0 |
| 186 | 01/02/2022 | 12:36:56 | 406.536 | 6.907906 | 1.204194 | -0.000214 | 300.231424 | 31.586401 | 1429.123486 | 11.21895 | 505.750815 | 0.1 | 2.0 | 512.370164 | 50.914002 | 0.0 | 0.0 | Sort | Monte | 0.0 | 0 |
| 187 | 01/02/2022 | 12:36:58 | 408.627 | 6.910751 | 1.204293 | -0.000608 | 300.188686 | 31.754064 | 1428.979519 | 11.244542 | 505.624564 | 0.1 | 2.0 | 512.309254 | 50.906544 | 0.0 | 0.0 | Sort | Monte | 0.0 | 0 |
| 188 | 01/02/2022 | 12:37:00 | 410.679 | 6.907805 | 1.205854 | -0.000181 | 296.358074 | 31.563389 | 1415.224427 | 11.129375 | 502.464948 | 0.1 | 2.0 | 510.702313 | 50.742104 | 0.0 | 0.0 | Sort | Monte | 0.0 | 0 |
------- ------------- ------------ ----------- ---------- ---------- ------------ ------------------- ----------- ------------- ----------- ------------ ------------ -------------------------- ------------ ------------ ----- ------------------ -------- ------------- ------------------- ----
CodePudding user response:
A one line cumsum
trick solves it.
cumsum(c(0L, diff(df1$N) != 0))
#> [1] 0 1 1 2 2 3 3 4 4 4 5 5 6 7 8 9 10
all.equal(
cumsum(c(0L, diff(df1$N) != 0)),
df1$CN
)
#> [1] TRUE
Created on 2022-02-14 by the reprex package (v2.0.1)
Data
x <- "
--- ----
| N | CN |
--- ----
| 0 | 0 |
| 1 | 1 |
| 1 | 1 |
| 2 | 2 |
| 2 | 2 |
| 0 | 3 |
| 0 | 3 |
| 1 | 4 |
| 1 | 4 |
| 1 | 4 |
| 2 | 5 |
| 2 | 5 |
| 3 | 6 |
| 4 | 7 |
| 0 | 8 |
| 1 | 9 |
| 2 | 10 |
--- ---- "
df1 <- read.table(textConnection(x), header = TRUE, sep = "|", comment.char = " ")[2:3]
Created on 2022-02-14 by the reprex package (v2.0.1)