How to verify if each element in a pandas columns increase strictly by 1 from row to row-CodePudding

Let's say I have the following pandas.dataframe:

data

      series    time_idx    value
0        0          0      -0.000000
1        0          1       0.018844
2        0          2       0.028694
3        0          3       0.050784
4        0          4       0.067037
...      ...        ...     ...
3995     9          395     0.973978
3996     9          396     0.944002
3997     9          397     1.001089
3998     9          398     1.132001
3999     9          399     1.169244
4000 rows × 3 columns

I want to test if for each series (0..9) the time indexes are incremented by 1 from row to row and if not where the difference is?

I thought about sorting the dataframe by series and by time_index and then compare to the index mod 400, but it's not a nice solution. Any suggestions?

Thanks

CodePudding user response：

The following is based on what I understand from your question. See if this answers your question. I have to use 'True' instead of Boolean True because the dataframe converts it to the numeric 1.0.

df['IncOne'] = (df.series==df.series.shift())
df['IncOne'] = (
    np.where(df.IncOne, 
        np.where( df.time_idx.eq(df.time_idx.shift() 1), 
                  'True' , df.time_idx-df.time_idx.shift() ),
    ''))

	series	time_idx	value	IncOne
0	0	0	0
1	0	1	0.018844	True
2	0	2	0.028694	True
3	0	3	0.050784	True
4	0	4	0.067037	True
5	0	6	0	2.0
6	0	7	0.018844	True
7	0	8	0.028694	True
8	0	9	0.050784	True
9	0	12	0.067037	3.0
10	0	13	1	True
11	9	395	0.973978
12	9	396	0.944002	True
13	9	397	1.00109	True
14	9	398	1.132	True
15	9	399	1.16924	True

CodePudding user response：

Assuming that the dataframe is df you can try this:

df["diff"] = df.groupby(by="series")["time_idx"].diff().fillna(1) != 1

It will create a new column "diff" with boolean values. A True value indicates that the difference between the time_idx value in the current row and the one preceding it is different than one. Only differences between rows corresponding to the same series can give a True value.