Home > Mobile >  Drop interval of rows in a dataframe based on the value of an object
Drop interval of rows in a dataframe based on the value of an object

Time:04-08

I am trying to drop intervals of rows in my Dataframes from the maximal value (exclusive) to the rest (end) of the column. Here is an example of one of the column of my df (dflist['time']):

0     0.000000
1     0.021528
2     0.042135
3     0.062925
4     0.083498
        ...   
88    1.796302
89    1.816918
90    1.837118
91    1.857405
92    1.878976
Name: time, Length: 93, dtype: float64

I have tried to use the .iloc and the .drop function in conjunction to the .index to achieve this result but without any success so far:

for nested_dict in dict_all_raw.values():
    for dflist in nested_dict.values():
        v_max = dflist['velocity'].max()
        v_max_idx = dflist['velocity'].index[dflist['velocity'] == v_max]
        dflist['time'] = dflist['time'].iloc[0:[v_max_idx]]

I have also tried several variations, like converting 'v_max_idx' to a list with .list or a .int to change the type inside the .iloc function as it seems to be the problem:

TypeError: cannot do positional indexing on RangeIndex with these indexers [[Int64Index([15], dtype='int64')]] of type list

I don't know why I am not able to do this and it is quiet frustrating, as it seems to be a pretty basic operation..

Any help would therefore be greatly appreciated !

##EDIT REGARDING THE dropna() PROBLEM

I tried with .notna() :

for nested_dict in dict_all_raw.values():
    for dflist in nested_dict.values():
        v_max = dflist['velocity'].max()
        v_max_idx = dflist['velocity'].index[dflist['velocity'] == v_max]
        dflist['velocity'] = dflist['velocity'].iloc[0:list(v_max_idx)[0]]
        dflist['velocity'] = dflist['velocity'][dflist['velocity'].notna()]
        dflist['time'] = dflist['time'].iloc[0:list(v_max_idx)[0]]
        dflist['time'] = dflist['time'][dflist['time'].notna()]

and try with dropna():

for nested_dict in dict_all_raw.values():
    for dflist in nested_dict.values():
        v_max = dflist['velocity'].max()
        v_max_idx = dflist['velocity'].index[dflist['velocity'] == v_max]
        dflist['velocity'] = dflist['velocity'].iloc[0:list(v_max_idx)[0]].dropna()
        dflist['time'] = dflist['time'].iloc[0:list(v_max_idx)[0]].dropna()

No error messages, it just doesn't do anything:

19  0.385243  1.272031
20  0.405416  1.329072
21  0.425477  1.352059
22  0.445642  1.349657
23  0.465755  1.378407
24       NaN       NaN
25       NaN       NaN
26       NaN       NaN
27       NaN       NaN
28       NaN       NaN
29       NaN       NaN
30       NaN       NaN
31       NaN       NaN
32       NaN       NaN
33       NaN       NaN
34       NaN       NaN
35       NaN       NaN
36       NaN       NaN

CodePudding user response:

Return value of pandas.Index() in your example is pandas.Int64Index().

pandas.DataFrame.iloc() allows inputs like a slice object with ints, e.g. 1:7.

In your code, no matter v_max_idx which a pandas.Index() object or [pandas.Index()] which is a list object doesn't meet the requirements of iloc() argument type.

You can use list(v_max_idx) to convert pandas.Index() object to list then use [0] etc. to access the data, like

dflist['time'] = dflist['time'].iloc[0:list(v_max_idx)[0]]
  • Related