Home > Enterprise >  Pandas selecting ranges/features of data in a dataframe
Pandas selecting ranges/features of data in a dataframe

Time:10-12

I have a time series dataframe and there are features of it that I want to delete (by applying a linear function). I already have other dataframes that indicate the start and the end of the features.

The head of my Range_Start dataframe would be like this:

Sample
0 57
1 350
2 642
3 926
4 1211

And a Range_End dataframe:

Sample
0 97
1 390
2 682
3 966
4 1251

So in dataframe A, I would like to select rows from 57 to 97, 350 to 390, and so on, and apply a linear function to the selected rows. I have a hard time figuring out how to select these ranges of data, what is the best way to do this? Thank you very much.

CodePudding user response:

dropping the rows:

You could generate a flat list of the values to drop and drop:

from itertools import chain
df2 = df.drop(list(chain(*(list(range(a,b 1)) for a,b in zip(df1.Sample, df2.Sample)))))

keeping the rows:

from itertools import chain
df.loc[list(chain(*(list(range(a,b 1)) for a,b in zip(df1.Sample, df2.Sample))))]

example (dropping the rows):

df = pd.DataFrame({'col': range(50, 1255)}, index=range(50, 1255))
df2 = df.drop(list(chain(*(list(range(a,b 1)) for a,b in zip(df1.Sample, df2.Sample)))))

output:

       col
50      50
51      51
52      52
53      53
54      54
55      55
56      56
98      98
99      99
100    100
...    ...
1209  1209
1210  1210
1252  1252
1253  1253
1254  1254

[1000 rows x 1 columns]
  • Related