I have a time series dataframe and there are features of it that I want to delete (by applying a linear function). I already have other dataframes that indicate the start and the end of the features.
The head of my Range_Start dataframe would be like this:
Sample | |
---|---|
0 | 57 |
1 | 350 |
2 | 642 |
3 | 926 |
4 | 1211 |
And a Range_End dataframe:
Sample | |
---|---|
0 | 97 |
1 | 390 |
2 | 682 |
3 | 966 |
4 | 1251 |
So in dataframe A, I would like to select rows from 57 to 97, 350 to 390, and so on, and apply a linear function to the selected rows. I have a hard time figuring out how to select these ranges of data, what is the best way to do this? Thank you very much.
CodePudding user response:
dropping the rows:
You could generate a flat list of the values to drop and drop
:
from itertools import chain
df2 = df.drop(list(chain(*(list(range(a,b 1)) for a,b in zip(df1.Sample, df2.Sample)))))
keeping the rows:
from itertools import chain
df.loc[list(chain(*(list(range(a,b 1)) for a,b in zip(df1.Sample, df2.Sample))))]
example (dropping the rows):
df = pd.DataFrame({'col': range(50, 1255)}, index=range(50, 1255))
df2 = df.drop(list(chain(*(list(range(a,b 1)) for a,b in zip(df1.Sample, df2.Sample)))))
output:
col
50 50
51 51
52 52
53 53
54 54
55 55
56 56
98 98
99 99
100 100
... ...
1209 1209
1210 1210
1252 1252
1253 1253
1254 1254
[1000 rows x 1 columns]