I have a dataframe with two columns: trialType and diameter. I need to find everywhere where trialType == 'start', and then get the diameter for all 2500 rows before and after those locations, inclusive. I tried the following:
idx = df.loc(df[df['trialType']=='start'])
df.iloc[idx - 2500 : idx 2500]
My goal is to have a dataframe with only those relevant rows.
CodePudding user response:
Interesting.
What about:
idx = df.loc[lambda x: x['trialType']=='start'].index
rows = df.loc[idx]
a = df.shift( 2500).loc[idx]
b = df.shift(-2500).loc[idx]
You can then combine them however you find best.
pd.concat([a,rows,b])
You could also do:
idx = df.loc[lambda x: x['trialType']=='start'].index
df.loc[lambda x: (x.index-2500).isin(idx)
¦ x.index.isin(idx)
¦ (x.index 2500).isin(idx)]
But you have to modify the code above if your index is not sequential (0,1,2,3,etc.)
CodePudding user response:
I will only change the first line to
idx = df.index[df['trialType']=="start"].tolist()[0]
this will return the first index where the condition is true
the line 2 should work fine df.iloc[idx - 2500 : idx 2500]
you can run this code to try
df = pd.DataFrame({'team': ['A', 'A', 'A', 'B', 'Z', 'C', 'C', 'D'],
'points': [5, 7, 7, 9, 12, 9, 9, 4],
'rebounds': [11, 8, 10, 6, 6, 5, 9, 12]})
df
idx = df.index[df['team']=="B"].tolist()[0]
df.iloc[idx - 2 : idx 2]
output
team points rebounds
1 A 7 8
2 A 7 10
3 B 9 6
4 Z 12 6