I have a pandas dataframe and I am experimenting with
CodePudding user response:
If I understand your question correctly, you want to filter a dataframe (df
) using another dataframe (df2
). If they share the same index you can do that using .loc[...]
.
# assuming you have a column called "is_outlier" or something like that
# filters rows in df to observations where df2.is_outlier is True
df.loc[df2.is_outlier]
# filters rows in df to observations where df2.is_outlier is False
df.loc[df2.is_outlier == False]
Edit
You want to filter df
using your array, good
.
# you can filter df using bool masking in .loc[...]
df.loc[good == True]
# or...
df.loc[good == False]
# ***NOTE: if you've altered the index in df you may have unexpected results.
# convert `good` into a `pd.Series` with the same index as `df`
s = pd.Series(good, index=df.index, name="is_outlier")
# ... join with df
df = df.join(s)
# then filter to True
df.loc[df.is_outlier == True]
# or False
df.loc[df.is_outlier == False]
CodePudding user response:
Thanks to @Ian Thompson
My code for what its worth...
s = pd.Series(good, index=df.index, name="is_outlier")
df = df.join(s)
# df2 is filtered to remove BAD data
df2 = df[(df['is_outlier']==True)]
df2 = df2[['pid','power','dat']]
df2.to_csv('./filteredILCdata.csv')