Home > other >  save outlier removed data back to new csv file
save outlier removed data back to new csv file

Time:08-25

I have a pandas dataframe and I am experimenting with enter image description here

CodePudding user response:

If I understand your question correctly, you want to filter a dataframe (df) using another dataframe (df2). If they share the same index you can do that using .loc[...].

# assuming you have a column called "is_outlier" or something like that
# filters rows in df to observations where df2.is_outlier is True
df.loc[df2.is_outlier]

# filters rows in df to observations where df2.is_outlier is False
df.loc[df2.is_outlier == False]

Edit

You want to filter df using your array, good.

# you can filter df using bool masking in .loc[...]
df.loc[good == True]

# or...
df.loc[good == False]

# ***NOTE: if you've altered the index in df you may have unexpected results.
# convert `good` into a `pd.Series` with the same index as `df`
s = pd.Series(good, index=df.index, name="is_outlier")

# ... join with df
df = df.join(s)

# then filter to True
df.loc[df.is_outlier == True]

# or False
df.loc[df.is_outlier == False]

CodePudding user response:

Thanks to @Ian Thompson

My code for what its worth...

s = pd.Series(good, index=df.index, name="is_outlier")
df = df.join(s)

# df2 is filtered to remove BAD data
df2 = df[(df['is_outlier']==True)]
df2 = df2[['pid','power','dat']]
df2.to_csv('./filteredILCdata.csv')
  • Related