How to extract other data of outlier that is specified with that outlier in box plot in python?-CodePudding

this is the my pandas data frame:

Datetime	SN NO.	Values	data1	data2	data3	data4	data5	data6
2020-09-29T14:59:13.4461479 02:00	701	24.511	3.556	3.557	3.555	3.551	3.559	3.555
2020-09-29T15:48:04.6368679 02:00	702	24.516	3.554	3.555	3.555	3.556	3.552	3.557
2020-09-29T15:51:46.2555875 02:00	703	24.517	3.553	3.556	3.551	3.553	3.558	3.554
2020-10-01T12:51:59.2687665 02:00	704	24.519	3.552	3.557	3.556	3.559	3.557	3.557
2021-02-01T19:27:09.0472459 02:00	705	24.511	3.551	3.558	3.558	3.550	3.551	3.552
.	.	.	.	.	.	.	.	.

boxplot = df.reset_index().boxplot(column=['Values'], by = "Datetime", return_type=None)
from matplotlib.cbook import boxplot_stats
outliers = [y for stat in boxplot_stats(df['Values']) for y in stat['fliers']]
print(outliers)
boxplot.plot()
plt.show()

[sorry for inconvenience this picture was deleted]

as shown in the box plot, there is some outlier but I want to extract other data which is included in the row with that specific values. (by example: one outlier is 24.519 from the data frame but I also need other data such as SN no. and data1, data2, data3, and so on for specific values. what is the best way to do it?

CodePudding user response：

To get a DF with all the outliers:

df_outliers = df.loc[df['Values'].isin(outlier_values), :]

To get only one row:

df_outliers = df.loc[df['Values'].eq(single_value), :]

If you have multiple rows with the same Value it will find all of them.

To keep only some columns from the original df:

cols = ['data1', 'data2']
df_outliers = df.loc[df['Values'].isin(outlier_values), cols]