I have a 2d data set of (x,y)
. x
and y
are integer values.
How can I use only Pandas code to find all x
values where y
reaches its maximum values (there are multiple and same absolute maxima)?
I also want to plot (with pandas.DataFrame.plot) x
vs. y
and mark the maxima positions.
Example code:
import numpy as np
import pandas as pd
np.random.seed(10)
x = np.arange(100)*0.2
y = np.random.randint(0, 20, size=100)
data = np.vstack((x, y)).T
df = pd.DataFrame(data, columns=['x', 'y'])
ymax = df['y'].max()
df_ymax = df[df['y'] == ymax]
print(df_ymax)
# x y
# 13 2.6 19.0
# 24 4.8 19.0
# 28 5.6 19.0
# 86 17.2 19.0
# 88 17.6 19.0
df.plot(x='x', y='y', figsize=(8, 4),
ylabel='y', legend=False, style=['b-'])
I have no idea how to mark the maxima values (df_ymax
) in the same plot, e.g. using circles. How can that be solved?
The final plot should look like this (here I programmed everything with numpy and matplotlib):
CodePudding user response:
Get the Axes
returned by df.plot
and reuse it to plot the maxima values:
ax = df.plot(x='x', y='y', figsize=(8, 4), ylabel='y', legend=False, style=['b-'])
df_ymax.plot.scatter(x='x', y='y', color='r', ax=ax)