I have two groups of points, but they also overlap, so I need to add jitter if I plot them with a scatterplot. I also want to connect matching points from each group (they all have a pair).
There are many questions that suggest:
data = [['abc', 'pre', 10], ['abc', 'post', 5], ['bce', 'pre', 10], ['bce', 'post', 5], ['cef', 'pre', 8], ['cef', 'post', 5]]
df = pd.DataFrame(data, columns=['ID', 'time', 'value'])
grouped = df.groupby('ID')
for name, group in grouped:
sns.scatterplot(x='time', y='value', data=group, color='#3C74BC')
sns.lineplot(x='time', y='value', data=group, color='#3C74BC')
plt.show()
It works ok, but it doesn't have jitter. If I add jitter via sns. stripplot()
, the lines do not connect dots anymore and they are coming out of arbitrary places.
CodePudding user response:
The approach below makes following changes:
- Convert the time to numeric (0 for 'pre' and 1 for 'post') via
(df['time'] != 'pre').astype(float)
- Add a random jitter to these values:
np.random.uniform(-0.1, 0.1, len(df))
. Depending on how many values you have, you might change0.1
to a larger value. - Use
sns.lineplot
with a marker to avoid the need ofscatterplot
. - Use
hue='ID'
to draw everything in one go. - As
hue
doesn't look tocolor=
, usepalette=
with the same number of colors as there are different hue values. - Suppress the legend, as all hue values have the same color.
- Assign tick labels to 0 and 1.
- Set
xlim
to so the tick labels are at equal distances to the respective border.
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
data = [['abc', 'pre', 10], ['abc', 'post', 5], ['bce', 'pre', 10], ['bce', 'post', 5], ['cef', 'pre', 8], ['cef', 'post', 5]]
df = pd.DataFrame(data, columns=['ID', 'time', 'value'])
df['time'] = (df['time'] != 'pre').astype(float) np.random.uniform(-0.1, 0.1, len(df))
ax = sns.lineplot(x='time', y='value', data=df, hue='ID', marker='o',
palette=['#3C74BC'] * len(df['ID'].unique()), legend=False)
ax.set_xticks([0, 1], ['pre', 'post'])
ax.set_xlim(-0.2, 1.2)
plt.show()