I have a univariate time series structured as well:
data = [15, 5, 7, 9, 10, 23, 4, 6]
And a list of score of the values inside the list, structured as well:
score = [0.3, 0.6, 0.1, 0.8, 0.4, 0.7, 0.3, 0.1]
I also have a threshold t = 0.5
From this, I created a dataframe with two columns, where in the first one I have the value and in the second one I have True if it is an anomaly (which means it has a score a score > t
) and False if it is not (score< t
). The structure is this:
values | anomalies
15 | False
5 | True
7 | False
9 | True
10 | False
23 | True
4 | False
6 | False
What I wanna do is plot the values with anomalies==True
in a color, and the values with anomalies==False
in another color. I tried to plot the normal values and then overlap them with the anomalies ones, as you can see in this fragment of code:
fig = plt.figure(figsize=(25,5))
ax1=plt.subplot(121)
sns.lineplot(data=df['values'], ax=ax1) # plot normal time series plot
sns.lineplot(data=df['values'][(df['anomalies'] == True )], color='red', ax=ax1)
But the result is the one in the following figure, where the red points are linked even if they should be separated:
How can I solve it?
CodePudding user response:
you can create a dataframe first:
df = pd.DataFrame(columns=['data','score','anomalies'])
then:
df.loc[df[score]>t,'anomalies'] = 'True'
for your first part of the answer
CodePudding user response:
You can use markevery argument to the plot function as described [here] (Highlighting arbitrary points in a matplotlib plot?). Then you can set the markerface to your liking.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()
data = [15, 5, 7, 9, 10, 23, 4, 6]
score = [0.3, 0.6, 0.1, 0.8, 0.4, 0.7, 0.3, 0.1]
df = pd.DataFrame(data,columns=['values'])
df['score'] = score
plt.figure(figsize=(8,6))
plt.plot(df.index, df['values'], '-go', markevery=np.where(df.score > 0.5, True, False), markerfacecolor='b')
plt.xlabel('Index')
plt.ylabel('Values')
plt.title('Anomalies Plot')
It will look something like this plot
you can achieve similar result using seaborn by replacing
plt.plot(df.index, df['values'], '-go', markevery=np.where(df.score > 0.5, True, False), markerfacecolor='b')
with
sns.scatterplot(x=df.index,y=df['values'], hue=df.score>0.5)
sns.lineplot(x=df.index,y=df['values'])