Home > Blockchain >  Plot only selected points of a time series
Plot only selected points of a time series

Time:10-10

I have a univariate time series structured as well:

data = [15, 5, 7, 9, 10, 23, 4, 6]

And a list of score of the values inside the list, structured as well:

score = [0.3, 0.6, 0.1, 0.8, 0.4, 0.7, 0.3, 0.1]

I also have a threshold t = 0.5

From this, I created a dataframe with two columns, where in the first one I have the value and in the second one I have True if it is an anomaly (which means it has a score a score > t) and False if it is not (score< t). The structure is this:

values | anomalies
  15   |   False
  5    |   True
  7    |   False
  9    |   True
  10   |   False
  23   |   True
  4    |   False
  6    |   False

What I wanna do is plot the values with anomalies==True in a color, and the values with anomalies==False in another color. I tried to plot the normal values and then overlap them with the anomalies ones, as you can see in this fragment of code:

fig = plt.figure(figsize=(25,5)) 
ax1=plt.subplot(121)
sns.lineplot(data=df['values'], ax=ax1) # plot normal time series plot
sns.lineplot(data=df['values'][(df['anomalies'] == True )], color='red', ax=ax1)

But the result is the one in the following figure, where the red points are linked even if they should be separated: enter image description here

How can I solve it?

CodePudding user response:

you can create a dataframe first:

df = pd.DataFrame(columns=['data','score','anomalies'])

then:

df.loc[df[score]>t,'anomalies'] = 'True'

for your first part of the answer

CodePudding user response:

You can use markevery argument to the plot function as described [here] (Highlighting arbitrary points in a matplotlib plot?). Then you can set the markerface to your liking.

  import pandas as pd
  import numpy as np
  import matplotlib.pyplot as plt
  import seaborn as sns
  sns.set()
  data = [15, 5, 7, 9, 10, 23, 4, 6]
  score = [0.3, 0.6, 0.1, 0.8, 0.4, 0.7, 0.3, 0.1]
  df = pd.DataFrame(data,columns=['values'])
  df['score'] = score
  plt.figure(figsize=(8,6))
  plt.plot(df.index, df['values'], '-go', markevery=np.where(df.score > 0.5, True, False), markerfacecolor='b')
  plt.xlabel('Index')
  plt.ylabel('Values')
  plt.title('Anomalies Plot')

It will look something like this plot

you can achieve similar result using seaborn by replacing

plt.plot(df.index, df['values'], '-go', markevery=np.where(df.score > 0.5, True, False), markerfacecolor='b')

with

sns.scatterplot(x=df.index,y=df['values'], hue=df.score>0.5)
sns.lineplot(x=df.index,y=df['values'])
  • Related