I am a manufacturing engineer, very new to Python and Matplotlib. Currently, I am trying to plot a scatter time graph, where for every single record, I have the data (read from a sensor) and upper and lower limits for that data that will stop the tool if data is not between them.
So for a simple set of data like this:
time = [1, 2, 3, 7, 8, 9, 10]*
data = [5, 6, 5, 5, 6, 7, 8]
lower_limit = [4, 4, 5, 5, 5, 5, 5]
upper_limit = [6, 6, 6, 7, 7, 7, 7]
- When the tool is not working, nothing will be recorded, hence a gap b/w 3 & 7 in time records.
The desired graph would look like this:
A few rules that I am trying to stick to:
- All three graphs (data, upper_limit, and lower_limit) are required to be scattered points and not lines, with the x-axis (time) being shared among them. - required.
- A green highlight that fills between upper and lower limits, considering only the two points with the same time for each highlight. - highly recommended. (I tried matplotlib.fill_between, but it creates a polygon between trend lines, rather than straight vertical lines between matching pairs of L.L. & U.L. dots. Therefore, it won't be accurate, and it will fill up the gap b/w times 3s and 7s, which is not desired. Also, I tried to use matplot.bar for limits along the scatter plot for the 'data', but I was not able to set a minimum = lower_limit for the bars.)
- When the value of data is not equal to or between the limits, the representing dot should appear in red, rather than the original color. -highly recommended.
So, with all of that in mind, and thousands of records per day, a regular graph, for a 24hr time span, should look like the following: (notice the gap due to possible lack of records in a time span, as well as vertical green lines, for the limits.)
Thanks for your time and help!
CodePudding user response:
Something like this should work, plotting each component separately:
time = [1, 2, 3, 7, 8, 9, 10]
data = [5, 6, 5, 5, 6, 7, 8]
lower_limit = [4, 4, 5, 5, 5, 5, 5]
upper_limit = [6, 6, 6, 7, 7, 7, 7]
# put data into dataframe and identify which points are out of range (not between the lower and upper limit)
df = pd.DataFrame({'time': time, 'data': data, 'll': lower_limit, 'ul': upper_limit})
df.loc[:, 'in_range'] = 0
df.loc[((df['data'] >= df['ll']) & (df['data'] <= df['ul'])), 'in_range'] = 1
# make the plot
fig, ax = plt.subplots()
# plot lower-limit and upper-limit points
plt.scatter(df['time'], df['ll'], c='green')
plt.scatter(df['time'], df['ul'], c='green')
# plot data points in range
plt.scatter(df.loc[df['in_range']==1, :]['time'], df.loc[df['in_range']==1, :]['data'], c='black')
# plot data points out of range (in red)
plt.scatter(df.loc[df['in_range']==0, :]['time'], df.loc[df['in_range']==0, :]['data'], c='red')
# plot lines between lower limit and upper limit
plt.plot((df['time'],df['time']),([i for i in df['ll']], [j for j in df['ul']]), c='lightgreen')
CodePudding user response:
This is a version using numpy
s masking and matplotlib
s errorbar
import matplotlib.pyplot as plt
import numpy as np
time = np.array( [0, 1, 2, 3, 7, 8, 9, 10] )
data = np.array([2, 5, 6, 5, 5, 6, 7, 8] )
lower = np.array([4, 4, 4, 5, 5, 5, 5, 5] )
upper = np.array([6, 6, 6, 6, 7, 7, 7, 7] )
nn = len( lower )
delta = upper - lower
### creating masks
inside = ( ( upper - data ) >= 0 ) & ( ( data - lower ) >= 0 )
outside = np.logical_not( inside )
fig = plt.figure()
ax = fig.add_subplot( 1, 1, 1 )
ax.errorbar( time, lower, yerr=( nn*[0], delta), ls='', ecolor="#00C023" )
ax.scatter( time[ inside ], data[ inside ], c='k' )
ax.scatter( time[ outside ], data[ outside ], c='r' )
plt.show()