I want to remove rows from a Numpy array, where there is just noise and a random constant offset.
My data looks similar to this:
offset = np.array([0.2, 3.2])
signal = np.sin(np.arange(0, 2 * np.pi, 0.1))
null = np.zeros_like(signal)
data_block = np.array([signal, null]).T
padding = np.zeros((10, 2))
block = np.vstack((padding, data_block, padding)) offset
# add noise
shape = block.shape
noise = np.random.rand(shape[0], shape[1]) * .01
everything = noise block
In reality, there is no step but rather a smooth transition from the offset to the data. It is like first there is the offset, which starts to move, once the data block starts, and becomes another offset when it stops. The noise amplitude is much smaller than the data amplitude.
I would like to retrieve the rows with the data block from everything
, preferably based on continuous, smooth change in the data block. How can I do that?
CodePudding user response:
This is my best effort on identifying the data_block
. I would be happy if it could be improved!
import numpy as np
offset = np.array([0.2, 3.2])
signal = np.sin(np.arange(0, 2 * np.pi, 0.1))
null = np.zeros_like(signal)
data_block = np.array([signal, null]).T
padding = np.zeros((10, 2))
block = np.vstack((padding, data_block, padding)) offset
# add noise
shape = block.shape
noise = np.random.rand(shape[0], shape[1]) * .01
everything = noise block
from matplotlib import pyplot as plt
x = np.arange(shape[0])
plt.plot(x, everything[:, 0])
plt.plot(x, everything[:, 1])
plt.show()
diff_everything = np.diff(everything, axis=0)
x = np.arange(shape[0] - 1)
plt.plot(x, diff_everything[:, 0])
plt.plot(x, diff_everything[:, 1])
plt.show()
mask = (np.linalg.norm(diff_everything[:, :], axis=1) > 0.01)
mask = np.append(mask, False)
data = everything[mask, :]
shape = data.shape
x=np.arange(0,shape[0])
plt.plot(x, data[:, 0])
plt.plot(x, data[:, 1])
plt.show()