I am little bit new to python. I have loaded a .mat
file into pandas DataFrame but there is one column sensor
where the values are stored as a list of lists (nested list). Each list contains one time stamp and the reading at that time. All what I need is to get all the sensor readings alone in a new list without the time stamp. The attached photo is a simplified example to what I have. sensor_data_without_timestamp.
As described in the photo, I have a for loop
to iterate for each entry in sensor
column and get the sensor reading to the new column sensor_data_without_timestamp
. My data set is really big (630 rows and for each raw the length of the list is approx. 30.000) and this for loop
alone takes about 10 minutes running.
Isn't there any other method to get the sensor data more effectively ?
Thanks in Advance :)
here is the code too:
df = pd.DataFrame({'id':['1A', '2A', '3A'],
'sensor': [[[0,10], [2,15], [4,20]], [[0,7], [2,14], [4,18]], [[0,11], [2,16], [4,22]]]})
sensor_data_without_timestamp = []
for i in range(len(df)):
l = []
for a in range(len(df['sensor'][i])):
l.append(df['sensor'][i][a][1])
sensor_data_without_timestamp.append(l)
df['sensor_data_without_timestamp'] = sensor_data_without_timestamp
df.head()
CodePudding user response:
You can explode your dataframe then extract the second values and finally reshape your dataframe:
df['sensor_data_without_timestamp'] = \
df['sensor'].explode().str[1].groupby(level=0).agg(list)
print(df)
# Output
id sensor sensor_data_without_timestamp
0 1A [[0, 10], [2, 15], [4, 20]] [10, 15, 20]
1 2A [[0, 7], [2, 14], [4, 18]] [7, 14, 18]
2 3A [[0, 11], [2, 16], [4, 22]] [11, 16, 22]
Update If you have always a list of 3 elements:
df['sensor_data_without_timestamp'] = \
np.vstack(df['sensor'])[:, 1].reshape(-1, 3).tolist()
print(df)
# Output
id sensor sensor_data_without_timestamp
0 1A [[0, 10], [2, 15], [4, 20]] [10, 15, 20]
1 2A [[0, 7], [2, 14], [4, 18]] [7, 14, 18]
2 3A [[0, 11], [2, 16], [4, 22]] [11, 16, 22]