Home > Software engineering >  Convert dataframe with two array columns into list of arrays
Convert dataframe with two array columns into list of arrays

Time:06-01

I have a dataframe with two columns containing arrays in each cell. Here's some code to create a small example dataframe with the same features as mine.

import pandas as pd
data = {'time': [
                 np.array(['2017-06-28T22:47:51.213500000', '2017-06-28T22:48:37.570900000', '2017-06-28T22:49:46.736800000']), 
                 np.array(['2017-06-28T22:46:27.321600000', '2017-06-28T22:46:27.321600000', '2017-06-28T22:47:07.220500000', '2017-06-28T22:47:04.293000000']),
                 np.array(['2017-06-28T23:10:20.125000000', '2017-06-28T23:10:09.885000000', '2017-06-28T23:11:31.902000000'])
                 ],
        'depth': [
                  np.array([215.91168091, 222.89173789, 215.21367521]),
                  np.array([188.68945869, 208.23361823, 217.30769231, 229.87179487]),
                  np.array([169.84330484, 189.38746439, 178.91737892])
                  ]
        }

df = pd.DataFrame(data)
df

I want to plot the data as three individual shapes, one for each row, where the time values are treated as the x coordinates and the depth values are treated as the y coordinates. To do this, I want to make a list of arrays that looks something like this.

[array([['2017-06-28T22:47:51.213500000', 215.91168091],
        ['2017-06-28T22:48:37.570900000', 222.89173789],
        ['2017-06-28T22:49:46.736800000', 215.21367521], dtype=object),
array([['2017-06-28T22:46:27.321600000', 188.68945869],
        ['2017-06-28T22:46:27.321600000', 208.23361823],
        ['2017-06-28T22:47:07.220500000', 217.30769231],
        ['2017-06-28T22:47:04.293000000', 229.87179487], dtype=object),
array([['2017-06-28T23:10:20.125000000', 169.84330484],
        ['2017-06-28T23:10:09.885000000', 189.38746439],
        ['2017-06-28T23:11:31.902000000', 178.91737892], dtype=object)]

CodePudding user response:

Try zip with for loop

l = [np.array(list(zip(x,y))) for x, y in zip(df.time,df.depth)]
Out[385]: 
[array([['2017-06-28T22:47:51.213500000', '215.91168091'],
        ['2017-06-28T22:48:37.570900000', '222.89173789'],
        ['2017-06-28T22:49:46.736800000', '215.21367521']], dtype='<U29'),
 array([['2017-06-28T22:46:27.321600000', '188.68945869'],
        ['2017-06-28T22:46:27.321600000', '208.23361823'],
        ['2017-06-28T22:47:07.220500000', '217.30769231'],
        ['2017-06-28T22:47:04.293000000', '229.87179487']], dtype='<U29'),
 array([['2017-06-28T23:10:20.125000000', '169.84330484'],
        ['2017-06-28T23:10:09.885000000', '189.38746439'],
        ['2017-06-28T23:11:31.902000000', '178.91737892']], dtype='<U29')]
  • Related