Home > OS >  pandas - unnest list and show in csv each array value in a cell on column basis
pandas - unnest list and show in csv each array value in a cell on column basis

Time:11-22

I'm preparing a dataset right now and I need it to have a certain format, such as this

Hand Pose
No Seating Back
Yes Seating Back
No Seating Back
Yes Seating Back
No Seating Back
Yes Seating Back

However, currently it's producing

0
Hand ['No', 'No', 'No', 'No', 'No', 'No', 'No']
Pose ['Seating Back', 'Seating Back', 'Seating Back']

As you can see the values inside the array stored in one cell while I need it to be un-nested in a where one value occupies one cell.

My code:

array_all = {'Hand': alldata, 'Pose': keypoints}
df = pd.Series(array_all)

# Keypoint detection
df.to_csv('test.csv',
          mode='w',
          header=True,
          index=True)
df.transpose()

To give context to the object, here is a snippet on one of them

for data_point in results.face_landmarks.landmark:
    if 0.6 <= data_point.x < 0.8:
        face_position.append('Straight')
    else:
        face_position.append('Angled')

All arrays are being appended in this manner

In the case of using it as

array_all = {'Hand': [alldata],
             'Seating': [keypoints],
             'Pose': [face_position]}
df = pd.DataFrame(array_all)

I still have the issue where it returns like this

Hand Pose
['No', 'No', 'No', 'No', 'No', 'No', 'No'] ['Seating Back', 'Seating Back', 'Seating Back']

CodePudding user response:

You're creating a Series instead of not a DataFrame. You want to create a DataFrame.

Change

df = pd.Series(array_all)

to

df = pd.DataFrame(array_all)

Edit: If you can't create a DataFrame because the arrays are not all of equal length, add empty strings to them until they all are as long as the longest of them, like this:

array_all = {'Hand': alldata, 'Pose': keypoints, 'Face Pose': face_position}

# Pad the arrays with '' so that they are all the same length
max_size = max([len(array) for array in array_all.values()])
for array in array_all.values():
    array.extend([''] * (max_size - len(array)))

df = pd.DataFrame(array_all)

Or, a better solution:

array_all = {'Hand': alldata, 'Pose': keypoints, 'Face Pose': face_position}
array_all = {k: pd.Series(v) for k, v in array_all.items()}
df = pd.DataFrame(array_all)

Last snippet partially credited to @Jeff's answer here.

  • Related