I'm preparing a dataset right now and I need it to have a certain format, such as this
Hand | Pose |
---|---|
No | Seating Back |
Yes | Seating Back |
No | Seating Back |
Yes | Seating Back |
No | Seating Back |
Yes | Seating Back |
However, currently it's producing
0 | |
---|---|
Hand | ['No', 'No', 'No', 'No', 'No', 'No', 'No'] |
Pose | ['Seating Back', 'Seating Back', 'Seating Back'] |
As you can see the values inside the array stored in one cell while I need it to be un-nested in a where one value occupies one cell.
My code:
array_all = {'Hand': alldata, 'Pose': keypoints}
df = pd.Series(array_all)
# Keypoint detection
df.to_csv('test.csv',
mode='w',
header=True,
index=True)
df.transpose()
To give context to the object, here is a snippet on one of them
for data_point in results.face_landmarks.landmark:
if 0.6 <= data_point.x < 0.8:
face_position.append('Straight')
else:
face_position.append('Angled')
All arrays are being appended in this manner
In the case of using it as
array_all = {'Hand': [alldata],
'Seating': [keypoints],
'Pose': [face_position]}
df = pd.DataFrame(array_all)
I still have the issue where it returns like this
Hand | Pose |
---|---|
['No', 'No', 'No', 'No', 'No', 'No', 'No'] | ['Seating Back', 'Seating Back', 'Seating Back'] |
CodePudding user response:
You're creating a Series
instead of not a DataFrame
. You want to create a DataFrame
.
Change
df = pd.Series(array_all)
to
df = pd.DataFrame(array_all)
Edit: If you can't create a DataFrame because the arrays are not all of equal length, add empty strings to them until they all are as long as the longest of them, like this:
array_all = {'Hand': alldata, 'Pose': keypoints, 'Face Pose': face_position}
# Pad the arrays with '' so that they are all the same length
max_size = max([len(array) for array in array_all.values()])
for array in array_all.values():
array.extend([''] * (max_size - len(array)))
df = pd.DataFrame(array_all)
Or, a better solution:
array_all = {'Hand': alldata, 'Pose': keypoints, 'Face Pose': face_position}
array_all = {k: pd.Series(v) for k, v in array_all.items()}
df = pd.DataFrame(array_all)
Last snippet partially credited to @Jeff's answer here.