Pandas DataFrame (long) to Series ("wide")-CodePudding

I have the following DataFrame:

	completeness	homogeneity	label_f1_score	label_precision	label_recall	mean_bbox_iou	mean_iou	px_accuracy	px_f1_score	px_iou	px_precision	px_recall	t_eval	v_score
mean	0.1	1	0.92	0.92	0.92	0.729377	0.784934	0.843802	0.898138	0.774729	0.998674	0.832576	1.10854	0.1
std	0.0707107	0	0.0447214	0.0447214	0.0447214	0.0574177	0.0313196	0.0341158	0.0224574	0.0299977	0.000432499	0.0327758	0.0588322	0.0707107

What I would like to obtain is a Series composed of completeness_mean, completeness_std, homogenety_mean, homogenety_std, ..., i.e. a label {column}_{index} for every cell.

Does Pandas have a function for this or do I have to iterate over all cells myself to build the desired result?

EDIT: I mean a Series with {column}_{index} as index and the corresponding values from the table.

(I believe this is not a duplicate of the other questions on SO related wide to long.)

CodePudding user response：

IIUC, unstack and flatten the index:

df2 = df.unstack()
df2.index = df2.index.map('_'.join)

output:

completeness_mean       0.100000
completeness_std        0.070711
homogeneity_mean        1.000000
homogeneity_std         0.000000
label_f1_score_mean     0.920000
label_f1_score_std      0.044721
label_precision_mean    0.920000
label_precision_std     0.044721
label_recall_mean       0.920000
label_recall_std        0.044721
mean_bbox_iou_mean      0.729377
mean_bbox_iou_std       0.057418
mean_iou_mean           0.784934
mean_iou_std            0.031320
px_accuracy_mean        0.843802
px_accuracy_std         0.034116
px_f1_score_mean        0.898138
px_f1_score_std         0.022457
px_iou_mean             0.774729
px_iou_std              0.029998
px_precision_mean       0.998674
px_precision_std        0.000432
px_recall_mean          0.832576
px_recall_std           0.032776
t_eval_mean             1.108540
t_eval_std              0.058832
v_score_mean            0.100000
v_score_std             0.070711
dtype: float64

or with stack for a different order:

df2 = df.stack()
df2.index = df2.swaplevel().index.map('_'.join)

output:

completeness_mean       0.100000
homogeneity_mean        1.000000
label_f1_score_mean     0.920000
label_precision_mean    0.920000
label_recall_mean       0.920000
mean_bbox_iou_mean      0.729377
mean_iou_mean           0.784934
px_accuracy_mean        0.843802
px_f1_score_mean        0.898138
px_iou_mean             0.774729
px_precision_mean       0.998674
px_recall_mean          0.832576
t_eval_mean             1.108540
v_score_mean            0.100000
completeness_std        0.070711
homogeneity_std         0.000000
label_f1_score_std      0.044721
label_precision_std     0.044721
label_recall_std        0.044721
mean_bbox_iou_std       0.057418
mean_iou_std            0.031320
px_accuracy_std         0.034116
px_f1_score_std         0.022457
px_iou_std              0.029998
px_precision_std        0.000432
px_recall_std           0.032776
t_eval_std              0.058832
v_score_std             0.070711
dtype: float64

CodePudding user response：

Is this what you're looking for?

pd.merge(df.columns.to_frame(), df.index.to_frame(), 'cross').apply('_'.join, axis=1)
# OR
pd.Series(df.unstack().index.map('_'.join))

Output:

0        completeness_mean
1         completeness_std
2         homogeneity_mean
3          homogeneity_std
4      label_f1_score_mean
5       label_f1_score_std
6     label_precision_mean
7      label_precision_std
8        label_recall_mean
9         label_recall_std
10      mean_bbox_iou_mean
11       mean_bbox_iou_std
12           mean_iou_mean
13            mean_iou_std
14        px_accuracy_mean
15         px_accuracy_std
16        px_f1_score_mean
17         px_f1_score_std
18             px_iou_mean
19              px_iou_std
20       px_precision_mean
21        px_precision_std
22          px_recall_mean
23           px_recall_std
24             t_eval_mean
25              t_eval_std
26            v_score_mean
27             v_score_std
dtype: object