How can i split a column into multiple columns?-CodePudding

I have a data like this:

d1 = pd.DataFrame({"Mother_id": 11111, "Children_id": [12476, 19684]})
d2 = pd.DataFrame({"Mother_id": 22222, "Children_id": [24153, 29654, 25417]})

d3 = pd.concat([d1, d2], axis=0)

Desired Output:

    Mother_id   child_id_1  child_2 child_3 ....  number_of_children
(11111, 12476, 19684, nan, 2)
(22222, 24153, 29654, 25417, 3)

Can anyone help ? thanks.

CodePudding user response：

Here is a solution using pivot. It first uses groupby cumcount to compute a helper column with the children's rank that will be used to define the columns for the pivot.

(d3.assign(n=d3.groupby('Mother_id').cumcount().add(1))
   .pivot(index='Mother_id', columns='n', values='Children_id')
   .add_prefix('child_')
   .assign(n_children=lambda d: d.notna().sum(axis=1))
)

output:

           child_1  child_2  child_3  n_children
Mother_id                                       
11111      12476.0  19684.0      NaN           2
22222      24153.0  29654.0  25417.0           3

CodePudding user response：

You are not only need concat , you will need groupby with row explode

s = pd.concat([d1,d2]).groupby('Mother_id').Children_id.agg(list).apply(pd.Series).add_prefix('child_id_')
s['number_of_child'] = s.notna().sum(1)
s = s.reset_index()
s
Out[95]: 
   Mother_id  child_id_0  child_id_1  child_id_2  number_of_child
0      11111     12476.0     19684.0         NaN                2
1      22222     24153.0     29654.0     25417.0                3