What is the easiest way to go from:
df = pd.DataFrame({'col1': [1,1,2,3], 'col2': [2,4,3,5]})
group_l = ['a', 'b']
df
col1 col2
0 1 2
1 1 4
2 2 3
3 3 5
to
col1 col2 group
0 1 2 a
1 1 4 a
2 2 3 a
3 3 5 a
0 1 2 b
1 1 4 b
2 2 3 b
3 3 5 b
I've thought of a few solutions but none seem great.
- Use pd.MultiIndex.from_product, then reset_index. This would work fine if the initial DataFrame only had one column.
- Add a new column
group
where each element is['a', 'b']
. Use pd.DataFrame.explode. Feels inefficient.
CodePudding user response:
You might create copies, set group value accordingly and concatenate them, that is
import pandas as pd
df = pd.DataFrame({'col1': [1,1,2,3], 'col2': [2,4,3,5]})
df1 = df.copy()
df2 = df.copy()
df1['group'] = 'A'
df2['group'] = 'B'
df_out = pd.concat([df1,df2])
print(df_out)
gives output
col1 col2 group
0 1 2 A
1 1 4 A
2 2 3 A
3 3 5 A
0 1 2 B
1 1 4 B
2 2 3 B
3 3 5 B
CodePudding user response:
One approach, using pd.concat
:
group_l = ['a', 'b']
res = pd.concat([df.assign(group=e) for e in group_l])
print(res)
Output
col1 col2 group
0 1 2 a
1 1 4 a
2 2 3 a
3 3 5 a
0 1 2 b
1 1 4 b
2 2 3 b
3 3 5 b