Suppose the next packed df, where column_name
represents the name of the columns and row_name
the name of the rows in an unpacked dataframe:
column_name row_name value
0 col1 a 1
1 col1 b 2
2 col1 c 3
3 col2 d 4
4 col2 e 5
5 col2 f 6
6 col2 g 7
7 col2 h 8
And the unpacked dataframe looks like this:
individual col1 col2
0 1 a e
1 2 b f
2 3 c g
3 4 a h
4 5 b e
5 6 c f
6 7 a g
7 8 b h
Desired output could look like this:
individual col1 col2 value_col1 value_col2
1 a e 1 5
2 b f 2 6
3 c g 3 7
4 a h 1 8
5 b e 2 5
6 c f 3 6
7 a g 1 7
8 b h 2 8
Is there any way to make a join in order to unpack value
column into two distinct columns in the unpacked dataframe?
Reproducible data:
unpacked_df = pd.DataFrame({
'individual': [1,2,3,4,5,6,7,8],
'col1':['a','b','c','a','b','c','a','b'],
'col2':['e','f','g','h','e','f','g','h']
})
packed_df = pd.DataFrame({
'column_name': ['col1','col1','col1','col2','col2','col2','col2','col2'],
'row_name': ['a','b','c','d','e','f','g','h'],
'value':[1,2,3,4,5,6,7,8]
})
CodePudding user response:
You can do a double merge
, first for col1
, then for col2
:
unpacked_df.merge(packed_df[packed_df.column_name.eq("col1")], left_on="col1", right_on="row_name").merge(packed_df[packed_df.column_name.eq("col2")], left_on="col2", right_on="row_name", suffixes=("_col1","_col2")).sort_values("individual")[["individual", "col1", "col2", "value_col1", "value_col2"]]
Result:
individual col1 col2 value_col1 value_col2
0 1 a e 1 5
6 2 b f 2 6
5 3 c g 3 7
2 4 a h 1 8
1 5 b e 2 5
7 6 c f 3 6
4 7 a g 1 7
3 8 b h 2 8
CodePudding user response:
This works:
cols = packed_df['column_name'].unique()
dct = {k: [] for k in cols}
for row in unpacked_df.iloc:
for col in cols:
dct[col].append(packed_df[(packed_df['column_name'] == col) & (packed_df['row_name'] == row[col])]['value'].iloc[0])