I have this code
df = pd.DataFrame({'an':{0: 'f', 1: 'i', 2:'-' , 3:'-' , 4:'f' , 5:'c,f,i,j' , 6:'c,d,e,j'},
'nv':{0: [-1], 1: [-1], 2: ['-'], 3:['-'] , 4:[-2] , 5:[-2,-1,-3,-1] , 6:[-2,-1,-2,-1]},
})
which yields
an nv
0 f [-1]
1 i [-1]
2 - [-]
3 - [-]
4 f [-2]
5 c,f,i,j [-2, -1, -3, -1]
6 c,d,e,j [-2, -1, -2, -1]
I would like to split column ['an'], so that each value inside ['an'] becomes column becomes a column with the value currently assigned on the bracket from ['nv'] column.
Desired output:
an nv c d e f i j
0 f [-1] -1
1 i [-1] -1
2 - [-]
3 - [-]
4 f [-2] -2
5 c,f,i,j [-2, -1, -3, -1] -2 -1 -3 -1
6 c,d,e,j [-2, -1, -2, -1] -2 -1 -2 -1
Solution on this question is similar, but it does not work for my problem
CodePudding user response:
Try:
df = pd.concat(
[
df,
(
x := pd.DataFrame(
[
{k: v for k, v in zip(a, b) if v != "-"}
for a, b in zip(df.an.str.split(","), df.nv)
]
).fillna("")
)[sorted(x)],
],
axis=1,
)
print(df)
Prints:
an nv c d e f i j
0 f [-1] -1.0
1 i [-1] -1.0
2 - [-]
3 - [-]
4 f [-2] -2.0
5 c,f,i,j [-2, -1, -3, -1] -2.0 -1.0 -3.0 -1.0
6 c,d,e,j [-2, -1, -2, -1] -2.0 -1.0 -2.0 -1.0
CodePudding user response:
Let's split
the an
column then explode
the dataframe on multiple columns, then reshape with pivot
:
df.join(
df
.query("an != '-'")
.assign(an=df['an'].str.split(','))
.explode(['an', 'nv'])
.pivot(columns='an', values='nv')
).fillna('')
an nv c d e f i j
0 f [-1] -1
1 i [-1] -1
2 - [-]
3 - [-]
4 f [-2] -2
5 c,f,i,j [-2, -1, -3, -1] -2 -1 -3 -1
6 c,d,e,j [-2, -1, -2, -1] -2 -1 -2 -1
CodePudding user response:
Using crosstab
as a way to reshape:
s = df['an'].str.split(',').explode()
df.join(pd.crosstab(s.index, s, values=df['nv'].explode(), aggfunc='first')
.fillna('').drop(columns='-'))
output:
an nv c d e f i j
0 f [-1] -1
1 i [-1] -1
2 - [-]
3 - [-]
4 f [-2] -2
5 c,f,i,j [-2, -1, -3, -1] -2 -1 -3 -1
6 c,d,e,j [-2, -1, -2, -1] -2 -1 -2 -1