I have a dataframe as follows:
Machine Time Part PowerA PowerB
1 20:30 1 0.1 0.4
1 20:30 2 0.9 0.7
1 20:31 1 0.3 0.1
1 20:31 2 0.2 0.3
2 20:30 1 0.2 0.5
2 20:31 1 0.8 0.4
I want it to be like:
Machine Time Part1_PowerA Part1_PowerB Part2_PowerA Part2_PowerB
1 20:30 0.1 0.4 0.9 0.7
1 20:31 0.3 0.1 0.2 0.3
2 20:30 0.2 0.5 -1.0 -1.0
2 20:31 0.8 0.4 -1.0 -1.0
The objective is that I create a column for each Part
and Power
and fill the values as shown. Each machine has a variable number of parts, but the maximum is 8 (which would lead to columns Part8_PowerA
and Part8_PowerB
). When a machine doesn't have a certain part, the values for the Part_Power are filled with -1.
I have looked quite a while for solutions, including this one, but I wasn't able to adapt to my situation, where I actually change the names of the columns as a combination of a row value already existent column.
Thanks!
CodePudding user response:
Let us do pivot_table
then swaplevel
s = df.pivot_table(index= ['Machine','Time'],
columns = df.Part.astype(str).radd('Part'),
values=['PowerA','PowerB'],
fill_value=-1).swaplevel(1,0, axis=1).sort_index(level=0, axis=1)
s.columns = s.columns.map('_'.join)
s.reset_index(inplace=True)
s
Out[751]:
Machine Time Part1_PowerA Part1_PowerB Part2_PowerA Part2_PowerB
0 1 20:30 0.1 0.4 0.9 0.7
1 1 20:31 0.3 0.1 0.2 0.3
2 2 20:30 0.2 0.5 -1.0 -1.0
3 2 20:31 0.8 0.4 -1.0 -1.0
CodePudding user response:
You can use pivot
in combination with string formatting for a fairly clean solution:
out = (
df.pivot(['Machine', 'Time'], 'Part')
.fillna(-1)
.sort_index(level=1, axis=1)
)
out.columns = out.columns.map('Part{0[1]}_{0[0]}'.format)
out = out.reset_index()
print(out)
Machine Time Part1_PowerA Part1_PowerB Part2_PowerA Part2_PowerB
0 1 20:30 0.1 0.4 0.9 0.7
1 1 20:31 0.3 0.1 0.2 0.3
2 2 20:30 0.2 0.5 -1.0 -1.0
3 2 20:31 0.8 0.4 -1.0 -1.0
Alternatively, if you want a method chained solution, you'll need to use .pipe
to access the current columns values from .set_axis
so you end up with some extra code bloat:
print(
df.pivot(['Machine', 'Time'], 'Part')
.fillna(-1)
.sort_index(level=1, axis=1)
.pipe(lambda d:
d.set_axis(
d.columns.map('Part{0[1]}_{0[0]}'.format),
axis='columns'
)
)
.reset_index()
)
Machine Time Part1_PowerA Part1_PowerB Part2_PowerA Part2_PowerB
0 1 20:30 0.1 0.4 0.9 0.7
1 1 20:31 0.3 0.1 0.2 0.3
2 2 20:30 0.2 0.5 -1.0 -1.0
3 2 20:31 0.8 0.4 -1.0 -1.0
CodePudding user response:
You may do:
df = pd.pivot_table(df, values=['PowerA','PowerB'], index=['Machine', 'Time'], columns='Part')
df.columns = df.columns.to_flat_index()
df.columns = [ 'Part' str(col[1]) '_' col[0] for col in df.columns]
df.reset_index(inplace=True)
df.fillna(-1,inplace=True)