Convert combinations of row column as Column headers-CodePudding

I have a dataframe as follows:

Machine    Time      Part    PowerA    PowerB
1         20:30       1        0.1      0.4
1         20:30       2        0.9      0.7
1         20:31       1        0.3      0.1
1         20:31       2        0.2      0.3
2         20:30       1        0.2      0.5
2         20:31       1        0.8      0.4

I want it to be like:

Machine    Time      Part1_PowerA   Part1_PowerB    Part2_PowerA    Part2_PowerB
1         20:30           0.1            0.4            0.9             0.7
1         20:31           0.3            0.1            0.2             0.3
2         20:30           0.2            0.5           -1.0            -1.0
2         20:31           0.8            0.4           -1.0            -1.0

The objective is that I create a column for each Part and Power and fill the values as shown. Each machine has a variable number of parts, but the maximum is 8 (which would lead to columns Part8_PowerA and Part8_PowerB). When a machine doesn't have a certain part, the values for the Part_Power are filled with -1.

I have looked quite a while for solutions, including this one, but I wasn't able to adapt to my situation, where I actually change the names of the columns as a combination of a row value already existent column.

Thanks!

CodePudding user response：

Let us do pivot_table then swaplevel

s = df.pivot_table(index= ['Machine','Time'], 
                   columns = df.Part.astype(str).radd('Part'),
                   values=['PowerA','PowerB'],
                   fill_value=-1).swaplevel(1,0, axis=1).sort_index(level=0, axis=1)

s.columns = s.columns.map('_'.join)
s.reset_index(inplace=True)
s
Out[751]: 
   Machine   Time  Part1_PowerA  Part1_PowerB  Part2_PowerA  Part2_PowerB
0        1  20:30           0.1           0.4           0.9           0.7
1        1  20:31           0.3           0.1           0.2           0.3
2        2  20:30           0.2           0.5          -1.0          -1.0
3        2  20:31           0.8           0.4          -1.0          -1.0

CodePudding user response：

You can use pivot in combination with string formatting for a fairly clean solution:

out = (
    df.pivot(['Machine', 'Time'], 'Part')
    .fillna(-1)
    .sort_index(level=1, axis=1)
)

out.columns = out.columns.map('Part{0[1]}_{0[0]}'.format)
out = out.reset_index()

print(out)
   Machine   Time  Part1_PowerA  Part1_PowerB  Part2_PowerA  Part2_PowerB
0        1  20:30           0.1           0.4           0.9           0.7
1        1  20:31           0.3           0.1           0.2           0.3
2        2  20:30           0.2           0.5          -1.0          -1.0
3        2  20:31           0.8           0.4          -1.0          -1.0

Alternatively, if you want a method chained solution, you'll need to use .pipe to access the current columns values from .set_axis so you end up with some extra code bloat:

print(
    df.pivot(['Machine', 'Time'], 'Part')
    .fillna(-1)
    .sort_index(level=1, axis=1)
    .pipe(lambda d: 
        d.set_axis(
            d.columns.map('Part{0[1]}_{0[0]}'.format),
            axis='columns'
        )
    )
    .reset_index()
)

   Machine   Time  Part1_PowerA  Part1_PowerB  Part2_PowerA  Part2_PowerB
0        1  20:30           0.1           0.4           0.9           0.7
1        1  20:31           0.3           0.1           0.2           0.3
2        2  20:30           0.2           0.5          -1.0          -1.0
3        2  20:31           0.8           0.4          -1.0          -1.0

CodePudding user response：

You may do:

df = pd.pivot_table(df, values=['PowerA','PowerB'], index=['Machine', 'Time'], columns='Part')
df.columns = df.columns.to_flat_index()
df.columns = [ 'Part'   str(col[1])   '_'    col[0] for col in df.columns]
df.reset_index(inplace=True)
df.fillna(-1,inplace=True)