from a database I get the following table into a python dataframe df
:
FunctionID | FunctionText | FunctionModule | UserGroup |
---|---|---|---|
1 | Fct1 | ModX | GroupA |
2 | Fct2 | ModX | GroupA |
2 | Fct2 | ModX | GroupB |
3 | Fct3 | ModY | GroupB |
3 | Fct3 | ModY | GroupC |
. | ... | ... | ... |
3000 | Fct3000 | ModZ | GroupF |
My goal is to get a pivot-like table that looks like this one:
FunctionID | FunctionText | FunctionModule | GroupA | GroupB | GroupC | ... | GroupF |
---|---|---|---|---|---|---|---|
1 | Fct1 | ModX | X | ... | |||
2 | Fct2 | ModX | X | X | ... | ||
3 | Fct3 | ModY | X | X | ... | ||
. | ... | ... | ... | ... | ... | ... | ... |
3000 | Fct3000 | ModZ | ... | X |
So, the first 3 columns shall stay as is whereas the entries of UserGroup column shall be the added columns of the pivot table.
I tried some approaches like
df2 = df.pivot_table(index=df.iloc[:3],columns='UserGroup',values='UserGroup')
but without success. Maybe I have to use pivot()
or stack()
or unstack()
? But none of those led me to the desired target table.
CodePudding user response:
IIUC, you could add an extra column and pivot_table
with aggfunc='first'
:
(df
.assign(value='x') # or value=df['UserGroup'] if you want the names as value
.pivot_table(index=list(df.columns[:3]),
columns='UserGroup',
values='value',
aggfunc='first',
fill_value='',
)
)
output:
UserGroup GroupA GroupB GroupC
FunctionID FunctionText FunctionModule
1 Fct1 ModX x
2 Fct2 ModX x x
3 Fct3 ModY x x
NB. note that index=df.iloc[:3]
will also select 'UserGroup', I used index=list(df.columns[:3])
here