Home > Enterprise >  Using Pandas convert column values to column header
Using Pandas convert column values to column header

Time:10-11

Data Set

result = [['Nikolas', 'Webber', 63, 'Male', 'Options', 'Option - 1'],
['Sanaa', 'Kirkpat', 53, 'Male', 'Options', 'Option - 3'],
['Laaibah', 'Andrson', 52, 'Male', 'Options', 'Option - 1'],
['Beck', 'Villanu', 61, 'Male', 'Options', 'Option - 2'],
['Blair', 'Hook', 67, 'Male', 'Options', 'Option - 1'],
['Kavan', 'Hodges', 47, 'Male', 'Options', 'Option - 3'],
['Kajus', 'Oco', 54, 'Male', 'Options', 'Option - 1']]

pandas DataFrame created

df = pandas.DataFrame(result, columns=['First Name', 'Last Name', 'Age', 'Gender', 'field_label', 'field_value'])

Values in column field_label has to be transposed as column header replacing column header name - field_value to Options

data = df.pivot_table(index=['First Name', 'Last Name', 'Age', 'Gender'], columns='field_label', values='field_value', fill_value='', aggfunc='first')
data = data.reset_index().rename_axis(None, axis=1)

Tried serveral other ways to achieve this using pivot_table and aggfunc='first' but some of the records are getting omitted when result set is too large to create dataframe.

Input
First Name  Last Name   Age Gender  field_label field_value
Nikolas     Webber      63  Male    Options     Option - 1
Sanaa       Kirkpat     53  Male    Options     Option - 3
Laaibah     Andrson     52  Male    Options     Option - 1
Beck        Villanu     61  Male    Options     Option - 2
Blair       Hook        67  Male    Options     Option - 1
Kavan       Hodges      47  Male    Options     Option - 3
Kajus       Oco         54  Male    Options     Option - 1

Any pointer to get below Output using pandas pivot_table or any other way would be really appreciated.

Output
First Name  Last Name   Age Gender  Options
Nikolas     Webber      63  Male    Option - 1
Sanaa       Kirkpat     53  Male    Option - 3
Laaibah     Andrson     52  Male    Option - 1
Beck        Villanu     61  Male    Option - 2
Blair       Hook        67  Male    Option - 1
Kavan       Hodges      47  Male    Option - 3
Kajus       Oco         54  Male    Option - 1

CodePudding user response:

I think code below will help you:

# create your processing function
def handle_function(row: pd.Series) -> pd.Series:
    """
    What you want to do here...
    """

    # get label and value
    label = row["field_label"]
    value = row["field_value"]
    
    # create column for label if it exists no create
    row[label] = value
   
    # return new row
    return row


# apply on your dataframe
df.apply(handle_function, axis=1)

# remove columns for label and values
df.drop(["field_label", "field_value", axis=1, inplace=True]
  • Related