Home > Mobile >  convert dataframe column names from camel case to snake case
convert dataframe column names from camel case to snake case

Time:12-01

I want to change the column labels of a Pandas DataFrame from

['evaluationId,createdAt,scheduleEndDate,sharedTo, ...]

to

['EVALUATION_ID,CREATED_AT,SCHEDULE_END_DATE,SHARED_TO,...] I have a lot of collumns with this pattern "aaaBb" and I want to create this pattern "AAA_BB" of renamed columns

Can anyone help me?

Cheers

I tried something like

new_columns = [unidecode(x).upper()
                    for x in df.columns]

But I don´t have idea how to create a solution.

CodePudding user response:

You can use a regex with str.replace to detect the lowercase-UPPERCASE shifts and insert a _, then str.upper:

df.columns = (df.columns
                .str.replace('(?<=[a-z])(?=[A-Z])', '_', regex=True)
                .str.upper()
             )

Before:

  evaluationId createdAt scheduleEndDate sharedTo
0          NaN       NaN             NaN      NaN

After:

  EVALUATION_ID CREATED_AT SCHEDULE_END_DATE SHARED_TO
0           NaN        NaN               NaN       NaN

CodePudding user response:

To change the labels of the columns in a Pandas DataFrame, you can use the DataFrame.rename() method. This method takes a dictionary as its argument, where the keys are the old column labels and the values are the new labels.

For example, to change the column labels in the DataFrame you provided, you can use the following code:

import pandas as pd

import pandas as pd    
# Create a sample DataFrame
df = pd.DataFrame(columns=['evaluationId,createdAt,scheduleEndDate,sharedTo, ...'])

# Use the rename() method to change the column labels
df = df.rename(columns={'evaluationId,createdAt,scheduleEndDate,sharedTo, ...': 'EVALUATION_ID,CREATED_AT,SCHEDULE_END_DATE,SHARED_TO,...'})

Note that the keys and values in the dictionary passed to the rename() method should be strings.

If you have many columns with the pattern "aaaBb" that you want to change to "AAA_BB", you can use the str.replace() method to do this in one go. This method takes two arguments: the substring you want to find, and the substring you want to replace it with.

For example, you could use the following code to replace all instances of "aaaBb" with "AAA_BB" in the column labels:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame(columns=['aaaBb1', 'aaaBb2', 'aaaBb3', ...])

# Use the rename() method with the str.replace() method to change the column labels
df = df.rename(columns=lambda x: x.str.replace('aaaBb', 'AAA_BB'))

This code uses the lambda function to apply the str.replace() method to each column label in the DataFrame. This allows you to replace all instances of "aaaBb" with "AAA_BB" in the column labels in one go.

  • Related