Home > database >  Python Trim Multiple Column Names of a dataframe
Python Trim Multiple Column Names of a dataframe

Time:02-26

I have data frame of 20 columns. All of them have a common text and a serial number. I want to trim the text part and make the name shorter. Below is an example:

xdf = pd.DataFrame({'Column1':[10,20],'Column2':[80,90]})

   Column1  Column2
0       10       80
1       20       90

Expected output:

        C1       C2
0       10       80
1       20       90

Solution1:

oldcols = ['Column1','Column2']
newcols = ['C1','C2']
xdf.rename(columns=dict(zip(oldcols,newcols)),inplace=True)

        C1       C2
0       10       80
1       20       90

Solution2:

for i in range(len(oldcols)):
    xdf.rename(columns={'%s'%(xdf[i]):'%s'%(xdf[i].replace('Column','C'))},inplace=True)

raise KeyError(key) from err

Solution1 works fine but I have to prepare an old and new column names list. Instead, I want to iterate through each column name and replace the column text. However, solution2 is not working.

CodePudding user response:

You could use str.findall on the columns to split into text and number; then use a list comprehension to take only the first letter and join it with the numbers for each column name:

xdf.columns = [x[0] y for li in xdf.columns.str.findall(r'([A-Za-z] )(\d )') for x,y in li]

Output:

   C1  C2
0  10  80
1  20  90
  • Related