Home > OS >  Pandas: Apply transformations to all character columns
Pandas: Apply transformations to all character columns

Time:07-20

I am working in Python to try and apply a few transformations to all character/string columns in a pandas dataframe. The transformations are:

  • Make everything uppercase
  • Trim the white space

I come from an R background and this can be achieved via something like


mydf <- mydf %>% 
  dplyr::mutate_if(is.character, toupper)
  dplyr::mutate_if(is.character, trimws)

For Python I am at a loss. I have tried the below where it first identifies all the character columns and then attempts to trim the whitespace and make all the character columns upper case (Species in this case)

import numpy as np
import pandas as pd
from sklearn.datasets import load_iris

# Create a sample dataset
iris = load_iris()

df= pd.DataFrame(data= np.c_[iris['data'], iris['target']],
                 columns= iris['feature_names']   ['target'])

df['species'] = pd.Categorical.from_codes(iris.target, iris.target_names)

# Make character columns upper case and then trim the white space
string_dtypes = df.convert_dtypes().select_dtypes("string")
df[string_dtypes.columns] = string_dtypes.apply(lambda x: x.str.upper())
df[string_dtypes.columns] = string_dtypes.apply(lambda x: x.str.strip())

df

I appreciate this might be a very basic question and thank you in advance to anyone who takes the time to help

CodePudding user response:

You should be able to do this in one line with method chaining:

df.astype(str).apply(lambda x: x.str.upper().str.strip())

Output:

    sepal length (cm)   sepal width (cm)    petal length (cm)   petal width (cm)    target  species
0   5.1 3.5 1.4 0.2 0.0 SETOSA
1   4.9 3.0 1.4 0.2 0.0 SETOSA
2   4.7 3.2 1.3 0.2 0.0 SETOSA
3   4.6 3.1 1.5 0.2 0.0 SETOSA
4   5.0 3.6 1.4 0.2 0.0 SETOSA
... ... ... ... ... ... ...
145 6.7 3.0 5.2 2.3 2.0 VIRGINICA
146 6.3 2.5 5.0 1.9 2.0 VIRGINICA
147 6.5 3.0 5.2 2.0 2.0 VIRGINICA
148 6.2 3.4 5.4 2.3 2.0 VIRGINICA
149 5.9 3.0 5.1 1.8 2.0 VIRGINICA
  • Related