How can I sort a Pandas dataframe with text and numeric columns while ignoring the case?
df = pd.DataFrame({
'A':list('aabbCC'),
'B':[2,1,2,1,10,1]
})
Based on this answer Sort dataframe by multiple columns while ignoring case
df.sort_values(by=[ 'A', 'B'], inplace=True, key=lambda x: x.str.lower())
I get an error
builtins.AttributeError: Can only use .str accessor with string values!
How do I have to modify the key function?
CodePudding user response:
Use if-else
statement with lowercase for non numeric column:
f = lambda x: x if np.issubdtype(x.dtype, np.number) else x.str.lower()
df.sort_values(by=[ 'A', 'B'], inplace=True, key=f)
Or:
df = df.loc[df.assign(A=df['A'].str.lower()).sort_values(by=[ 'A', 'B']).index]