I've been searching around for a while now, but I can't seem to find the answer to this small problem.
I have this code to make a function for lowercase values:
df = {'name':['AL', 'EL', 'NAILA', 'DORI', 'KAKAEKA', 'GENTA', 'RUBY'],
'living':['lagoa','sangiang','penjaringan','warakas','jonggol','cikarang', 'cikarang'],
'food':['PIZZA','MEATBALL','CHICKEN','CAKE','CAKE','ONION','NOODLE'],
'sub':['KOTA','KAB','WILAYAH','KAB','DAERAH','KOTA','WILAYAH'],
'job':['Chef','Teacher','Police','Doctor','Students','Programmer','Lecturer'],
'side_job':['Designer','Nurse','Designer','Programmer','Programmer','Teacher','Mentor'],
'status':['Single','Single','Married','Single','Single','Divorced','Married'],
'age':[20,25,20,18,25,40,37]
}
df = pd.DataFrame(df)
def content_consistent(df):
cols = df.select_dtypes(object).columns
df[cols] = df[cols].apply(lambda x: x.str.lower())
return df
df = content_consistent(df)
the result shows all values to be lowercase, but what I want is some columns not to be lowercase like 'sub' and 'status' columns
But I am actually expecting this output with the simple code not use looping
name living food sub job side_job status age
0 al lagoa pizza KOTA chef designer Single 20
1 el sangiang meatball KAB teacher nurse Single 25
2 naila penjaringan chicken WILAYAH police designer Married 20
3 dori warakas cake KAB doctor programmer Single 18
4 kakaeka jonggol cake DAERAH students programmer Single 25
5 genta cikarang onion KOTA programmer teacher Divorced 40
6 ruby cikarang noodle WILAYAH lecturer mentor Married 37
CodePudding user response:
Use Index.difference
for exclude some non numeric columns by list:
def content_consistent(df):
cols = df.select_dtypes(object).columns.difference(['sub', 'status'])
df[cols] = df[cols].apply(lambda x: x.str.lower())
return df
CodePudding user response:
You can exclude those columns with list comprehension as mentioned below
df = pd.DataFrame(df)
def content_consistent(df):
cols = df.select_dtypes(object).columns
cols = [x for x in cols if x not in ['sub', 'status']]
df[cols] = df[cols].apply(lambda x: x.str.lower())
return df
df = content_consistent(df)
CodePudding user response:
Select columns except sub and age. make them all lower and then update the df
df.update(df.filter(regex='[^subage]', axis=1).apply(lambda x:x.str.lower()))