I have a dataset which contains 6 columns TIME1
to TIME6
, amongst others. For each of these I need to apply the code below (which is shown for 2 columns). LISTED
is a prepared list of the possible elements to be seen in these columns.
Is there a way to do this without writing the same 2 lines 6 times?
df['PART1'] = df['TIME1'].astype('category')
df['PART1'].cat.set_categories(LISTED, inplace=True)
df['PART2'] = df['TIME2'].astype('category')
df['PART2'].cat.set_categories(LISTED, inplace=True)
For astype
(first line of code), I tried the following:
for col in ['TIME1', 'TIME2', 'TIME3', 'TIME4', 'TIME5', 'TIME6']:
df_col = df[col].astype('category')
I think this works (not sure how to check without the whole code working). But how could I do something similar for the second line of code with the set_categories etc?
In short, I'm looking for something short/more elegant that just copying and modifying the same 2 lines 6 times. I am new to python, any help is greatly appreciated.
Using python 2.7 and pandas 0.24.2
CodePudding user response:
Yes it is possible! We can change the dtype of multiple columns to categorical is one go by creating CategoricalDtype
i = pd.RangeIndex(1, 7).astype(str)
df['PART' i] = df['TIME' i].astype(pd.CategoricalDtype(LISTED))