Say I have a dataframe 'df' that contains a list of files and their contents:
File Field Folder
Users.csv Age UserFolder
Users.csv Name UserFolder
Cars.csv Color CarFolder
Cars.csv Model CarFolder
How can I reorder this df if I have ordered lists of how the 'Field' column should be ordered?
users_col_order = ['Name', 'Age']
cars_col_order = ['Model', 'Color']
So that the resulting df is re ordered like so (I am not trying to just sort 'Field' in reverse alphabetical order, this example is just coincidence):
File Field Folder
Users.csv Name UserFolder
Users.csv Age UserFolder
Cars.csv Model CarFolder
Cars.csv Color CarFolder
CodePudding user response:
First, put your new orders in a dictionary:
mapping = {
'Users': ['Name', 'Age'],
'Cars': ['Model', 'Color'],
}
Then, create a new column with those values properly positioned according to the File
values, and make Field
the index and index it with the new column:
original_cols = df.columns
for k, v in mapping.items():
df.loc[df['File'] == k '.csv', 'tmp'] = v
df = df.set_index('Field').loc[df['tmp']].reset_index().drop('tmp', axis=1)[original_cols]
Output:
>>> df
File Field Folder
0 Users.csv Name UserFolder
1 Users.csv Age UserFolder
2 Cars.csv Model CarFolder
3 Cars.csv Color CarFolder
CodePudding user response:
Use pd.Categorical
with ordered=True
!
categories = users_col_order cars_col_order
df['Field'] = pd.Categorical(values = df['Field'],
categories = categories,
ordered = True)
df.sort_values(by='Field')
File Field Folder
Users.csv Name UserFolder
Users.csv Age UserFolder
Cars.csv Model CarFolder
Cars.csv Color CarFolder
If you want to, you can always create a new column Field_categorical
to preserve the original values in Field
.
CodePudding user response:
#Clothing sizes is a good example for custom sorting order,
#because XL is at the opposite of XS and not the following one:
#Create DF:
df = pd.DataFrame({
'cloth_id': [1001, 1002, 1003, 1004, 1005],
'size': ['S', 'XL', 'M', 'XS', 'L'],
})
#Import this module
from pandas.api.types import CategoricalDtype
#Create and assign your own list order
cat_size_order = CategoricalDtype(
['XS', 'S', 'M', 'L', 'XL'],
ordered=True
)
# After that, call astype(cat_size_order) to cast the size data to the custom category type.
# By running df['size'], we can see that the size column has been casted to a category type with the order [XS < S < M < L < XL].
df['size'] = df['size'].astype(cat_size_order)
#Apply it :)
df.sort_values('size')