Can we get custom dtype from pandas column or at least order of encoded values ?
df = pd.DataFrame({"b": [1, np.nan, 3, 4, np.nan], "a": ["a", "a", "a", "b", "b"]})
ordered = pd.CategoricalDtype(["a", "b"], ordered=True)
df["a"].astype(ordered)
df.dtypes
# a object
# b float64
# dtype: object
CodePudding user response:
You must assign the output:
df['a'] = df['a'].astype(ordered)
print(df.dtypes)
output:
b float64
a category
dtype: object
Alternative, use pandas.Categorical
with the dtype
parameter:
df['a'] = pd.Categorical(df['a'], dtype=ordered)