Home > Enterprise >  ValueError on inverse transform using OrdinalEncoder with dictionary
ValueError on inverse transform using OrdinalEncoder with dictionary

Time:11-24

I can transform the target column to desired ordered numerical value using categorical encoding and ordinal encoding. But I am unable to perform inverse_transform as an error is showing which is written below.

import pandas as pd
import category_encoders as ce
from sklearn.preprocessing import OrdinalEncoder

lst = [ 'BRANCHING/ELONGATION', 'EARLY', 'EARLY', 'EARLY', 'EARLY', 'MID', 'MID',  'ADVANCED/TILLERING',
        'FLOWERING', 'FLOWERING', 'FLOWERING', 'SEEDLING/EMERGED']
  
filtered_df = pd.DataFrame(lst, columns =['growth_state'])

filtered_df['growth_state'].value_counts()

EARLY                   4
FLOWERING               3
MID                     2
ADVANCED/TILLERING      1
SEEDLING/EMERGED        1
BRANCHING/ELONGATION    1
Name: growth_state, dtype: int64

dictionary = [{'col': 'growth_state',
               'mapping':{'SEEDLING/EMERGED':0, 'EARLY':1, 'MID':2,
                          'ADVANCED/TILLERING':3, 'BRANCHING/ELONGATION':4, 'FLOWERING':5 }}]

# instiating encoder
encoder = ce.OrdinalEncoder(cols = 'growth_state', mapping= dictionary)

filtered_df['growth_state'] = encoder.fit_transform(filtered_df['growth_state'])
filtered_df

    growth_state
0   4
1   1
2   1
3   1
4   1
5   2
6   2
7   3
8   5
9   5
10  5
11  0

But when I perform inverse_transform:

newCol = encoder.inverse_transform(filtered_df['growth_state'])
AttributeError                            Traceback (most recent call last)
<ipython-input-26-b6505b4be1e1> in <module>
----> 1 newCol = encoder.inverse_transform(filtered_df['growth_state'])

d:\users\tiwariam\appdata\local\programs\python\python36\lib\site-packages\category_encoders\ordinal.py in inverse_transform(self, X_in)
    266         for switch in self.mapping:
    267             column_mapping = switch.get('mapping')
--> 268             inverse = pd.Series(data=column_mapping.index, index=column_mapping.values)
    269             X[switch.get('col')] = X[switch.get('col')].map(inverse).astype(switch.get('data_type'))
    270 

AttributeError: 'dict' object has no attribute 'index'

Note: the above column is a target column, I could have applied a label encoder as this is a classification-related problem. But I have adopted the above combination of categorical and ordinal encoding as variables are ordered in nature.

CodePudding user response:

The error comes from this line in the inverse_transform source code:

inverse = pd.Series(data=column_mapping.index, index=column_mapping.values)

It seems that even though the category_encoders documentation says that the mapping should be provided as a dictionary, their inverse_transform code is actually looking for a pd.Series:

import pandas as pd
from category_encoders import OrdinalEncoder

df = pd.DataFrame({
    'growth_state': ['BRANCHING/ELONGATION', 'EARLY', 'EARLY', 'EARLY', 'EARLY', 'MID', 'MID', 'ADVANCED/TILLERING', 'FLOWERING', 'FLOWERING', 'FLOWERING', 'SEEDLING/EMERGED']
})

mapping = [{
    'col': 'growth_state',
    'mapping': pd.Series(data={'SEEDLING/EMERGED': 0, 'EARLY': 1, 'MID': 2, 'ADVANCED/TILLERING': 3, 'BRANCHING/ELONGATION': 4, 'FLOWERING': 5}),
    'data_type': object
}]

enc = OrdinalEncoder(cols=['growth_state'], mapping=mapping)

df_transformed = enc.fit_transform(df)
df_transformed.head()
#    growth_state
# 0             4
# 1             1
# 2             1
# 3             1
# 4             1

df_inverse = enc.inverse_transform(df_transformed)
df_inverse.head()
#            growth_state
# 0  BRANCHING/ELONGATION
# 1                 EARLY
# 2                 EARLY
# 3                 EARLY
# 4                 EARLY
  • Related