Skip to content Skip to sidebar Skip to footer

How To Prevent Labelencoder From Sorting Label Values?

Scikit LabelEncoder is showing some puzzling behavior in my Jupyter Notebook, as in: from sklearn.preprocessing import LabelEncoder le2 = LabelEncoder() le2.fit(['zero', 'one']) pr

Solution 1:

Thing is that LabelEncoder.fit() returns sorted data always. That is because it uses np.unique Here's the source code

I guess the only way to do what you want is to create your own fit method and override the original one from LabelEncoder.

You just need to reuse the existing code as given in the link, here's example:

import pandas as pd
from sklearn.preprocessing import LabelEncoder
from sklearn.utils import column_or_1d

classMyLabelEncoder(LabelEncoder):

    deffit(self, y):
        y = column_or_1d(y, warn=True)
        self.classes_ = pd.Series(y).unique()
        return self

le2 = MyLabelEncoder()
le2.fit(['zero', 'one'])
print (le2.inverse_transform([0, 0, 0, 1, 1, 1]))

gives you:

['zero''zero''zero''one''one''one']

Post a Comment for "How To Prevent Labelencoder From Sorting Label Values?"