How To Prevent Labelencoder From Sorting Label Values?

January 30, 2024 Post a Comment

Scikit LabelEncoder is showing some puzzling behavior in my Jupyter Notebook, as in: from sklearn.preprocessing import LabelEncoder le2 = LabelEncoder() le2.fit(['zero', 'one']) pr

Solution 1:

Thing is that LabelEncoder.fit() returns sorted data always. That is because it uses np.unique Here's the source code

I guess the only way to do what you want is to create your own fit method and override the original one from LabelEncoder.

You just need to reuse the existing code as given in the link, here's example:

import pandas as pd
from sklearn.preprocessing import LabelEncoder
from sklearn.utils import column_or_1d

classMyLabelEncoder(LabelEncoder):

    deffit(self, y):
        y = column_or_1d(y, warn=True)
        self.classes_ = pd.Series(y).unique()
        return self

le2 = MyLabelEncoder()
le2.fit(['zero', 'one'])
print (le2.inverse_transform([0, 0, 0, 1, 1, 1]))

gives you:

['zero''zero''zero''one''one''one']

Python Channel

How To Prevent Labelencoder From Sorting Label Values?

Solution 1:

Post a Comment for "How To Prevent Labelencoder From Sorting Label Values?"