Why Xgboost.cv And Sklearn.cross_val_score Give Different Results?
I'm trying to make a classifier on a data set. I first used XGBoost: import xgboost as xgb import pandas as pd import numpy as np train = pd.read_csv('train_users_processed_onehot
Solution 1:
This question is a bit old, but I ran into the problem today and figured out why the results given by xgboost.cv
and sklearn.model_selection.cross_val_score
are quite different.
By default cross_val_score use KFold
or StratifiedKFold
whose shuffle argument is False so the folds are not pulled randomly from the data.
So if you do this, then you should get the same results:
cross_val_score(estimator, X=train_features, y=train_labels, scoring="neg_log_loss",
cv = StratifiedKFold(shuffle=True, random_state=23333))
Keep the random state
in StratifiedKfold
and seed
in xgboost.cv
same to get exactly reproducible results.
Post a Comment for "Why Xgboost.cv And Sklearn.cross_val_score Give Different Results?"