FIX, ENH: Patching for LogisticRegressionCV#2871
FIX, ENH: Patching for LogisticRegressionCV#2871david-cortes-intel merged 24 commits intouxlfoundation:mainfrom
Conversation
|
@david-cortes-intel just casually checking in. this PR if it gets moved out of draft will require updates to scikit learn bench, and depending on the results may require sklearnex.preview module patching. cc @ethanglaser |
Codecov Report✅ All modified and coverable lines are covered by tests.
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 29 files with indirect coverage changes 🚀 New features to boost your workflow:
|
|
/intelci: run |
|
/azp run Nightly |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
CI failures are not from the class introduced here. |
|
@david-cortes-intel I think the common tests like https://github.com/uxlfoundation/scikit-learn-intelex/blob/main/sklearnex/tests/test_monkeypatch.py, https://github.com/uxlfoundation/scikit-learn-intelex/blob/main/sklearnex/tests/test_run_to_run_stability.py, etc. should be also updated with LogisticRegressionCV estimator. Or can this be done as a follow up task? I see that this PR fixes a lot of issues in the CI, maybe it worth to merge it faster. |
All of those tests use a common list of estimators to run: They are actually running the new estimator in this PR, since it is being added to that list. |
|
@david-cortes-intel but those tests have some LogReg-related code as well. Should it be extended to LogRegCV? |
No, those are just random examples. As far as those tests are concerned, it doesn't matter which estimators are used, and not everything is tested in that file. The more comprehensive tests in https://github.com/uxlfoundation/scikit-learn-intelex/blob/main/sklearnex/tests/test_patching.py do include LogisticRegressionCV. |
Vika-F
left a comment
There was a problem hiding this comment.
Thank you for sorting this out! The amount of testing seems also reasonable. LGTM.
cd6e847
into
uxlfoundation:main
|
Looks like there's something wrong with the CV version here. It ends up running much slower than stock sklearn. Will move it to preview as suggested. |
Description
From #2863
It appears that the logistic regression patching logic is in many ways designed to cover also LogisticRegressionCV, and there are tests checking that it runs with a patched replacement of an internal function, but patching is not applied explicitly to it - rather, it appears to be left patched in some cases as an incorrect cleanup of
LogisticRegression.fit, resulting in some CI jobs executing some tests with this class patched but not others.This PR implements explicit patching for LogisticRegressionCV. Since the implementation is based on replacing some internal functions from sklearn in the same way as for LogisticRegression, it implements it through daal4py even though it is deprecated.
Note that it requires #2863 to be merged before passing the tests.
Along the way, it also fixes a few outstanding issues with changes from sklearn1.8 that were not correctly adopted in logistic regression, which shares the same internal function as LogisticRegressionCV.
Checklist:
Completeness and readability
Testing