Skip to content

[MODIN]: Add time results verifier for getting started notebook#871

Merged
praveenkk123 merged 7 commits into
oneapi-src:masterfrom
dchigarev:fix_modin_notebook
Mar 1, 2022
Merged

[MODIN]: Add time results verifier for getting started notebook#871
praveenkk123 merged 7 commits into
oneapi-src:masterfrom
dchigarev:fix_modin_notebook

Conversation

@dchigarev
Copy link
Copy Markdown
Contributor

Existing Sample Changes

Description

[NOTE: This PR is a copy of the accidentally closed #843 (can't be reopened)]

The notebook is supposed to show the performance improvement after switching from stock Pandas to Modin, it compares the execution times of Pandas and Modin in various scenarios. The previous version of the notebook didn't imply that the Modin could be slower than Pandas anyhow and so confused users if such happened.

The changes in this PR introduce a time verifier, that compares Modin and Pandas execution times and prints a note if Modin appears to be slower with the possible reasons for it (poor CPU performance, old Modin version, silly environment variables).

Fixes Jira MODIN6-135.

External Dependencies

The changes do not add any new dependencies.

Type of change

  • New feature (non-breaking change which adds functionality)
  • Implement fixes for Jiras

How Has This Been Tested?

  • Command Line

To test the changes locally:

  1. Combine steps from the sample.json into a single bash script.
script
set -e # Terminate the script on first error
source $(conda info --base)/etc/profile.d/conda.sh # Bypassing conda's disability to activate environments inside a bash script: https://github.com/conda/conda/issues/7980
conda create -y -n intel-aikit-modin intel-aikit-modin -c intel
conda activate intel-aikit-modin
pip install -r requirements.txt # Installing notebook's dependencies
pip install runipy # Installing 'runipy' for extended abilities to execute the notebook
runipy IntelModin_GettingStarted.ipynb # Test 'Modin is faster than Pandas' case
MODIN_CPUS=1 runipy IntelModin_GettingStarted.ipynb # Test 'Modin is slower than Pandas' case
  1. Run the script from the sample's directory and ensure that the last notebook's cell printed"[CODE_SAMPLE_COMPLETED_SUCCESFULLY]"

Would also like to get a review from Modin folks: @Garra1980 @YarShev @vnlitvinov

Signed-off-by: Dmitry Chigarev <dmitry.chigarev@intel.com>
Signed-off-by: Dmitry Chigarev <dmitry.chigarev@intel.com>
Signed-off-by: Dmitry Chigarev <dmitry.chigarev@intel.com>
Signed-off-by: Dmitry Chigarev <dmitry.chigarev@intel.com>
" try:\n",
" from modin.utils import get_current_execution\n",
"\n",
" execution = get_current_execution()\n",
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"pip install runipy # Installing 'runipy' for extended abilities to execute the notebook",
"runipy IntelModin_GettingStarted.ipynb"
"runipy IntelModin_GettingStarted.ipynb # Test 'Modin is faster than Pandas' case",
"MODIN_CPUS=1 runipy IntelModin_GettingStarted.ipynb # Test 'Modin is slower than Pandas' case"
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allowing for Modin to utilize only a single core to syntactically slow it down

Comment thread AI-and-Analytics/Getting-Started-Samples/IntelModin_GettingStarted/sample.json Outdated
" execution = (\n",
" \"Can't deduce the current execution, your Modin version is too old!\"\n",
" )\n",
" print(f\"\\tExecution: {execution}\")\n",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We print above Current execution is:. Should we duplicate this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed "Current execution is:" to the "Current configuration ..." above as it seems more suitable and doesn't cause such duplication

dchigarev and others added 2 commits February 21, 2022 17:28
Co-authored-by: Yaroslav Igoshev <Poolliver868@mail.ru>
Signed-off-by: Dmitry Chigarev <dmitry.chigarev@intel.com>
Copy link
Copy Markdown
Contributor

@YarShev YarShev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Copy Markdown
Contributor

@praveenkk123 praveenkk123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve for CI

@praveenkk123
Copy link
Copy Markdown
Contributor

Can we merge this PR..thanks

@praveenkk123 praveenkk123 merged commit ba2b369 into oneapi-src:master Mar 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants