Skip to content

Enable cpu/xpu support for the benchmarking suite#905

Merged
loadams merged 2 commits into
deepspeedai:masterfrom
intel-ai-tce:benchmark_cpu
Aug 14, 2024
Merged

Enable cpu/xpu support for the benchmarking suite#905
loadams merged 2 commits into
deepspeedai:masterfrom
intel-ai-tce:benchmark_cpu

Conversation

@louie-tsai
Copy link
Copy Markdown
Contributor

@louie-tsai louie-tsai commented May 22, 2024

Enable Intel CPU and Intel XPU support for Benchmark Suite.
Many customers use deepspeed on CPU and XPU for LLM models, and this benchmark suite helps them to debugging communication issues on their environment.

an screenshot for two nodes run of all_reduce.py on CPU
image

an screenshot for two cards run of run_all.py on XPU
image

@louie-tsai
Copy link
Copy Markdown
Contributor Author

@louie-tsai please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.

@microsoft-github-policy-service agree [company="{your company}"]

Options:

  • (default - no company specified) I have sole ownership of intellectual property rights to my Submissions and I am not making Submissions in the course of work for my employer.
@microsoft-github-policy-service agree
  • (when company given) I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer. By signing below, the defined term “You” includes me and my employer.
@microsoft-github-policy-service agree company="Microsoft"

Contributor License Agreement

@microsoft-github-policy-service agree [company="{Intel}"]

@louie-tsai
Copy link
Copy Markdown
Contributor Author

@louie-tsai the command you issued was incorrect. Please try again.

Examples are:

@microsoft-github-policy-service agree

and

@microsoft-github-policy-service agree company="your company"

@microsoft-github-policy-service agree [company="{Intel}"]

@louie-tsai
Copy link
Copy Markdown
Contributor Author

@microsoft-github-policy-service agree [company="{Intel}"]

@louie-tsai
Copy link
Copy Markdown
Contributor Author

@louie-tsai the command you issued was incorrect. Please try again.

Examples are:

@microsoft-github-policy-service agree

and

@microsoft-github-policy-service agree company="your company"

@microsoft-github-policy-service agree company="Intel"

@louie-tsai
Copy link
Copy Markdown
Contributor Author

@microsoft-github-policy-service agree company="Intel"

@microsoft-github-policy-service agree company=Intel

@tjruwase
Copy link
Copy Markdown
Contributor

@louie-tsai, thanks so much. This is an amazing PR. We will review and merge shortly.

@tjruwase
Copy link
Copy Markdown
Contributor

@louie-tsai, can you confirm if this PR is ready for review, I noticed that output (e.g., Gbps) is incorrect/missing.
image

Comment thread benchmarks/communication/utils.py Outdated
Comment thread benchmarks/communication/all_gather.py Outdated
@louie-tsai
Copy link
Copy Markdown
Contributor Author

louie-tsai commented Jul 26, 2024

@louie-tsai, can you confirm if this PR is ready for review, I noticed that output (e.g., Gbps) is incorrect/missing. image

The output issue is related to the duration calculation from event.
if I used time.time to measure instead of XPU event. it looks good.
image
I will escalate the XPU event issue and ask for a fix.
In the meantime, remove XPU support from README

@louie-tsai louie-tsai requested a review from costin-eseanu July 26, 2024 07:44
@loadams loadams merged commit b04fedd into deepspeedai:master Aug 14, 2024
hwchen2017 pushed a commit that referenced this pull request Jun 8, 2025
* enable cpu/xpu support for the benchmarking suite

* fixes according to review feedback
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants