pSTL_offload

`pSTL offload` Sample

The pSTL_offload sample demonstrates the offloading of C++ standard parallel algorithms to a SYCL device.

Area	Description
What you will learn	Offloading of C++ standard algorithms to GPU devices.
Time to complete	15 minutes
Category	Concepts and Functionality

Note: This sample is based on the cppParallelSTL GitHub repository.

Purpose

Offloading the C++ standard parallel STL code (par-unseq policy) to GPU and CPU without any code changes when using the -fsycl-pstl-offload compiler option with Intel® DPC+/C+ compiler. It is an experimental feature of oneDPL.

This folder contains three sample examples in the following folders:

Folder Name	Description
`FileWordCount`	Counting Words in Files Example
`WordCount`	Counting Words generated Example
'ParSTLTests'	Examples of Various STL Algorithms with Execution Policies

Note: For more information refer to Get Started with Parallel STL.

Prerequisites

Optimized for	Description
OS	Ubuntu* 22.04
Hardware	Intel® Data Center GPU Max Intel® Xeon CPU
Software	Intel oneAPI Base Toolkit version 2024.2 Intel® Threading Building Blocks (Intel® TBB)

Key Implementation Details

The example includes three samples FileWordCount , WordCount and and ParSTLTests. FileWordCount and WordCount counts the number of words which count the number of words in files and the number of words generated respectively using the standard C++17 Parallel Algorithm transfor_reduce. ParSTLTests demonstrates the use of various STL algorithms with different execution policies (seq, par, par_unseq). It applies these algorithms to large datasets and prints the results for each execution. This computation can be offloaded to the GPU device with the help of -fsycl-pstl-offload compiler option and standard header inclusion is explicitly required for PSTL Offload to work. FileWordCount sample also demonstrates the use of transform, copy, copy_if, and for_each standard C++17 Parallel Algorithms. . The ParSTLTests uses STL algorithms such as reduce, accumulate, find, copy_if, inclusive_scan, min_element, max_element, minmax_element, is_partitioned, lexicographical_compare, binary_search, lower_bound, and upper_bound. These algorithms perform tasks like summing elements, finding values, copying based on conditions, scanning, and searching within large datasets. The -fsycl-pstl-offload option enables the offloading of C++ standard parallel algorithms that were only called with std::execution::par_unseq policy to a SYCL device. The offloaded algorithms are implemented via the oneAPI Data Parallel C++ Library (oneDPL). This option is an experimental feature. If the argument is not specified, the compiler offloads to the default SYCL device. The performance of memory allocations may be improved by using the SYCL_PI_LEVEL_ZERO_USM_ALLOCATOR environment variable.

Set Environment Variables

When working with the command-line interface (CLI), you should configure the oneAPI toolkits using environment variables. Set up your CLI environment by sourcing the setvars script every time you open a new terminal window. This practice ensures that your compiler, libraries, and tools are ready for development.

Build and Run the `pSTL offload` Samples

Note: If you have not already done so, set up your CLI environment by sourcing the setvars script at the root of your oneAPI installation.

Linux*:

For system wide installations: . /opt/intel/oneapi/setvars.sh

For private installations: . ~/intel/oneapi/setvars.sh

For non-POSIX shells, like csh, use the following command: bash -c 'source <install-dir>/setvars.sh ; exec csh'

Windows*:

C:\Program Files (x86)\Intel\oneAPI\setvars.bat

Windows PowerShell*, use the following command: cmd.exe "/K" '"C:\Program Files (x86)\Intel\oneAPI\setvars.bat" && powershell'

For more information on configuring environment variables, see Use the setvars Script with Linux* or macOS*

On Linux*

Change to the sample directory.
Build the program.
```
$ mkdir build
$ cd build
$ ( cmake -D GPU=1 .. ) or ( cmake -D CPU=1 .. )
$ make
```
Note: Enable GPU flag during the build which supports Intel® Data Center GPU Max 1550 or 1100 to execution on GPUs.
Enable CPU flag during the build to execution on GPU.

This command sequence will build the WordCount and FileWordCount samples.

Run the program.

Run pSTL_offload-WordCount on GPU.

$ export ONEAPI_DEVICE_SELECTOR=level_zero:gpu
$ make run_wc
$ unset ONEAPI_DEVICE_SELECTOR

Run pSTL_offload-WordCount on CPU.

$ export ONEAPI_DEVICE_SELECTOR=*:cpu
$ make run_wc
$ unset ONEAPI_DEVICE_SELECTOR

Run pSTL_offload-FileWordCount on GPU.

$ export ONEAPI_DEVICE_SELECTOR=level_zero:gpu
$ make run_fwc0               //for SEQ Policy
$ make run_fwc1               //for PAR Policy
$ unset ONEAPI_DEVICE_SELECTOR

Run pSTL_offload-FileWordCount on CPU.

$ export ONEAPI_DEVICE_SELECTOR=*:cpu
$ make run_fwc0              //for SEQ Policy
$ make run_fwc1              //for PAR Policy
$ unset ONEAPI_DEVICE_SELECTOR

Run pSTL_offload-ParSTLTest on GPU.

$ export ONEAPI_DEVICE_SELECTOR=level_zero:gpu
$ ./ParSTLTest
$ unset ONEAPI_DEVICE_SELECTOR

Run pSTL_offload-ParSTLTest on CPU.

$ export ONEAPI_DEVICE_SELECTOR=*:cpu
$ ./ParSTLTest
$ unset ONEAPI_DEVICE_SELECTOR

Troubleshooting

If an error occurs, you can get more details by running make with the VERBOSE=1 argument:

$ make VERBOSE=1

If you receive an error message, troubleshoot the problem using the Diagnostics Utility for Intel® oneAPI Toolkits. The diagnostic utility provides configuration and system checks to help find missing dependencies, permissions errors, and other issues. See the Diagnostics Utility for Intel® oneAPI Toolkits User Guide for more information on using the utility.

License

Code samples are licensed under the MIT license. See License.txt for details.

Third party program licenses are at third-party-programs.txt.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

`pSTL offload` Sample

Purpose

Prerequisites

Key Implementation Details

Set Environment Variables

Build and Run the `pSTL offload` Samples

On Linux*

Troubleshooting

License

Name		Name	Last commit message	Last commit date
parent directory ..
FileWordCount		FileWordCount
ParSTLTests		ParSTLTests
WordCount		WordCount
CMakeLists.txt		CMakeLists.txt
License.txt		License.txt
README.md		README.md
sample.json		sample.json
third-party-programs.txt		third-party-programs.txt

FilesExpand file tree

pSTL_offload

Directory actions

More options

Directory actions

More options

Latest commit

History

pSTL_offload

Folders and files

parent directory

README.md

pSTL offload Sample

Purpose

Prerequisites

Key Implementation Details

Set Environment Variables

Build and Run the pSTL offload Samples

On Linux*

Troubleshooting

License

`pSTL offload` Sample

Build and Run the `pSTL offload` Samples