Skip to content

Latest commit

 

History

History
 
 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

README.md

oneCCL Getting Started Samples

The CCL sample codes are implemented using C++, C, and SYCL*-compliant extensions for CPU and GPU. By using all reduce collective operation samples, users can understand how to compile Intel® oneAPI Collective Communications Library (oneCCL) codes with various oneCCL configurations in Intel oneAPI environment.

Optimized for Description
OS Linux Ubuntu 18.04
Hardware Kaby Lake with GEN9 or newer
Software Intel® oneAPI Collective Communications Library (oneCCL)
Intel® oneAPI DPC++/C++ Compiler, v, GNU Compiler
What you will learn Basic oneCCL programming model for both Intel CPU and GPU
Time to complete 15 minutes

List of Samples

C++ API Collective Operation
sycl_allreduce_test.cpp Allreduce
cpu_allreduce_test.cpp/cpu_allreduce_bf16_test.c Allreduce

Notice: Please use Intel® DevCloud for oneAPI as the environment for jupyter notebook samples.

Users can refer to Intel® DevCloud Getting Started for using Intel® DevCloud.

Users can use JupyterLab from Intel® DevCloud via "One-click Login in", and download samples via git clone or the oneapi-cli tool.

Once users are in the JupyterLab with download jupyter notebook samples, they can start following the steps without further installation needed.

You can also use Visual Studio Code (VS Code) extensions to set your environment, create launch configurations, and browse and download samples.

To learn more about the extensions and how to configure the oneAPI environment, see Using Visual Studio Code with Intel® oneAPI Toolkits.

After learning how to use the extensions for Intel oneAPI Toolkits, return to this readme for instructions on how to build and run a sample.

Purpose

The samples implement the allreduce collective operation with oneCCL APIs. The sample users will learn how to compile the code with various oneCCL configurations in the Intel oneAPI environment.

Prerequisites

CPU

The samples below require the following components, which are part of the Intel® oneAPI DL Framework Developer Toolkit (DLFD Kit):

  • Intel® oneAPI Collective Communications Library (oneCCL):

You can refer to this page oneAPI for toolkit installation.

GPU and CPU

The samples below require the following components, which are part of the Intel® oneAPI Base Toolkit (Base Kit).

  • Intel® oneAPI Collective Communications Library (oneCCL)
  • Intel® oneAPI DPC++/C++ Compiler
  • Intel® oneAPI DPC++ Library (oneDPL)

The samples also require an OpenCL driver. Please refer System Requirements for OpenCL driver installation.

You can refer to this page oneAPI for toolkit installation.

Using Visual Studio Code* (Optional)

You can use Visual Studio Code (VS Code) extensions to set your environment, create launch configurations, and browse and download samples.

The basic steps to build and run a sample using VS Code include:

  • Download a sample using the extension Code Sample Browser for Intel oneAPI Toolkits.
  • Configure the oneAPI environment with the extension Environment Configurator for Intel oneAPI Toolkits.
  • Open a Terminal in VS Code (Terminal>New Terminal).
  • Run the sample in the VS Code terminal using the instructions below.
  • (Linux only) Debug your GPU application with GDB for Intel® oneAPI toolkits using the Generate Launch Configurations extension.

To learn more about the extensions, see the Using Visual Studio Code with Intel® oneAPI Toolkits User Guide.

Building the samples for CPU and GPU

Note: If you have not already done so, set up your CLI environment by sourcing the setvars script located in the root of your oneAPI installation.

Linux*:

  • For system wide installations: . /opt/intel/oneapi/setvars.sh
  • For private installations: . ~/intel/oneapi/setvars.sh
  • For non-POSIX shells, like csh, use the following command: $ bash -c 'source <install-dir>/setvars.sh ; exec csh'

For more information on configuring environment variables, see Use the setvars Script with Linux* or MacOS*.

on a Linux* System

CPU only:

  • Build the samples with GCC for CPU only
    please replace ${ONEAPI_ROOT} for your installation path.
    ex : /opt/intel/oneapi
    Don't need to replace {DPCPP_CMPLR_ROOT}
    source ${ONEAPI_ROOT}/setvars.sh --ccl-configuration=cpu
    
    cd oneapi-toolkit/oneCCL/oneCCL_Getting_Started
    mkdir build
    cd build
    cmake .. -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++
    make cpu_allreduce_test
    

NOTE: The source file "cpu_allreduce_test.cpp" will be copied from ${INTEL_ONEAPI_INSTALL_FOLDER}/ccl/latest/examples/cpu to build/src/cpu folder. Users can rebuild the cpu_allreduce_test.cpp by typing "make cpu_allreduce_test" under build folder.

GPU and CPU:

  • Build the samples with SYCL for GPU and CPU
    please replace ${ONEAPI_ROOT} for your installation path.
    ex : /opt/intel/oneapi
    Don't need to replace {DPCPP_CMPLR_ROOT}
    source ${ONEAPI_ROOT}/setvars.sh --ccl-configuration=cpu_gpu_dpcpp
    
    cd oneapi-toolkit/oneCCL/oneCCL_Getting_Started
    mkdir build
    cd build
    cmake ..  -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DCOMPUTE_BACKEND=icpx
    make sycl_allreduce_test
    

NOTE: The source file "sycl_allreduce_test.cpp" will be copied from ${INTEL_ONEAPI_INSTALL_FOLDER}/ccl/latest/examples/sycl to build/src/sycl folder. Users can rebuild the sycl_allreduce_test.cpp by typing "make sycl_allreduce_test" under build folder.

Include Files

The include folder is located at ${CCL_ROOT}}\include on your development system".

Running the Sample

Linux*

CPU only:

  • Run the program
    take cpu_allreduce_test for example.
    you can apply those steps for all other sample binaries.
    please replace the {NUMBER_OF_PROCESSES} with integer number accordingly

    mpirun -n ${NUMBER_OF_PROCESSES} ./out/cpu/cpu_allreduce_test
    

    ex:

    mpirun -n 2 ./out/cpu/cpu_allreduce_test
    

GPU and CPU:

  • Run the program
    take sycl_allreduce_test for example.
    you can apply those steps for all other sample binaries.
    please replace the {NUMBER_OF_PROCESSES} with integer number accordingly

    mpirun -n ${NUMBER_OF_PROCESSES} ./out/sycl/sycl_allreduce_test gpu|cpu|host|default
    

    ex: run on GPU

    mpirun -n 2 ./out/sycl/sycl_allreduce_test gpu
    

Example of Output

Linux

  • Run the program on CPU or GPU following How to Run Section

  • CPU Results

    Provided device type: cpu
    Running on Intel(R) Core(TM) i7-7567U CPU @ 3.50GHz
    Example passes
    

    please note that name of the running device may vary according to your environment

  • GPU Results

    Provided device type: gpu
    Running on Intel(R) Gen9 HD Graphics NEO
    Example passes
    

    please note that name of the running device may vary according to your environment

  • Enable oneCCL Verbose log

    There are different log levels in oneCCL. Users can refer to the link for different log levels.

    Users can enable oneCCL verbose log by following the command shown below to see more runtime information from oneCCL.

    export CCL_LOG_LEVEL=info
    

Troubleshooting

If an error occurs, troubleshoot the problem using the Diagnostics Utility for Intel® oneAPI Toolkits. Learn more.

License

Code samples are licensed under the MIT license. See License.txt for details.

Third party program Licenses can be found here: third-party-programs.txt.