taskflow/docs/BenchmarkTaskflow.html at fix_multidll_memory_leak · ModuleWorks/taskflow

150 lines (149 loc) · 16.2 KB
<!DOCTYPE html>
<html lang="en">
  <meta charset="UTF-8" />
  <title>Building and Installing &raquo; Benchmark Taskflow | Taskflow QuickStart</title>
  <link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Source+Sans+Pro:400,400i,600,600i%7CSource+Code+Pro:400,400i,600" />
  <link rel="stylesheet" href="m-dark+documentation.compiled.css" />
  <link rel="icon" href="favicon.ico" type="image/x-icon" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
  <meta name="theme-color" content="#22272e" />
<header><nav id="navigation">
  <div class="m-container">
    <div class="m-row">
      <span id="m-navbar-brand" class="m-col-t-8 m-col-m-none m-left-m">
        <a href="https://taskflow.github.io"><img src="taskflow_logo.png" alt="" />Taskflow</a> <span class="m-breadcrumb">|</span> <a href="index.html" class="m-thin">QuickStart</a>
      </span>
      <div class="m-col-t-4 m-hide-m m-text-right m-nopadr">
        <a href="#search" class="m-doc-search-icon" title="Search" onclick="return showSearch()"><svg style="height: 0.9rem;" viewBox="0 0 16 16">
          <path id="m-doc-search-icon-path" d="m6 0c-3.31 0-6 2.69-6 6 0 3.31 2.69 6 6 6 1.49 0 2.85-0.541 3.89-1.44-0.0164 0.338 0.147 0.759 0.5 1.15l3.22 3.79c0.552 0.614 1.45 0.665 2 0.115 0.55-0.55 0.499-1.45-0.115-2l-3.79-3.22c-0.392-0.353-0.812-0.515-1.15-0.5 0.895-1.05 1.44-2.41 1.44-3.89 0-3.31-2.69-6-6-6zm0 1.56a4.44 4.44 0 0 1 4.44 4.44 4.44 4.44 0 0 1-4.44 4.44 4.44 4.44 0 0 1-4.44-4.44 4.44 4.44 0 0 1 4.44-4.44z"/>
        </svg></a>
        <a id="m-navbar-show" href="#navigation" title="Show navigation"></a>
        <a id="m-navbar-hide" href="#" title="Hide navigation"></a>
      </div>
      <div id="m-navbar-collapse" class="m-col-t-12 m-show-m m-col-m-none m-right-m">
        <div class="m-row">
          <ol class="m-col-t-6 m-col-m-none">
            <li><a href="pages.html">Handbook</a></li>
            <li><a href="namespaces.html">Namespaces</a></li>
          </ol>
          <ol class="m-col-t-6 m-col-m-none" start="3">
            <li><a href="annotated.html">Classes</a></li>
            <li><a href="files.html">Files</a></li>
            <li class="m-show-m"><a href="#search" class="m-doc-search-icon" title="Search" onclick="return showSearch()"><svg style="height: 0.9rem;" viewBox="0 0 16 16">
              <use href="#m-doc-search-icon-path" />
            </svg></a></li>
          </ol>
        </div>
      </div>
</nav></header>
<main><article>
  <div class="m-container m-container-inflatable">
    <div class="m-row">
      <div class="m-col-l-10 m-push-l-1">
        <h1>
          <span class="m-breadcrumb"><a href="install.html">Building and Installing</a> &raquo;</span>
          Benchmark Taskflow
        </h1>
        <nav class="m-block m-default">
          <h3>Contents</h3>
          <ul>
            <li><a href="#CompileAndRunBenchmarks">Compile and Run Benchmarks</a></li>
            <li>
              <a href="#ConfigureRunOptions">Configure Run Options</a>
              <ul>
                <li><a href="#SpecifyTheRunModel">Specify the Run Model</a></li>
                <li><a href="#SpecifyTheNumberOfThreads">Specify the Number of Threads</a></li>
                <li><a href="#SpecifyTheNumberOfRounds">Specify the Number of Rounds</a></li>
              </ul>
            </li>
          </ul>
        </nav>
<section id="CompileAndRunBenchmarks"><h2><a href="#CompileAndRunBenchmarks">Compile and Run Benchmarks</a></h2><p>To build the benchmark code, enable the CMake option <code>TF_BUILD_BENCHMARKS</code> to <code>ON</code> as follows:</p><pre class="m-code"><span class="c1"># under /taskflow/build</span>
~$<span class="w"> </span>cmake<span class="w"> </span>../<span class="w"> </span>-DTF_BUILD_BENCHMARKS<span class="o">=</span>ON
~$<span class="w"> </span>make</pre><p>After you successfully build the benchmark code, you can find all benchmark instances in the <code>benchmarks/</code> folder. You can run the executable of each instance in the corresponding folder.</p><pre class="m-code">~$<span class="w"> </span><span class="nb">cd</span><span class="w"> </span>benchmarks<span class="w"> </span><span class="p">&amp;</span><span class="w"> </span>ls
bench_black_scholes<span class="w"> </span>bench_binary_tree<span class="w"> </span>bench_graph_traversal<span class="w"> </span>...
~$<span class="w"> </span>./bench_graph_traversal
<span class="p">|</span>V<span class="p">|</span>+<span class="p">|</span>E<span class="p">|</span><span class="w">     </span>Runtime
<span class="w">      </span><span class="m">2</span><span class="w">       </span><span class="m">0</span>.197
<span class="w">    </span><span class="m">842</span><span class="w">       </span><span class="m">0</span>.198
<span class="w">   </span><span class="m">3284</span><span class="w">       </span><span class="m">0</span>.488
<span class="w">   </span><span class="m">7288</span><span class="w">       </span><span class="m">0</span>.774
<span class="w">    </span>...<span class="w">         </span>...
<span class="w">    </span>...<span class="w">         </span>...
<span class="w"> </span><span class="m">619802</span><span class="w">      </span><span class="m">75</span>.135
<span class="w"> </span><span class="m">664771</span><span class="w">      </span><span class="m">77</span>.436
<span class="w"> </span><span class="m">711200</span><span class="w">      </span><span class="m">83</span>.957</pre><p>You can display the help message by giving the option <code>--help</code>.</p><pre class="m-code">~$<span class="w"> </span>./bench_graph_traversal<span class="w"> </span>--help
Graph<span class="w"> </span>Traversal
Usage:<span class="w"> </span>./bench_graph_traversal<span class="w"> </span><span class="o">[</span>OPTIONS<span class="o">]</span>
<span class="w">  </span>-h,--help<span class="w">                   </span>Print<span class="w"> </span>this<span class="w"> </span><span class="nb">help</span><span class="w"> </span>message<span class="w"> </span>and<span class="w"> </span><span class="nb">exit</span>
<span class="w">  </span>-t,--num_threads<span class="w"> </span>UINT<span class="w">       </span>number<span class="w"> </span>of<span class="w"> </span>threads<span class="w"> </span><span class="o">(</span><span class="nv">default</span><span class="o">=</span><span class="m">1</span><span class="o">)</span>
<span class="w">  </span>-r,--num_rounds<span class="w"> </span>UINT<span class="w">        </span>number<span class="w"> </span>of<span class="w"> </span>rounds<span class="w"> </span><span class="o">(</span><span class="nv">default</span><span class="o">=</span><span class="m">1</span><span class="o">)</span>
<span class="w">  </span>-m,--model<span class="w"> </span>TEXT<span class="w">             </span>model<span class="w"> </span>name<span class="w"> </span>tbb<span class="p">|</span>omp<span class="p">|</span>tf<span class="w"> </span><span class="o">(</span><span class="nv">default</span><span class="o">=</span>tf<span class="o">)</span></pre><p>We currently implement the following instances that are commonly used by the parallel computing community to evaluate the system performance.</p><table class="m-table"><thead><tr><th>Instance</th><th>Description</th></tr></thead><tbody><tr><td>bench_binary_tree</td><td>traverses a complete binary tree</td></tr><tr><td>bench_black_scholes</td><td>computes option pricing with Black-Shcoles Models</td></tr><tr><td>bench_graph_traversal</td><td>traverses a randomly generated direct acyclic graph</td></tr><tr><td>bench_linear_chain</td><td>traverses a linear chain of tasks</td></tr><tr><td>bench_mandelbrot</td><td>exploits imbalanced workloads in a Mandelbrot set</td></tr><tr><td>bench_matrix_multiplication</td><td>multiplies two 2D matrices</td></tr><tr><td>bench_mnist</td><td>trains a neural network-based image classifier on the MNIST dataset</td></tr><tr><td>bench_parallel_sort</td><td>sorts a range of items</td></tr><tr><td>bench_reduce_sum</td><td>sums a range of items using reduction</td></tr><tr><td>bench_wavefront</td><td>propagates computations in a 2D grid</td></tr><tr><td>bench_linear_pipeline</td><td>performs pipeline parallelism on a linear chain of pipes</td></tr><tr><td>bench_graph_pipeline</td><td>performs pipeline parallelism on a graph of pipes</td></tr><tr><td>bench_deferred_pipeline</td><td>performs pipeline parallelism with dependencies from future pipes</td></tr><tr><td>bench_data_pipeline</td><td>performs pipeline parallelisms on a cache-friendly data wrapper</td></tr><tr><td>bench_thread_pool</td><td>uses our executor as a simple thread pool</td></tr><tr><td>bench_for_each</td><td>performs parallel-iteration algorithms</td></tr><tr><td>bench_scan</td><td>performs parallel-scan algorithms</td></tr><tr><td>bench_async_task</td><td>creates asynchronous tasks</td></tr><tr><td>bench_fibonacci</td><td>finds Fibonacci numbers using recursive asynchronous tasking</td></tr><tr><td>bench_nqueens</td><td>parallelizes n-queen search using recursive asynchronous tasking</td></tr><tr><td>bench_integrate</td><td>parallelizes integration using recursive asynchronous tasking</td></tr><tr><td>bench_primes</td><td>finds a range of prime numbers using parallel-reduction algorithms</td></tr><tr><td>bench_skynet</td><td>traverses a 10-ray tree using recursive asynchronous tasking</td></tr></tbody></table></section><section id="ConfigureRunOptions"><h2><a href="#ConfigureRunOptions">Configure Run Options</a></h2><p>We implement consistent options for each benchmark instance. Common options are:</p><table class="m-table"><thead><tr><th>option</th><th>value</th><th>function</th></tr></thead><tbody><tr><td><code>-h</code></td><td>none</td><td>displays the help message</td></tr><tr><td><code>-t</code></td><td>integer</td><td>configures the number of threads to run</td></tr><tr><td><code>-r</code></td><td>integer</td><td>configures the number of rounds to run</td></tr><tr><td><code>-m</code></td><td>string</td><td>configures the baseline models to run, tbb, omp, or tf</td></tr></tbody></table><p>You can configure the benchmarking environment by giving different options.</p><section id="SpecifyTheRunModel"><h3><a href="#SpecifyTheRunModel">Specify the Run Model</a></h3><p>In addition to a Taskflow-based implementation for each benchmark instance, we have implemented two baseline models using the state-of-the-art parallel programming libraries, <a href="https://www.openmp.org/">OpenMP</a> and <a href="https://github.com/oneapi-src/oneTBB">Intel TBB</a>, to measure and evaluate the performance of Taskflow. You can select different implementations by passing the option <code>-m</code>.</p><pre class="m-code">~$<span class="w"> </span>./bench_graph_traversal<span class="w"> </span>-m<span class="w"> </span>tf<span class="w">   </span><span class="c1"># run the Taskflow implementation (default)</span>
~$<span class="w"> </span>./bench_graph_traversal<span class="w"> </span>-m<span class="w"> </span>tbb<span class="w">  </span><span class="c1"># run the TBB implementation</span>
~$<span class="w"> </span>./bench_graph_traversal<span class="w"> </span>-m<span class="w"> </span>omp<span class="w">  </span><span class="c1"># run the OpenMP implementation</span></pre></section><section id="SpecifyTheNumberOfThreads"><h3><a href="#SpecifyTheNumberOfThreads">Specify the Number of Threads</a></h3><p>You can configure the number of threads to run a benchmark instance by passing the option <code>-t</code>. The default value is one.</p><pre class="m-code"><span class="c1"># run the Taskflow implementation using 4 threads</span>
~$<span class="w"> </span>./bench_graph_traversal<span class="w"> </span>-m<span class="w"> </span>tf<span class="w"> </span>-t<span class="w"> </span><span class="m">4</span></pre><p>Depending on your environment, you may need to use <code>taskset</code> to set the CPU affinity of the running process. This allows the OS scheduler to keep process on the same CPU(s) as long as practical for performance reason.</p><pre class="m-code"><span class="c1"># affine the process to 4 CPUs, CPU 0, CPU 1, CPU 2, and CPU 3</span>
~$<span class="w"> </span>taskset<span class="w"> </span>-c<span class="w"> </span><span class="m">0</span>-3<span class="w"> </span>bench_graph_traversal<span class="w"> </span>-t<span class="w"> </span><span class="m">4</span><span class="w">  </span></pre></section><section id="SpecifyTheNumberOfRounds"><h3><a href="#SpecifyTheNumberOfRounds">Specify the Number of Rounds</a></h3><p>Each benchmark instance evaluates the runtime of the implementation at different problem sizes. Each problem size corresponds to one iteration. You can configure the number of rounds per iteration to average the runtime.</p><pre class="m-code"><span class="c1"># measure the %Taskflow runtime by averaging the results over 10 runs</span>
~$<span class="w"> </span>./bench_graph_traversal<span class="w"> </span>-r<span class="w"> </span><span class="m">10</span><span class="w"> </span>-m<span class="w"> </span>tf
<span class="p">|</span>V<span class="p">|</span>+<span class="p">|</span>E<span class="p">|</span><span class="w">     </span>Runtime
<span class="w">      </span><span class="m">2</span><span class="w">       </span><span class="m">0</span>.109<span class="w">   </span><span class="c1"># the runtime value 0.109 is an average of 10 runs</span>
<span class="w">    </span><span class="m">842</span><span class="w">       </span><span class="m">0</span>.298
<span class="w">    </span>...<span class="w">         </span>...
<span class="w"> </span><span class="m">619802</span><span class="w">      </span><span class="m">73</span>.135
<span class="w"> </span><span class="m">664771</span><span class="w">      </span><span class="m">74</span>.436</pre></section></section>
      </div>
</article></main>
<div class="m-doc-search" id="search">
  <a href="#!" onclick="return hideSearch()"></a>
  <div class="m-container">
    <div class="m-row">
      <div class="m-col-m-8 m-push-m-2">
        <div class="m-doc-search-header m-text m-small">
          <div><span class="m-label m-default">Tab</span> / <span class="m-label m-default">T</span> to search, <span class="m-label m-default">Esc</span> to close</div>
          <div id="search-symbolcount">&hellip;</div>
        </div>
        <div class="m-doc-search-content">
          <form>
            <input type="search" name="q" id="search-input" placeholder="Loading &hellip;" disabled="disabled" autofocus="autofocus" autocomplete="off" spellcheck="false" />
          </form>
          <noscript class="m-text m-danger m-text-center">Unlike everything else in the docs, the search functionality <em>requires</em> JavaScript.</noscript>
          <div id="search-help" class="m-text m-dim m-text-center">
            <p class="m-noindent">Search for symbols, directories, files, pages or
            modules. You can omit any prefix from the symbol or file path; adding a
            <code>:</code> or <code>/</code> suffix lists all members of given symbol or
            directory.</p>
            <p class="m-noindent">Use <span class="m-label m-dim">&darr;</span>
            / <span class="m-label m-dim">&uarr;</span> to navigate through the list,
            <span class="m-label m-dim">Enter</span> to go.
            <span class="m-label m-dim">Tab</span> autocompletes common prefix, you can
            copy a link to the result using <span class="m-label m-dim">⌘</span>
            <span class="m-label m-dim">L</span> while <span class="m-label m-dim">⌘</span>
            <span class="m-label m-dim">M</span> produces a Markdown link.</p>
          </div>
          <div id="search-notfound" class="m-text m-warning m-text-center">Sorry, nothing was found.</div>
          <ul id="search-results"></ul>
        </div>
      </div>
<script src="search-v2.js"></script>
<script src="searchdata-v2.js" async="async"></script>
<footer><nav>
  <div class="m-container">
    <div class="m-row">
      <div class="m-col-l-10 m-push-l-1">
        <p>Taskflow handbook is part of the <a href="https://taskflow.github.io">Taskflow project</a>, copyright © <a href="https://tsung-wei-huang.github.io/">Dr. Tsung-Wei Huang</a>, 2018&ndash;2025.<br />Generated by <a href="https://doxygen.org/">Doxygen</a> 1.12.0 and <a href="https://mcss.mosra.cz/">m.css</a>.</p>
      </div>
</nav></footer>
Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FilesExpand file tree

BenchmarkTaskflow.html

Latest commit

History

BenchmarkTaskflow.html

File metadata and controls