From 418101a5b44d830532026fea52a2c0a95446898f Mon Sep 17 00:00:00 2001 From: hagabb Date: Fri, 15 Sep 2023 08:58:12 -0700 Subject: [PATCH 1/3] Submitting Fortran edge detection sample Signed-off-by: u172874 --- .../simple-binary-images/.Makefile.swp | Bin 0 -> 12288 bytes .../simple-binary-images/.README.md.swp | Bin 0 -> 20480 bytes .../simple-binary-images/License.txt | 7 + .../simple-binary-images/Makefile | 52 +++++ .../simple-binary-images/README.md | 121 ++++++++++++ .../img_seg_do_concurrent.F90 | 168 ++++++++++++++++ .../img_seg_omp_target.F90 | 181 ++++++++++++++++++ .../simple-binary-images/sample.json | 24 +++ 8 files changed, 553 insertions(+) create mode 100644 DirectProgramming/Fortran/EdgeDetection/simple-binary-images/.Makefile.swp create mode 100644 DirectProgramming/Fortran/EdgeDetection/simple-binary-images/.README.md.swp create mode 100644 DirectProgramming/Fortran/EdgeDetection/simple-binary-images/License.txt create mode 100644 DirectProgramming/Fortran/EdgeDetection/simple-binary-images/Makefile create mode 100644 DirectProgramming/Fortran/EdgeDetection/simple-binary-images/README.md create mode 100644 DirectProgramming/Fortran/EdgeDetection/simple-binary-images/img_seg_do_concurrent.F90 create mode 100644 DirectProgramming/Fortran/EdgeDetection/simple-binary-images/img_seg_omp_target.F90 create mode 100644 DirectProgramming/Fortran/EdgeDetection/simple-binary-images/sample.json diff --git a/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/.Makefile.swp b/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/.Makefile.swp new file mode 100644 index 0000000000000000000000000000000000000000..1e4753603010de72cd6f23697f7d8268d9ad3029 GIT binary patch literal 12288 zcmeI2&ubGw6vw9`)%qJn&!g=@E8C>;M>QBQm{tSEgfitFxVySTfvPBz+H^dC?J z6~Uw4Jc{5&58~0YdeQ$tQ3SpFog|y4KWr)$1p5{~$?nX1Z{GXP>_Es|NnXqs_=%Au z!?laCca4o=a_fydm-`v3kB+5Ij2+cID_)*-+`ve!XrnuCwgV<)AhM;rQ8WF}6SYLn zRlfE4)86fkYZZf|C6ZUBYh|Nr>+ z|LdKMJpvEGb#M%%!1o=DJqHbN4r~Jb;OBP6et=KlBe)Apa0GnW#@J`@0z3c#I1Glt zKJb1k&K-cq;6AtqDqtMsz+Uie3u9lwTks0J1ouG$%z+XZ28X}_@B#Ha0gr*M?HQnp z6d(mi0aAbzAO%Q)f2qJo>0I%`OqyGsFavIS9#{3C5`WuP^HKUCV*`frxl!e=?^J5S zsaRpXa}R?h&rVoh;$zE-B?C*_BF43je|LLmswM3@Rx`si@^GtjeCQN6Bp*7>hYl7C zrAo0>o;rm4v!&_QJ&Uu{nZY~V38@<*Q#m3UnK+3fXul1&AOsR#Wr&X3>Royk5zDin zkalBTDG{u?mN=5000=!_Q?F>9vbrN+3RjE}PY2In;mA+~erU9`uVreXtBxOC+-}&? z&?}@>6^rLq2e{S>fEm-R$__&1DMN3%P8$I61>d5{ssu8v!3p>E(hC zmydG8W&;bYphgd?-&p8(1q*6k1Z~$n{^Do7kXKzmEH#{Zq!(LaI6bX@V6ET6WdzcN z+17kr-2?pTi_6+OO1in1bZ;rPsG4;zYIR`GVQg@4cuf}q@Ur9zf(d0 literal 0 HcmV?d00001 diff --git a/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/.README.md.swp b/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/.README.md.swp new file mode 100644 index 0000000000000000000000000000000000000000..2f595b360f78b32fe3ff2249128f5ab67e0ae69a GIT binary patch literal 20480 zcmeHPYm6jS6}~)FWJQb$f|2Cn3YlrAs%M7x7#3&e#fEt~-LnY0I#XSBr@Nf$y1n(7 z-qscW@Qh-j{K14o<(G+Z#l%2DqEUVtqa^+@f%piCKNOW14a6VfckaE_)!oxGi?FMh ztxkX2-BtIVd+OYC&pqedvr`*CdSr>;YL7F#-pAONzj(1&KhS>kl0%H86Sqv>ddrP; zXWv&K>waiW?%RfL-e(AwWhkask65wOYy~1N&-#&YlhsJpBd6K)!+LpMMoHv^nr&E|22z3kehwC?rrwppZZzfkFa>1PTe9UlK@K7qKtlkuNlleWm%lVBqs&^Lx~M zeq`YOMdtUE`MiDLzWG^v6cQ*TP)MMVKp}xb0)+$$2^112Bv448kU$}ULIVE-2{<@6 zw=ZSvNr2A(v-tn_moxSS;0oZW_cFEs{N^&o+Q8MoRlwih!`QEZCxH{d?ZE2?7<&<;9lUF`9J5J;~H54?A@mL7H9t)l{ z1imM@6Z482`K^SbJ3qGQhv`X9G#z)P@|lrFlC`HtM6S`VX zGD0qAjpIg7+e+$5 zYFn1AWZ&kbERmE6ms`9e(+K)SWin8?IB|kN%{3mNPv>eK9;2y?q8YBVZPW0u$@Lua zK)OyMJWxT`QfXNxqEwpVN7NX+;p_+=VoWE9gyVI%he1J-R5;;jOEWUFcmyl*AU~Y? zfk#6*p~u(KFr@Fq_L6fFaW7AuShx+!>w2;lR@b@HY6U*#ovf64w1wHjxrp;e zOpK5*BiU4zN(&+s5p;$xI%()OAfRx|ucx>~GgT__?zGv&2lPPksHF@&Q@@T@v|us) zsOW;U^AY-HtpIvlDX<^pXP)h${o;9a>a$=j^BUA1=Aqa?J>tMu? z9L*)A*dS}5Y*LR@$GD5N2t5h3o?yG`(UUP{G4@(;B|fs!B7Jg#Q8)k%Kk|6XiINUC zyBJ^eUDyT@D{gDF-cvkVLquYVAA=!?JWF-Rw!PR+PA2(Mtz$u_HMNbZWyU9M?lNRxc-r5|U&jLNgCvM`f$tk7_GQYdwsHF`X83bncImPeP8@1fpQid*@QLP%7 zA%|m|LGOy(O4<(m;-1$jZ9K1!MEw(8U6{d?!o`Zv=T0Z`_U$vXx~}yPV-Slgl^*Km z>nRur_@yPjUQ5FyE%6D|FyBaSX(-w>%#>wWySk~XL=m>9>3<$>5!qw?I)zPC`%ExP zjHwMkKZ98Ya3;_T=FYE0v{vAYxK4{ae$s3ss%#tWS9G;vl_6Dvvzguvv45aF`tM%e zW4zt)vH4;Um^rwAB2fJ1+o<8H=D6D;^lS*73m84@F|HGfG1`UAFfv=+p}@YPp4lw& zgeIS+Z^#&ie!l8$UJgv=4BEx&pDdg9qnviW3D9)WuK_F8^WYd_3-Jb*wa=kuox^_%Jw%1Di|}TgzCFWc z6;bcTa3nH^>>f-hmo4(Y$ah08wZYJ?=!#f7X=V(}Z|?x4f@)Mfmg#7-gS?`do}(?S zTTGvM=oJ4~5EtK#n3>}L?EL;k#Qon#%zqkq9H;^n;8Ng+i2bX;KN0VL7gzz7fvbQI z0T%!dBHq6ixCVFuvHp*M#{dC%z&bDv902}+SpTQM6F?hS11#VY;3dTUKL)-5JPv#X zr~`+AYk=2~1Naqi8h8ZofIEN>1OGry;CbM2;4WYRmYSnbft+O_({h? z@D}AMKAk*Z$mn2O6Hx?@M!JY@kUB{snNmiwL1`u0FtCr{chkfxbY-@y4KZKW^z$G! z&lF_o7;Gr;*{eNt;lVJrxw*-BRU%Ktm5($+IVByIl1H>f+c~yev#*U#aXxO3_rK;o zj^9W-k&6*CeuM(69tUm$`?uWY{5E@hqOaqeH4ewG+YcFuGrifc7FEvm+w|Y75pJX1 z85#|t)TQ5dZ-7!xxp$Ov*{GMTLCFmj!MSXtT$H90Z^H|LhMe&|w9y3T`fd83?M7Yt zeWP@F)x*6V8=XO(L*+Es=M6SGtg{Kqccr;pPMb9ZCjCCDa45c^We#)DeNFbZR(Dn0 zWE|k$mJ^IGM81c(0&cX*BKO>#m}DzUs~gqnwS~Fr#>)Ks;>z^w?Mu_kv(wd;wR^Z- z&ay$2JE^x)ZiReel3SAFA0H}YdCICO77$$_GNAz3^W#?Fbd)hgw1fZn3X`YWz-k7_ zWg@;pQl2a;lD9ZJQNf}5nS1B-Q}r3+!mis>WV? zC5cO%REhO&mQ3-4Z~6>jlSznq>3D?AAm`d7mb|teY zHlsU~yWT6pW&EbywB1&^f!LKJHs6yrf~YLKrNkYLgA4=XHx4HkixYd1>qfjfIz!78 zA*&DJbowaYkWyP@6NvUrWlDngphNb(N1LB2u?tWP;A z9b_(=2-zAqB*0{=)1+?t9bzQ9I%hvCc4h*ZB259^70T|YGzZexyqUrciCe<;xAYpN zOd}FuI08e0UN7 z6GV`@z|~zx#dILtWij`v4kJ?ow@ICJ@df1AE0}Fc(JavUE$^NSXsrEg+f$`I9QHGMMzb+@I|4J`g>t#yHLot;>aER3uU!H z*2+jsBGxPKReLnbm@4=(rB*7SMG+nqF?k&4z!X1X+Ln%>en(%X19d^gz^c*y^-E`D ztB1;{O5mef$?*e}d?5-nH@g*(*s|&Gs|{pguvznI*A*z!P<3!Cy87ckKEQyG5Rm#v z3dLpo?Y2gUQpCVmH;h3eTb{J201N383zsY>AJ#Id5`C;e34$HO!sdiuKNJLtE2>5r zb`wD@(r3NdF*@!a(*AekmsE!X<(iZkAaMWsR*GZ-Jf95Igvf; mBvideO`>X*sy*na2mMV_2Llcks^G*(4VEQ@P)kBR82dL$gd$h~ literal 0 HcmV?d00001 diff --git a/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/License.txt b/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/License.txt new file mode 100644 index 0000000000..6e9524bd74 --- /dev/null +++ b/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/License.txt @@ -0,0 +1,7 @@ +Copyright 2020 Intel Corporation + +Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. diff --git a/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/Makefile b/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/Makefile new file mode 100644 index 0000000000..e63dd983f7 --- /dev/null +++ b/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/Makefile @@ -0,0 +1,52 @@ +##============================================================= +## Copyright © 2020 Intel Corporation +## +## SPDX-License-Identifier: MIT +## ============================================================= +## +##************************************************************** +## To compile and run the do concurrent examples: make run_dc +## To compile and run the for-loop examples: make run_omp +## To compile and run all examples: make run_all +##************************************************************** + +default: run_all + +run_all: run_dc run_omp + +run_dc: img_seg_do_conc_cpu_seq img_seg_do_conc_cpu_par img_seg_do_conc_gpu + ./img_seg_do_conc_cpu_seq -n 12 -o 2 -i 1 -d + ./img_seg_do_conc_cpu_par -n 12 -o 2 -i 1 -d + OMP_TARGET_OFFLOAD=MANDATORY ./img_seg_do_conc_gpu -n 12 -o 2 -i 1 -d + +run_omp: img_seg_cpu img_seg_omp_cpu img_seg_omp_gpu + ./img_seg_cpu -n 12 -o 2 -i 1 -d + ./img_seg_omp_cpu -n 12 -o 2 -i 1 -d + OMP_TARGET_OFFLOAD=MANDATORY ./img_seg_omp_gpu -n 12 -o 2 -i 1 -d + +OMP_OPTS = -qopenmp +GPU_OPTS = -fopenmp-targets=spir64 -fopenmp-target-do-concurrent + +img_seg_do_conc_cpu_seq: img_seg_do_concurrent.F90 + ifx $< -o $@ + +img_seg_do_conc_cpu_par: img_seg_do_concurrent.F90 + ifx $< -o $@ $(OMP_OPTS) + +img_seg_do_conc_gpu: img_seg_do_concurrent.F90 + ifx $< -o $@ $(OMP_OPTS) $(GPU_OPTS) + +img_seg_cpu: img_seg_omp_target.F90 + ifx $< -o $@ + +img_seg_omp_cpu: img_seg_omp_target.F90 + ifx $< -o $@ $(OMP_OPTS) + +img_seg_omp_gpu: img_seg_omp_target.F90 + ifx $< -o $@ $(OMP_OPTS) $(GPU_OPTS) + +clean: + -rm -f img_seg_do_conc_cpu_seq img_seg_do_conc_cpu_par img_seg_do_conc_gpu + -rm -f img_seg_cpu img_seg_omp_cpu img_seg_omp_gpu + +.PHONY: clean all run_all run_dc run_omp diff --git a/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/README.md b/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/README.md new file mode 100644 index 0000000000..83cee57403 --- /dev/null +++ b/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/README.md @@ -0,0 +1,121 @@ +# Simple Edge Detection Sample +Segmentation is a common operation in image processing to find the boundaries of objects in an image. +This sample implements a simple edge detection algorithm to find object boundaries in a binary image. +However, this sample is more about offloading Fortran code to a GPU than it is about edge detection. +The algorithm is implemented in two different but functionally equivalent ways. First, it is implemented +using ordinary nested for-loops that are parallelized using OpenMP directives. Second, it is implemented +using a single DO CONCURRENT loop, which is parallelized using the OpenMP backend. In either case, the +Intel® OpenMP runtime library is capable of offloading the edge detection loops to a GPU. + +| Optimized for | Description +|:--- |:--- +| OS | Linux* Ubuntu* 18.04 or newer +| Hardware | Intel® CPUs and GPUs +| Software | Intel® Fortran Compiler +| What you will learn | How to offload Fortran loops to a GPU +| Time to complete | 15 minutes + +## Purpose +This sample demonstrates two Fortran implementations of edge detection: + + 1. img_seg_omp_target.F90 implements edge detection on binary images using ordinary for-loops and OpenMP target directives + 2. img_seg_do_concurrent.F90 implements edge detection on binary images using only a DO CONCURRENT loop + +The implementations are functionally equivalent. In both cases, the OpenMP runtime library is used to parallelize the +edge detection loops, regardless of whether they are run on the CPU or offloaded to a GPU. + +## Key Implementation Details +[Using Fortran DO CONCURRENT for Accelerator Offload](https://www.intel.com/content/www/us/en/developer/articles/technical/using-fortran-do-current-for-accelerator-offload.html) provides more detailed descriptions of each example code, and discusses the relative merits of each approach. + +## License +Code samples are licensed under the MIT license. See [License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) for details. + +Third party program Licenses can be found here: [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt) + +## Using Visual Studio Code* (Optional) + +You can use Visual Studio Code (VS Code) extensions to set your environment, create launch configurations, +and browse and download samples. + +The basic steps to build and run a sample using VS Code include: + - Download a sample using the extension **Code Sample Browser for Intel oneAPI Toolkits**. + - Configure the oneAPI environment with the extension **Environment Configurator for Intel oneAPI Toolkits**. + - Open a Terminal in VS Code (**Terminal>New Terminal**). + - Run the sample in the VS Code terminal using the instructions below. + - (Linux only) Debug your GPU application with GDB for Intel® oneAPI toolkits using the **Generate Launch Configurations** extension. + +To learn more about the extensions, see +[Using Visual Studio Code with Intel® oneAPI Toolkits](https://www.intel.com/content/www/us/en/develop/documentation/using-vs-code-with-intel-oneapi/top.html). + +After learning how to use the extensions for Intel oneAPI Toolkits, return to this readme for instructions on how to build and run a sample. + +## Building and Running this sample + +> **Note**: If you have not already done so, set up your CLI +> environment by sourcing the `setvars` script located in +> the root of your oneAPI installation. +> +> Linux Sudo: . /opt/intel/oneapi/setvars.sh +> +> Linux User: . ~/intel/oneapi/setvars.sh +> +>For more information on environment variables, see Use the setvars Script for [Linux or macOS](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-linux-or-macos.html). + +### Running Samples on the DevCloud +When running a sample in the Intel DevCloud, remember that you must specify the compute node (CPU, GPU, FPGA) as well whether to run in batch or interactive mode. For more information see the Intel® oneAPI Base Toolkit Get Started Guide (https://devcloud.intel.com/oneapi/get-started/base-toolkit/). + +### On a Linux System +Run `make` to build and run the sample. Six programs are generated: + + 1. img_seg_cpu runs the for-loop implementation sequentially on the CPU + 2. img_seg_omp_cpu runs the for-loops in parallel on the CPU using OpenMP directives + 3. img_seg_omp_gpu offloads the for-loop in parallel on the GPU using OpenMP target directives + 4. img_seg_do_conc_cpu_seq runs the DO CONCURRENT implementation sequentially on the CPU + 5. img_seg_do_conc_cpu_par runs the DO CONCURRENT loop in parallel on the CPU + 6. img_seg_do_conc_gpu offloads the DO CONCURRENT loop to the GPU using the OpenMP backend + +You can remove all generated files with `make clean`. + +### Example of Output +If everything is working correctly, each example program will perform edge detection on a small, randomly-generated binary +image. It will display the original image followed by the outline of the objects in the image, e.g.: +``` +OMP_TARGET_OFFLOAD=MANDATORY ./img_seg_omp_gpu -n 12 -o 2 -i 1 -d + Grid dimensions: 12 + Number of images to process: 1 + Number of objects in each image: 2 + + Binary image: + 0 0 0 0 0 0 0 0 0 0 0 0 + 0 0 0 0 0 0 0 0 0 0 0 0 + 0 0 0 0 0 0 0 0 0 0 0 0 + 0 1 1 1 1 1 0 0 0 0 0 0 + 0 1 1 1 1 1 0 0 0 0 0 0 + 0 1 1 1 1 1 0 0 0 0 0 0 + 0 1 1 1 1 1 0 0 0 0 0 0 + 0 1 1 1 1 1 0 0 0 0 0 0 + 0 0 0 0 0 0 1 1 1 0 0 0 + 0 0 0 0 0 0 1 1 1 0 0 0 + 0 0 0 0 0 0 1 1 1 0 0 0 + 0 0 0 0 0 0 0 0 0 0 0 0 + + Edge mask: + - - - - - - - - - - - - + - - - - - - - - - - - - + - - - - - - - - - - - - + - T T T T T - - - - - - + - T - - - T - - - - - - + - T - - - T - - - - - - + - T - - - T - - - - - - + - T T T T T - - - - - - + - - - - - - T T T - - - + - - - - - - T - T - - - + - - - - - - T T T - - - + - - - - - - - - - - - - + Image 1 took 9.010000000000000E-004 seconds + Total time (not including first iteration): 0.000000000000000E+000 seconds +``` + +### Troubleshooting +If an error occurs, troubleshoot the problem using the Diagnostics Utility for Intel® oneAPI Toolkits. +[Learn more](https://www.intel.com/content/www/us/en/develop/documentation/diagnostic-utility-user-guide/top.html) diff --git a/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/img_seg_do_concurrent.F90 b/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/img_seg_do_concurrent.F90 new file mode 100644 index 0000000000..8d12cc5f17 --- /dev/null +++ b/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/img_seg_do_concurrent.F90 @@ -0,0 +1,168 @@ +!=============================================================================== +! +! Content: +! Implement edge detection on simple binary images using a standard Fortran +! DO CONCURRENT loop. The compiler will offload the loop to a GPU using the +! OpenMP runtime. +! +! Compile for CPU (sequential): +! ifx img_seg_do_concurrent.F90 -o img_seg_do_conc_cpu_seq +! +! Compile for CPU (parallel): +! ifx img_seg_do_concurrent.F90 -o img_seg_do_conc_cpu_par -qopenmp +! +! Compile for GPU using the OpenMP backend: +! ifx img_seg_do_concurrent.F90 -o img_seg_do_conc_gpu -qopenmp \ +! -fopenmp-targets=spir64 -fopenmp-target-do-concurrent +! +!=============================================================================== +program img_seg_do_conc_example + implicit none + + integer :: n = 8, objects = 3, images = 1 + logical :: display = .false. + integer :: i, j, img_i, allocstat, stat + + integer, allocatable :: image(:,:) + logical, allocatable :: edge_mask(:,:) + + character (len = 132) :: allocmsg + character (len = 32) :: arg1, arg2 + + integer (kind=8) :: start_time, end_time, clock_precision + real (kind=8) :: cycle_time, total_time = 0.0d0 + + call process_command_line() + call system_clock(count_rate = clock_precision) + + ! Allocate image and edge mask + allocate (image(n, n), source = 0, stat = allocstat, errmsg = allocmsg) + if (allocstat > 0) stop trim(allocmsg) + + allocate (edge_mask(n, n), source = .false., stat = allocstat, errmsg = allocmsg) + if (allocstat > 0) stop trim(allocmsg) + + ! Process images + do img_i = 1, images + call initialize_image() + if (display) call display_image() + + call system_clock(start_time) ! Start timer + + ! Outline the objects in the binary image + do concurrent (j = 1:n, i = 1:n, image(i, j) /= 0) + if (i == 1 .or. i == n .or. & + j == 1 .or. j == n) then + edge_mask(i, j) = .true. + else + if (any(image(i-1:i+1, j-1:j+1) == 0)) edge_mask(i, j) = .true. + endif + enddo + + call system_clock(end_time) ! Stop timer + cycle_time = dble(end_time - start_time) / dble(clock_precision) + + if (display) call display_edge_mask() + + print *, 'Image', img_i, 'took', cycle_time, 'seconds' + if (img_i /= 1) total_time = total_time + cycle_time + + edge_mask = .false. ! Reset edge mask + enddo + print *, 'Total time (not including first iteration):', total_time, 'seconds' + + deallocate(image, edge_mask) + +contains + subroutine initialize_image() + integer x, x_min, x_max, y, y_min, y_max, d + real :: rn(3) + + image = 0 + + ! Create random regions of interest in the image + call random_seed() + do i = 1, objects + call random_number(rn) + d = 1 + floor(2 * rn(1)) + + x_min = d + 1 + x_max = n - d + x = x_min + (x_max - x_min) * rn(2) + + y_min = d + 1 + y_max = n - d + y = y_min + (y_max - y_min) * rn(3) + + image(x-d:x+d, y-d:y+d) = 1 + enddo + end subroutine initialize_image + + subroutine display_image() + print * + print *, 'Binary image:' + do j = 1, n + do i = 1, n + write(6, advance='no', fmt="(i3)") image(i, j) + enddo + print * + enddo + end subroutine display_image + + subroutine display_edge_mask() + print * + print *, 'Edge mask:' + do j = 1, n + do i = 1, n + if (edge_mask(i, j)) then + write(6, advance='no', fmt="(l3)") edge_mask(i, j) + else + write(6, advance='no', fmt="(a3)") '-' + endif + enddo + print * + enddo + end subroutine display_edge_mask + + subroutine process_command_line() + j = 1 + do while (j <= command_argument_count()) + call get_command_argument(j, arg1) + select case (arg1) + case ('-n') + call get_command_argument(j+1, arg2) + read(arg2, *, iostat=stat) n + j = j + 2 + case ('-o') + call get_command_argument(j+1, arg2) + read(arg2, *, iostat=stat) objects + j = j + 2 + case ('-i') + call get_command_argument(j+1, arg2) + read(arg2, *, iostat=stat) images + j = j + 2 + case ('-d') + display = .true. + j = j + 1 + case ('-h') + call print_help() + stop + case default + print *, 'Unrecognized command-line option: ', arg1 + call print_help() + stop + end select + enddo + print *, 'Grid dimensions:', n + print *, 'Number of images to process:', images + print *, 'Number of objects in each image:', objects + end subroutine process_command_line + + subroutine print_help() + print '(a,/)', 'Command-line options:' + print '(a)', ' -n # image dimensions (integer)' + print '(a)', ' -o # number of objects in image (integer), objects may overlap' + print '(a)', ' -i # number of images to process (integer)' + print '(a)', ' -d display image and object edge mask' + end subroutine print_help +end program img_seg_do_conc_example diff --git a/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/img_seg_omp_target.F90 b/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/img_seg_omp_target.F90 new file mode 100644 index 0000000000..73d14f831e --- /dev/null +++ b/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/img_seg_omp_target.F90 @@ -0,0 +1,181 @@ +!=============================================================================== +! +! Content: +! Implement edge detection on simple binary images and offload the +! computation to a GPU using OpenMP target directives. +! +! Compile for CPU without OpenMP: +! ifx img_seg_omp_target.F90 -o img_seg_cpu +! +! Compile for CPU with OpenMP: +! ifx -qopenmp img_seg_omp_target.F90 -o img_seg_omp_cpu +! +! Compile for GPU using the OpenMP backend: +! ifx img_seg_omp_target.F90 -o img_seg_omp_gpu \ +! -DOMP_TARGET -qopenmp -fopenmp-targets=spir64 +! +!=============================================================================== +program img_seg_omp_target + implicit none + + integer :: n = 8, objects = 3, images = 1 + logical :: display = .false. + integer :: i, j, img_i, allocstat, stat + + integer, allocatable :: image(:,:) + logical, allocatable :: edge_mask(:,:) + + character (len = 132) :: allocmsg + character (len = 32) :: arg1, arg2 + + integer (kind=8) :: start_time, end_time, clock_precision + real (kind=8) :: cycle_time, total_time = 0.0d0 + + call process_command_line() + call system_clock(count_rate = clock_precision) + + ! Allocate image and edge mask + allocate (image(n, n), source = 0, stat = allocstat, errmsg = allocmsg) + if (allocstat > 0) stop trim(allocmsg) + + allocate (edge_mask(n, n), source = .false., stat = allocstat, errmsg = allocmsg) + if (allocstat > 0) stop trim(allocmsg) + + ! Process images + do img_i = 1, images + call initialize_image() + if (display) call display_image() + + call system_clock(start_time) ! Start timer + + ! Outline the objects in the binary image +#if defined (OMP_TARGET) + !$omp target data map(to:image) map(from:edge_mask) + !$omp target +#endif + !$omp parallel do + do j = 1, n + do i = 1, n + edge_mask(i, j) = .false. + if (image(i, j) /= 0) then + if (i == 1 .or. i == n .or. & + j == 1 .or. j == n) then + edge_mask(i, j) = .true. + else + if (any(image(i-1:i+1, j-1:j+1) == 0)) edge_mask(i, j) = .true. + endif + endif + enddo + enddo +#if defined (OMP_TARGET) + !$omp end target + !$omp end target data +#endif + + call system_clock(end_time) ! Stop timer + cycle_time = dble(end_time - start_time) / dble(clock_precision) + + if (display) call display_edge_mask() + + print *, 'Image', img_i, 'took', cycle_time, 'seconds' + if (img_i /= 1) total_time = total_time + cycle_time + + edge_mask = .false. ! Reset edge mask + enddo + print *, 'Total time (not including first iteration):', total_time, 'seconds' + + deallocate(image, edge_mask) + +contains + subroutine initialize_image() + integer x, x_min, x_max, y, y_min, y_max, d + real :: rn(3) + + image = 0 + + ! Create random regions of interest in the image + call random_seed() + do i = 1, objects + call random_number(rn) + d = 1 + floor(2 * rn(1)) + + x_min = d + 1 + x_max = n - d + x = x_min + (x_max - x_min) * rn(2) + + y_min = d + 1 + y_max = n - d + y = y_min + (y_max - y_min) * rn(3) + + image(x-d:x+d, y-d:y+d) = 1 + enddo + end subroutine initialize_image + + subroutine display_image() + print * + print *, 'Binary image:' + do j = 1, n + do i = 1, n + write(6, advance='no', fmt="(i3)") image(i, j) + enddo + print * + enddo + end subroutine display_image + + subroutine display_edge_mask() + print * + print *, 'Edge mask:' + do j = 1, n + do i = 1, n + if (edge_mask(i, j)) then + write(6, advance='no', fmt="(l3)") edge_mask(i, j) + else + write(6, advance='no', fmt="(a3)") '-' + endif + enddo + print * + enddo + end subroutine display_edge_mask + + subroutine process_command_line() + j = 1 + do while (j <= command_argument_count()) + call get_command_argument(j, arg1) + select case (arg1) + case ('-n') + call get_command_argument(j+1, arg2) + read(arg2, *, iostat=stat) n + j = j + 2 + case ('-o') + call get_command_argument(j+1, arg2) + read(arg2, *, iostat=stat) objects + j = j + 2 + case ('-i') + call get_command_argument(j+1, arg2) + read(arg2, *, iostat=stat) images + j = j + 2 + case ('-d') + display = .true. + j = j + 1 + case ('-h') + call print_help() + stop + case default + print *, 'Unrecognized command-line option: ', arg1 + call print_help() + stop + end select + enddo + print *, 'Grid dimensions:', n + print *, 'Number of images to process:', images + print *, 'Number of objects in each image:', objects + end subroutine process_command_line + + subroutine print_help() + print '(a,/)', 'Command-line options:' + print '(a)', ' -n # image dimensions (integer)' + print '(a)', ' -o # number of objects in image (integer), objects may overlap' + print '(a)', ' -i # number of images to process (integer)' + print '(a)', ' -d display image and object edge mask' + end subroutine print_help +end program img_seg_omp_target diff --git a/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/sample.json b/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/sample.json new file mode 100644 index 0000000000..797dc619af --- /dev/null +++ b/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/sample.json @@ -0,0 +1,24 @@ +{ + "guid": "3E284A06-8D0E-41F1-B237-756D28F3FC57", + "name": "Edge Detection in Simple Binary Images", + "categories": ["Toolkit/oneAPI Direct Programming/Fortran/OpenMP"], + "description": "Offload Fortran loops to a GPU", + "toolchain": [ "ifx" ], + "languages": [ { "fortran": {} } ], + "targetDevice": [ "CPU", "GPU" ], + "os": [ "linux", "windows" ], + "builder": [ "make" ], + "ciTests": { + "linux": [ + { + "id": "edge_detection", + "steps": [ + "make clean", + "make" + ] + } + ], + "windows": [] + }, + "expertise": "Concepts and Functionality" +} From 7c78eefc8bce8c34048f70ad442d4cd11a4403a0 Mon Sep 17 00:00:00 2001 From: hagabb Date: Fri, 15 Sep 2023 09:00:21 -0700 Subject: [PATCH 2/3] Submitting Fortran edge detection sample Signed-off-by: hagabb --- .../simple-binary-images/.Makefile.swp | Bin 12288 -> 0 bytes .../simple-binary-images/.README.md.swp | Bin 20480 -> 0 bytes 2 files changed, 0 insertions(+), 0 deletions(-) delete mode 100644 DirectProgramming/Fortran/EdgeDetection/simple-binary-images/.Makefile.swp delete mode 100644 DirectProgramming/Fortran/EdgeDetection/simple-binary-images/.README.md.swp diff --git a/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/.Makefile.swp b/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/.Makefile.swp deleted file mode 100644 index 1e4753603010de72cd6f23697f7d8268d9ad3029..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12288 zcmeI2&ubGw6vw9`)%qJn&!g=@E8C>;M>QBQm{tSEgfitFxVySTfvPBz+H^dC?J z6~Uw4Jc{5&58~0YdeQ$tQ3SpFog|y4KWr)$1p5{~$?nX1Z{GXP>_Es|NnXqs_=%Au z!?laCca4o=a_fydm-`v3kB+5Ij2+cID_)*-+`ve!XrnuCwgV<)AhM;rQ8WF}6SYLn zRlfE4)86fkYZZf|C6ZUBYh|Nr>+ z|LdKMJpvEGb#M%%!1o=DJqHbN4r~Jb;OBP6et=KlBe)Apa0GnW#@J`@0z3c#I1Glt zKJb1k&K-cq;6AtqDqtMsz+Uie3u9lwTks0J1ouG$%z+XZ28X}_@B#Ha0gr*M?HQnp z6d(mi0aAbzAO%Q)f2qJo>0I%`OqyGsFavIS9#{3C5`WuP^HKUCV*`frxl!e=?^J5S zsaRpXa}R?h&rVoh;$zE-B?C*_BF43je|LLmswM3@Rx`si@^GtjeCQN6Bp*7>hYl7C zrAo0>o;rm4v!&_QJ&Uu{nZY~V38@<*Q#m3UnK+3fXul1&AOsR#Wr&X3>Royk5zDin zkalBTDG{u?mN=5000=!_Q?F>9vbrN+3RjE}PY2In;mA+~erU9`uVreXtBxOC+-}&? z&?}@>6^rLq2e{S>fEm-R$__&1DMN3%P8$I61>d5{ssu8v!3p>E(hC zmydG8W&;bYphgd?-&p8(1q*6k1Z~$n{^Do7kXKzmEH#{Zq!(LaI6bX@V6ET6WdzcN z+17kr-2?pTi_6+OO1in1bZ;rPsG4;zYIR`GVQg@4cuf}q@Ur9zf(d0 diff --git a/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/.README.md.swp b/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/.README.md.swp deleted file mode 100644 index 2f595b360f78b32fe3ff2249128f5ab67e0ae69a..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 20480 zcmeHPYm6jS6}~)FWJQb$f|2Cn3YlrAs%M7x7#3&e#fEt~-LnY0I#XSBr@Nf$y1n(7 z-qscW@Qh-j{K14o<(G+Z#l%2DqEUVtqa^+@f%piCKNOW14a6VfckaE_)!oxGi?FMh ztxkX2-BtIVd+OYC&pqedvr`*CdSr>;YL7F#-pAONzj(1&KhS>kl0%H86Sqv>ddrP; zXWv&K>waiW?%RfL-e(AwWhkask65wOYy~1N&-#&YlhsJpBd6K)!+LpMMoHv^nr&E|22z3kehwC?rrwppZZzfkFa>1PTe9UlK@K7qKtlkuNlleWm%lVBqs&^Lx~M zeq`YOMdtUE`MiDLzWG^v6cQ*TP)MMVKp}xb0)+$$2^112Bv448kU$}ULIVE-2{<@6 zw=ZSvNr2A(v-tn_moxSS;0oZW_cFEs{N^&o+Q8MoRlwih!`QEZCxH{d?ZE2?7<&<;9lUF`9J5J;~H54?A@mL7H9t)l{ z1imM@6Z482`K^SbJ3qGQhv`X9G#z)P@|lrFlC`HtM6S`VX zGD0qAjpIg7+e+$5 zYFn1AWZ&kbERmE6ms`9e(+K)SWin8?IB|kN%{3mNPv>eK9;2y?q8YBVZPW0u$@Lua zK)OyMJWxT`QfXNxqEwpVN7NX+;p_+=VoWE9gyVI%he1J-R5;;jOEWUFcmyl*AU~Y? zfk#6*p~u(KFr@Fq_L6fFaW7AuShx+!>w2;lR@b@HY6U*#ovf64w1wHjxrp;e zOpK5*BiU4zN(&+s5p;$xI%()OAfRx|ucx>~GgT__?zGv&2lPPksHF@&Q@@T@v|us) zsOW;U^AY-HtpIvlDX<^pXP)h${o;9a>a$=j^BUA1=Aqa?J>tMu? z9L*)A*dS}5Y*LR@$GD5N2t5h3o?yG`(UUP{G4@(;B|fs!B7Jg#Q8)k%Kk|6XiINUC zyBJ^eUDyT@D{gDF-cvkVLquYVAA=!?JWF-Rw!PR+PA2(Mtz$u_HMNbZWyU9M?lNRxc-r5|U&jLNgCvM`f$tk7_GQYdwsHF`X83bncImPeP8@1fpQid*@QLP%7 zA%|m|LGOy(O4<(m;-1$jZ9K1!MEw(8U6{d?!o`Zv=T0Z`_U$vXx~}yPV-Slgl^*Km z>nRur_@yPjUQ5FyE%6D|FyBaSX(-w>%#>wWySk~XL=m>9>3<$>5!qw?I)zPC`%ExP zjHwMkKZ98Ya3;_T=FYE0v{vAYxK4{ae$s3ss%#tWS9G;vl_6Dvvzguvv45aF`tM%e zW4zt)vH4;Um^rwAB2fJ1+o<8H=D6D;^lS*73m84@F|HGfG1`UAFfv=+p}@YPp4lw& zgeIS+Z^#&ie!l8$UJgv=4BEx&pDdg9qnviW3D9)WuK_F8^WYd_3-Jb*wa=kuox^_%Jw%1Di|}TgzCFWc z6;bcTa3nH^>>f-hmo4(Y$ah08wZYJ?=!#f7X=V(}Z|?x4f@)Mfmg#7-gS?`do}(?S zTTGvM=oJ4~5EtK#n3>}L?EL;k#Qon#%zqkq9H;^n;8Ng+i2bX;KN0VL7gzz7fvbQI z0T%!dBHq6ixCVFuvHp*M#{dC%z&bDv902}+SpTQM6F?hS11#VY;3dTUKL)-5JPv#X zr~`+AYk=2~1Naqi8h8ZofIEN>1OGry;CbM2;4WYRmYSnbft+O_({h? z@D}AMKAk*Z$mn2O6Hx?@M!JY@kUB{snNmiwL1`u0FtCr{chkfxbY-@y4KZKW^z$G! z&lF_o7;Gr;*{eNt;lVJrxw*-BRU%Ktm5($+IVByIl1H>f+c~yev#*U#aXxO3_rK;o zj^9W-k&6*CeuM(69tUm$`?uWY{5E@hqOaqeH4ewG+YcFuGrifc7FEvm+w|Y75pJX1 z85#|t)TQ5dZ-7!xxp$Ov*{GMTLCFmj!MSXtT$H90Z^H|LhMe&|w9y3T`fd83?M7Yt zeWP@F)x*6V8=XO(L*+Es=M6SGtg{Kqccr;pPMb9ZCjCCDa45c^We#)DeNFbZR(Dn0 zWE|k$mJ^IGM81c(0&cX*BKO>#m}DzUs~gqnwS~Fr#>)Ks;>z^w?Mu_kv(wd;wR^Z- z&ay$2JE^x)ZiReel3SAFA0H}YdCICO77$$_GNAz3^W#?Fbd)hgw1fZn3X`YWz-k7_ zWg@;pQl2a;lD9ZJQNf}5nS1B-Q}r3+!mis>WV? zC5cO%REhO&mQ3-4Z~6>jlSznq>3D?AAm`d7mb|teY zHlsU~yWT6pW&EbywB1&^f!LKJHs6yrf~YLKrNkYLgA4=XHx4HkixYd1>qfjfIz!78 zA*&DJbowaYkWyP@6NvUrWlDngphNb(N1LB2u?tWP;A z9b_(=2-zAqB*0{=)1+?t9bzQ9I%hvCc4h*ZB259^70T|YGzZexyqUrciCe<;xAYpN zOd}FuI08e0UN7 z6GV`@z|~zx#dILtWij`v4kJ?ow@ICJ@df1AE0}Fc(JavUE$^NSXsrEg+f$`I9QHGMMzb+@I|4J`g>t#yHLot;>aER3uU!H z*2+jsBGxPKReLnbm@4=(rB*7SMG+nqF?k&4z!X1X+Ln%>en(%X19d^gz^c*y^-E`D ztB1;{O5mef$?*e}d?5-nH@g*(*s|&Gs|{pguvznI*A*z!P<3!Cy87ckKEQyG5Rm#v z3dLpo?Y2gUQpCVmH;h3eTb{J201N383zsY>AJ#Id5`C;e34$HO!sdiuKNJLtE2>5r zb`wD@(r3NdF*@!a(*AekmsE!X<(iZkAaMWsR*GZ-Jf95Igvf; mBvideO`>X*sy*na2mMV_2Llcks^G*(4VEQ@P)kBR82dL$gd$h~ From e46225b827964ad2b76e350ebf61c11924117ef3 Mon Sep 17 00:00:00 2001 From: hagabb Date: Mon, 18 Sep 2023 06:43:52 -0700 Subject: [PATCH 3/3] made requested changes to the README Signed-off-by: hagabb --- .../EdgeDetection/simple-binary-images/README.md | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/README.md b/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/README.md index 83cee57403..83f953a956 100644 --- a/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/README.md +++ b/DirectProgramming/Fortran/EdgeDetection/simple-binary-images/README.md @@ -27,11 +27,6 @@ edge detection loops, regardless of whether they are run on the CPU or offloaded ## Key Implementation Details [Using Fortran DO CONCURRENT for Accelerator Offload](https://www.intel.com/content/www/us/en/developer/articles/technical/using-fortran-do-current-for-accelerator-offload.html) provides more detailed descriptions of each example code, and discusses the relative merits of each approach. -## License -Code samples are licensed under the MIT license. See [License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) for details. - -Third party program Licenses can be found here: [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt) - ## Using Visual Studio Code* (Optional) You can use Visual Studio Code (VS Code) extensions to set your environment, create launch configurations, @@ -61,9 +56,6 @@ After learning how to use the extensions for Intel oneAPI Toolkits, return to th > >For more information on environment variables, see Use the setvars Script for [Linux or macOS](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-linux-or-macos.html). -### Running Samples on the DevCloud -When running a sample in the Intel DevCloud, remember that you must specify the compute node (CPU, GPU, FPGA) as well whether to run in batch or interactive mode. For more information see the Intel® oneAPI Base Toolkit Get Started Guide (https://devcloud.intel.com/oneapi/get-started/base-toolkit/). - ### On a Linux System Run `make` to build and run the sample. Six programs are generated: @@ -119,3 +111,8 @@ OMP_TARGET_OFFLOAD=MANDATORY ./img_seg_omp_gpu -n 12 -o 2 -i 1 -d ### Troubleshooting If an error occurs, troubleshoot the problem using the Diagnostics Utility for Intel® oneAPI Toolkits. [Learn more](https://www.intel.com/content/www/us/en/develop/documentation/diagnostic-utility-user-guide/top.html) + +## License +Code samples are licensed under the MIT license. See [License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) for details. + +Third party program Licenses can be found here: [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt)