OpenCL incrementing integer after each kernel execution











up vote
0
down vote

favorite












I have a kernel that I need to execute multiple times (using clEnqueueNDRangeEnqueue), and one of its arguments is an integer that needs to be incremented after each execution.



Rather than have the host assign an incrementing value (using clSetKernelArg) before enqueuing each kernel execution, is there a purely "device-side" way to achieve this, e.g. have the kernel increment a global integer itself once the final work item has run? (I'm still new to OpenCL so might be barking up the wrong tree here).










share|improve this question


























    up vote
    0
    down vote

    favorite












    I have a kernel that I need to execute multiple times (using clEnqueueNDRangeEnqueue), and one of its arguments is an integer that needs to be incremented after each execution.



    Rather than have the host assign an incrementing value (using clSetKernelArg) before enqueuing each kernel execution, is there a purely "device-side" way to achieve this, e.g. have the kernel increment a global integer itself once the final work item has run? (I'm still new to OpenCL so might be barking up the wrong tree here).










    share|improve this question
























      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      I have a kernel that I need to execute multiple times (using clEnqueueNDRangeEnqueue), and one of its arguments is an integer that needs to be incremented after each execution.



      Rather than have the host assign an incrementing value (using clSetKernelArg) before enqueuing each kernel execution, is there a purely "device-side" way to achieve this, e.g. have the kernel increment a global integer itself once the final work item has run? (I'm still new to OpenCL so might be barking up the wrong tree here).










      share|improve this question













      I have a kernel that I need to execute multiple times (using clEnqueueNDRangeEnqueue), and one of its arguments is an integer that needs to be incremented after each execution.



      Rather than have the host assign an incrementing value (using clSetKernelArg) before enqueuing each kernel execution, is there a purely "device-side" way to achieve this, e.g. have the kernel increment a global integer itself once the final work item has run? (I'm still new to OpenCL so might be barking up the wrong tree here).







      opencl






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked yesterday









      Andrew Stephens

      4,34823588




      4,34823588
























          1 Answer
          1






          active

          oldest

          votes

















          up vote
          1
          down vote













          It is possible to achieve that on the kernel side but I would not do that, as it may have influence on the kernel performance. Anyway it could be done this way:



          kernel void my_kernel(__global int* counter, __global int* other_data, ...)
          {
          // some operations on other_data, etc.

          // make sure that only one work item increments the counter to avoid race condition
          // the assumption is that kernel uses one dimension only
          if(get_local_id(0) == 0)
          atomic_inc(counter); // need to use atomic function as kernels may run in parallel
          }


          So to summarize rather than adding branch by making only one work item work and waste cycles of the others I would continue using clSetKernelArg and increment counter on the host side. There are operations that are better suited for GPU and incrementing the counter is rather not one of them.






          share|improve this answer



















          • 1




            This does not increment the value when the last work item has run. It increments it whenever the work item with global_id in first dimensions happens to finish.
            – Jovasa
            yesterday










          • @Jovasa you are right, updated.
            – doqtor
            yesterday











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














           

          draft saved


          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53204311%2fopencl-incrementing-integer-after-each-kernel-execution%23new-answer', 'question_page');
          }
          );

          Post as a guest
































          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          1
          down vote













          It is possible to achieve that on the kernel side but I would not do that, as it may have influence on the kernel performance. Anyway it could be done this way:



          kernel void my_kernel(__global int* counter, __global int* other_data, ...)
          {
          // some operations on other_data, etc.

          // make sure that only one work item increments the counter to avoid race condition
          // the assumption is that kernel uses one dimension only
          if(get_local_id(0) == 0)
          atomic_inc(counter); // need to use atomic function as kernels may run in parallel
          }


          So to summarize rather than adding branch by making only one work item work and waste cycles of the others I would continue using clSetKernelArg and increment counter on the host side. There are operations that are better suited for GPU and incrementing the counter is rather not one of them.






          share|improve this answer



















          • 1




            This does not increment the value when the last work item has run. It increments it whenever the work item with global_id in first dimensions happens to finish.
            – Jovasa
            yesterday










          • @Jovasa you are right, updated.
            – doqtor
            yesterday















          up vote
          1
          down vote













          It is possible to achieve that on the kernel side but I would not do that, as it may have influence on the kernel performance. Anyway it could be done this way:



          kernel void my_kernel(__global int* counter, __global int* other_data, ...)
          {
          // some operations on other_data, etc.

          // make sure that only one work item increments the counter to avoid race condition
          // the assumption is that kernel uses one dimension only
          if(get_local_id(0) == 0)
          atomic_inc(counter); // need to use atomic function as kernels may run in parallel
          }


          So to summarize rather than adding branch by making only one work item work and waste cycles of the others I would continue using clSetKernelArg and increment counter on the host side. There are operations that are better suited for GPU and incrementing the counter is rather not one of them.






          share|improve this answer



















          • 1




            This does not increment the value when the last work item has run. It increments it whenever the work item with global_id in first dimensions happens to finish.
            – Jovasa
            yesterday










          • @Jovasa you are right, updated.
            – doqtor
            yesterday













          up vote
          1
          down vote










          up vote
          1
          down vote









          It is possible to achieve that on the kernel side but I would not do that, as it may have influence on the kernel performance. Anyway it could be done this way:



          kernel void my_kernel(__global int* counter, __global int* other_data, ...)
          {
          // some operations on other_data, etc.

          // make sure that only one work item increments the counter to avoid race condition
          // the assumption is that kernel uses one dimension only
          if(get_local_id(0) == 0)
          atomic_inc(counter); // need to use atomic function as kernels may run in parallel
          }


          So to summarize rather than adding branch by making only one work item work and waste cycles of the others I would continue using clSetKernelArg and increment counter on the host side. There are operations that are better suited for GPU and incrementing the counter is rather not one of them.






          share|improve this answer














          It is possible to achieve that on the kernel side but I would not do that, as it may have influence on the kernel performance. Anyway it could be done this way:



          kernel void my_kernel(__global int* counter, __global int* other_data, ...)
          {
          // some operations on other_data, etc.

          // make sure that only one work item increments the counter to avoid race condition
          // the assumption is that kernel uses one dimension only
          if(get_local_id(0) == 0)
          atomic_inc(counter); // need to use atomic function as kernels may run in parallel
          }


          So to summarize rather than adding branch by making only one work item work and waste cycles of the others I would continue using clSetKernelArg and increment counter on the host side. There are operations that are better suited for GPU and incrementing the counter is rather not one of them.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited yesterday

























          answered yesterday









          doqtor

          6,3751927




          6,3751927








          • 1




            This does not increment the value when the last work item has run. It increments it whenever the work item with global_id in first dimensions happens to finish.
            – Jovasa
            yesterday










          • @Jovasa you are right, updated.
            – doqtor
            yesterday














          • 1




            This does not increment the value when the last work item has run. It increments it whenever the work item with global_id in first dimensions happens to finish.
            – Jovasa
            yesterday










          • @Jovasa you are right, updated.
            – doqtor
            yesterday








          1




          1




          This does not increment the value when the last work item has run. It increments it whenever the work item with global_id in first dimensions happens to finish.
          – Jovasa
          yesterday




          This does not increment the value when the last work item has run. It increments it whenever the work item with global_id in first dimensions happens to finish.
          – Jovasa
          yesterday












          @Jovasa you are right, updated.
          – doqtor
          yesterday




          @Jovasa you are right, updated.
          – doqtor
          yesterday


















           

          draft saved


          draft discarded



















































           


          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53204311%2fopencl-incrementing-integer-after-each-kernel-execution%23new-answer', 'question_page');
          }
          );

          Post as a guest




















































































          Popular posts from this blog

          鏡平學校

          ꓛꓣだゔៀៅຸ໢ທຮ໕໒ ,ໂ'໥໓າ໼ឨឲ៵៭ៈゎゔit''䖳𥁄卿' ☨₤₨こゎもょの;ꜹꟚꞖꞵꟅꞛေၦေɯ,ɨɡ𛃵𛁹ޝ޳ޠ޾,ޤޒޯ޾𫝒𫠁သ𛅤チョ'サノބޘދ𛁐ᶿᶇᶀᶋᶠ㨑㽹⻮ꧬ꧹؍۩وَؠ㇕㇃㇪ ㇦㇋㇋ṜẰᵡᴠ 軌ᵕ搜۳ٰޗޮ޷ސޯ𫖾𫅀ल, ꙭ꙰ꚅꙁꚊꞻꝔ꟠Ꝭㄤﺟޱސꧨꧼ꧴ꧯꧽ꧲ꧯ'⽹⽭⾁⿞⼳⽋២៩ញណើꩯꩤ꩸ꩮᶻᶺᶧᶂ𫳲𫪭𬸄𫵰𬖩𬫣𬊉ၲ𛅬㕦䬺𫝌𫝼,,𫟖𫞽ហៅ஫㆔ాఆఅꙒꚞꙍ,Ꙟ꙱エ ,ポテ,フࢰࢯ𫟠𫞶 𫝤𫟠ﺕﹱﻜﻣ𪵕𪭸𪻆𪾩𫔷ġ,ŧآꞪ꟥,ꞔꝻ♚☹⛵𛀌ꬷꭞȄƁƪƬșƦǙǗdžƝǯǧⱦⱰꓕꓢႋ神 ဴ၀க௭எ௫ឫោ ' េㇷㇴㇼ神ㇸㇲㇽㇴㇼㇻㇸ'ㇸㇿㇸㇹㇰㆣꓚꓤ₡₧ ㄨㄟ㄂ㄖㄎ໗ツڒذ₶।ऩछएोञयूटक़कयँृी,冬'𛅢𛅥ㇱㇵㇶ𥄥𦒽𠣧𠊓𧢖𥞘𩔋цѰㄠſtʯʭɿʆʗʍʩɷɛ,əʏダヵㄐㄘR{gỚṖḺờṠṫảḙḭᴮᵏᴘᵀᵷᵕᴜᴏᵾq﮲ﲿﴽﭙ軌ﰬﶚﶧ﫲Ҝжюїкӈㇴffצּ﬘﭅﬈軌'ffistfflſtffतभफɳɰʊɲʎ𛁱𛁖𛁮𛀉 𛂯𛀞నఋŀŲ 𫟲𫠖𫞺ຆຆ ໹້໕໗ๆทԊꧢꧠ꧰ꓱ⿝⼑ŎḬẃẖỐẅ ,ờỰỈỗﮊDžȩꭏꭎꬻ꭮ꬿꭖꭥꭅ㇭神 ⾈ꓵꓑ⺄㄄ㄪㄙㄅㄇstA۵䞽ॶ𫞑𫝄㇉㇇゜軌𩜛𩳠Jﻺ‚Üမ႕ႌႊၐၸဓၞၞၡ៸wyvtᶎᶪᶹစဎ꣡꣰꣢꣤ٗ؋لㇳㇾㇻㇱ㆐㆔,,㆟Ⱶヤマފ޼ޝަݿݞݠݷݐ',ݘ,ݪݙݵ𬝉𬜁𫝨𫞘くせぉて¼óû×ó£…𛅑הㄙくԗԀ5606神45,神796'𪤻𫞧ꓐ㄁ㄘɥɺꓵꓲ3''7034׉ⱦⱠˆ“𫝋ȍ,ꩲ軌꩷ꩶꩧꩫఞ۔فڱێظペサ神ナᴦᵑ47 9238їﻂ䐊䔉㠸﬎ffiﬣ,לּᴷᴦᵛᵽ,ᴨᵤ ᵸᵥᴗᵈꚏꚉꚟ⻆rtǟƴ𬎎

          Why https connections are so slow when debugging (stepping over) in Java?