Apache Storm:KafkaSpout has a lot of failed tuples due to timeout












0














In my processtime-window wordcount topology, the input rate is 2000 tuple/s. The window size of my count_bolt is 3s and 1s lag. It can be seen from the UI that there are a large number of failed tuples. By looking at the logs, the reason for the tuple failure is timeout. I also set the TOPOLOGY_MAX_SPOUT_PENDING to 10000,topology.message.timeout.secs to 60. And according to the two parameters of Capacity and Execute latency in the figure, the parallelism of the bolt should be sufficient.



Q: How should I adjust the parameters? For example, TOPOLOGY_MAX_SPOUT_PENDINGtopology.message.timeout.secs or something else.



This is a picture of my storm UI:
storm UI










share|improve this question



























    0














    In my processtime-window wordcount topology, the input rate is 2000 tuple/s. The window size of my count_bolt is 3s and 1s lag. It can be seen from the UI that there are a large number of failed tuples. By looking at the logs, the reason for the tuple failure is timeout. I also set the TOPOLOGY_MAX_SPOUT_PENDING to 10000,topology.message.timeout.secs to 60. And according to the two parameters of Capacity and Execute latency in the figure, the parallelism of the bolt should be sufficient.



    Q: How should I adjust the parameters? For example, TOPOLOGY_MAX_SPOUT_PENDINGtopology.message.timeout.secs or something else.



    This is a picture of my storm UI:
    storm UI










    share|improve this question

























      0












      0








      0







      In my processtime-window wordcount topology, the input rate is 2000 tuple/s. The window size of my count_bolt is 3s and 1s lag. It can be seen from the UI that there are a large number of failed tuples. By looking at the logs, the reason for the tuple failure is timeout. I also set the TOPOLOGY_MAX_SPOUT_PENDING to 10000,topology.message.timeout.secs to 60. And according to the two parameters of Capacity and Execute latency in the figure, the parallelism of the bolt should be sufficient.



      Q: How should I adjust the parameters? For example, TOPOLOGY_MAX_SPOUT_PENDINGtopology.message.timeout.secs or something else.



      This is a picture of my storm UI:
      storm UI










      share|improve this question













      In my processtime-window wordcount topology, the input rate is 2000 tuple/s. The window size of my count_bolt is 3s and 1s lag. It can be seen from the UI that there are a large number of failed tuples. By looking at the logs, the reason for the tuple failure is timeout. I also set the TOPOLOGY_MAX_SPOUT_PENDING to 10000,topology.message.timeout.secs to 60. And according to the two parameters of Capacity and Execute latency in the figure, the parallelism of the bolt should be sufficient.



      Q: How should I adjust the parameters? For example, TOPOLOGY_MAX_SPOUT_PENDINGtopology.message.timeout.secs or something else.



      This is a picture of my storm UI:
      storm UI







      apache-kafka apache-storm






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 23 '18 at 7:34









      Cheng Jiang

      336




      336
























          1 Answer
          1






          active

          oldest

          votes


















          0














          I would start by lowering topology.max.spout.pending. Once you have a topology that isn't uselessly processing already timed out tuples, it should be easier to tell where your bottleneck is.



          Keep in mind that the capacity/execute latency only takes into account how much time is spent in execute for each tuple.



          As I recall the Kafka bolt doesn't ack tuples before it leaves execute, but instead delivers the tuple to the producer and acks the tuple via a callback from the producer, which can happen after execute returns. As a result, you won't see the actual time spent processing tuples in a Kafka bolt reflected in the capacity/execute latency. You can see the actual time between the tuple arriving at execute and the tuple being acked in the process latency, which is pretty high.



          Your count_bolt process latency is also high, so take a look at whether that one is also buffering up tuples before acking them.






          share|improve this answer





















          • Thanks. My count_bolt uses processtime-window,so one tuple buffered for 3 seconds. So should I adjust TOPOLOGY_MAX_SPOUT_PENDING from small to large? Do I need to increase topology.message.timeout.secs?
            – Cheng Jiang
            Nov 24 '18 at 7:51










          • I don't know what processtime-window is. No, I would lower max spout pending. I doubt increasing the timeout would help you.
            – Stig Rohde Døssing
            Nov 24 '18 at 8:18











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53442402%2fapache-storm-kafkaspout-has-a-lot-of-failed-tuples-due-to-timeout%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0














          I would start by lowering topology.max.spout.pending. Once you have a topology that isn't uselessly processing already timed out tuples, it should be easier to tell where your bottleneck is.



          Keep in mind that the capacity/execute latency only takes into account how much time is spent in execute for each tuple.



          As I recall the Kafka bolt doesn't ack tuples before it leaves execute, but instead delivers the tuple to the producer and acks the tuple via a callback from the producer, which can happen after execute returns. As a result, you won't see the actual time spent processing tuples in a Kafka bolt reflected in the capacity/execute latency. You can see the actual time between the tuple arriving at execute and the tuple being acked in the process latency, which is pretty high.



          Your count_bolt process latency is also high, so take a look at whether that one is also buffering up tuples before acking them.






          share|improve this answer





















          • Thanks. My count_bolt uses processtime-window,so one tuple buffered for 3 seconds. So should I adjust TOPOLOGY_MAX_SPOUT_PENDING from small to large? Do I need to increase topology.message.timeout.secs?
            – Cheng Jiang
            Nov 24 '18 at 7:51










          • I don't know what processtime-window is. No, I would lower max spout pending. I doubt increasing the timeout would help you.
            – Stig Rohde Døssing
            Nov 24 '18 at 8:18
















          0














          I would start by lowering topology.max.spout.pending. Once you have a topology that isn't uselessly processing already timed out tuples, it should be easier to tell where your bottleneck is.



          Keep in mind that the capacity/execute latency only takes into account how much time is spent in execute for each tuple.



          As I recall the Kafka bolt doesn't ack tuples before it leaves execute, but instead delivers the tuple to the producer and acks the tuple via a callback from the producer, which can happen after execute returns. As a result, you won't see the actual time spent processing tuples in a Kafka bolt reflected in the capacity/execute latency. You can see the actual time between the tuple arriving at execute and the tuple being acked in the process latency, which is pretty high.



          Your count_bolt process latency is also high, so take a look at whether that one is also buffering up tuples before acking them.






          share|improve this answer





















          • Thanks. My count_bolt uses processtime-window,so one tuple buffered for 3 seconds. So should I adjust TOPOLOGY_MAX_SPOUT_PENDING from small to large? Do I need to increase topology.message.timeout.secs?
            – Cheng Jiang
            Nov 24 '18 at 7:51










          • I don't know what processtime-window is. No, I would lower max spout pending. I doubt increasing the timeout would help you.
            – Stig Rohde Døssing
            Nov 24 '18 at 8:18














          0












          0








          0






          I would start by lowering topology.max.spout.pending. Once you have a topology that isn't uselessly processing already timed out tuples, it should be easier to tell where your bottleneck is.



          Keep in mind that the capacity/execute latency only takes into account how much time is spent in execute for each tuple.



          As I recall the Kafka bolt doesn't ack tuples before it leaves execute, but instead delivers the tuple to the producer and acks the tuple via a callback from the producer, which can happen after execute returns. As a result, you won't see the actual time spent processing tuples in a Kafka bolt reflected in the capacity/execute latency. You can see the actual time between the tuple arriving at execute and the tuple being acked in the process latency, which is pretty high.



          Your count_bolt process latency is also high, so take a look at whether that one is also buffering up tuples before acking them.






          share|improve this answer












          I would start by lowering topology.max.spout.pending. Once you have a topology that isn't uselessly processing already timed out tuples, it should be easier to tell where your bottleneck is.



          Keep in mind that the capacity/execute latency only takes into account how much time is spent in execute for each tuple.



          As I recall the Kafka bolt doesn't ack tuples before it leaves execute, but instead delivers the tuple to the producer and acks the tuple via a callback from the producer, which can happen after execute returns. As a result, you won't see the actual time spent processing tuples in a Kafka bolt reflected in the capacity/execute latency. You can see the actual time between the tuple arriving at execute and the tuple being acked in the process latency, which is pretty high.



          Your count_bolt process latency is also high, so take a look at whether that one is also buffering up tuples before acking them.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 23 '18 at 13:18









          Stig Rohde Døssing

          1,531234




          1,531234












          • Thanks. My count_bolt uses processtime-window,so one tuple buffered for 3 seconds. So should I adjust TOPOLOGY_MAX_SPOUT_PENDING from small to large? Do I need to increase topology.message.timeout.secs?
            – Cheng Jiang
            Nov 24 '18 at 7:51










          • I don't know what processtime-window is. No, I would lower max spout pending. I doubt increasing the timeout would help you.
            – Stig Rohde Døssing
            Nov 24 '18 at 8:18


















          • Thanks. My count_bolt uses processtime-window,so one tuple buffered for 3 seconds. So should I adjust TOPOLOGY_MAX_SPOUT_PENDING from small to large? Do I need to increase topology.message.timeout.secs?
            – Cheng Jiang
            Nov 24 '18 at 7:51










          • I don't know what processtime-window is. No, I would lower max spout pending. I doubt increasing the timeout would help you.
            – Stig Rohde Døssing
            Nov 24 '18 at 8:18
















          Thanks. My count_bolt uses processtime-window,so one tuple buffered for 3 seconds. So should I adjust TOPOLOGY_MAX_SPOUT_PENDING from small to large? Do I need to increase topology.message.timeout.secs?
          – Cheng Jiang
          Nov 24 '18 at 7:51




          Thanks. My count_bolt uses processtime-window,so one tuple buffered for 3 seconds. So should I adjust TOPOLOGY_MAX_SPOUT_PENDING from small to large? Do I need to increase topology.message.timeout.secs?
          – Cheng Jiang
          Nov 24 '18 at 7:51












          I don't know what processtime-window is. No, I would lower max spout pending. I doubt increasing the timeout would help you.
          – Stig Rohde Døssing
          Nov 24 '18 at 8:18




          I don't know what processtime-window is. No, I would lower max spout pending. I doubt increasing the timeout would help you.
          – Stig Rohde Døssing
          Nov 24 '18 at 8:18


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.





          Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


          Please pay close attention to the following guidance:


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53442402%2fapache-storm-kafkaspout-has-a-lot-of-failed-tuples-due-to-timeout%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          What visual should I use to simply compare current year value vs last year in Power BI desktop

          How to ignore python UserWarning in pytest?

          Alexandru Averescu