find longest length and value in repetitive sequence in data.table











up vote
2
down vote

favorite












dt<-fread(  "V1  V2  V3  V4  V5  V6  V7  V8  V9 V10
116 116 116 102 96 96 106 116 116 144
114 114 114 114 114 114 121 111 98 108
88 78 78 77 72 96 96 95 95 95
118 77 77 86 139 127 127 103 93 84
154 154 154 121 121 114 111 111 111 111
175 175 125 125 125 125 164 125 125 141
174 174 125 118 117 116 139 116 102 104
95 95 175 175 176 176 139 123 140 141
140 106 174 162 162 169 140 112 112 112
178 178 178 178 116 95 178 178 178 178")


what I'm trying to do is



find longest sequence in row value and length like this :



         V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 value  length
116 116 116 102 96 96 106 116 116 144 116 3
114 114 114 114 114 114 121 111 98 108 114 6
88 78 78 77 72 96 96 95 95 95 95 3
118 77 77 86 139 127 127 127 93 84 127 3
154 154 154 121 121 114 111 111 111 111 111 4
175 175 125 125 125 125 164 125 125 141 125 4
174 174 125 118 117 116 139 116 102 104 174 2
* 95 95 175 175 176 176 139 123 140 141 95 2*
140 106 174 162 162 169 140 112 112 112 112 3
178 178 178 178 116 95 178 178 178 178 178 5


and if length is same (95, 175, 176)*, choose lowest value



I think rle is one of way but I don't get it.










share|improve this question




























    up vote
    2
    down vote

    favorite












    dt<-fread(  "V1  V2  V3  V4  V5  V6  V7  V8  V9 V10
    116 116 116 102 96 96 106 116 116 144
    114 114 114 114 114 114 121 111 98 108
    88 78 78 77 72 96 96 95 95 95
    118 77 77 86 139 127 127 103 93 84
    154 154 154 121 121 114 111 111 111 111
    175 175 125 125 125 125 164 125 125 141
    174 174 125 118 117 116 139 116 102 104
    95 95 175 175 176 176 139 123 140 141
    140 106 174 162 162 169 140 112 112 112
    178 178 178 178 116 95 178 178 178 178")


    what I'm trying to do is



    find longest sequence in row value and length like this :



             V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 value  length
    116 116 116 102 96 96 106 116 116 144 116 3
    114 114 114 114 114 114 121 111 98 108 114 6
    88 78 78 77 72 96 96 95 95 95 95 3
    118 77 77 86 139 127 127 127 93 84 127 3
    154 154 154 121 121 114 111 111 111 111 111 4
    175 175 125 125 125 125 164 125 125 141 125 4
    174 174 125 118 117 116 139 116 102 104 174 2
    * 95 95 175 175 176 176 139 123 140 141 95 2*
    140 106 174 162 162 169 140 112 112 112 112 3
    178 178 178 178 116 95 178 178 178 178 178 5


    and if length is same (95, 175, 176)*, choose lowest value



    I think rle is one of way but I don't get it.










    share|improve this question


























      up vote
      2
      down vote

      favorite









      up vote
      2
      down vote

      favorite











      dt<-fread(  "V1  V2  V3  V4  V5  V6  V7  V8  V9 V10
      116 116 116 102 96 96 106 116 116 144
      114 114 114 114 114 114 121 111 98 108
      88 78 78 77 72 96 96 95 95 95
      118 77 77 86 139 127 127 103 93 84
      154 154 154 121 121 114 111 111 111 111
      175 175 125 125 125 125 164 125 125 141
      174 174 125 118 117 116 139 116 102 104
      95 95 175 175 176 176 139 123 140 141
      140 106 174 162 162 169 140 112 112 112
      178 178 178 178 116 95 178 178 178 178")


      what I'm trying to do is



      find longest sequence in row value and length like this :



               V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 value  length
      116 116 116 102 96 96 106 116 116 144 116 3
      114 114 114 114 114 114 121 111 98 108 114 6
      88 78 78 77 72 96 96 95 95 95 95 3
      118 77 77 86 139 127 127 127 93 84 127 3
      154 154 154 121 121 114 111 111 111 111 111 4
      175 175 125 125 125 125 164 125 125 141 125 4
      174 174 125 118 117 116 139 116 102 104 174 2
      * 95 95 175 175 176 176 139 123 140 141 95 2*
      140 106 174 162 162 169 140 112 112 112 112 3
      178 178 178 178 116 95 178 178 178 178 178 5


      and if length is same (95, 175, 176)*, choose lowest value



      I think rle is one of way but I don't get it.










      share|improve this question















      dt<-fread(  "V1  V2  V3  V4  V5  V6  V7  V8  V9 V10
      116 116 116 102 96 96 106 116 116 144
      114 114 114 114 114 114 121 111 98 108
      88 78 78 77 72 96 96 95 95 95
      118 77 77 86 139 127 127 103 93 84
      154 154 154 121 121 114 111 111 111 111
      175 175 125 125 125 125 164 125 125 141
      174 174 125 118 117 116 139 116 102 104
      95 95 175 175 176 176 139 123 140 141
      140 106 174 162 162 169 140 112 112 112
      178 178 178 178 116 95 178 178 178 178")


      what I'm trying to do is



      find longest sequence in row value and length like this :



               V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 value  length
      116 116 116 102 96 96 106 116 116 144 116 3
      114 114 114 114 114 114 121 111 98 108 114 6
      88 78 78 77 72 96 96 95 95 95 95 3
      118 77 77 86 139 127 127 127 93 84 127 3
      154 154 154 121 121 114 111 111 111 111 111 4
      175 175 125 125 125 125 164 125 125 141 125 4
      174 174 125 118 117 116 139 116 102 104 174 2
      * 95 95 175 175 176 176 139 123 140 141 95 2*
      140 106 174 162 162 169 140 112 112 112 112 3
      178 178 178 178 116 95 178 178 178 178 178 5


      and if length is same (95, 175, 176)*, choose lowest value



      I think rle is one of way but I don't get it.







      r data.table






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 22 at 2:28









      Ronak Shah

      28.6k103653




      28.6k103653










      asked Nov 22 at 2:10









      zell kim

      163




      163
























          2 Answers
          2






          active

          oldest

          votes

















          up vote
          2
          down vote













          You can convert into a long format before performing your rle. Then, look up the smallest value of the longest length:



          rmax <- melt(dt[, rn:=.I], id.vars="rn")[, 
          {
          r <- rle(value)
          m <- max(r$lengths)
          .(val=min(r$values[r$lengths==m]), len=m)
          },
          by=.(rn)]

          rmax[dt, on=.(rn)]


          output:



               V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 rn val len
          1: 116 116 116 102 96 96 106 116 116 144 1 116 3
          2: 114 114 114 114 114 114 121 111 98 108 2 114 6
          3: 88 78 78 77 72 96 96 95 95 95 3 95 3
          4: 118 77 77 86 139 127 127 103 93 84 4 77 2
          5: 154 154 154 121 121 114 111 111 111 111 5 111 4
          6: 175 175 125 125 125 125 164 125 125 141 6 125 4
          7: 174 174 125 118 117 116 139 116 102 104 7 174 2
          8: 95 95 175 175 176 176 139 123 140 141 8 95 2
          9: 140 106 174 162 162 169 140 112 112 112 9 112 3
          10: 178 178 178 178 116 95 178 178 178 178 10 178 4





          share|improve this answer




























            up vote
            1
            down vote













            Might not be the efficient solution as it doesn't take advantage of data.table syntax but one method using apply



            library(data.table)
            dt$length <- apply(dt, 1, function(x) max(table(rleid(x))))
            dt

            # V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 length
            # 1: 116 116 116 102 96 96 106 116 116 144 3
            # 2: 114 114 114 114 114 114 121 111 98 108 6
            # 3: 88 78 78 77 72 96 96 95 95 95 3
            # 4: 118 77 77 86 139 127 127 103 93 84 2
            # 5: 154 154 154 121 121 114 111 111 111 111 4
            # 6: 175 175 125 125 125 125 164 125 125 141 4
            # 7: 174 174 125 118 117 116 139 116 102 104 2
            # 8: 95 95 175 175 176 176 139 123 140 141 2
            # 9: 140 106 174 162 162 169 140 112 112 112 3
            #10: 178 178 178 178 116 95 178 178 178 178 4


            For every row we calculate the length of longest continual sequence of value.






            share|improve this answer



















            • 1




              I can't see how you can really avoid doing nrow * rle calls without getting substantially less clean.
              – thelatemail
              Nov 22 at 2:26











            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53422965%2ffind-longest-length-and-value-in-repetitive-sequence-in-data-table%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            2
            down vote













            You can convert into a long format before performing your rle. Then, look up the smallest value of the longest length:



            rmax <- melt(dt[, rn:=.I], id.vars="rn")[, 
            {
            r <- rle(value)
            m <- max(r$lengths)
            .(val=min(r$values[r$lengths==m]), len=m)
            },
            by=.(rn)]

            rmax[dt, on=.(rn)]


            output:



                 V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 rn val len
            1: 116 116 116 102 96 96 106 116 116 144 1 116 3
            2: 114 114 114 114 114 114 121 111 98 108 2 114 6
            3: 88 78 78 77 72 96 96 95 95 95 3 95 3
            4: 118 77 77 86 139 127 127 103 93 84 4 77 2
            5: 154 154 154 121 121 114 111 111 111 111 5 111 4
            6: 175 175 125 125 125 125 164 125 125 141 6 125 4
            7: 174 174 125 118 117 116 139 116 102 104 7 174 2
            8: 95 95 175 175 176 176 139 123 140 141 8 95 2
            9: 140 106 174 162 162 169 140 112 112 112 9 112 3
            10: 178 178 178 178 116 95 178 178 178 178 10 178 4





            share|improve this answer

























              up vote
              2
              down vote













              You can convert into a long format before performing your rle. Then, look up the smallest value of the longest length:



              rmax <- melt(dt[, rn:=.I], id.vars="rn")[, 
              {
              r <- rle(value)
              m <- max(r$lengths)
              .(val=min(r$values[r$lengths==m]), len=m)
              },
              by=.(rn)]

              rmax[dt, on=.(rn)]


              output:



                   V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 rn val len
              1: 116 116 116 102 96 96 106 116 116 144 1 116 3
              2: 114 114 114 114 114 114 121 111 98 108 2 114 6
              3: 88 78 78 77 72 96 96 95 95 95 3 95 3
              4: 118 77 77 86 139 127 127 103 93 84 4 77 2
              5: 154 154 154 121 121 114 111 111 111 111 5 111 4
              6: 175 175 125 125 125 125 164 125 125 141 6 125 4
              7: 174 174 125 118 117 116 139 116 102 104 7 174 2
              8: 95 95 175 175 176 176 139 123 140 141 8 95 2
              9: 140 106 174 162 162 169 140 112 112 112 9 112 3
              10: 178 178 178 178 116 95 178 178 178 178 10 178 4





              share|improve this answer























                up vote
                2
                down vote










                up vote
                2
                down vote









                You can convert into a long format before performing your rle. Then, look up the smallest value of the longest length:



                rmax <- melt(dt[, rn:=.I], id.vars="rn")[, 
                {
                r <- rle(value)
                m <- max(r$lengths)
                .(val=min(r$values[r$lengths==m]), len=m)
                },
                by=.(rn)]

                rmax[dt, on=.(rn)]


                output:



                     V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 rn val len
                1: 116 116 116 102 96 96 106 116 116 144 1 116 3
                2: 114 114 114 114 114 114 121 111 98 108 2 114 6
                3: 88 78 78 77 72 96 96 95 95 95 3 95 3
                4: 118 77 77 86 139 127 127 103 93 84 4 77 2
                5: 154 154 154 121 121 114 111 111 111 111 5 111 4
                6: 175 175 125 125 125 125 164 125 125 141 6 125 4
                7: 174 174 125 118 117 116 139 116 102 104 7 174 2
                8: 95 95 175 175 176 176 139 123 140 141 8 95 2
                9: 140 106 174 162 162 169 140 112 112 112 9 112 3
                10: 178 178 178 178 116 95 178 178 178 178 10 178 4





                share|improve this answer












                You can convert into a long format before performing your rle. Then, look up the smallest value of the longest length:



                rmax <- melt(dt[, rn:=.I], id.vars="rn")[, 
                {
                r <- rle(value)
                m <- max(r$lengths)
                .(val=min(r$values[r$lengths==m]), len=m)
                },
                by=.(rn)]

                rmax[dt, on=.(rn)]


                output:



                     V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 rn val len
                1: 116 116 116 102 96 96 106 116 116 144 1 116 3
                2: 114 114 114 114 114 114 121 111 98 108 2 114 6
                3: 88 78 78 77 72 96 96 95 95 95 3 95 3
                4: 118 77 77 86 139 127 127 103 93 84 4 77 2
                5: 154 154 154 121 121 114 111 111 111 111 5 111 4
                6: 175 175 125 125 125 125 164 125 125 141 6 125 4
                7: 174 174 125 118 117 116 139 116 102 104 7 174 2
                8: 95 95 175 175 176 176 139 123 140 141 8 95 2
                9: 140 106 174 162 162 169 140 112 112 112 9 112 3
                10: 178 178 178 178 116 95 178 178 178 178 10 178 4






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 22 at 2:21









                chinsoon12

                7,66611118




                7,66611118
























                    up vote
                    1
                    down vote













                    Might not be the efficient solution as it doesn't take advantage of data.table syntax but one method using apply



                    library(data.table)
                    dt$length <- apply(dt, 1, function(x) max(table(rleid(x))))
                    dt

                    # V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 length
                    # 1: 116 116 116 102 96 96 106 116 116 144 3
                    # 2: 114 114 114 114 114 114 121 111 98 108 6
                    # 3: 88 78 78 77 72 96 96 95 95 95 3
                    # 4: 118 77 77 86 139 127 127 103 93 84 2
                    # 5: 154 154 154 121 121 114 111 111 111 111 4
                    # 6: 175 175 125 125 125 125 164 125 125 141 4
                    # 7: 174 174 125 118 117 116 139 116 102 104 2
                    # 8: 95 95 175 175 176 176 139 123 140 141 2
                    # 9: 140 106 174 162 162 169 140 112 112 112 3
                    #10: 178 178 178 178 116 95 178 178 178 178 4


                    For every row we calculate the length of longest continual sequence of value.






                    share|improve this answer



















                    • 1




                      I can't see how you can really avoid doing nrow * rle calls without getting substantially less clean.
                      – thelatemail
                      Nov 22 at 2:26















                    up vote
                    1
                    down vote













                    Might not be the efficient solution as it doesn't take advantage of data.table syntax but one method using apply



                    library(data.table)
                    dt$length <- apply(dt, 1, function(x) max(table(rleid(x))))
                    dt

                    # V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 length
                    # 1: 116 116 116 102 96 96 106 116 116 144 3
                    # 2: 114 114 114 114 114 114 121 111 98 108 6
                    # 3: 88 78 78 77 72 96 96 95 95 95 3
                    # 4: 118 77 77 86 139 127 127 103 93 84 2
                    # 5: 154 154 154 121 121 114 111 111 111 111 4
                    # 6: 175 175 125 125 125 125 164 125 125 141 4
                    # 7: 174 174 125 118 117 116 139 116 102 104 2
                    # 8: 95 95 175 175 176 176 139 123 140 141 2
                    # 9: 140 106 174 162 162 169 140 112 112 112 3
                    #10: 178 178 178 178 116 95 178 178 178 178 4


                    For every row we calculate the length of longest continual sequence of value.






                    share|improve this answer



















                    • 1




                      I can't see how you can really avoid doing nrow * rle calls without getting substantially less clean.
                      – thelatemail
                      Nov 22 at 2:26













                    up vote
                    1
                    down vote










                    up vote
                    1
                    down vote









                    Might not be the efficient solution as it doesn't take advantage of data.table syntax but one method using apply



                    library(data.table)
                    dt$length <- apply(dt, 1, function(x) max(table(rleid(x))))
                    dt

                    # V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 length
                    # 1: 116 116 116 102 96 96 106 116 116 144 3
                    # 2: 114 114 114 114 114 114 121 111 98 108 6
                    # 3: 88 78 78 77 72 96 96 95 95 95 3
                    # 4: 118 77 77 86 139 127 127 103 93 84 2
                    # 5: 154 154 154 121 121 114 111 111 111 111 4
                    # 6: 175 175 125 125 125 125 164 125 125 141 4
                    # 7: 174 174 125 118 117 116 139 116 102 104 2
                    # 8: 95 95 175 175 176 176 139 123 140 141 2
                    # 9: 140 106 174 162 162 169 140 112 112 112 3
                    #10: 178 178 178 178 116 95 178 178 178 178 4


                    For every row we calculate the length of longest continual sequence of value.






                    share|improve this answer














                    Might not be the efficient solution as it doesn't take advantage of data.table syntax but one method using apply



                    library(data.table)
                    dt$length <- apply(dt, 1, function(x) max(table(rleid(x))))
                    dt

                    # V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 length
                    # 1: 116 116 116 102 96 96 106 116 116 144 3
                    # 2: 114 114 114 114 114 114 121 111 98 108 6
                    # 3: 88 78 78 77 72 96 96 95 95 95 3
                    # 4: 118 77 77 86 139 127 127 103 93 84 2
                    # 5: 154 154 154 121 121 114 111 111 111 111 4
                    # 6: 175 175 125 125 125 125 164 125 125 141 4
                    # 7: 174 174 125 118 117 116 139 116 102 104 2
                    # 8: 95 95 175 175 176 176 139 123 140 141 2
                    # 9: 140 106 174 162 162 169 140 112 112 112 3
                    #10: 178 178 178 178 116 95 178 178 178 178 4


                    For every row we calculate the length of longest continual sequence of value.







                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited Nov 22 at 2:27

























                    answered Nov 22 at 2:25









                    Ronak Shah

                    28.6k103653




                    28.6k103653








                    • 1




                      I can't see how you can really avoid doing nrow * rle calls without getting substantially less clean.
                      – thelatemail
                      Nov 22 at 2:26














                    • 1




                      I can't see how you can really avoid doing nrow * rle calls without getting substantially less clean.
                      – thelatemail
                      Nov 22 at 2:26








                    1




                    1




                    I can't see how you can really avoid doing nrow * rle calls without getting substantially less clean.
                    – thelatemail
                    Nov 22 at 2:26




                    I can't see how you can really avoid doing nrow * rle calls without getting substantially less clean.
                    – thelatemail
                    Nov 22 at 2:26


















                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.





                    Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                    Please pay close attention to the following guidance:


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53422965%2ffind-longest-length-and-value-in-repetitive-sequence-in-data-table%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    How to ignore python UserWarning in pytest?

                    What visual should I use to simply compare current year value vs last year in Power BI desktop

                    Script to remove string up to first number