find longest length and value in repetitive sequence in data.table
up vote
2
down vote
favorite
dt<-fread(  "V1  V2  V3  V4  V5  V6  V7  V8  V9 V10
            116 116 116 102  96  96 106 116 116 144
            114 114 114 114 114 114 121 111  98 108
             88  78  78  77  72  96  96  95  95  95
            118  77  77  86 139 127 127 103  93  84
            154 154 154 121 121 114 111 111 111 111
            175 175 125 125 125 125 164 125 125 141
            174 174 125 118 117 116 139 116 102 104
             95  95 175 175 176 176 139 123 140 141
            140 106 174 162 162 169 140 112 112 112
            178 178 178 178 116  95 178 178 178 178")
what I'm trying to do is
find longest sequence in row value and length like this :
         V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 value  length
        116 116 116 102  96  96 106 116 116 144  116      3
        114 114 114 114 114 114 121 111  98 108  114      6
         88  78  78  77  72  96  96  95  95  95  95       3
        118  77  77  86 139 127 127 127  93  84  127      3
        154 154 154 121 121 114 111 111 111 111  111      4
        175 175 125 125 125 125 164 125 125 141  125      4
        174 174 125 118 117 116 139 116 102 104  174      2
     *   95  95 175 175 176 176 139 123 140 141   95      2*
        140 106 174 162 162 169 140 112 112 112  112      3
        178 178 178 178 116  95 178 178 178 178  178      5
and if length is same (95, 175, 176)*, choose lowest value
I think rle is one of way but I don't get it.
r data.table
add a comment |
up vote
2
down vote
favorite
dt<-fread(  "V1  V2  V3  V4  V5  V6  V7  V8  V9 V10
            116 116 116 102  96  96 106 116 116 144
            114 114 114 114 114 114 121 111  98 108
             88  78  78  77  72  96  96  95  95  95
            118  77  77  86 139 127 127 103  93  84
            154 154 154 121 121 114 111 111 111 111
            175 175 125 125 125 125 164 125 125 141
            174 174 125 118 117 116 139 116 102 104
             95  95 175 175 176 176 139 123 140 141
            140 106 174 162 162 169 140 112 112 112
            178 178 178 178 116  95 178 178 178 178")
what I'm trying to do is
find longest sequence in row value and length like this :
         V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 value  length
        116 116 116 102  96  96 106 116 116 144  116      3
        114 114 114 114 114 114 121 111  98 108  114      6
         88  78  78  77  72  96  96  95  95  95  95       3
        118  77  77  86 139 127 127 127  93  84  127      3
        154 154 154 121 121 114 111 111 111 111  111      4
        175 175 125 125 125 125 164 125 125 141  125      4
        174 174 125 118 117 116 139 116 102 104  174      2
     *   95  95 175 175 176 176 139 123 140 141   95      2*
        140 106 174 162 162 169 140 112 112 112  112      3
        178 178 178 178 116  95 178 178 178 178  178      5
and if length is same (95, 175, 176)*, choose lowest value
I think rle is one of way but I don't get it.
r data.table
add a comment |
up vote
2
down vote
favorite
up vote
2
down vote
favorite
dt<-fread(  "V1  V2  V3  V4  V5  V6  V7  V8  V9 V10
            116 116 116 102  96  96 106 116 116 144
            114 114 114 114 114 114 121 111  98 108
             88  78  78  77  72  96  96  95  95  95
            118  77  77  86 139 127 127 103  93  84
            154 154 154 121 121 114 111 111 111 111
            175 175 125 125 125 125 164 125 125 141
            174 174 125 118 117 116 139 116 102 104
             95  95 175 175 176 176 139 123 140 141
            140 106 174 162 162 169 140 112 112 112
            178 178 178 178 116  95 178 178 178 178")
what I'm trying to do is
find longest sequence in row value and length like this :
         V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 value  length
        116 116 116 102  96  96 106 116 116 144  116      3
        114 114 114 114 114 114 121 111  98 108  114      6
         88  78  78  77  72  96  96  95  95  95  95       3
        118  77  77  86 139 127 127 127  93  84  127      3
        154 154 154 121 121 114 111 111 111 111  111      4
        175 175 125 125 125 125 164 125 125 141  125      4
        174 174 125 118 117 116 139 116 102 104  174      2
     *   95  95 175 175 176 176 139 123 140 141   95      2*
        140 106 174 162 162 169 140 112 112 112  112      3
        178 178 178 178 116  95 178 178 178 178  178      5
and if length is same (95, 175, 176)*, choose lowest value
I think rle is one of way but I don't get it.
r data.table
dt<-fread(  "V1  V2  V3  V4  V5  V6  V7  V8  V9 V10
            116 116 116 102  96  96 106 116 116 144
            114 114 114 114 114 114 121 111  98 108
             88  78  78  77  72  96  96  95  95  95
            118  77  77  86 139 127 127 103  93  84
            154 154 154 121 121 114 111 111 111 111
            175 175 125 125 125 125 164 125 125 141
            174 174 125 118 117 116 139 116 102 104
             95  95 175 175 176 176 139 123 140 141
            140 106 174 162 162 169 140 112 112 112
            178 178 178 178 116  95 178 178 178 178")
what I'm trying to do is
find longest sequence in row value and length like this :
         V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 value  length
        116 116 116 102  96  96 106 116 116 144  116      3
        114 114 114 114 114 114 121 111  98 108  114      6
         88  78  78  77  72  96  96  95  95  95  95       3
        118  77  77  86 139 127 127 127  93  84  127      3
        154 154 154 121 121 114 111 111 111 111  111      4
        175 175 125 125 125 125 164 125 125 141  125      4
        174 174 125 118 117 116 139 116 102 104  174      2
     *   95  95 175 175 176 176 139 123 140 141   95      2*
        140 106 174 162 162 169 140 112 112 112  112      3
        178 178 178 178 116  95 178 178 178 178  178      5
and if length is same (95, 175, 176)*, choose lowest value
I think rle is one of way but I don't get it.
r data.table
r data.table
edited Nov 22 at 2:28


Ronak Shah
28.6k103653
28.6k103653
asked Nov 22 at 2:10


zell kim
163
163
add a comment |
add a comment |
                                2 Answers
                                2
                        
active
oldest
votes
up vote
2
down vote
You can convert into a long format before performing your rle. Then, look up the smallest value of the longest length:
rmax <- melt(dt[, rn:=.I], id.vars="rn")[, 
    {
        r <- rle(value)
        m <- max(r$lengths)
        .(val=min(r$values[r$lengths==m]), len=m)
    }, 
    by=.(rn)]
rmax[dt, on=.(rn)]
output:
     V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 rn val len
 1: 116 116 116 102  96  96 106 116 116 144  1 116   3
 2: 114 114 114 114 114 114 121 111  98 108  2 114   6
 3:  88  78  78  77  72  96  96  95  95  95  3  95   3
 4: 118  77  77  86 139 127 127 103  93  84  4  77   2
 5: 154 154 154 121 121 114 111 111 111 111  5 111   4
 6: 175 175 125 125 125 125 164 125 125 141  6 125   4
 7: 174 174 125 118 117 116 139 116 102 104  7 174   2
 8:  95  95 175 175 176 176 139 123 140 141  8  95   2
 9: 140 106 174 162 162 169 140 112 112 112  9 112   3
10: 178 178 178 178 116  95 178 178 178 178 10 178   4
add a comment |
up vote
1
down vote
Might not be the efficient solution as it doesn't take advantage of data.table syntax but one method using apply
library(data.table)
dt$length <- apply(dt, 1, function(x) max(table(rleid(x))))
dt
#     V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 length
# 1: 116 116 116 102  96  96 106 116 116 144      3
# 2: 114 114 114 114 114 114 121 111  98 108      6
# 3:  88  78  78  77  72  96  96  95  95  95      3
# 4: 118  77  77  86 139 127 127 103  93  84      2
# 5: 154 154 154 121 121 114 111 111 111 111      4
# 6: 175 175 125 125 125 125 164 125 125 141      4
# 7: 174 174 125 118 117 116 139 116 102 104      2
# 8:  95  95 175 175 176 176 139 123 140 141      2
# 9: 140 106 174 162 162 169 140 112 112 112      3
#10: 178 178 178 178 116  95 178 178 178 178      4
For every row we calculate the length of longest continual sequence of value.
 
 
 1
 
 
 
 
 I can't see how you can really avoid doing- nrow*- rlecalls without getting substantially less clean.
 – thelatemail
 Nov 22 at 2:26
 
 
 
add a comment |
                                2 Answers
                                2
                        
active
oldest
votes
                                2 Answers
                                2
                        
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
You can convert into a long format before performing your rle. Then, look up the smallest value of the longest length:
rmax <- melt(dt[, rn:=.I], id.vars="rn")[, 
    {
        r <- rle(value)
        m <- max(r$lengths)
        .(val=min(r$values[r$lengths==m]), len=m)
    }, 
    by=.(rn)]
rmax[dt, on=.(rn)]
output:
     V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 rn val len
 1: 116 116 116 102  96  96 106 116 116 144  1 116   3
 2: 114 114 114 114 114 114 121 111  98 108  2 114   6
 3:  88  78  78  77  72  96  96  95  95  95  3  95   3
 4: 118  77  77  86 139 127 127 103  93  84  4  77   2
 5: 154 154 154 121 121 114 111 111 111 111  5 111   4
 6: 175 175 125 125 125 125 164 125 125 141  6 125   4
 7: 174 174 125 118 117 116 139 116 102 104  7 174   2
 8:  95  95 175 175 176 176 139 123 140 141  8  95   2
 9: 140 106 174 162 162 169 140 112 112 112  9 112   3
10: 178 178 178 178 116  95 178 178 178 178 10 178   4
add a comment |
up vote
2
down vote
You can convert into a long format before performing your rle. Then, look up the smallest value of the longest length:
rmax <- melt(dt[, rn:=.I], id.vars="rn")[, 
    {
        r <- rle(value)
        m <- max(r$lengths)
        .(val=min(r$values[r$lengths==m]), len=m)
    }, 
    by=.(rn)]
rmax[dt, on=.(rn)]
output:
     V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 rn val len
 1: 116 116 116 102  96  96 106 116 116 144  1 116   3
 2: 114 114 114 114 114 114 121 111  98 108  2 114   6
 3:  88  78  78  77  72  96  96  95  95  95  3  95   3
 4: 118  77  77  86 139 127 127 103  93  84  4  77   2
 5: 154 154 154 121 121 114 111 111 111 111  5 111   4
 6: 175 175 125 125 125 125 164 125 125 141  6 125   4
 7: 174 174 125 118 117 116 139 116 102 104  7 174   2
 8:  95  95 175 175 176 176 139 123 140 141  8  95   2
 9: 140 106 174 162 162 169 140 112 112 112  9 112   3
10: 178 178 178 178 116  95 178 178 178 178 10 178   4
add a comment |
up vote
2
down vote
up vote
2
down vote
You can convert into a long format before performing your rle. Then, look up the smallest value of the longest length:
rmax <- melt(dt[, rn:=.I], id.vars="rn")[, 
    {
        r <- rle(value)
        m <- max(r$lengths)
        .(val=min(r$values[r$lengths==m]), len=m)
    }, 
    by=.(rn)]
rmax[dt, on=.(rn)]
output:
     V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 rn val len
 1: 116 116 116 102  96  96 106 116 116 144  1 116   3
 2: 114 114 114 114 114 114 121 111  98 108  2 114   6
 3:  88  78  78  77  72  96  96  95  95  95  3  95   3
 4: 118  77  77  86 139 127 127 103  93  84  4  77   2
 5: 154 154 154 121 121 114 111 111 111 111  5 111   4
 6: 175 175 125 125 125 125 164 125 125 141  6 125   4
 7: 174 174 125 118 117 116 139 116 102 104  7 174   2
 8:  95  95 175 175 176 176 139 123 140 141  8  95   2
 9: 140 106 174 162 162 169 140 112 112 112  9 112   3
10: 178 178 178 178 116  95 178 178 178 178 10 178   4
You can convert into a long format before performing your rle. Then, look up the smallest value of the longest length:
rmax <- melt(dt[, rn:=.I], id.vars="rn")[, 
    {
        r <- rle(value)
        m <- max(r$lengths)
        .(val=min(r$values[r$lengths==m]), len=m)
    }, 
    by=.(rn)]
rmax[dt, on=.(rn)]
output:
     V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 rn val len
 1: 116 116 116 102  96  96 106 116 116 144  1 116   3
 2: 114 114 114 114 114 114 121 111  98 108  2 114   6
 3:  88  78  78  77  72  96  96  95  95  95  3  95   3
 4: 118  77  77  86 139 127 127 103  93  84  4  77   2
 5: 154 154 154 121 121 114 111 111 111 111  5 111   4
 6: 175 175 125 125 125 125 164 125 125 141  6 125   4
 7: 174 174 125 118 117 116 139 116 102 104  7 174   2
 8:  95  95 175 175 176 176 139 123 140 141  8  95   2
 9: 140 106 174 162 162 169 140 112 112 112  9 112   3
10: 178 178 178 178 116  95 178 178 178 178 10 178   4
answered Nov 22 at 2:21
chinsoon12
7,66611118
7,66611118
add a comment |
add a comment |
up vote
1
down vote
Might not be the efficient solution as it doesn't take advantage of data.table syntax but one method using apply
library(data.table)
dt$length <- apply(dt, 1, function(x) max(table(rleid(x))))
dt
#     V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 length
# 1: 116 116 116 102  96  96 106 116 116 144      3
# 2: 114 114 114 114 114 114 121 111  98 108      6
# 3:  88  78  78  77  72  96  96  95  95  95      3
# 4: 118  77  77  86 139 127 127 103  93  84      2
# 5: 154 154 154 121 121 114 111 111 111 111      4
# 6: 175 175 125 125 125 125 164 125 125 141      4
# 7: 174 174 125 118 117 116 139 116 102 104      2
# 8:  95  95 175 175 176 176 139 123 140 141      2
# 9: 140 106 174 162 162 169 140 112 112 112      3
#10: 178 178 178 178 116  95 178 178 178 178      4
For every row we calculate the length of longest continual sequence of value.
 
 
 1
 
 
 
 
 I can't see how you can really avoid doing- nrow*- rlecalls without getting substantially less clean.
 – thelatemail
 Nov 22 at 2:26
 
 
 
add a comment |
up vote
1
down vote
Might not be the efficient solution as it doesn't take advantage of data.table syntax but one method using apply
library(data.table)
dt$length <- apply(dt, 1, function(x) max(table(rleid(x))))
dt
#     V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 length
# 1: 116 116 116 102  96  96 106 116 116 144      3
# 2: 114 114 114 114 114 114 121 111  98 108      6
# 3:  88  78  78  77  72  96  96  95  95  95      3
# 4: 118  77  77  86 139 127 127 103  93  84      2
# 5: 154 154 154 121 121 114 111 111 111 111      4
# 6: 175 175 125 125 125 125 164 125 125 141      4
# 7: 174 174 125 118 117 116 139 116 102 104      2
# 8:  95  95 175 175 176 176 139 123 140 141      2
# 9: 140 106 174 162 162 169 140 112 112 112      3
#10: 178 178 178 178 116  95 178 178 178 178      4
For every row we calculate the length of longest continual sequence of value.
 
 
 1
 
 
 
 
 I can't see how you can really avoid doing- nrow*- rlecalls without getting substantially less clean.
 – thelatemail
 Nov 22 at 2:26
 
 
 
add a comment |
up vote
1
down vote
up vote
1
down vote
Might not be the efficient solution as it doesn't take advantage of data.table syntax but one method using apply
library(data.table)
dt$length <- apply(dt, 1, function(x) max(table(rleid(x))))
dt
#     V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 length
# 1: 116 116 116 102  96  96 106 116 116 144      3
# 2: 114 114 114 114 114 114 121 111  98 108      6
# 3:  88  78  78  77  72  96  96  95  95  95      3
# 4: 118  77  77  86 139 127 127 103  93  84      2
# 5: 154 154 154 121 121 114 111 111 111 111      4
# 6: 175 175 125 125 125 125 164 125 125 141      4
# 7: 174 174 125 118 117 116 139 116 102 104      2
# 8:  95  95 175 175 176 176 139 123 140 141      2
# 9: 140 106 174 162 162 169 140 112 112 112      3
#10: 178 178 178 178 116  95 178 178 178 178      4
For every row we calculate the length of longest continual sequence of value.
Might not be the efficient solution as it doesn't take advantage of data.table syntax but one method using apply
library(data.table)
dt$length <- apply(dt, 1, function(x) max(table(rleid(x))))
dt
#     V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 length
# 1: 116 116 116 102  96  96 106 116 116 144      3
# 2: 114 114 114 114 114 114 121 111  98 108      6
# 3:  88  78  78  77  72  96  96  95  95  95      3
# 4: 118  77  77  86 139 127 127 103  93  84      2
# 5: 154 154 154 121 121 114 111 111 111 111      4
# 6: 175 175 125 125 125 125 164 125 125 141      4
# 7: 174 174 125 118 117 116 139 116 102 104      2
# 8:  95  95 175 175 176 176 139 123 140 141      2
# 9: 140 106 174 162 162 169 140 112 112 112      3
#10: 178 178 178 178 116  95 178 178 178 178      4
For every row we calculate the length of longest continual sequence of value.
edited Nov 22 at 2:27
answered Nov 22 at 2:25


Ronak Shah
28.6k103653
28.6k103653
 
 
 1
 
 
 
 
 I can't see how you can really avoid doing- nrow*- rlecalls without getting substantially less clean.
 – thelatemail
 Nov 22 at 2:26
 
 
 
add a comment |
 
 
 1
 
 
 
 
 I can't see how you can really avoid doing- nrow*- rlecalls without getting substantially less clean.
 – thelatemail
 Nov 22 at 2:26
 
 
 
1
1
I can't see how you can really avoid doing
nrow * rle calls without getting substantially less clean.– thelatemail
Nov 22 at 2:26
I can't see how you can really avoid doing
nrow * rle calls without getting substantially less clean.– thelatemail
Nov 22 at 2:26
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53422965%2ffind-longest-length-and-value-in-repetitive-sequence-in-data-table%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown