find longest length and value in repetitive sequence in data.table
up vote
2
down vote
favorite
dt<-fread( "V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
116 116 116 102 96 96 106 116 116 144
114 114 114 114 114 114 121 111 98 108
88 78 78 77 72 96 96 95 95 95
118 77 77 86 139 127 127 103 93 84
154 154 154 121 121 114 111 111 111 111
175 175 125 125 125 125 164 125 125 141
174 174 125 118 117 116 139 116 102 104
95 95 175 175 176 176 139 123 140 141
140 106 174 162 162 169 140 112 112 112
178 178 178 178 116 95 178 178 178 178")
what I'm trying to do is
find longest sequence in row value and length like this :
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 value length
116 116 116 102 96 96 106 116 116 144 116 3
114 114 114 114 114 114 121 111 98 108 114 6
88 78 78 77 72 96 96 95 95 95 95 3
118 77 77 86 139 127 127 127 93 84 127 3
154 154 154 121 121 114 111 111 111 111 111 4
175 175 125 125 125 125 164 125 125 141 125 4
174 174 125 118 117 116 139 116 102 104 174 2
* 95 95 175 175 176 176 139 123 140 141 95 2*
140 106 174 162 162 169 140 112 112 112 112 3
178 178 178 178 116 95 178 178 178 178 178 5
and if length is same (95, 175, 176)*, choose lowest value
I think rle
is one of way but I don't get it.
r data.table
add a comment |
up vote
2
down vote
favorite
dt<-fread( "V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
116 116 116 102 96 96 106 116 116 144
114 114 114 114 114 114 121 111 98 108
88 78 78 77 72 96 96 95 95 95
118 77 77 86 139 127 127 103 93 84
154 154 154 121 121 114 111 111 111 111
175 175 125 125 125 125 164 125 125 141
174 174 125 118 117 116 139 116 102 104
95 95 175 175 176 176 139 123 140 141
140 106 174 162 162 169 140 112 112 112
178 178 178 178 116 95 178 178 178 178")
what I'm trying to do is
find longest sequence in row value and length like this :
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 value length
116 116 116 102 96 96 106 116 116 144 116 3
114 114 114 114 114 114 121 111 98 108 114 6
88 78 78 77 72 96 96 95 95 95 95 3
118 77 77 86 139 127 127 127 93 84 127 3
154 154 154 121 121 114 111 111 111 111 111 4
175 175 125 125 125 125 164 125 125 141 125 4
174 174 125 118 117 116 139 116 102 104 174 2
* 95 95 175 175 176 176 139 123 140 141 95 2*
140 106 174 162 162 169 140 112 112 112 112 3
178 178 178 178 116 95 178 178 178 178 178 5
and if length is same (95, 175, 176)*, choose lowest value
I think rle
is one of way but I don't get it.
r data.table
add a comment |
up vote
2
down vote
favorite
up vote
2
down vote
favorite
dt<-fread( "V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
116 116 116 102 96 96 106 116 116 144
114 114 114 114 114 114 121 111 98 108
88 78 78 77 72 96 96 95 95 95
118 77 77 86 139 127 127 103 93 84
154 154 154 121 121 114 111 111 111 111
175 175 125 125 125 125 164 125 125 141
174 174 125 118 117 116 139 116 102 104
95 95 175 175 176 176 139 123 140 141
140 106 174 162 162 169 140 112 112 112
178 178 178 178 116 95 178 178 178 178")
what I'm trying to do is
find longest sequence in row value and length like this :
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 value length
116 116 116 102 96 96 106 116 116 144 116 3
114 114 114 114 114 114 121 111 98 108 114 6
88 78 78 77 72 96 96 95 95 95 95 3
118 77 77 86 139 127 127 127 93 84 127 3
154 154 154 121 121 114 111 111 111 111 111 4
175 175 125 125 125 125 164 125 125 141 125 4
174 174 125 118 117 116 139 116 102 104 174 2
* 95 95 175 175 176 176 139 123 140 141 95 2*
140 106 174 162 162 169 140 112 112 112 112 3
178 178 178 178 116 95 178 178 178 178 178 5
and if length is same (95, 175, 176)*, choose lowest value
I think rle
is one of way but I don't get it.
r data.table
dt<-fread( "V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
116 116 116 102 96 96 106 116 116 144
114 114 114 114 114 114 121 111 98 108
88 78 78 77 72 96 96 95 95 95
118 77 77 86 139 127 127 103 93 84
154 154 154 121 121 114 111 111 111 111
175 175 125 125 125 125 164 125 125 141
174 174 125 118 117 116 139 116 102 104
95 95 175 175 176 176 139 123 140 141
140 106 174 162 162 169 140 112 112 112
178 178 178 178 116 95 178 178 178 178")
what I'm trying to do is
find longest sequence in row value and length like this :
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 value length
116 116 116 102 96 96 106 116 116 144 116 3
114 114 114 114 114 114 121 111 98 108 114 6
88 78 78 77 72 96 96 95 95 95 95 3
118 77 77 86 139 127 127 127 93 84 127 3
154 154 154 121 121 114 111 111 111 111 111 4
175 175 125 125 125 125 164 125 125 141 125 4
174 174 125 118 117 116 139 116 102 104 174 2
* 95 95 175 175 176 176 139 123 140 141 95 2*
140 106 174 162 162 169 140 112 112 112 112 3
178 178 178 178 116 95 178 178 178 178 178 5
and if length is same (95, 175, 176)*, choose lowest value
I think rle
is one of way but I don't get it.
r data.table
r data.table
edited Nov 22 at 2:28
Ronak Shah
28.6k103653
28.6k103653
asked Nov 22 at 2:10
zell kim
163
163
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
up vote
2
down vote
You can convert into a long format before performing your rle
. Then, look up the smallest value of the longest length:
rmax <- melt(dt[, rn:=.I], id.vars="rn")[,
{
r <- rle(value)
m <- max(r$lengths)
.(val=min(r$values[r$lengths==m]), len=m)
},
by=.(rn)]
rmax[dt, on=.(rn)]
output:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 rn val len
1: 116 116 116 102 96 96 106 116 116 144 1 116 3
2: 114 114 114 114 114 114 121 111 98 108 2 114 6
3: 88 78 78 77 72 96 96 95 95 95 3 95 3
4: 118 77 77 86 139 127 127 103 93 84 4 77 2
5: 154 154 154 121 121 114 111 111 111 111 5 111 4
6: 175 175 125 125 125 125 164 125 125 141 6 125 4
7: 174 174 125 118 117 116 139 116 102 104 7 174 2
8: 95 95 175 175 176 176 139 123 140 141 8 95 2
9: 140 106 174 162 162 169 140 112 112 112 9 112 3
10: 178 178 178 178 116 95 178 178 178 178 10 178 4
add a comment |
up vote
1
down vote
Might not be the efficient solution as it doesn't take advantage of data.table
syntax but one method using apply
library(data.table)
dt$length <- apply(dt, 1, function(x) max(table(rleid(x))))
dt
# V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 length
# 1: 116 116 116 102 96 96 106 116 116 144 3
# 2: 114 114 114 114 114 114 121 111 98 108 6
# 3: 88 78 78 77 72 96 96 95 95 95 3
# 4: 118 77 77 86 139 127 127 103 93 84 2
# 5: 154 154 154 121 121 114 111 111 111 111 4
# 6: 175 175 125 125 125 125 164 125 125 141 4
# 7: 174 174 125 118 117 116 139 116 102 104 2
# 8: 95 95 175 175 176 176 139 123 140 141 2
# 9: 140 106 174 162 162 169 140 112 112 112 3
#10: 178 178 178 178 116 95 178 178 178 178 4
For every row we calculate the length of longest continual sequence of value.
1
I can't see how you can really avoid doingnrow
*rle
calls without getting substantially less clean.
– thelatemail
Nov 22 at 2:26
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
You can convert into a long format before performing your rle
. Then, look up the smallest value of the longest length:
rmax <- melt(dt[, rn:=.I], id.vars="rn")[,
{
r <- rle(value)
m <- max(r$lengths)
.(val=min(r$values[r$lengths==m]), len=m)
},
by=.(rn)]
rmax[dt, on=.(rn)]
output:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 rn val len
1: 116 116 116 102 96 96 106 116 116 144 1 116 3
2: 114 114 114 114 114 114 121 111 98 108 2 114 6
3: 88 78 78 77 72 96 96 95 95 95 3 95 3
4: 118 77 77 86 139 127 127 103 93 84 4 77 2
5: 154 154 154 121 121 114 111 111 111 111 5 111 4
6: 175 175 125 125 125 125 164 125 125 141 6 125 4
7: 174 174 125 118 117 116 139 116 102 104 7 174 2
8: 95 95 175 175 176 176 139 123 140 141 8 95 2
9: 140 106 174 162 162 169 140 112 112 112 9 112 3
10: 178 178 178 178 116 95 178 178 178 178 10 178 4
add a comment |
up vote
2
down vote
You can convert into a long format before performing your rle
. Then, look up the smallest value of the longest length:
rmax <- melt(dt[, rn:=.I], id.vars="rn")[,
{
r <- rle(value)
m <- max(r$lengths)
.(val=min(r$values[r$lengths==m]), len=m)
},
by=.(rn)]
rmax[dt, on=.(rn)]
output:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 rn val len
1: 116 116 116 102 96 96 106 116 116 144 1 116 3
2: 114 114 114 114 114 114 121 111 98 108 2 114 6
3: 88 78 78 77 72 96 96 95 95 95 3 95 3
4: 118 77 77 86 139 127 127 103 93 84 4 77 2
5: 154 154 154 121 121 114 111 111 111 111 5 111 4
6: 175 175 125 125 125 125 164 125 125 141 6 125 4
7: 174 174 125 118 117 116 139 116 102 104 7 174 2
8: 95 95 175 175 176 176 139 123 140 141 8 95 2
9: 140 106 174 162 162 169 140 112 112 112 9 112 3
10: 178 178 178 178 116 95 178 178 178 178 10 178 4
add a comment |
up vote
2
down vote
up vote
2
down vote
You can convert into a long format before performing your rle
. Then, look up the smallest value of the longest length:
rmax <- melt(dt[, rn:=.I], id.vars="rn")[,
{
r <- rle(value)
m <- max(r$lengths)
.(val=min(r$values[r$lengths==m]), len=m)
},
by=.(rn)]
rmax[dt, on=.(rn)]
output:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 rn val len
1: 116 116 116 102 96 96 106 116 116 144 1 116 3
2: 114 114 114 114 114 114 121 111 98 108 2 114 6
3: 88 78 78 77 72 96 96 95 95 95 3 95 3
4: 118 77 77 86 139 127 127 103 93 84 4 77 2
5: 154 154 154 121 121 114 111 111 111 111 5 111 4
6: 175 175 125 125 125 125 164 125 125 141 6 125 4
7: 174 174 125 118 117 116 139 116 102 104 7 174 2
8: 95 95 175 175 176 176 139 123 140 141 8 95 2
9: 140 106 174 162 162 169 140 112 112 112 9 112 3
10: 178 178 178 178 116 95 178 178 178 178 10 178 4
You can convert into a long format before performing your rle
. Then, look up the smallest value of the longest length:
rmax <- melt(dt[, rn:=.I], id.vars="rn")[,
{
r <- rle(value)
m <- max(r$lengths)
.(val=min(r$values[r$lengths==m]), len=m)
},
by=.(rn)]
rmax[dt, on=.(rn)]
output:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 rn val len
1: 116 116 116 102 96 96 106 116 116 144 1 116 3
2: 114 114 114 114 114 114 121 111 98 108 2 114 6
3: 88 78 78 77 72 96 96 95 95 95 3 95 3
4: 118 77 77 86 139 127 127 103 93 84 4 77 2
5: 154 154 154 121 121 114 111 111 111 111 5 111 4
6: 175 175 125 125 125 125 164 125 125 141 6 125 4
7: 174 174 125 118 117 116 139 116 102 104 7 174 2
8: 95 95 175 175 176 176 139 123 140 141 8 95 2
9: 140 106 174 162 162 169 140 112 112 112 9 112 3
10: 178 178 178 178 116 95 178 178 178 178 10 178 4
answered Nov 22 at 2:21
chinsoon12
7,66611118
7,66611118
add a comment |
add a comment |
up vote
1
down vote
Might not be the efficient solution as it doesn't take advantage of data.table
syntax but one method using apply
library(data.table)
dt$length <- apply(dt, 1, function(x) max(table(rleid(x))))
dt
# V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 length
# 1: 116 116 116 102 96 96 106 116 116 144 3
# 2: 114 114 114 114 114 114 121 111 98 108 6
# 3: 88 78 78 77 72 96 96 95 95 95 3
# 4: 118 77 77 86 139 127 127 103 93 84 2
# 5: 154 154 154 121 121 114 111 111 111 111 4
# 6: 175 175 125 125 125 125 164 125 125 141 4
# 7: 174 174 125 118 117 116 139 116 102 104 2
# 8: 95 95 175 175 176 176 139 123 140 141 2
# 9: 140 106 174 162 162 169 140 112 112 112 3
#10: 178 178 178 178 116 95 178 178 178 178 4
For every row we calculate the length of longest continual sequence of value.
1
I can't see how you can really avoid doingnrow
*rle
calls without getting substantially less clean.
– thelatemail
Nov 22 at 2:26
add a comment |
up vote
1
down vote
Might not be the efficient solution as it doesn't take advantage of data.table
syntax but one method using apply
library(data.table)
dt$length <- apply(dt, 1, function(x) max(table(rleid(x))))
dt
# V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 length
# 1: 116 116 116 102 96 96 106 116 116 144 3
# 2: 114 114 114 114 114 114 121 111 98 108 6
# 3: 88 78 78 77 72 96 96 95 95 95 3
# 4: 118 77 77 86 139 127 127 103 93 84 2
# 5: 154 154 154 121 121 114 111 111 111 111 4
# 6: 175 175 125 125 125 125 164 125 125 141 4
# 7: 174 174 125 118 117 116 139 116 102 104 2
# 8: 95 95 175 175 176 176 139 123 140 141 2
# 9: 140 106 174 162 162 169 140 112 112 112 3
#10: 178 178 178 178 116 95 178 178 178 178 4
For every row we calculate the length of longest continual sequence of value.
1
I can't see how you can really avoid doingnrow
*rle
calls without getting substantially less clean.
– thelatemail
Nov 22 at 2:26
add a comment |
up vote
1
down vote
up vote
1
down vote
Might not be the efficient solution as it doesn't take advantage of data.table
syntax but one method using apply
library(data.table)
dt$length <- apply(dt, 1, function(x) max(table(rleid(x))))
dt
# V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 length
# 1: 116 116 116 102 96 96 106 116 116 144 3
# 2: 114 114 114 114 114 114 121 111 98 108 6
# 3: 88 78 78 77 72 96 96 95 95 95 3
# 4: 118 77 77 86 139 127 127 103 93 84 2
# 5: 154 154 154 121 121 114 111 111 111 111 4
# 6: 175 175 125 125 125 125 164 125 125 141 4
# 7: 174 174 125 118 117 116 139 116 102 104 2
# 8: 95 95 175 175 176 176 139 123 140 141 2
# 9: 140 106 174 162 162 169 140 112 112 112 3
#10: 178 178 178 178 116 95 178 178 178 178 4
For every row we calculate the length of longest continual sequence of value.
Might not be the efficient solution as it doesn't take advantage of data.table
syntax but one method using apply
library(data.table)
dt$length <- apply(dt, 1, function(x) max(table(rleid(x))))
dt
# V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 length
# 1: 116 116 116 102 96 96 106 116 116 144 3
# 2: 114 114 114 114 114 114 121 111 98 108 6
# 3: 88 78 78 77 72 96 96 95 95 95 3
# 4: 118 77 77 86 139 127 127 103 93 84 2
# 5: 154 154 154 121 121 114 111 111 111 111 4
# 6: 175 175 125 125 125 125 164 125 125 141 4
# 7: 174 174 125 118 117 116 139 116 102 104 2
# 8: 95 95 175 175 176 176 139 123 140 141 2
# 9: 140 106 174 162 162 169 140 112 112 112 3
#10: 178 178 178 178 116 95 178 178 178 178 4
For every row we calculate the length of longest continual sequence of value.
edited Nov 22 at 2:27
answered Nov 22 at 2:25
Ronak Shah
28.6k103653
28.6k103653
1
I can't see how you can really avoid doingnrow
*rle
calls without getting substantially less clean.
– thelatemail
Nov 22 at 2:26
add a comment |
1
I can't see how you can really avoid doingnrow
*rle
calls without getting substantially less clean.
– thelatemail
Nov 22 at 2:26
1
1
I can't see how you can really avoid doing
nrow
* rle
calls without getting substantially less clean.– thelatemail
Nov 22 at 2:26
I can't see how you can really avoid doing
nrow
* rle
calls without getting substantially less clean.– thelatemail
Nov 22 at 2:26
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53422965%2ffind-longest-length-and-value-in-repetitive-sequence-in-data-table%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown