subset using percentile for gridded data
Multi tool use
up vote
0
down vote
favorite
I have gridded data that has 24249 obs and 963 var for daily maximum temperatures (K). I am looking for a way in r to select all days with maximum temperatures higher than the 90th percentile.
> dim(DailyT)
[1] 24249 963
> DailyT[1:4,1:7]
x y 1988-05-01 1988-05-02 1988-05-03 1988-05-04 1988-05-05
1 34.000 33 291.7603 291.8044 291.6158 292.9659 293.7032
2 34.125 33 291.7240 291.7951 291.5439 292.9451 293.7017
3 34.250 33 291.6884 291.7866 291.4721 292.9250 293.7001
4 34.375 33 291.6521 291.7781 291.4010 292.9049 293.6986
I did this but did not work
df<- DailyT[DailyT[,3:963] <= quantile(DailyT[,3:963],.9, na.rm = T, type = 6) ]
r
add a comment |
up vote
0
down vote
favorite
I have gridded data that has 24249 obs and 963 var for daily maximum temperatures (K). I am looking for a way in r to select all days with maximum temperatures higher than the 90th percentile.
> dim(DailyT)
[1] 24249 963
> DailyT[1:4,1:7]
x y 1988-05-01 1988-05-02 1988-05-03 1988-05-04 1988-05-05
1 34.000 33 291.7603 291.8044 291.6158 292.9659 293.7032
2 34.125 33 291.7240 291.7951 291.5439 292.9451 293.7017
3 34.250 33 291.6884 291.7866 291.4721 292.9250 293.7001
4 34.375 33 291.6521 291.7781 291.4010 292.9049 293.6986
I did this but did not work
df<- DailyT[DailyT[,3:963] <= quantile(DailyT[,3:963],.9, na.rm = T, type = 6) ]
r
Maybe you find this helpful.
– A. Suliman
Nov 22 at 9:07
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I have gridded data that has 24249 obs and 963 var for daily maximum temperatures (K). I am looking for a way in r to select all days with maximum temperatures higher than the 90th percentile.
> dim(DailyT)
[1] 24249 963
> DailyT[1:4,1:7]
x y 1988-05-01 1988-05-02 1988-05-03 1988-05-04 1988-05-05
1 34.000 33 291.7603 291.8044 291.6158 292.9659 293.7032
2 34.125 33 291.7240 291.7951 291.5439 292.9451 293.7017
3 34.250 33 291.6884 291.7866 291.4721 292.9250 293.7001
4 34.375 33 291.6521 291.7781 291.4010 292.9049 293.6986
I did this but did not work
df<- DailyT[DailyT[,3:963] <= quantile(DailyT[,3:963],.9, na.rm = T, type = 6) ]
r
I have gridded data that has 24249 obs and 963 var for daily maximum temperatures (K). I am looking for a way in r to select all days with maximum temperatures higher than the 90th percentile.
> dim(DailyT)
[1] 24249 963
> DailyT[1:4,1:7]
x y 1988-05-01 1988-05-02 1988-05-03 1988-05-04 1988-05-05
1 34.000 33 291.7603 291.8044 291.6158 292.9659 293.7032
2 34.125 33 291.7240 291.7951 291.5439 292.9451 293.7017
3 34.250 33 291.6884 291.7866 291.4721 292.9250 293.7001
4 34.375 33 291.6521 291.7781 291.4010 292.9049 293.6986
I did this but did not work
df<- DailyT[DailyT[,3:963] <= quantile(DailyT[,3:963],.9, na.rm = T, type = 6) ]
r
r
asked Nov 22 at 8:55
Ali
337
337
Maybe you find this helpful.
– A. Suliman
Nov 22 at 9:07
add a comment |
Maybe you find this helpful.
– A. Suliman
Nov 22 at 9:07
Maybe you find this helpful.
– A. Suliman
Nov 22 at 9:07
Maybe you find this helpful.
– A. Suliman
Nov 22 at 9:07
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
First, you need an id column to identify the rows later. Then, calculate the 90% quantile of all temperature values. At the end subset data witch any row cells exceeding q
.
DailyT <- cbind(id=rownames(DailyT), DailyT) # to identify rows later
q <- quantile(as.matrix(DailyT[, -(1:3)]), .9, na.rm = T, type = 6) # 293.7003
DailyT.q <- DailyT[which(sapply(1:nrow(DailyT), function(x) any(DailyT[x, -(1:2)] >= q))), ]
Yields
> DailyT.q
id x y X1988.05.01 X1988.05.02 X1988.05.03 X1988.05.04 X1988.05.05
1 1 34.000 33 291.7603 291.8044 291.6158 292.9659 293.7032
2 2 34.125 33 291.7240 291.7951 291.5439 292.9451 293.7017
Edit:
To get the quantile rowwise use apply()
q90 <- apply(DailyT[, 4:8], MARGIN=1, quantile, .9,na.rm = T, type = 6)
> data.frame(DailyT, q90=q90)
id x y X1988.05.01 X1988.05.02 X1988.05.03 X1988.05.04 X1988.05.05 q90
1 1 34.000 33 291.7603 291.8044 291.6158 292.9659 293.7032 293.7032
2 2 34.125 33 291.7240 291.7951 291.5439 292.9451 293.7017 293.7017
3 3 34.250 33 291.6884 291.7866 291.4721 292.9250 293.7001 293.7001
4 4 34.375 33 291.6521 291.7781 291.4010 292.9049 293.6986 293.6986
Data
> dput(DailyT)
structure(list(x = c(34, 34.125, 34.25, 34.375), y = c(33L, 33L,
33L, 33L), X1988.05.01 = c(291.7603, 291.724, 291.6884, 291.6521
), X1988.05.02 = c(291.8044, 291.7951, 291.7866, 291.7781), X1988.05.03 = c(291.6158,
291.5439, 291.4721, 291.401), X1988.05.04 = c(292.9659, 292.9451,
292.925, 292.9049), X1988.05.05 = c(293.7032, 293.7017, 293.7001,
293.6986)), class = "data.frame", row.names = c(NA, -4L))
Thanks, I need to calculate the 90% quantile of each row not of all data.
– Ali
Nov 24 at 11:09
Aha, please see my edit.
– jay.sf
Nov 24 at 11:24
Worked.... Many thanks
– Ali
Nov 24 at 13:32
Very good! - Please mark the question as answered when you're satisfied with the given answer and win +2 reputation. This stops people spending time on answering a question that has already been answered.
– jay.sf
Nov 24 at 13:58
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
First, you need an id column to identify the rows later. Then, calculate the 90% quantile of all temperature values. At the end subset data witch any row cells exceeding q
.
DailyT <- cbind(id=rownames(DailyT), DailyT) # to identify rows later
q <- quantile(as.matrix(DailyT[, -(1:3)]), .9, na.rm = T, type = 6) # 293.7003
DailyT.q <- DailyT[which(sapply(1:nrow(DailyT), function(x) any(DailyT[x, -(1:2)] >= q))), ]
Yields
> DailyT.q
id x y X1988.05.01 X1988.05.02 X1988.05.03 X1988.05.04 X1988.05.05
1 1 34.000 33 291.7603 291.8044 291.6158 292.9659 293.7032
2 2 34.125 33 291.7240 291.7951 291.5439 292.9451 293.7017
Edit:
To get the quantile rowwise use apply()
q90 <- apply(DailyT[, 4:8], MARGIN=1, quantile, .9,na.rm = T, type = 6)
> data.frame(DailyT, q90=q90)
id x y X1988.05.01 X1988.05.02 X1988.05.03 X1988.05.04 X1988.05.05 q90
1 1 34.000 33 291.7603 291.8044 291.6158 292.9659 293.7032 293.7032
2 2 34.125 33 291.7240 291.7951 291.5439 292.9451 293.7017 293.7017
3 3 34.250 33 291.6884 291.7866 291.4721 292.9250 293.7001 293.7001
4 4 34.375 33 291.6521 291.7781 291.4010 292.9049 293.6986 293.6986
Data
> dput(DailyT)
structure(list(x = c(34, 34.125, 34.25, 34.375), y = c(33L, 33L,
33L, 33L), X1988.05.01 = c(291.7603, 291.724, 291.6884, 291.6521
), X1988.05.02 = c(291.8044, 291.7951, 291.7866, 291.7781), X1988.05.03 = c(291.6158,
291.5439, 291.4721, 291.401), X1988.05.04 = c(292.9659, 292.9451,
292.925, 292.9049), X1988.05.05 = c(293.7032, 293.7017, 293.7001,
293.6986)), class = "data.frame", row.names = c(NA, -4L))
Thanks, I need to calculate the 90% quantile of each row not of all data.
– Ali
Nov 24 at 11:09
Aha, please see my edit.
– jay.sf
Nov 24 at 11:24
Worked.... Many thanks
– Ali
Nov 24 at 13:32
Very good! - Please mark the question as answered when you're satisfied with the given answer and win +2 reputation. This stops people spending time on answering a question that has already been answered.
– jay.sf
Nov 24 at 13:58
add a comment |
up vote
0
down vote
First, you need an id column to identify the rows later. Then, calculate the 90% quantile of all temperature values. At the end subset data witch any row cells exceeding q
.
DailyT <- cbind(id=rownames(DailyT), DailyT) # to identify rows later
q <- quantile(as.matrix(DailyT[, -(1:3)]), .9, na.rm = T, type = 6) # 293.7003
DailyT.q <- DailyT[which(sapply(1:nrow(DailyT), function(x) any(DailyT[x, -(1:2)] >= q))), ]
Yields
> DailyT.q
id x y X1988.05.01 X1988.05.02 X1988.05.03 X1988.05.04 X1988.05.05
1 1 34.000 33 291.7603 291.8044 291.6158 292.9659 293.7032
2 2 34.125 33 291.7240 291.7951 291.5439 292.9451 293.7017
Edit:
To get the quantile rowwise use apply()
q90 <- apply(DailyT[, 4:8], MARGIN=1, quantile, .9,na.rm = T, type = 6)
> data.frame(DailyT, q90=q90)
id x y X1988.05.01 X1988.05.02 X1988.05.03 X1988.05.04 X1988.05.05 q90
1 1 34.000 33 291.7603 291.8044 291.6158 292.9659 293.7032 293.7032
2 2 34.125 33 291.7240 291.7951 291.5439 292.9451 293.7017 293.7017
3 3 34.250 33 291.6884 291.7866 291.4721 292.9250 293.7001 293.7001
4 4 34.375 33 291.6521 291.7781 291.4010 292.9049 293.6986 293.6986
Data
> dput(DailyT)
structure(list(x = c(34, 34.125, 34.25, 34.375), y = c(33L, 33L,
33L, 33L), X1988.05.01 = c(291.7603, 291.724, 291.6884, 291.6521
), X1988.05.02 = c(291.8044, 291.7951, 291.7866, 291.7781), X1988.05.03 = c(291.6158,
291.5439, 291.4721, 291.401), X1988.05.04 = c(292.9659, 292.9451,
292.925, 292.9049), X1988.05.05 = c(293.7032, 293.7017, 293.7001,
293.6986)), class = "data.frame", row.names = c(NA, -4L))
Thanks, I need to calculate the 90% quantile of each row not of all data.
– Ali
Nov 24 at 11:09
Aha, please see my edit.
– jay.sf
Nov 24 at 11:24
Worked.... Many thanks
– Ali
Nov 24 at 13:32
Very good! - Please mark the question as answered when you're satisfied with the given answer and win +2 reputation. This stops people spending time on answering a question that has already been answered.
– jay.sf
Nov 24 at 13:58
add a comment |
up vote
0
down vote
up vote
0
down vote
First, you need an id column to identify the rows later. Then, calculate the 90% quantile of all temperature values. At the end subset data witch any row cells exceeding q
.
DailyT <- cbind(id=rownames(DailyT), DailyT) # to identify rows later
q <- quantile(as.matrix(DailyT[, -(1:3)]), .9, na.rm = T, type = 6) # 293.7003
DailyT.q <- DailyT[which(sapply(1:nrow(DailyT), function(x) any(DailyT[x, -(1:2)] >= q))), ]
Yields
> DailyT.q
id x y X1988.05.01 X1988.05.02 X1988.05.03 X1988.05.04 X1988.05.05
1 1 34.000 33 291.7603 291.8044 291.6158 292.9659 293.7032
2 2 34.125 33 291.7240 291.7951 291.5439 292.9451 293.7017
Edit:
To get the quantile rowwise use apply()
q90 <- apply(DailyT[, 4:8], MARGIN=1, quantile, .9,na.rm = T, type = 6)
> data.frame(DailyT, q90=q90)
id x y X1988.05.01 X1988.05.02 X1988.05.03 X1988.05.04 X1988.05.05 q90
1 1 34.000 33 291.7603 291.8044 291.6158 292.9659 293.7032 293.7032
2 2 34.125 33 291.7240 291.7951 291.5439 292.9451 293.7017 293.7017
3 3 34.250 33 291.6884 291.7866 291.4721 292.9250 293.7001 293.7001
4 4 34.375 33 291.6521 291.7781 291.4010 292.9049 293.6986 293.6986
Data
> dput(DailyT)
structure(list(x = c(34, 34.125, 34.25, 34.375), y = c(33L, 33L,
33L, 33L), X1988.05.01 = c(291.7603, 291.724, 291.6884, 291.6521
), X1988.05.02 = c(291.8044, 291.7951, 291.7866, 291.7781), X1988.05.03 = c(291.6158,
291.5439, 291.4721, 291.401), X1988.05.04 = c(292.9659, 292.9451,
292.925, 292.9049), X1988.05.05 = c(293.7032, 293.7017, 293.7001,
293.6986)), class = "data.frame", row.names = c(NA, -4L))
First, you need an id column to identify the rows later. Then, calculate the 90% quantile of all temperature values. At the end subset data witch any row cells exceeding q
.
DailyT <- cbind(id=rownames(DailyT), DailyT) # to identify rows later
q <- quantile(as.matrix(DailyT[, -(1:3)]), .9, na.rm = T, type = 6) # 293.7003
DailyT.q <- DailyT[which(sapply(1:nrow(DailyT), function(x) any(DailyT[x, -(1:2)] >= q))), ]
Yields
> DailyT.q
id x y X1988.05.01 X1988.05.02 X1988.05.03 X1988.05.04 X1988.05.05
1 1 34.000 33 291.7603 291.8044 291.6158 292.9659 293.7032
2 2 34.125 33 291.7240 291.7951 291.5439 292.9451 293.7017
Edit:
To get the quantile rowwise use apply()
q90 <- apply(DailyT[, 4:8], MARGIN=1, quantile, .9,na.rm = T, type = 6)
> data.frame(DailyT, q90=q90)
id x y X1988.05.01 X1988.05.02 X1988.05.03 X1988.05.04 X1988.05.05 q90
1 1 34.000 33 291.7603 291.8044 291.6158 292.9659 293.7032 293.7032
2 2 34.125 33 291.7240 291.7951 291.5439 292.9451 293.7017 293.7017
3 3 34.250 33 291.6884 291.7866 291.4721 292.9250 293.7001 293.7001
4 4 34.375 33 291.6521 291.7781 291.4010 292.9049 293.6986 293.6986
Data
> dput(DailyT)
structure(list(x = c(34, 34.125, 34.25, 34.375), y = c(33L, 33L,
33L, 33L), X1988.05.01 = c(291.7603, 291.724, 291.6884, 291.6521
), X1988.05.02 = c(291.8044, 291.7951, 291.7866, 291.7781), X1988.05.03 = c(291.6158,
291.5439, 291.4721, 291.401), X1988.05.04 = c(292.9659, 292.9451,
292.925, 292.9049), X1988.05.05 = c(293.7032, 293.7017, 293.7001,
293.6986)), class = "data.frame", row.names = c(NA, -4L))
edited Nov 24 at 11:24
answered Nov 22 at 9:21
jay.sf
4,21621436
4,21621436
Thanks, I need to calculate the 90% quantile of each row not of all data.
– Ali
Nov 24 at 11:09
Aha, please see my edit.
– jay.sf
Nov 24 at 11:24
Worked.... Many thanks
– Ali
Nov 24 at 13:32
Very good! - Please mark the question as answered when you're satisfied with the given answer and win +2 reputation. This stops people spending time on answering a question that has already been answered.
– jay.sf
Nov 24 at 13:58
add a comment |
Thanks, I need to calculate the 90% quantile of each row not of all data.
– Ali
Nov 24 at 11:09
Aha, please see my edit.
– jay.sf
Nov 24 at 11:24
Worked.... Many thanks
– Ali
Nov 24 at 13:32
Very good! - Please mark the question as answered when you're satisfied with the given answer and win +2 reputation. This stops people spending time on answering a question that has already been answered.
– jay.sf
Nov 24 at 13:58
Thanks, I need to calculate the 90% quantile of each row not of all data.
– Ali
Nov 24 at 11:09
Thanks, I need to calculate the 90% quantile of each row not of all data.
– Ali
Nov 24 at 11:09
Aha, please see my edit.
– jay.sf
Nov 24 at 11:24
Aha, please see my edit.
– jay.sf
Nov 24 at 11:24
Worked.... Many thanks
– Ali
Nov 24 at 13:32
Worked.... Many thanks
– Ali
Nov 24 at 13:32
Very good! - Please mark the question as answered when you're satisfied with the given answer and win +2 reputation. This stops people spending time on answering a question that has already been answered.
– jay.sf
Nov 24 at 13:58
Very good! - Please mark the question as answered when you're satisfied with the given answer and win +2 reputation. This stops people spending time on answering a question that has already been answered.
– jay.sf
Nov 24 at 13:58
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53427098%2fsubset-using-percentile-for-gridded-data%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
HieSg7fwZqkQb1dTWVRI7NuiXhkl,yx1HibOLP5YtIBp zm YzX CAyAQD,JXRaebzOCiVh
Maybe you find this helpful.
– A. Suliman
Nov 22 at 9:07