A function to fill in a column with NA of the same type
up vote
6
down vote
favorite
I have a data frame with many columns of different types. I would like to replace each column with NA of the corresponding class.
for example:
df = data_frame(x = c(1,2,3), y = c("a", "b", "c"))
df[, 1:2] <- NA
yields a data frame with two logical columns, rather than numeric and character.
I know I can tell R:
df[,1] = as.numeric(NA)
df[,2] = as.character(NA)
But how do I do this collectively in a loop for all columns with all possible types of NA?
r dplyr na
add a comment |
up vote
6
down vote
favorite
I have a data frame with many columns of different types. I would like to replace each column with NA of the corresponding class.
for example:
df = data_frame(x = c(1,2,3), y = c("a", "b", "c"))
df[, 1:2] <- NA
yields a data frame with two logical columns, rather than numeric and character.
I know I can tell R:
df[,1] = as.numeric(NA)
df[,2] = as.character(NA)
But how do I do this collectively in a loop for all columns with all possible types of NA?
r dplyr na
3
Good question +1, but why does this matter?
– Tim Biegeleisen
1 hour ago
It's a very weird problem, I later need to join the data frame with another frame of the original type...
– Omry Atia
1 hour ago
But why? Please give us more context, seems like pointless (but fun) step.
– zx8754
41 mins ago
add a comment |
up vote
6
down vote
favorite
up vote
6
down vote
favorite
I have a data frame with many columns of different types. I would like to replace each column with NA of the corresponding class.
for example:
df = data_frame(x = c(1,2,3), y = c("a", "b", "c"))
df[, 1:2] <- NA
yields a data frame with two logical columns, rather than numeric and character.
I know I can tell R:
df[,1] = as.numeric(NA)
df[,2] = as.character(NA)
But how do I do this collectively in a loop for all columns with all possible types of NA?
r dplyr na
I have a data frame with many columns of different types. I would like to replace each column with NA of the corresponding class.
for example:
df = data_frame(x = c(1,2,3), y = c("a", "b", "c"))
df[, 1:2] <- NA
yields a data frame with two logical columns, rather than numeric and character.
I know I can tell R:
df[,1] = as.numeric(NA)
df[,2] = as.character(NA)
But how do I do this collectively in a loop for all columns with all possible types of NA?
r dplyr na
r dplyr na
edited 1 hour ago
zx8754
28.9k76395
28.9k76395
asked 1 hour ago
Omry Atia
672411
672411
3
Good question +1, but why does this matter?
– Tim Biegeleisen
1 hour ago
It's a very weird problem, I later need to join the data frame with another frame of the original type...
– Omry Atia
1 hour ago
But why? Please give us more context, seems like pointless (but fun) step.
– zx8754
41 mins ago
add a comment |
3
Good question +1, but why does this matter?
– Tim Biegeleisen
1 hour ago
It's a very weird problem, I later need to join the data frame with another frame of the original type...
– Omry Atia
1 hour ago
But why? Please give us more context, seems like pointless (but fun) step.
– zx8754
41 mins ago
3
3
Good question +1, but why does this matter?
– Tim Biegeleisen
1 hour ago
Good question +1, but why does this matter?
– Tim Biegeleisen
1 hour ago
It's a very weird problem, I later need to join the data frame with another frame of the original type...
– Omry Atia
1 hour ago
It's a very weird problem, I later need to join the data frame with another frame of the original type...
– Omry Atia
1 hour ago
But why? Please give us more context, seems like pointless (but fun) step.
– zx8754
41 mins ago
But why? Please give us more context, seems like pointless (but fun) step.
– zx8754
41 mins ago
add a comment |
5 Answers
5
active
oldest
votes
up vote
5
down vote
accepted
You can use this "trick" :
df[1:nrow(df),1] <- NA
df[1:nrow(df),2] <- NA
the [1:nrow(df),]
basically tells R to replace all values in the column with NA
and in this way the logical NA
is coerced to the original type of the column before replacing the other values.
Also, if you have a lot of columns to replace and the data_frame has a lot of rows, I suggest to store the row indexes and reuse them :
rowIdxs <- 1:nrow(df)
df[rowIdxs ,1] <- NA
df[rowIdxs ,2] <- NA
df[rowIdxs ,3] <- NA
...
As cleverly suggested by @RonakShah, you can also use :
df[TRUE, 1] <- NA
df[TRUE, 2] <- NA
...
As pointed out by @Cath both the methods still work when you select more than one column e.g. :
df[TRUE, 1:3] <- NA
# or
df[1:nrow(df), 1:3] <- NA
This doesn't seem to work... df is still logical :(
– Omry Atia
1 hour ago
@OmryAtia : edited. it should work now ;)
– digEmAll
1 hour ago
Awesome... so simple :)
– Omry Atia
1 hour ago
3
why not directlydf[TRUE, 1:2] <- NA
?
– Cath
44 mins ago
@Cath: sure, added in the answer, thanks !
– digEmAll
38 mins ago
add a comment |
up vote
6
down vote
Another solution that applies to all the columns can be to specify the non-NAs and replace with NA, i.e.
df[!is.na(df)] <- NA
which gives,
# A tibble: 3 x 2
x y
<dbl> <chr>
1 NA <NA>
2 NA <NA>
3 NA <NA>
add a comment |
up vote
3
down vote
Using dplyr::na_if:
library(dplyr)
df %>%
mutate(x = na_if(x, x),
y = na_if(y, y))
# # A tibble: 3 x 2
# x y
# <dbl> <chr>
# 1 NA NA
# 2 NA NA
# 3 NA NA
If we want to mutate only subset of columns to NA, then:
# dataframe with extra column that stay unchanged
df = data_frame(x = c(1,2,3), y = c("a", "b", "c"), z = c(4:6))
df %>%
mutate_at(vars(x, y), funs(na_if(.,.)))
# # A tibble: 3 x 3
# x y z
# <dbl> <chr> <int>
# 1 NA NA 4
# 2 NA NA 5
# 3 NA NA 6
add a comment |
up vote
2
down vote
Another way to change all columns at once while keeping the variables' classes:
df <- lapply(df, function(x) {type <- class(x); x <- NA; class(x) <- type; x})
df
# A tibble: 3 x 2
# x y
# <dbl> <chr>
#1 NA <NA>
#2 NA <NA>
#3 NA <NA>
As @digEmAll notified in comments, there is another similar but shorter way:
df <- lapply(df, function(x) as(NA,class(x)))
2
Alsolapply(df, function(x)as(NA,class(x)))
should work
– digEmAll
52 mins ago
@digEmAll indeed and much shorter thanks!
– Cath
51 mins ago
add a comment |
up vote
0
down vote
Using bind_cols()
from dplyr
you can also do:
df <- data_frame(x = c(1,2,3), y = c("a", "b", "c"))
classes <- sapply(df, class)
df[,1:2] <- NA
bind_cols(lapply(colnames(x), function(x){eval(parse(text=paste0("as.", classes[names(classes[x])], "(", df[,x],")")))}))
V1 V2
<dbl> <chr>
1 NA NA
2 NA NA
3 NA NA
Please note that this will change the colnames.
add a comment |
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
5
down vote
accepted
You can use this "trick" :
df[1:nrow(df),1] <- NA
df[1:nrow(df),2] <- NA
the [1:nrow(df),]
basically tells R to replace all values in the column with NA
and in this way the logical NA
is coerced to the original type of the column before replacing the other values.
Also, if you have a lot of columns to replace and the data_frame has a lot of rows, I suggest to store the row indexes and reuse them :
rowIdxs <- 1:nrow(df)
df[rowIdxs ,1] <- NA
df[rowIdxs ,2] <- NA
df[rowIdxs ,3] <- NA
...
As cleverly suggested by @RonakShah, you can also use :
df[TRUE, 1] <- NA
df[TRUE, 2] <- NA
...
As pointed out by @Cath both the methods still work when you select more than one column e.g. :
df[TRUE, 1:3] <- NA
# or
df[1:nrow(df), 1:3] <- NA
This doesn't seem to work... df is still logical :(
– Omry Atia
1 hour ago
@OmryAtia : edited. it should work now ;)
– digEmAll
1 hour ago
Awesome... so simple :)
– Omry Atia
1 hour ago
3
why not directlydf[TRUE, 1:2] <- NA
?
– Cath
44 mins ago
@Cath: sure, added in the answer, thanks !
– digEmAll
38 mins ago
add a comment |
up vote
5
down vote
accepted
You can use this "trick" :
df[1:nrow(df),1] <- NA
df[1:nrow(df),2] <- NA
the [1:nrow(df),]
basically tells R to replace all values in the column with NA
and in this way the logical NA
is coerced to the original type of the column before replacing the other values.
Also, if you have a lot of columns to replace and the data_frame has a lot of rows, I suggest to store the row indexes and reuse them :
rowIdxs <- 1:nrow(df)
df[rowIdxs ,1] <- NA
df[rowIdxs ,2] <- NA
df[rowIdxs ,3] <- NA
...
As cleverly suggested by @RonakShah, you can also use :
df[TRUE, 1] <- NA
df[TRUE, 2] <- NA
...
As pointed out by @Cath both the methods still work when you select more than one column e.g. :
df[TRUE, 1:3] <- NA
# or
df[1:nrow(df), 1:3] <- NA
This doesn't seem to work... df is still logical :(
– Omry Atia
1 hour ago
@OmryAtia : edited. it should work now ;)
– digEmAll
1 hour ago
Awesome... so simple :)
– Omry Atia
1 hour ago
3
why not directlydf[TRUE, 1:2] <- NA
?
– Cath
44 mins ago
@Cath: sure, added in the answer, thanks !
– digEmAll
38 mins ago
add a comment |
up vote
5
down vote
accepted
up vote
5
down vote
accepted
You can use this "trick" :
df[1:nrow(df),1] <- NA
df[1:nrow(df),2] <- NA
the [1:nrow(df),]
basically tells R to replace all values in the column with NA
and in this way the logical NA
is coerced to the original type of the column before replacing the other values.
Also, if you have a lot of columns to replace and the data_frame has a lot of rows, I suggest to store the row indexes and reuse them :
rowIdxs <- 1:nrow(df)
df[rowIdxs ,1] <- NA
df[rowIdxs ,2] <- NA
df[rowIdxs ,3] <- NA
...
As cleverly suggested by @RonakShah, you can also use :
df[TRUE, 1] <- NA
df[TRUE, 2] <- NA
...
As pointed out by @Cath both the methods still work when you select more than one column e.g. :
df[TRUE, 1:3] <- NA
# or
df[1:nrow(df), 1:3] <- NA
You can use this "trick" :
df[1:nrow(df),1] <- NA
df[1:nrow(df),2] <- NA
the [1:nrow(df),]
basically tells R to replace all values in the column with NA
and in this way the logical NA
is coerced to the original type of the column before replacing the other values.
Also, if you have a lot of columns to replace and the data_frame has a lot of rows, I suggest to store the row indexes and reuse them :
rowIdxs <- 1:nrow(df)
df[rowIdxs ,1] <- NA
df[rowIdxs ,2] <- NA
df[rowIdxs ,3] <- NA
...
As cleverly suggested by @RonakShah, you can also use :
df[TRUE, 1] <- NA
df[TRUE, 2] <- NA
...
As pointed out by @Cath both the methods still work when you select more than one column e.g. :
df[TRUE, 1:3] <- NA
# or
df[1:nrow(df), 1:3] <- NA
edited 38 mins ago
answered 1 hour ago
digEmAll
46.1k984120
46.1k984120
This doesn't seem to work... df is still logical :(
– Omry Atia
1 hour ago
@OmryAtia : edited. it should work now ;)
– digEmAll
1 hour ago
Awesome... so simple :)
– Omry Atia
1 hour ago
3
why not directlydf[TRUE, 1:2] <- NA
?
– Cath
44 mins ago
@Cath: sure, added in the answer, thanks !
– digEmAll
38 mins ago
add a comment |
This doesn't seem to work... df is still logical :(
– Omry Atia
1 hour ago
@OmryAtia : edited. it should work now ;)
– digEmAll
1 hour ago
Awesome... so simple :)
– Omry Atia
1 hour ago
3
why not directlydf[TRUE, 1:2] <- NA
?
– Cath
44 mins ago
@Cath: sure, added in the answer, thanks !
– digEmAll
38 mins ago
This doesn't seem to work... df is still logical :(
– Omry Atia
1 hour ago
This doesn't seem to work... df is still logical :(
– Omry Atia
1 hour ago
@OmryAtia : edited. it should work now ;)
– digEmAll
1 hour ago
@OmryAtia : edited. it should work now ;)
– digEmAll
1 hour ago
Awesome... so simple :)
– Omry Atia
1 hour ago
Awesome... so simple :)
– Omry Atia
1 hour ago
3
3
why not directly
df[TRUE, 1:2] <- NA
?– Cath
44 mins ago
why not directly
df[TRUE, 1:2] <- NA
?– Cath
44 mins ago
@Cath: sure, added in the answer, thanks !
– digEmAll
38 mins ago
@Cath: sure, added in the answer, thanks !
– digEmAll
38 mins ago
add a comment |
up vote
6
down vote
Another solution that applies to all the columns can be to specify the non-NAs and replace with NA, i.e.
df[!is.na(df)] <- NA
which gives,
# A tibble: 3 x 2
x y
<dbl> <chr>
1 NA <NA>
2 NA <NA>
3 NA <NA>
add a comment |
up vote
6
down vote
Another solution that applies to all the columns can be to specify the non-NAs and replace with NA, i.e.
df[!is.na(df)] <- NA
which gives,
# A tibble: 3 x 2
x y
<dbl> <chr>
1 NA <NA>
2 NA <NA>
3 NA <NA>
add a comment |
up vote
6
down vote
up vote
6
down vote
Another solution that applies to all the columns can be to specify the non-NAs and replace with NA, i.e.
df[!is.na(df)] <- NA
which gives,
# A tibble: 3 x 2
x y
<dbl> <chr>
1 NA <NA>
2 NA <NA>
3 NA <NA>
Another solution that applies to all the columns can be to specify the non-NAs and replace with NA, i.e.
df[!is.na(df)] <- NA
which gives,
# A tibble: 3 x 2
x y
<dbl> <chr>
1 NA <NA>
2 NA <NA>
3 NA <NA>
answered 1 hour ago
Sotos
27.2k51640
27.2k51640
add a comment |
add a comment |
up vote
3
down vote
Using dplyr::na_if:
library(dplyr)
df %>%
mutate(x = na_if(x, x),
y = na_if(y, y))
# # A tibble: 3 x 2
# x y
# <dbl> <chr>
# 1 NA NA
# 2 NA NA
# 3 NA NA
If we want to mutate only subset of columns to NA, then:
# dataframe with extra column that stay unchanged
df = data_frame(x = c(1,2,3), y = c("a", "b", "c"), z = c(4:6))
df %>%
mutate_at(vars(x, y), funs(na_if(.,.)))
# # A tibble: 3 x 3
# x y z
# <dbl> <chr> <int>
# 1 NA NA 4
# 2 NA NA 5
# 3 NA NA 6
add a comment |
up vote
3
down vote
Using dplyr::na_if:
library(dplyr)
df %>%
mutate(x = na_if(x, x),
y = na_if(y, y))
# # A tibble: 3 x 2
# x y
# <dbl> <chr>
# 1 NA NA
# 2 NA NA
# 3 NA NA
If we want to mutate only subset of columns to NA, then:
# dataframe with extra column that stay unchanged
df = data_frame(x = c(1,2,3), y = c("a", "b", "c"), z = c(4:6))
df %>%
mutate_at(vars(x, y), funs(na_if(.,.)))
# # A tibble: 3 x 3
# x y z
# <dbl> <chr> <int>
# 1 NA NA 4
# 2 NA NA 5
# 3 NA NA 6
add a comment |
up vote
3
down vote
up vote
3
down vote
Using dplyr::na_if:
library(dplyr)
df %>%
mutate(x = na_if(x, x),
y = na_if(y, y))
# # A tibble: 3 x 2
# x y
# <dbl> <chr>
# 1 NA NA
# 2 NA NA
# 3 NA NA
If we want to mutate only subset of columns to NA, then:
# dataframe with extra column that stay unchanged
df = data_frame(x = c(1,2,3), y = c("a", "b", "c"), z = c(4:6))
df %>%
mutate_at(vars(x, y), funs(na_if(.,.)))
# # A tibble: 3 x 3
# x y z
# <dbl> <chr> <int>
# 1 NA NA 4
# 2 NA NA 5
# 3 NA NA 6
Using dplyr::na_if:
library(dplyr)
df %>%
mutate(x = na_if(x, x),
y = na_if(y, y))
# # A tibble: 3 x 2
# x y
# <dbl> <chr>
# 1 NA NA
# 2 NA NA
# 3 NA NA
If we want to mutate only subset of columns to NA, then:
# dataframe with extra column that stay unchanged
df = data_frame(x = c(1,2,3), y = c("a", "b", "c"), z = c(4:6))
df %>%
mutate_at(vars(x, y), funs(na_if(.,.)))
# # A tibble: 3 x 3
# x y z
# <dbl> <chr> <int>
# 1 NA NA 4
# 2 NA NA 5
# 3 NA NA 6
edited 1 hour ago
answered 1 hour ago
zx8754
28.9k76395
28.9k76395
add a comment |
add a comment |
up vote
2
down vote
Another way to change all columns at once while keeping the variables' classes:
df <- lapply(df, function(x) {type <- class(x); x <- NA; class(x) <- type; x})
df
# A tibble: 3 x 2
# x y
# <dbl> <chr>
#1 NA <NA>
#2 NA <NA>
#3 NA <NA>
As @digEmAll notified in comments, there is another similar but shorter way:
df <- lapply(df, function(x) as(NA,class(x)))
2
Alsolapply(df, function(x)as(NA,class(x)))
should work
– digEmAll
52 mins ago
@digEmAll indeed and much shorter thanks!
– Cath
51 mins ago
add a comment |
up vote
2
down vote
Another way to change all columns at once while keeping the variables' classes:
df <- lapply(df, function(x) {type <- class(x); x <- NA; class(x) <- type; x})
df
# A tibble: 3 x 2
# x y
# <dbl> <chr>
#1 NA <NA>
#2 NA <NA>
#3 NA <NA>
As @digEmAll notified in comments, there is another similar but shorter way:
df <- lapply(df, function(x) as(NA,class(x)))
2
Alsolapply(df, function(x)as(NA,class(x)))
should work
– digEmAll
52 mins ago
@digEmAll indeed and much shorter thanks!
– Cath
51 mins ago
add a comment |
up vote
2
down vote
up vote
2
down vote
Another way to change all columns at once while keeping the variables' classes:
df <- lapply(df, function(x) {type <- class(x); x <- NA; class(x) <- type; x})
df
# A tibble: 3 x 2
# x y
# <dbl> <chr>
#1 NA <NA>
#2 NA <NA>
#3 NA <NA>
As @digEmAll notified in comments, there is another similar but shorter way:
df <- lapply(df, function(x) as(NA,class(x)))
Another way to change all columns at once while keeping the variables' classes:
df <- lapply(df, function(x) {type <- class(x); x <- NA; class(x) <- type; x})
df
# A tibble: 3 x 2
# x y
# <dbl> <chr>
#1 NA <NA>
#2 NA <NA>
#3 NA <NA>
As @digEmAll notified in comments, there is another similar but shorter way:
df <- lapply(df, function(x) as(NA,class(x)))
edited 49 mins ago
answered 54 mins ago
Cath
19.5k43464
19.5k43464
2
Alsolapply(df, function(x)as(NA,class(x)))
should work
– digEmAll
52 mins ago
@digEmAll indeed and much shorter thanks!
– Cath
51 mins ago
add a comment |
2
Alsolapply(df, function(x)as(NA,class(x)))
should work
– digEmAll
52 mins ago
@digEmAll indeed and much shorter thanks!
– Cath
51 mins ago
2
2
Also
lapply(df, function(x)as(NA,class(x)))
should work– digEmAll
52 mins ago
Also
lapply(df, function(x)as(NA,class(x)))
should work– digEmAll
52 mins ago
@digEmAll indeed and much shorter thanks!
– Cath
51 mins ago
@digEmAll indeed and much shorter thanks!
– Cath
51 mins ago
add a comment |
up vote
0
down vote
Using bind_cols()
from dplyr
you can also do:
df <- data_frame(x = c(1,2,3), y = c("a", "b", "c"))
classes <- sapply(df, class)
df[,1:2] <- NA
bind_cols(lapply(colnames(x), function(x){eval(parse(text=paste0("as.", classes[names(classes[x])], "(", df[,x],")")))}))
V1 V2
<dbl> <chr>
1 NA NA
2 NA NA
3 NA NA
Please note that this will change the colnames.
add a comment |
up vote
0
down vote
Using bind_cols()
from dplyr
you can also do:
df <- data_frame(x = c(1,2,3), y = c("a", "b", "c"))
classes <- sapply(df, class)
df[,1:2] <- NA
bind_cols(lapply(colnames(x), function(x){eval(parse(text=paste0("as.", classes[names(classes[x])], "(", df[,x],")")))}))
V1 V2
<dbl> <chr>
1 NA NA
2 NA NA
3 NA NA
Please note that this will change the colnames.
add a comment |
up vote
0
down vote
up vote
0
down vote
Using bind_cols()
from dplyr
you can also do:
df <- data_frame(x = c(1,2,3), y = c("a", "b", "c"))
classes <- sapply(df, class)
df[,1:2] <- NA
bind_cols(lapply(colnames(x), function(x){eval(parse(text=paste0("as.", classes[names(classes[x])], "(", df[,x],")")))}))
V1 V2
<dbl> <chr>
1 NA NA
2 NA NA
3 NA NA
Please note that this will change the colnames.
Using bind_cols()
from dplyr
you can also do:
df <- data_frame(x = c(1,2,3), y = c("a", "b", "c"))
classes <- sapply(df, class)
df[,1:2] <- NA
bind_cols(lapply(colnames(x), function(x){eval(parse(text=paste0("as.", classes[names(classes[x])], "(", df[,x],")")))}))
V1 V2
<dbl> <chr>
1 NA NA
2 NA NA
3 NA NA
Please note that this will change the colnames.
answered 1 hour ago
alex_555
666315
666315
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53719387%2fa-function-to-fill-in-a-column-with-na-of-the-same-type%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
3
Good question +1, but why does this matter?
– Tim Biegeleisen
1 hour ago
It's a very weird problem, I later need to join the data frame with another frame of the original type...
– Omry Atia
1 hour ago
But why? Please give us more context, seems like pointless (but fun) step.
– zx8754
41 mins ago