“Incorrect number of dimensions” error, help me understand why
Organization of this question:
I. Background
II. The Problem/Question
III. Steps Taken to Make this Question Good
IV. Update: the output of head(x.path) and dput(x.path)
I. Background
I am customizing/adapting the e-mail classification code from the O'Reilly book "Machine Learning for Hackers" (Chapter 3). That code and its accompanying data can be found here: https://github.com/johnmyleswhite/ML_for_Hackers/tree/master/03-Classification
II. The Problem/Question
One of the main functions in that code is called get.msg(). The original function is
get.msg <- function(path)
{
con <- file(path, open = "rt", encoding = "latin1")
text <- readLines(con)
# The message always begins after the first full line break
msg <- text[seq(which(text == "")[1] + 1, length(text), 1)]
close(con)
return(paste(msg, collapse = "n"))
}
My data is different in a number of ways though, so I have to edit this quite a bit. My data is read in earlier from a relational DB, thus I don't have to read in and clean a text file. Instead, my email body data is the 18th column of a dataframe, which we can call x. Here is my version of get.msg():
get.msg <- function(path) {
bodyvector <- path[!(is.na(path[,18]) | path[,18]==""), ]
return(paste(bodyvector))
}
Originally I referred to it as x$email and this worked through most of the code, however in a later step the get.msg() function was used on x.path, where x.path pointed to x and was used within another function in combination with the paste() function, as per the authors of the example code:
z.spam <- sapply(spam.docs, function(p) count.word(paste(x.path,p,sep = ""), "keyword"))
Here, the count.word() function is a function containing get.msg(). So, the paste() function was causing problems because it caused x.path to be considered an atomic array apparently, and gave the error that $ could not be used with an atomic array. As per an older StackOverflow Q&A, I changed the way I referred to the column to path[,18] (which is evaluated as x.path[,18] and therefore is the same as x[,18]).
Then I did some checking to ensure that x.path[,18] had the same information as x.path$email, which it did. However, when I try to run the code I get an error message on get.msg(x.path), which is:
Error in path[,18] : incorrect number of dimensions.
I tried path[,'email'], then path[18,] and then just path by itself and all three led to the same error. I tried path[[1]][[18]] and that gave me a subscript out of bounds error.
Any thoughts?
III. Steps Taken to Make this Question Good
To avoid annoying anyone and getting any down votes, I confirmed that the topic was relevant to StackOverflow and I feel that it may be relevant to other people dealing with this or similar programming problems in the future. I also spent almost an hour researching this problem online and trying things in R to fix it.
There were plenty of references to this error message, however the causes seemed to be very diverse and completely unrelated (such as networking trouble, etc). Finally, I spent a significant amount of time editing this question to try to make it readable and properly formatted (I hope I did okay, I know it's a lot of information).
IV. The output of head() and dput()
Some of you extremely helpful folks have requested to see the output of head(x.path) or dput(x.path). I don't mind except that it's confidential company email data and I'll be out of a job and sued if I publish it. ;-)
I've pasted it here and replaced the real info with fake info. I hope this is okay. I tried to use dput() at first and I can do so if you like but it was truly an overwhelming amount of data. Here's head(x.path):
> head(x.path)
[1] "c("Z12e3317e4b1jZbbajZ9Zdd6", "Z12e3317e4b1jZbbajZ99124", "Z12e331Ze4b1jZbbajZ996dd", "Z12e3319e4b1jZbbajZ9acb6", "Z12e3319e4b1jZbbajZ9ad3b", "Z12e3319e4b1jZbbajZ9adjd", "Z12e3319e4b1jZbbajZ9aebZ", "Z12e3319e4b1jZbbajZ9aj23", "Z12e3319e4b1jZbbajZ9b22b", "Z12e3319e4b1jZbbajZ9b42a", "Z12e3319e4b1jZbbajZ9b49a", "Z12e331ae4b1jZbbajZ9bZ11", "Z12e331ae4b1jZbbajZ9bZZ4", "Z12e331ae4b1jZbbajZ9c237", "Z12e331ae4b1jZbbajZ9c2e4", "Z12e331ae4b1jZbbajZ9c3bZ", "Z12e331ae4b1jZbbajZ9c3cZ", "Z12e331ae4b1jZbbajZ9cZ31", n"Z12e331be4b1jZbbajZ9cddd", "Z12e331be4b1jZbbajZ9cja6", "Z12e331ce4b1jZbbajZ9da1j", "Z12e331de4b1jZbbajZ9e649", "Z12e331de4b1jZbbajZ9j669", "Z12e331de4b1jZbbajZ9jZZZ", "Z12e331ee4b1jZbbajZ9j944", "Z12e331ee4b1jZbbajZ9jcZa", "Z12e331ee4b1jZbbajZ9jd4c", "Z12e331ee4b1jZbbajZa11e2", "Z12e331ee4b1jZbbajZa1291", "Z12e331ee4b1jZbbajZa1344", "Z12e3311e4b1jZbbajZa1j73", "Z12e3311e4b1jZbbajZa1131", "Z12e3311e4b1jZbbajZa11Z6", "Z12e3311e4b1jZbbajZa124c", "Z12e3311e4b1jZbbajZa1Zbc", "Z12e3311e4b1jZbbajZa19a9", n"Z12e3311e4b1jZbbajZa1ac2", "Z12e3311e4b1jZbbajZa1b79", "Z12e3311e4b1jZbbajZa1db2", "Z12e3311e4b1jZbbajZa1ejb", "Z12e3312e4b1jZbbajZa2333", "Z12e3312e4b1jZbbajZa23aZ", "Z12e3312e4b1jZbbajZa24bb", "Z12e3312e4b1jZbbajZa2Z79", "Z12e3312e4b1jZbbajZa2Zea", "Z12e3312e4b1jZbbajZa2ba9", "Z12e3312e4b1jZbbajZa2cZa", "Z12e3313e4b1jZbbajZa3bc1", "Z12e3313e4b1jZbbajZa3ca9", "Z12e3313e4b1jZbbajZa3e71", "Z12e3ajbe4b1j66Zbcja4eZc", "Z12e3ajbe4b1j66Zbcja4ja4", "Z12e3c79e4b1j66ZbcjaZc36", "Z12e3e1ce4b1j66Zbcja64bd", n"Z12e4117e4b1j66Zbcja6Zj1", "Z12e41bae4b1j66Zbcja734Z", "Z12e4226e4b1j66Zbcja7b13", "Z12e4226e4b1j66Zbcja7cbZ", "Z12e4ajee4b1j66Zbcjaa916", "Z12e4e61e4b1j66Zbcjab1c2", "Z12e4e61e4b1j66Zbcjab2da", "Z12eZ226e4b1j66ZbcjacZea", "Z12e6141e4b1j66Zbcjb19Z9", "Z12e6141e4b1j66Zbcjb19jd", "Z12e61Z9e4b1j66Zbcjb1acb", "Z12e61Z9e4b1j66Zbcjb1acj", "Z12j9713e4b1j66Zbcjc34db", "Z12j9713e4b1j66Zbcjc3ZZa", "Z12j9713e4b1j66Zbcjc3Za7", "Z12j9713e4b1j66Zbcjc3Zd2", "Z12j9713e4b1j66Zbcjc36c2", "Z12j973ce4b1j66Zbcjc396b"n)"
[2] "c("Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", n"Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something")"
[3] "c(61Z7, 674Z, Z462, 692, Z26, 1121, 1213, 1317, 21ZZ, 2Z9Z, 2711, 3612, 3717, 4774, 4Z93, Z117, Z113, Z197, Z77Z, 61Z3, Z16Z, 11771, 12923, 13374, 13Z93, 14277, 1446Z, 1Z3ZZ, 1ZZ16, 1Z993, 164Z2, 16664, 1711Z, 171Z6, 1Z6ZZ, 1Z921, 19211, 193ZZ, 19931, 21117, 21164, 21177, 21371, 21Z61, 21673, 22ZZ7, 23137, 2ZZ44, 26166, 26Z1Z, 173Z6, 17661, 21Z74, 23119, 232ZZ, 249Z3, 2ZZ31, 261Z9, 31211, 33414, 336Z6, 37941, 1743, 1Z61, 216Z, 2171, 1ZZ3, 2119, 21Z4, 2129, 2334, 2ZZZ)"
[4] "c("Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", n"Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty")"
[5] "c(Z6, 93Z, 1314, 3, 4, Z, 6, 7, 9, 11, 11, 13, 14, 2Z, 26, 27, 2Z, 29, 33, 34, ZZ, Z3, 122, 12Z, 133, 139, 142, 147, 1Z2, 1Z3, 16Z, 169, 171, 171, 219, 221, 221, 222, 22Z, 226, 244, 246, 247, 24Z, 249, 2637, 264, 2Z9, 292, 296, 49, Z1, 76, 93, 9Z, 112, 111, 114, 1Z7, 211, 214, 263, 6, 7, 11, 11, 11, 11, 12, 13, 14, 1Z)"
[6] "c(3Z11, 3Z11, 3Z11, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, Z664, Z664, Z664, Z664, Z664, Z664, Z664, Z664, Z664, Z664, Z664, Z664, 66Z1, 66Z1, 66Z1, 66Z1, 4ZZ4, 4ZZ4, 4ZZ4, 4ZZ4, 4ZZ4, 4ZZ4)"
If this were to show you more then you'd see message bodies for [18].
r dimensions
|
show 3 more comments
Organization of this question:
I. Background
II. The Problem/Question
III. Steps Taken to Make this Question Good
IV. Update: the output of head(x.path) and dput(x.path)
I. Background
I am customizing/adapting the e-mail classification code from the O'Reilly book "Machine Learning for Hackers" (Chapter 3). That code and its accompanying data can be found here: https://github.com/johnmyleswhite/ML_for_Hackers/tree/master/03-Classification
II. The Problem/Question
One of the main functions in that code is called get.msg(). The original function is
get.msg <- function(path)
{
con <- file(path, open = "rt", encoding = "latin1")
text <- readLines(con)
# The message always begins after the first full line break
msg <- text[seq(which(text == "")[1] + 1, length(text), 1)]
close(con)
return(paste(msg, collapse = "n"))
}
My data is different in a number of ways though, so I have to edit this quite a bit. My data is read in earlier from a relational DB, thus I don't have to read in and clean a text file. Instead, my email body data is the 18th column of a dataframe, which we can call x. Here is my version of get.msg():
get.msg <- function(path) {
bodyvector <- path[!(is.na(path[,18]) | path[,18]==""), ]
return(paste(bodyvector))
}
Originally I referred to it as x$email and this worked through most of the code, however in a later step the get.msg() function was used on x.path, where x.path pointed to x and was used within another function in combination with the paste() function, as per the authors of the example code:
z.spam <- sapply(spam.docs, function(p) count.word(paste(x.path,p,sep = ""), "keyword"))
Here, the count.word() function is a function containing get.msg(). So, the paste() function was causing problems because it caused x.path to be considered an atomic array apparently, and gave the error that $ could not be used with an atomic array. As per an older StackOverflow Q&A, I changed the way I referred to the column to path[,18] (which is evaluated as x.path[,18] and therefore is the same as x[,18]).
Then I did some checking to ensure that x.path[,18] had the same information as x.path$email, which it did. However, when I try to run the code I get an error message on get.msg(x.path), which is:
Error in path[,18] : incorrect number of dimensions.
I tried path[,'email'], then path[18,] and then just path by itself and all three led to the same error. I tried path[[1]][[18]] and that gave me a subscript out of bounds error.
Any thoughts?
III. Steps Taken to Make this Question Good
To avoid annoying anyone and getting any down votes, I confirmed that the topic was relevant to StackOverflow and I feel that it may be relevant to other people dealing with this or similar programming problems in the future. I also spent almost an hour researching this problem online and trying things in R to fix it.
There were plenty of references to this error message, however the causes seemed to be very diverse and completely unrelated (such as networking trouble, etc). Finally, I spent a significant amount of time editing this question to try to make it readable and properly formatted (I hope I did okay, I know it's a lot of information).
IV. The output of head() and dput()
Some of you extremely helpful folks have requested to see the output of head(x.path) or dput(x.path). I don't mind except that it's confidential company email data and I'll be out of a job and sued if I publish it. ;-)
I've pasted it here and replaced the real info with fake info. I hope this is okay. I tried to use dput() at first and I can do so if you like but it was truly an overwhelming amount of data. Here's head(x.path):
> head(x.path)
[1] "c("Z12e3317e4b1jZbbajZ9Zdd6", "Z12e3317e4b1jZbbajZ99124", "Z12e331Ze4b1jZbbajZ996dd", "Z12e3319e4b1jZbbajZ9acb6", "Z12e3319e4b1jZbbajZ9ad3b", "Z12e3319e4b1jZbbajZ9adjd", "Z12e3319e4b1jZbbajZ9aebZ", "Z12e3319e4b1jZbbajZ9aj23", "Z12e3319e4b1jZbbajZ9b22b", "Z12e3319e4b1jZbbajZ9b42a", "Z12e3319e4b1jZbbajZ9b49a", "Z12e331ae4b1jZbbajZ9bZ11", "Z12e331ae4b1jZbbajZ9bZZ4", "Z12e331ae4b1jZbbajZ9c237", "Z12e331ae4b1jZbbajZ9c2e4", "Z12e331ae4b1jZbbajZ9c3bZ", "Z12e331ae4b1jZbbajZ9c3cZ", "Z12e331ae4b1jZbbajZ9cZ31", n"Z12e331be4b1jZbbajZ9cddd", "Z12e331be4b1jZbbajZ9cja6", "Z12e331ce4b1jZbbajZ9da1j", "Z12e331de4b1jZbbajZ9e649", "Z12e331de4b1jZbbajZ9j669", "Z12e331de4b1jZbbajZ9jZZZ", "Z12e331ee4b1jZbbajZ9j944", "Z12e331ee4b1jZbbajZ9jcZa", "Z12e331ee4b1jZbbajZ9jd4c", "Z12e331ee4b1jZbbajZa11e2", "Z12e331ee4b1jZbbajZa1291", "Z12e331ee4b1jZbbajZa1344", "Z12e3311e4b1jZbbajZa1j73", "Z12e3311e4b1jZbbajZa1131", "Z12e3311e4b1jZbbajZa11Z6", "Z12e3311e4b1jZbbajZa124c", "Z12e3311e4b1jZbbajZa1Zbc", "Z12e3311e4b1jZbbajZa19a9", n"Z12e3311e4b1jZbbajZa1ac2", "Z12e3311e4b1jZbbajZa1b79", "Z12e3311e4b1jZbbajZa1db2", "Z12e3311e4b1jZbbajZa1ejb", "Z12e3312e4b1jZbbajZa2333", "Z12e3312e4b1jZbbajZa23aZ", "Z12e3312e4b1jZbbajZa24bb", "Z12e3312e4b1jZbbajZa2Z79", "Z12e3312e4b1jZbbajZa2Zea", "Z12e3312e4b1jZbbajZa2ba9", "Z12e3312e4b1jZbbajZa2cZa", "Z12e3313e4b1jZbbajZa3bc1", "Z12e3313e4b1jZbbajZa3ca9", "Z12e3313e4b1jZbbajZa3e71", "Z12e3ajbe4b1j66Zbcja4eZc", "Z12e3ajbe4b1j66Zbcja4ja4", "Z12e3c79e4b1j66ZbcjaZc36", "Z12e3e1ce4b1j66Zbcja64bd", n"Z12e4117e4b1j66Zbcja6Zj1", "Z12e41bae4b1j66Zbcja734Z", "Z12e4226e4b1j66Zbcja7b13", "Z12e4226e4b1j66Zbcja7cbZ", "Z12e4ajee4b1j66Zbcjaa916", "Z12e4e61e4b1j66Zbcjab1c2", "Z12e4e61e4b1j66Zbcjab2da", "Z12eZ226e4b1j66ZbcjacZea", "Z12e6141e4b1j66Zbcjb19Z9", "Z12e6141e4b1j66Zbcjb19jd", "Z12e61Z9e4b1j66Zbcjb1acb", "Z12e61Z9e4b1j66Zbcjb1acj", "Z12j9713e4b1j66Zbcjc34db", "Z12j9713e4b1j66Zbcjc3ZZa", "Z12j9713e4b1j66Zbcjc3Za7", "Z12j9713e4b1j66Zbcjc3Zd2", "Z12j9713e4b1j66Zbcjc36c2", "Z12j973ce4b1j66Zbcjc396b"n)"
[2] "c("Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", n"Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something")"
[3] "c(61Z7, 674Z, Z462, 692, Z26, 1121, 1213, 1317, 21ZZ, 2Z9Z, 2711, 3612, 3717, 4774, 4Z93, Z117, Z113, Z197, Z77Z, 61Z3, Z16Z, 11771, 12923, 13374, 13Z93, 14277, 1446Z, 1Z3ZZ, 1ZZ16, 1Z993, 164Z2, 16664, 1711Z, 171Z6, 1Z6ZZ, 1Z921, 19211, 193ZZ, 19931, 21117, 21164, 21177, 21371, 21Z61, 21673, 22ZZ7, 23137, 2ZZ44, 26166, 26Z1Z, 173Z6, 17661, 21Z74, 23119, 232ZZ, 249Z3, 2ZZ31, 261Z9, 31211, 33414, 336Z6, 37941, 1743, 1Z61, 216Z, 2171, 1ZZ3, 2119, 21Z4, 2129, 2334, 2ZZZ)"
[4] "c("Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", n"Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty")"
[5] "c(Z6, 93Z, 1314, 3, 4, Z, 6, 7, 9, 11, 11, 13, 14, 2Z, 26, 27, 2Z, 29, 33, 34, ZZ, Z3, 122, 12Z, 133, 139, 142, 147, 1Z2, 1Z3, 16Z, 169, 171, 171, 219, 221, 221, 222, 22Z, 226, 244, 246, 247, 24Z, 249, 2637, 264, 2Z9, 292, 296, 49, Z1, 76, 93, 9Z, 112, 111, 114, 1Z7, 211, 214, 263, 6, 7, 11, 11, 11, 11, 12, 13, 14, 1Z)"
[6] "c(3Z11, 3Z11, 3Z11, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, Z664, Z664, Z664, Z664, Z664, Z664, Z664, Z664, Z664, Z664, Z664, Z664, 66Z1, 66Z1, 66Z1, 66Z1, 4ZZ4, 4ZZ4, 4ZZ4, 4ZZ4, 4ZZ4, 4ZZ4)"
If this were to show you more then you'd see message bodies for [18].
r dimensions
3
It will be much easier if you show us your object (head, str) and the offending line of code. A reproducible example may go even farther.
– Roman Luštrik
Mar 29 '13 at 23:57
Thought ispathneeds to be a two-dimension object (e.g.data.frameormatrix) so you can dopath[,18]; yourx.pathis not. Just doclass(x.path)and you should see that.
– flodel
Mar 30 '13 at 0:16
Update: I also tried replacing [,18] with [,'email'] but got the same message. I see two comments popped up while I'm editing this so let me save my commend then I will follow up on yours (and thanks btw!). I would give you the output of head() but it's confidential email bodies : /
– user2225772
Mar 30 '13 at 0:16
flodel: You're right, class(x.path) shows that it's character due to the paste() command, but I used that because of the authors' example and because I can't figure out how to get away from it while still using the anonymous function like in that third snippet of code in my original post. Is there a way I could do that without paste tho? Sorry for the dumb question.
– user2225772
Mar 30 '13 at 0:21
Roman: I can however describe the output of head(x.path). It's a large dataframe with different kinds of data stored relating to an email client. The only column I care about at the moment is the email body column and that one is just the text of emails, with r and n and other such text representations of formatting.
– user2225772
Mar 30 '13 at 0:24
|
show 3 more comments
Organization of this question:
I. Background
II. The Problem/Question
III. Steps Taken to Make this Question Good
IV. Update: the output of head(x.path) and dput(x.path)
I. Background
I am customizing/adapting the e-mail classification code from the O'Reilly book "Machine Learning for Hackers" (Chapter 3). That code and its accompanying data can be found here: https://github.com/johnmyleswhite/ML_for_Hackers/tree/master/03-Classification
II. The Problem/Question
One of the main functions in that code is called get.msg(). The original function is
get.msg <- function(path)
{
con <- file(path, open = "rt", encoding = "latin1")
text <- readLines(con)
# The message always begins after the first full line break
msg <- text[seq(which(text == "")[1] + 1, length(text), 1)]
close(con)
return(paste(msg, collapse = "n"))
}
My data is different in a number of ways though, so I have to edit this quite a bit. My data is read in earlier from a relational DB, thus I don't have to read in and clean a text file. Instead, my email body data is the 18th column of a dataframe, which we can call x. Here is my version of get.msg():
get.msg <- function(path) {
bodyvector <- path[!(is.na(path[,18]) | path[,18]==""), ]
return(paste(bodyvector))
}
Originally I referred to it as x$email and this worked through most of the code, however in a later step the get.msg() function was used on x.path, where x.path pointed to x and was used within another function in combination with the paste() function, as per the authors of the example code:
z.spam <- sapply(spam.docs, function(p) count.word(paste(x.path,p,sep = ""), "keyword"))
Here, the count.word() function is a function containing get.msg(). So, the paste() function was causing problems because it caused x.path to be considered an atomic array apparently, and gave the error that $ could not be used with an atomic array. As per an older StackOverflow Q&A, I changed the way I referred to the column to path[,18] (which is evaluated as x.path[,18] and therefore is the same as x[,18]).
Then I did some checking to ensure that x.path[,18] had the same information as x.path$email, which it did. However, when I try to run the code I get an error message on get.msg(x.path), which is:
Error in path[,18] : incorrect number of dimensions.
I tried path[,'email'], then path[18,] and then just path by itself and all three led to the same error. I tried path[[1]][[18]] and that gave me a subscript out of bounds error.
Any thoughts?
III. Steps Taken to Make this Question Good
To avoid annoying anyone and getting any down votes, I confirmed that the topic was relevant to StackOverflow and I feel that it may be relevant to other people dealing with this or similar programming problems in the future. I also spent almost an hour researching this problem online and trying things in R to fix it.
There were plenty of references to this error message, however the causes seemed to be very diverse and completely unrelated (such as networking trouble, etc). Finally, I spent a significant amount of time editing this question to try to make it readable and properly formatted (I hope I did okay, I know it's a lot of information).
IV. The output of head() and dput()
Some of you extremely helpful folks have requested to see the output of head(x.path) or dput(x.path). I don't mind except that it's confidential company email data and I'll be out of a job and sued if I publish it. ;-)
I've pasted it here and replaced the real info with fake info. I hope this is okay. I tried to use dput() at first and I can do so if you like but it was truly an overwhelming amount of data. Here's head(x.path):
> head(x.path)
[1] "c("Z12e3317e4b1jZbbajZ9Zdd6", "Z12e3317e4b1jZbbajZ99124", "Z12e331Ze4b1jZbbajZ996dd", "Z12e3319e4b1jZbbajZ9acb6", "Z12e3319e4b1jZbbajZ9ad3b", "Z12e3319e4b1jZbbajZ9adjd", "Z12e3319e4b1jZbbajZ9aebZ", "Z12e3319e4b1jZbbajZ9aj23", "Z12e3319e4b1jZbbajZ9b22b", "Z12e3319e4b1jZbbajZ9b42a", "Z12e3319e4b1jZbbajZ9b49a", "Z12e331ae4b1jZbbajZ9bZ11", "Z12e331ae4b1jZbbajZ9bZZ4", "Z12e331ae4b1jZbbajZ9c237", "Z12e331ae4b1jZbbajZ9c2e4", "Z12e331ae4b1jZbbajZ9c3bZ", "Z12e331ae4b1jZbbajZ9c3cZ", "Z12e331ae4b1jZbbajZ9cZ31", n"Z12e331be4b1jZbbajZ9cddd", "Z12e331be4b1jZbbajZ9cja6", "Z12e331ce4b1jZbbajZ9da1j", "Z12e331de4b1jZbbajZ9e649", "Z12e331de4b1jZbbajZ9j669", "Z12e331de4b1jZbbajZ9jZZZ", "Z12e331ee4b1jZbbajZ9j944", "Z12e331ee4b1jZbbajZ9jcZa", "Z12e331ee4b1jZbbajZ9jd4c", "Z12e331ee4b1jZbbajZa11e2", "Z12e331ee4b1jZbbajZa1291", "Z12e331ee4b1jZbbajZa1344", "Z12e3311e4b1jZbbajZa1j73", "Z12e3311e4b1jZbbajZa1131", "Z12e3311e4b1jZbbajZa11Z6", "Z12e3311e4b1jZbbajZa124c", "Z12e3311e4b1jZbbajZa1Zbc", "Z12e3311e4b1jZbbajZa19a9", n"Z12e3311e4b1jZbbajZa1ac2", "Z12e3311e4b1jZbbajZa1b79", "Z12e3311e4b1jZbbajZa1db2", "Z12e3311e4b1jZbbajZa1ejb", "Z12e3312e4b1jZbbajZa2333", "Z12e3312e4b1jZbbajZa23aZ", "Z12e3312e4b1jZbbajZa24bb", "Z12e3312e4b1jZbbajZa2Z79", "Z12e3312e4b1jZbbajZa2Zea", "Z12e3312e4b1jZbbajZa2ba9", "Z12e3312e4b1jZbbajZa2cZa", "Z12e3313e4b1jZbbajZa3bc1", "Z12e3313e4b1jZbbajZa3ca9", "Z12e3313e4b1jZbbajZa3e71", "Z12e3ajbe4b1j66Zbcja4eZc", "Z12e3ajbe4b1j66Zbcja4ja4", "Z12e3c79e4b1j66ZbcjaZc36", "Z12e3e1ce4b1j66Zbcja64bd", n"Z12e4117e4b1j66Zbcja6Zj1", "Z12e41bae4b1j66Zbcja734Z", "Z12e4226e4b1j66Zbcja7b13", "Z12e4226e4b1j66Zbcja7cbZ", "Z12e4ajee4b1j66Zbcjaa916", "Z12e4e61e4b1j66Zbcjab1c2", "Z12e4e61e4b1j66Zbcjab2da", "Z12eZ226e4b1j66ZbcjacZea", "Z12e6141e4b1j66Zbcjb19Z9", "Z12e6141e4b1j66Zbcjb19jd", "Z12e61Z9e4b1j66Zbcjb1acb", "Z12e61Z9e4b1j66Zbcjb1acj", "Z12j9713e4b1j66Zbcjc34db", "Z12j9713e4b1j66Zbcjc3ZZa", "Z12j9713e4b1j66Zbcjc3Za7", "Z12j9713e4b1j66Zbcjc3Zd2", "Z12j9713e4b1j66Zbcjc36c2", "Z12j973ce4b1j66Zbcjc396b"n)"
[2] "c("Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", n"Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something")"
[3] "c(61Z7, 674Z, Z462, 692, Z26, 1121, 1213, 1317, 21ZZ, 2Z9Z, 2711, 3612, 3717, 4774, 4Z93, Z117, Z113, Z197, Z77Z, 61Z3, Z16Z, 11771, 12923, 13374, 13Z93, 14277, 1446Z, 1Z3ZZ, 1ZZ16, 1Z993, 164Z2, 16664, 1711Z, 171Z6, 1Z6ZZ, 1Z921, 19211, 193ZZ, 19931, 21117, 21164, 21177, 21371, 21Z61, 21673, 22ZZ7, 23137, 2ZZ44, 26166, 26Z1Z, 173Z6, 17661, 21Z74, 23119, 232ZZ, 249Z3, 2ZZ31, 261Z9, 31211, 33414, 336Z6, 37941, 1743, 1Z61, 216Z, 2171, 1ZZ3, 2119, 21Z4, 2129, 2334, 2ZZZ)"
[4] "c("Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", n"Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty")"
[5] "c(Z6, 93Z, 1314, 3, 4, Z, 6, 7, 9, 11, 11, 13, 14, 2Z, 26, 27, 2Z, 29, 33, 34, ZZ, Z3, 122, 12Z, 133, 139, 142, 147, 1Z2, 1Z3, 16Z, 169, 171, 171, 219, 221, 221, 222, 22Z, 226, 244, 246, 247, 24Z, 249, 2637, 264, 2Z9, 292, 296, 49, Z1, 76, 93, 9Z, 112, 111, 114, 1Z7, 211, 214, 263, 6, 7, 11, 11, 11, 11, 12, 13, 14, 1Z)"
[6] "c(3Z11, 3Z11, 3Z11, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, Z664, Z664, Z664, Z664, Z664, Z664, Z664, Z664, Z664, Z664, Z664, Z664, 66Z1, 66Z1, 66Z1, 66Z1, 4ZZ4, 4ZZ4, 4ZZ4, 4ZZ4, 4ZZ4, 4ZZ4)"
If this were to show you more then you'd see message bodies for [18].
r dimensions
Organization of this question:
I. Background
II. The Problem/Question
III. Steps Taken to Make this Question Good
IV. Update: the output of head(x.path) and dput(x.path)
I. Background
I am customizing/adapting the e-mail classification code from the O'Reilly book "Machine Learning for Hackers" (Chapter 3). That code and its accompanying data can be found here: https://github.com/johnmyleswhite/ML_for_Hackers/tree/master/03-Classification
II. The Problem/Question
One of the main functions in that code is called get.msg(). The original function is
get.msg <- function(path)
{
con <- file(path, open = "rt", encoding = "latin1")
text <- readLines(con)
# The message always begins after the first full line break
msg <- text[seq(which(text == "")[1] + 1, length(text), 1)]
close(con)
return(paste(msg, collapse = "n"))
}
My data is different in a number of ways though, so I have to edit this quite a bit. My data is read in earlier from a relational DB, thus I don't have to read in and clean a text file. Instead, my email body data is the 18th column of a dataframe, which we can call x. Here is my version of get.msg():
get.msg <- function(path) {
bodyvector <- path[!(is.na(path[,18]) | path[,18]==""), ]
return(paste(bodyvector))
}
Originally I referred to it as x$email and this worked through most of the code, however in a later step the get.msg() function was used on x.path, where x.path pointed to x and was used within another function in combination with the paste() function, as per the authors of the example code:
z.spam <- sapply(spam.docs, function(p) count.word(paste(x.path,p,sep = ""), "keyword"))
Here, the count.word() function is a function containing get.msg(). So, the paste() function was causing problems because it caused x.path to be considered an atomic array apparently, and gave the error that $ could not be used with an atomic array. As per an older StackOverflow Q&A, I changed the way I referred to the column to path[,18] (which is evaluated as x.path[,18] and therefore is the same as x[,18]).
Then I did some checking to ensure that x.path[,18] had the same information as x.path$email, which it did. However, when I try to run the code I get an error message on get.msg(x.path), which is:
Error in path[,18] : incorrect number of dimensions.
I tried path[,'email'], then path[18,] and then just path by itself and all three led to the same error. I tried path[[1]][[18]] and that gave me a subscript out of bounds error.
Any thoughts?
III. Steps Taken to Make this Question Good
To avoid annoying anyone and getting any down votes, I confirmed that the topic was relevant to StackOverflow and I feel that it may be relevant to other people dealing with this or similar programming problems in the future. I also spent almost an hour researching this problem online and trying things in R to fix it.
There were plenty of references to this error message, however the causes seemed to be very diverse and completely unrelated (such as networking trouble, etc). Finally, I spent a significant amount of time editing this question to try to make it readable and properly formatted (I hope I did okay, I know it's a lot of information).
IV. The output of head() and dput()
Some of you extremely helpful folks have requested to see the output of head(x.path) or dput(x.path). I don't mind except that it's confidential company email data and I'll be out of a job and sued if I publish it. ;-)
I've pasted it here and replaced the real info with fake info. I hope this is okay. I tried to use dput() at first and I can do so if you like but it was truly an overwhelming amount of data. Here's head(x.path):
> head(x.path)
[1] "c("Z12e3317e4b1jZbbajZ9Zdd6", "Z12e3317e4b1jZbbajZ99124", "Z12e331Ze4b1jZbbajZ996dd", "Z12e3319e4b1jZbbajZ9acb6", "Z12e3319e4b1jZbbajZ9ad3b", "Z12e3319e4b1jZbbajZ9adjd", "Z12e3319e4b1jZbbajZ9aebZ", "Z12e3319e4b1jZbbajZ9aj23", "Z12e3319e4b1jZbbajZ9b22b", "Z12e3319e4b1jZbbajZ9b42a", "Z12e3319e4b1jZbbajZ9b49a", "Z12e331ae4b1jZbbajZ9bZ11", "Z12e331ae4b1jZbbajZ9bZZ4", "Z12e331ae4b1jZbbajZ9c237", "Z12e331ae4b1jZbbajZ9c2e4", "Z12e331ae4b1jZbbajZ9c3bZ", "Z12e331ae4b1jZbbajZ9c3cZ", "Z12e331ae4b1jZbbajZ9cZ31", n"Z12e331be4b1jZbbajZ9cddd", "Z12e331be4b1jZbbajZ9cja6", "Z12e331ce4b1jZbbajZ9da1j", "Z12e331de4b1jZbbajZ9e649", "Z12e331de4b1jZbbajZ9j669", "Z12e331de4b1jZbbajZ9jZZZ", "Z12e331ee4b1jZbbajZ9j944", "Z12e331ee4b1jZbbajZ9jcZa", "Z12e331ee4b1jZbbajZ9jd4c", "Z12e331ee4b1jZbbajZa11e2", "Z12e331ee4b1jZbbajZa1291", "Z12e331ee4b1jZbbajZa1344", "Z12e3311e4b1jZbbajZa1j73", "Z12e3311e4b1jZbbajZa1131", "Z12e3311e4b1jZbbajZa11Z6", "Z12e3311e4b1jZbbajZa124c", "Z12e3311e4b1jZbbajZa1Zbc", "Z12e3311e4b1jZbbajZa19a9", n"Z12e3311e4b1jZbbajZa1ac2", "Z12e3311e4b1jZbbajZa1b79", "Z12e3311e4b1jZbbajZa1db2", "Z12e3311e4b1jZbbajZa1ejb", "Z12e3312e4b1jZbbajZa2333", "Z12e3312e4b1jZbbajZa23aZ", "Z12e3312e4b1jZbbajZa24bb", "Z12e3312e4b1jZbbajZa2Z79", "Z12e3312e4b1jZbbajZa2Zea", "Z12e3312e4b1jZbbajZa2ba9", "Z12e3312e4b1jZbbajZa2cZa", "Z12e3313e4b1jZbbajZa3bc1", "Z12e3313e4b1jZbbajZa3ca9", "Z12e3313e4b1jZbbajZa3e71", "Z12e3ajbe4b1j66Zbcja4eZc", "Z12e3ajbe4b1j66Zbcja4ja4", "Z12e3c79e4b1j66ZbcjaZc36", "Z12e3e1ce4b1j66Zbcja64bd", n"Z12e4117e4b1j66Zbcja6Zj1", "Z12e41bae4b1j66Zbcja734Z", "Z12e4226e4b1j66Zbcja7b13", "Z12e4226e4b1j66Zbcja7cbZ", "Z12e4ajee4b1j66Zbcjaa916", "Z12e4e61e4b1j66Zbcjab1c2", "Z12e4e61e4b1j66Zbcjab2da", "Z12eZ226e4b1j66ZbcjacZea", "Z12e6141e4b1j66Zbcjb19Z9", "Z12e6141e4b1j66Zbcjb19jd", "Z12e61Z9e4b1j66Zbcjb1acb", "Z12e61Z9e4b1j66Zbcjb1acj", "Z12j9713e4b1j66Zbcjc34db", "Z12j9713e4b1j66Zbcjc3ZZa", "Z12j9713e4b1j66Zbcjc3Za7", "Z12j9713e4b1j66Zbcjc3Zd2", "Z12j9713e4b1j66Zbcjc36c2", "Z12j973ce4b1j66Zbcjc396b"n)"
[2] "c("Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", n"Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something", "Something")"
[3] "c(61Z7, 674Z, Z462, 692, Z26, 1121, 1213, 1317, 21ZZ, 2Z9Z, 2711, 3612, 3717, 4774, 4Z93, Z117, Z113, Z197, Z77Z, 61Z3, Z16Z, 11771, 12923, 13374, 13Z93, 14277, 1446Z, 1Z3ZZ, 1ZZ16, 1Z993, 164Z2, 16664, 1711Z, 171Z6, 1Z6ZZ, 1Z921, 19211, 193ZZ, 19931, 21117, 21164, 21177, 21371, 21Z61, 21673, 22ZZ7, 23137, 2ZZ44, 26166, 26Z1Z, 173Z6, 17661, 21Z74, 23119, 232ZZ, 249Z3, 2ZZ31, 261Z9, 31211, 33414, 336Z6, 37941, 1743, 1Z61, 216Z, 2171, 1ZZ3, 2119, 21Z4, 2129, 2334, 2ZZZ)"
[4] "c("Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", n"Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty", "Booty")"
[5] "c(Z6, 93Z, 1314, 3, 4, Z, 6, 7, 9, 11, 11, 13, 14, 2Z, 26, 27, 2Z, 29, 33, 34, ZZ, Z3, 122, 12Z, 133, 139, 142, 147, 1Z2, 1Z3, 16Z, 169, 171, 171, 219, 221, 221, 222, 22Z, 226, 244, 246, 247, 24Z, 249, 2637, 264, 2Z9, 292, 296, 49, Z1, 76, 93, 9Z, 112, 111, 114, 1Z7, 211, 214, 263, 6, 7, 11, 11, 11, 11, 12, 13, 14, 1Z)"
[6] "c(3Z11, 3Z11, 3Z11, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, 691Z, Z664, Z664, Z664, Z664, Z664, Z664, Z664, Z664, Z664, Z664, Z664, Z664, 66Z1, 66Z1, 66Z1, 66Z1, 4ZZ4, 4ZZ4, 4ZZ4, 4ZZ4, 4ZZ4, 4ZZ4)"
If this were to show you more then you'd see message bodies for [18].
r dimensions
r dimensions
edited Nov 23 '18 at 9:12
zx8754
29.2k76398
29.2k76398
asked Mar 29 '13 at 23:52
user2225772
74117
74117
3
It will be much easier if you show us your object (head, str) and the offending line of code. A reproducible example may go even farther.
– Roman Luštrik
Mar 29 '13 at 23:57
Thought ispathneeds to be a two-dimension object (e.g.data.frameormatrix) so you can dopath[,18]; yourx.pathis not. Just doclass(x.path)and you should see that.
– flodel
Mar 30 '13 at 0:16
Update: I also tried replacing [,18] with [,'email'] but got the same message. I see two comments popped up while I'm editing this so let me save my commend then I will follow up on yours (and thanks btw!). I would give you the output of head() but it's confidential email bodies : /
– user2225772
Mar 30 '13 at 0:16
flodel: You're right, class(x.path) shows that it's character due to the paste() command, but I used that because of the authors' example and because I can't figure out how to get away from it while still using the anonymous function like in that third snippet of code in my original post. Is there a way I could do that without paste tho? Sorry for the dumb question.
– user2225772
Mar 30 '13 at 0:21
Roman: I can however describe the output of head(x.path). It's a large dataframe with different kinds of data stored relating to an email client. The only column I care about at the moment is the email body column and that one is just the text of emails, with r and n and other such text representations of formatting.
– user2225772
Mar 30 '13 at 0:24
|
show 3 more comments
3
It will be much easier if you show us your object (head, str) and the offending line of code. A reproducible example may go even farther.
– Roman Luštrik
Mar 29 '13 at 23:57
Thought ispathneeds to be a two-dimension object (e.g.data.frameormatrix) so you can dopath[,18]; yourx.pathis not. Just doclass(x.path)and you should see that.
– flodel
Mar 30 '13 at 0:16
Update: I also tried replacing [,18] with [,'email'] but got the same message. I see two comments popped up while I'm editing this so let me save my commend then I will follow up on yours (and thanks btw!). I would give you the output of head() but it's confidential email bodies : /
– user2225772
Mar 30 '13 at 0:16
flodel: You're right, class(x.path) shows that it's character due to the paste() command, but I used that because of the authors' example and because I can't figure out how to get away from it while still using the anonymous function like in that third snippet of code in my original post. Is there a way I could do that without paste tho? Sorry for the dumb question.
– user2225772
Mar 30 '13 at 0:21
Roman: I can however describe the output of head(x.path). It's a large dataframe with different kinds of data stored relating to an email client. The only column I care about at the moment is the email body column and that one is just the text of emails, with r and n and other such text representations of formatting.
– user2225772
Mar 30 '13 at 0:24
3
3
It will be much easier if you show us your object (head, str) and the offending line of code. A reproducible example may go even farther.
– Roman Luštrik
Mar 29 '13 at 23:57
It will be much easier if you show us your object (head, str) and the offending line of code. A reproducible example may go even farther.
– Roman Luštrik
Mar 29 '13 at 23:57
Thought is
path needs to be a two-dimension object (e.g. data.frame or matrix) so you can do path[,18]; your x.path is not. Just do class(x.path) and you should see that.– flodel
Mar 30 '13 at 0:16
Thought is
path needs to be a two-dimension object (e.g. data.frame or matrix) so you can do path[,18]; your x.path is not. Just do class(x.path) and you should see that.– flodel
Mar 30 '13 at 0:16
Update: I also tried replacing [,18] with [,'email'] but got the same message. I see two comments popped up while I'm editing this so let me save my commend then I will follow up on yours (and thanks btw!). I would give you the output of head() but it's confidential email bodies : /
– user2225772
Mar 30 '13 at 0:16
Update: I also tried replacing [,18] with [,'email'] but got the same message. I see two comments popped up while I'm editing this so let me save my commend then I will follow up on yours (and thanks btw!). I would give you the output of head() but it's confidential email bodies : /
– user2225772
Mar 30 '13 at 0:16
flodel: You're right, class(x.path) shows that it's character due to the paste() command, but I used that because of the authors' example and because I can't figure out how to get away from it while still using the anonymous function like in that third snippet of code in my original post. Is there a way I could do that without paste tho? Sorry for the dumb question.
– user2225772
Mar 30 '13 at 0:21
flodel: You're right, class(x.path) shows that it's character due to the paste() command, but I used that because of the authors' example and because I can't figure out how to get away from it while still using the anonymous function like in that third snippet of code in my original post. Is there a way I could do that without paste tho? Sorry for the dumb question.
– user2225772
Mar 30 '13 at 0:21
Roman: I can however describe the output of head(x.path). It's a large dataframe with different kinds of data stored relating to an email client. The only column I care about at the moment is the email body column and that one is just the text of emails, with r and n and other such text representations of formatting.
– user2225772
Mar 30 '13 at 0:24
Roman: I can however describe the output of head(x.path). It's a large dataframe with different kinds of data stored relating to an email client. The only column I care about at the moment is the email body column and that one is just the text of emails, with r and n and other such text representations of formatting.
– user2225772
Mar 30 '13 at 0:24
|
show 3 more comments
2 Answers
2
active
oldest
votes
Your example is a little complex for me to run, but I have gotten this error a number of times and the problem has always been due ultimately to the default behavior of the extract function (i.e. ) in coercing to the lowest possible number of dimensions. As BondedDust observes, if you extract a single column from a data frame you can no longer select subsets of the frame with the same syntax, because you do not have a data frame any more.
Frequently these problems vanish if, in any operation in which you may be reducing the data frame to a single column, you set the parameter drop=FALSE in the extract operation. I suggest that you look carefully not only at the line where the error is generated but also at any preceding lines in which the "" operation is used on the problem data frame. Look at the help for the data frame method for the extract function, "extract.data.frame"
believe the problem is probably that when you subset the data frame to a single column, it is coerced to a single dimension and can no longer be indexed by column number or row number.
add a comment |
This might deserve to be a comment but it wouldn't fit and I'm prepared to delete if warranted. You say
"So, the paste function was causing problems because it caused x.path to be considered an atomic array apparently, and gave the error that $ could not be used with an atomic array. As per an older StackOverflow Q&A, I changed the way I referred to the column to path[,18] (which is evaluated as x.path[,18] and therefore is the same as x[,18])."
If x.path is an atomic array then you cannot use x.path[ , 18] but rather need to use x.path[18].
You can inspect x.path with str(x.path) and your output suggests that is indeed a character vector. In R only objects with two dimensions (matrices and data.frames) can be referenced with object[ , n] references.
I think you might be onto something but I got this error: Error in path[!(is.na(path[18]) | path[18] == ""), ] : incorrect number of dimensions
– user2225772
Mar 30 '13 at 1:58
By the way, it is character (thanks to paste) and str(x.path) gives me this: str(x.path) chr [1:145] "c("5... and then it goes on a long way
– user2225772
Mar 30 '13 at 2:01
dim(x.path) says NULL... so it has no dimensions, but I'm not even really sure what that means. If there are no dimensions shouldn't path by itself have worked? But that gave the same error...
– user2225772
Mar 30 '13 at 2:03
Vectors in R have no dimensions. It just an ordinary character vector. I do not know what you mean by "shouldn't path work".
– 42-
Mar 30 '13 at 2:04
Thanks. It sounds like we are on the same page; that's what I thought it meant. When I said "shouldn't path work" I meant, since there is only one column/vector should just using "path" by itself instead of path[,18] in get.msg make it work? But it gives the same error. I also just tried changing the paste statement to specify the column of interest at that point but it didn't work.
– user2225772
Mar 30 '13 at 2:18
|
show 2 more comments
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f15713222%2fincorrect-number-of-dimensions-error-help-me-understand-why%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Your example is a little complex for me to run, but I have gotten this error a number of times and the problem has always been due ultimately to the default behavior of the extract function (i.e. ) in coercing to the lowest possible number of dimensions. As BondedDust observes, if you extract a single column from a data frame you can no longer select subsets of the frame with the same syntax, because you do not have a data frame any more.
Frequently these problems vanish if, in any operation in which you may be reducing the data frame to a single column, you set the parameter drop=FALSE in the extract operation. I suggest that you look carefully not only at the line where the error is generated but also at any preceding lines in which the "" operation is used on the problem data frame. Look at the help for the data frame method for the extract function, "extract.data.frame"
believe the problem is probably that when you subset the data frame to a single column, it is coerced to a single dimension and can no longer be indexed by column number or row number.
add a comment |
Your example is a little complex for me to run, but I have gotten this error a number of times and the problem has always been due ultimately to the default behavior of the extract function (i.e. ) in coercing to the lowest possible number of dimensions. As BondedDust observes, if you extract a single column from a data frame you can no longer select subsets of the frame with the same syntax, because you do not have a data frame any more.
Frequently these problems vanish if, in any operation in which you may be reducing the data frame to a single column, you set the parameter drop=FALSE in the extract operation. I suggest that you look carefully not only at the line where the error is generated but also at any preceding lines in which the "" operation is used on the problem data frame. Look at the help for the data frame method for the extract function, "extract.data.frame"
believe the problem is probably that when you subset the data frame to a single column, it is coerced to a single dimension and can no longer be indexed by column number or row number.
add a comment |
Your example is a little complex for me to run, but I have gotten this error a number of times and the problem has always been due ultimately to the default behavior of the extract function (i.e. ) in coercing to the lowest possible number of dimensions. As BondedDust observes, if you extract a single column from a data frame you can no longer select subsets of the frame with the same syntax, because you do not have a data frame any more.
Frequently these problems vanish if, in any operation in which you may be reducing the data frame to a single column, you set the parameter drop=FALSE in the extract operation. I suggest that you look carefully not only at the line where the error is generated but also at any preceding lines in which the "" operation is used on the problem data frame. Look at the help for the data frame method for the extract function, "extract.data.frame"
believe the problem is probably that when you subset the data frame to a single column, it is coerced to a single dimension and can no longer be indexed by column number or row number.
Your example is a little complex for me to run, but I have gotten this error a number of times and the problem has always been due ultimately to the default behavior of the extract function (i.e. ) in coercing to the lowest possible number of dimensions. As BondedDust observes, if you extract a single column from a data frame you can no longer select subsets of the frame with the same syntax, because you do not have a data frame any more.
Frequently these problems vanish if, in any operation in which you may be reducing the data frame to a single column, you set the parameter drop=FALSE in the extract operation. I suggest that you look carefully not only at the line where the error is generated but also at any preceding lines in which the "" operation is used on the problem data frame. Look at the help for the data frame method for the extract function, "extract.data.frame"
believe the problem is probably that when you subset the data frame to a single column, it is coerced to a single dimension and can no longer be indexed by column number or row number.
answered Apr 16 '14 at 6:52
andrewH
7811717
7811717
add a comment |
add a comment |
This might deserve to be a comment but it wouldn't fit and I'm prepared to delete if warranted. You say
"So, the paste function was causing problems because it caused x.path to be considered an atomic array apparently, and gave the error that $ could not be used with an atomic array. As per an older StackOverflow Q&A, I changed the way I referred to the column to path[,18] (which is evaluated as x.path[,18] and therefore is the same as x[,18])."
If x.path is an atomic array then you cannot use x.path[ , 18] but rather need to use x.path[18].
You can inspect x.path with str(x.path) and your output suggests that is indeed a character vector. In R only objects with two dimensions (matrices and data.frames) can be referenced with object[ , n] references.
I think you might be onto something but I got this error: Error in path[!(is.na(path[18]) | path[18] == ""), ] : incorrect number of dimensions
– user2225772
Mar 30 '13 at 1:58
By the way, it is character (thanks to paste) and str(x.path) gives me this: str(x.path) chr [1:145] "c("5... and then it goes on a long way
– user2225772
Mar 30 '13 at 2:01
dim(x.path) says NULL... so it has no dimensions, but I'm not even really sure what that means. If there are no dimensions shouldn't path by itself have worked? But that gave the same error...
– user2225772
Mar 30 '13 at 2:03
Vectors in R have no dimensions. It just an ordinary character vector. I do not know what you mean by "shouldn't path work".
– 42-
Mar 30 '13 at 2:04
Thanks. It sounds like we are on the same page; that's what I thought it meant. When I said "shouldn't path work" I meant, since there is only one column/vector should just using "path" by itself instead of path[,18] in get.msg make it work? But it gives the same error. I also just tried changing the paste statement to specify the column of interest at that point but it didn't work.
– user2225772
Mar 30 '13 at 2:18
|
show 2 more comments
This might deserve to be a comment but it wouldn't fit and I'm prepared to delete if warranted. You say
"So, the paste function was causing problems because it caused x.path to be considered an atomic array apparently, and gave the error that $ could not be used with an atomic array. As per an older StackOverflow Q&A, I changed the way I referred to the column to path[,18] (which is evaluated as x.path[,18] and therefore is the same as x[,18])."
If x.path is an atomic array then you cannot use x.path[ , 18] but rather need to use x.path[18].
You can inspect x.path with str(x.path) and your output suggests that is indeed a character vector. In R only objects with two dimensions (matrices and data.frames) can be referenced with object[ , n] references.
I think you might be onto something but I got this error: Error in path[!(is.na(path[18]) | path[18] == ""), ] : incorrect number of dimensions
– user2225772
Mar 30 '13 at 1:58
By the way, it is character (thanks to paste) and str(x.path) gives me this: str(x.path) chr [1:145] "c("5... and then it goes on a long way
– user2225772
Mar 30 '13 at 2:01
dim(x.path) says NULL... so it has no dimensions, but I'm not even really sure what that means. If there are no dimensions shouldn't path by itself have worked? But that gave the same error...
– user2225772
Mar 30 '13 at 2:03
Vectors in R have no dimensions. It just an ordinary character vector. I do not know what you mean by "shouldn't path work".
– 42-
Mar 30 '13 at 2:04
Thanks. It sounds like we are on the same page; that's what I thought it meant. When I said "shouldn't path work" I meant, since there is only one column/vector should just using "path" by itself instead of path[,18] in get.msg make it work? But it gives the same error. I also just tried changing the paste statement to specify the column of interest at that point but it didn't work.
– user2225772
Mar 30 '13 at 2:18
|
show 2 more comments
This might deserve to be a comment but it wouldn't fit and I'm prepared to delete if warranted. You say
"So, the paste function was causing problems because it caused x.path to be considered an atomic array apparently, and gave the error that $ could not be used with an atomic array. As per an older StackOverflow Q&A, I changed the way I referred to the column to path[,18] (which is evaluated as x.path[,18] and therefore is the same as x[,18])."
If x.path is an atomic array then you cannot use x.path[ , 18] but rather need to use x.path[18].
You can inspect x.path with str(x.path) and your output suggests that is indeed a character vector. In R only objects with two dimensions (matrices and data.frames) can be referenced with object[ , n] references.
This might deserve to be a comment but it wouldn't fit and I'm prepared to delete if warranted. You say
"So, the paste function was causing problems because it caused x.path to be considered an atomic array apparently, and gave the error that $ could not be used with an atomic array. As per an older StackOverflow Q&A, I changed the way I referred to the column to path[,18] (which is evaluated as x.path[,18] and therefore is the same as x[,18])."
If x.path is an atomic array then you cannot use x.path[ , 18] but rather need to use x.path[18].
You can inspect x.path with str(x.path) and your output suggests that is indeed a character vector. In R only objects with two dimensions (matrices and data.frames) can be referenced with object[ , n] references.
answered Mar 30 '13 at 1:55
42-
211k14249395
211k14249395
I think you might be onto something but I got this error: Error in path[!(is.na(path[18]) | path[18] == ""), ] : incorrect number of dimensions
– user2225772
Mar 30 '13 at 1:58
By the way, it is character (thanks to paste) and str(x.path) gives me this: str(x.path) chr [1:145] "c("5... and then it goes on a long way
– user2225772
Mar 30 '13 at 2:01
dim(x.path) says NULL... so it has no dimensions, but I'm not even really sure what that means. If there are no dimensions shouldn't path by itself have worked? But that gave the same error...
– user2225772
Mar 30 '13 at 2:03
Vectors in R have no dimensions. It just an ordinary character vector. I do not know what you mean by "shouldn't path work".
– 42-
Mar 30 '13 at 2:04
Thanks. It sounds like we are on the same page; that's what I thought it meant. When I said "shouldn't path work" I meant, since there is only one column/vector should just using "path" by itself instead of path[,18] in get.msg make it work? But it gives the same error. I also just tried changing the paste statement to specify the column of interest at that point but it didn't work.
– user2225772
Mar 30 '13 at 2:18
|
show 2 more comments
I think you might be onto something but I got this error: Error in path[!(is.na(path[18]) | path[18] == ""), ] : incorrect number of dimensions
– user2225772
Mar 30 '13 at 1:58
By the way, it is character (thanks to paste) and str(x.path) gives me this: str(x.path) chr [1:145] "c("5... and then it goes on a long way
– user2225772
Mar 30 '13 at 2:01
dim(x.path) says NULL... so it has no dimensions, but I'm not even really sure what that means. If there are no dimensions shouldn't path by itself have worked? But that gave the same error...
– user2225772
Mar 30 '13 at 2:03
Vectors in R have no dimensions. It just an ordinary character vector. I do not know what you mean by "shouldn't path work".
– 42-
Mar 30 '13 at 2:04
Thanks. It sounds like we are on the same page; that's what I thought it meant. When I said "shouldn't path work" I meant, since there is only one column/vector should just using "path" by itself instead of path[,18] in get.msg make it work? But it gives the same error. I also just tried changing the paste statement to specify the column of interest at that point but it didn't work.
– user2225772
Mar 30 '13 at 2:18
I think you might be onto something but I got this error: Error in path[!(is.na(path[18]) | path[18] == ""), ] : incorrect number of dimensions
– user2225772
Mar 30 '13 at 1:58
I think you might be onto something but I got this error: Error in path[!(is.na(path[18]) | path[18] == ""), ] : incorrect number of dimensions
– user2225772
Mar 30 '13 at 1:58
By the way, it is character (thanks to paste) and str(x.path) gives me this: str(x.path) chr [1:145] "c("5... and then it goes on a long way
– user2225772
Mar 30 '13 at 2:01
By the way, it is character (thanks to paste) and str(x.path) gives me this: str(x.path) chr [1:145] "c("5... and then it goes on a long way
– user2225772
Mar 30 '13 at 2:01
dim(x.path) says NULL... so it has no dimensions, but I'm not even really sure what that means. If there are no dimensions shouldn't path by itself have worked? But that gave the same error...
– user2225772
Mar 30 '13 at 2:03
dim(x.path) says NULL... so it has no dimensions, but I'm not even really sure what that means. If there are no dimensions shouldn't path by itself have worked? But that gave the same error...
– user2225772
Mar 30 '13 at 2:03
Vectors in R have no dimensions. It just an ordinary character vector. I do not know what you mean by "shouldn't path work".
– 42-
Mar 30 '13 at 2:04
Vectors in R have no dimensions. It just an ordinary character vector. I do not know what you mean by "shouldn't path work".
– 42-
Mar 30 '13 at 2:04
Thanks. It sounds like we are on the same page; that's what I thought it meant. When I said "shouldn't path work" I meant, since there is only one column/vector should just using "path" by itself instead of path[,18] in get.msg make it work? But it gives the same error. I also just tried changing the paste statement to specify the column of interest at that point but it didn't work.
– user2225772
Mar 30 '13 at 2:18
Thanks. It sounds like we are on the same page; that's what I thought it meant. When I said "shouldn't path work" I meant, since there is only one column/vector should just using "path" by itself instead of path[,18] in get.msg make it work? But it gives the same error. I also just tried changing the paste statement to specify the column of interest at that point but it didn't work.
– user2225772
Mar 30 '13 at 2:18
|
show 2 more comments
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f15713222%2fincorrect-number-of-dimensions-error-help-me-understand-why%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
3
It will be much easier if you show us your object (head, str) and the offending line of code. A reproducible example may go even farther.
– Roman Luštrik
Mar 29 '13 at 23:57
Thought is
pathneeds to be a two-dimension object (e.g.data.frameormatrix) so you can dopath[,18]; yourx.pathis not. Just doclass(x.path)and you should see that.– flodel
Mar 30 '13 at 0:16
Update: I also tried replacing [,18] with [,'email'] but got the same message. I see two comments popped up while I'm editing this so let me save my commend then I will follow up on yours (and thanks btw!). I would give you the output of head() but it's confidential email bodies : /
– user2225772
Mar 30 '13 at 0:16
flodel: You're right, class(x.path) shows that it's character due to the paste() command, but I used that because of the authors' example and because I can't figure out how to get away from it while still using the anonymous function like in that third snippet of code in my original post. Is there a way I could do that without paste tho? Sorry for the dumb question.
– user2225772
Mar 30 '13 at 0:21
Roman: I can however describe the output of head(x.path). It's a large dataframe with different kinds of data stored relating to an email client. The only column I care about at the moment is the email body column and that one is just the text of emails, with r and n and other such text representations of formatting.
– user2225772
Mar 30 '13 at 0:24