How to read a tweet that contains a curly apostrophe (’)
up vote
0
down vote
favorite
I am reading tweets in the following format:
545253503963516928|Wed Dec 17 16:25:40 +0000 2014|Massachusetts Pharmacy Owners Arrested in Meningitis Deaths http://xxxxxxxxx
545235402156937217|Wed Dec 17 15:13:44 +0000 2014|For First Time, Treatment Helps Patients With Worst Kind of Stroke, Study Says http://xxxxxxxxx
The code I'm using is
msn <- read.table(file=".../msnhealthnews.txt",
sep="|",
header = FALSE,
quote="",
fill=TRUE,
stringsAsFactors = FALSE,
numerals ="no.loss",
encoding = "UTF-8")
There is a tweet that has a curly apostrophe:
You’re Already Losing Your Mind: http://on-msn.com/w0LiSx
This tweet is being read as follows:
"Youu0092re Already Losing Your Mind: http://on-msn.com/w0LiSx"
How I can ensure that the tweet is read correctly? I thought that setting encoding = "UTF-8" would take care of this.
r
add a comment |
up vote
0
down vote
favorite
I am reading tweets in the following format:
545253503963516928|Wed Dec 17 16:25:40 +0000 2014|Massachusetts Pharmacy Owners Arrested in Meningitis Deaths http://xxxxxxxxx
545235402156937217|Wed Dec 17 15:13:44 +0000 2014|For First Time, Treatment Helps Patients With Worst Kind of Stroke, Study Says http://xxxxxxxxx
The code I'm using is
msn <- read.table(file=".../msnhealthnews.txt",
sep="|",
header = FALSE,
quote="",
fill=TRUE,
stringsAsFactors = FALSE,
numerals ="no.loss",
encoding = "UTF-8")
There is a tweet that has a curly apostrophe:
You’re Already Losing Your Mind: http://on-msn.com/w0LiSx
This tweet is being read as follows:
"Youu0092re Already Losing Your Mind: http://on-msn.com/w0LiSx"
How I can ensure that the tweet is read correctly? I thought that setting encoding = "UTF-8" would take care of this.
r
Please define "correctly". You seem to want magical things to happen. Magic, sadly, isn't real.
– hrbrmstr
Nov 22 at 2:50
Please use encoding = utf-8
– Hunaidkhan
Nov 22 at 4:47
try running through the other encodings? Once I found that UTF-10 worked for me
– RAB
Nov 22 at 7:26
@hrbrmstr: by correctly, I mean thatYou'reshould be read as such and not asYouu0092reI want to be able determine the total number of characters in each tweet using nchar(msn$V3). However, this throws up an error when it encountersYouu0092re
– Anonymouse
Nov 22 at 18:37
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I am reading tweets in the following format:
545253503963516928|Wed Dec 17 16:25:40 +0000 2014|Massachusetts Pharmacy Owners Arrested in Meningitis Deaths http://xxxxxxxxx
545235402156937217|Wed Dec 17 15:13:44 +0000 2014|For First Time, Treatment Helps Patients With Worst Kind of Stroke, Study Says http://xxxxxxxxx
The code I'm using is
msn <- read.table(file=".../msnhealthnews.txt",
sep="|",
header = FALSE,
quote="",
fill=TRUE,
stringsAsFactors = FALSE,
numerals ="no.loss",
encoding = "UTF-8")
There is a tweet that has a curly apostrophe:
You’re Already Losing Your Mind: http://on-msn.com/w0LiSx
This tweet is being read as follows:
"Youu0092re Already Losing Your Mind: http://on-msn.com/w0LiSx"
How I can ensure that the tweet is read correctly? I thought that setting encoding = "UTF-8" would take care of this.
r
I am reading tweets in the following format:
545253503963516928|Wed Dec 17 16:25:40 +0000 2014|Massachusetts Pharmacy Owners Arrested in Meningitis Deaths http://xxxxxxxxx
545235402156937217|Wed Dec 17 15:13:44 +0000 2014|For First Time, Treatment Helps Patients With Worst Kind of Stroke, Study Says http://xxxxxxxxx
The code I'm using is
msn <- read.table(file=".../msnhealthnews.txt",
sep="|",
header = FALSE,
quote="",
fill=TRUE,
stringsAsFactors = FALSE,
numerals ="no.loss",
encoding = "UTF-8")
There is a tweet that has a curly apostrophe:
You’re Already Losing Your Mind: http://on-msn.com/w0LiSx
This tweet is being read as follows:
"Youu0092re Already Losing Your Mind: http://on-msn.com/w0LiSx"
How I can ensure that the tweet is read correctly? I thought that setting encoding = "UTF-8" would take care of this.
r
r
asked Nov 22 at 1:39
Anonymouse
457
457
Please define "correctly". You seem to want magical things to happen. Magic, sadly, isn't real.
– hrbrmstr
Nov 22 at 2:50
Please use encoding = utf-8
– Hunaidkhan
Nov 22 at 4:47
try running through the other encodings? Once I found that UTF-10 worked for me
– RAB
Nov 22 at 7:26
@hrbrmstr: by correctly, I mean thatYou'reshould be read as such and not asYouu0092reI want to be able determine the total number of characters in each tweet using nchar(msn$V3). However, this throws up an error when it encountersYouu0092re
– Anonymouse
Nov 22 at 18:37
add a comment |
Please define "correctly". You seem to want magical things to happen. Magic, sadly, isn't real.
– hrbrmstr
Nov 22 at 2:50
Please use encoding = utf-8
– Hunaidkhan
Nov 22 at 4:47
try running through the other encodings? Once I found that UTF-10 worked for me
– RAB
Nov 22 at 7:26
@hrbrmstr: by correctly, I mean thatYou'reshould be read as such and not asYouu0092reI want to be able determine the total number of characters in each tweet using nchar(msn$V3). However, this throws up an error when it encountersYouu0092re
– Anonymouse
Nov 22 at 18:37
Please define "correctly". You seem to want magical things to happen. Magic, sadly, isn't real.
– hrbrmstr
Nov 22 at 2:50
Please define "correctly". You seem to want magical things to happen. Magic, sadly, isn't real.
– hrbrmstr
Nov 22 at 2:50
Please use encoding = utf-8
– Hunaidkhan
Nov 22 at 4:47
Please use encoding = utf-8
– Hunaidkhan
Nov 22 at 4:47
try running through the other encodings? Once I found that UTF-10 worked for me
– RAB
Nov 22 at 7:26
try running through the other encodings? Once I found that UTF-10 worked for me
– RAB
Nov 22 at 7:26
@hrbrmstr: by correctly, I mean that
You're should be read as such and not as Youu0092re I want to be able determine the total number of characters in each tweet using nchar(msn$V3). However, this throws up an error when it encounters Youu0092re– Anonymouse
Nov 22 at 18:37
@hrbrmstr: by correctly, I mean that
You're should be read as such and not as Youu0092re I want to be able determine the total number of characters in each tweet using nchar(msn$V3). However, this throws up an error when it encounters Youu0092re– Anonymouse
Nov 22 at 18:37
add a comment |
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53422763%2fhow-to-read-a-tweet-that-contains-a-curly-apostrophe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Please define "correctly". You seem to want magical things to happen. Magic, sadly, isn't real.
– hrbrmstr
Nov 22 at 2:50
Please use encoding = utf-8
– Hunaidkhan
Nov 22 at 4:47
try running through the other encodings? Once I found that UTF-10 worked for me
– RAB
Nov 22 at 7:26
@hrbrmstr: by correctly, I mean that
You'reshould be read as such and not asYouu0092reI want to be able determine the total number of characters in each tweet using nchar(msn$V3). However, this throws up an error when it encountersYouu0092re– Anonymouse
Nov 22 at 18:37