How to batch OCR multiple image files to multiple text files using Tesseract
I am currently using tesseract to OCR some jpeg files to txt files (in Ubuntu 16.04). Typically this is ~500 files in one directory.
I know I can do this by making a text file with all the file names (savedlist.txt), and then do:
tesseract savedlist.txt output.txt
however output.txt is a single file with all the ocr results.
What I need is to be able to save the ocr results to individual txt files with the same file name as the original image file. For example:
input file: image456.jpeg
output file: image456.txt
So I am looking for a command line script that can do this processing.
ubuntu-16.04 tesseract
add a comment |
I am currently using tesseract to OCR some jpeg files to txt files (in Ubuntu 16.04). Typically this is ~500 files in one directory.
I know I can do this by making a text file with all the file names (savedlist.txt), and then do:
tesseract savedlist.txt output.txt
however output.txt is a single file with all the ocr results.
What I need is to be able to save the ocr results to individual txt files with the same file name as the original image file. For example:
input file: image456.jpeg
output file: image456.txt
So I am looking for a command line script that can do this processing.
ubuntu-16.04 tesseract
This question was answered on Software Recs SE - Software to batch OCR multiple image files to multiple text files using Tesseract?.
– user3169
Nov 25 at 5:04
add a comment |
I am currently using tesseract to OCR some jpeg files to txt files (in Ubuntu 16.04). Typically this is ~500 files in one directory.
I know I can do this by making a text file with all the file names (savedlist.txt), and then do:
tesseract savedlist.txt output.txt
however output.txt is a single file with all the ocr results.
What I need is to be able to save the ocr results to individual txt files with the same file name as the original image file. For example:
input file: image456.jpeg
output file: image456.txt
So I am looking for a command line script that can do this processing.
ubuntu-16.04 tesseract
I am currently using tesseract to OCR some jpeg files to txt files (in Ubuntu 16.04). Typically this is ~500 files in one directory.
I know I can do this by making a text file with all the file names (savedlist.txt), and then do:
tesseract savedlist.txt output.txt
however output.txt is a single file with all the ocr results.
What I need is to be able to save the ocr results to individual txt files with the same file name as the original image file. For example:
input file: image456.jpeg
output file: image456.txt
So I am looking for a command line script that can do this processing.
ubuntu-16.04 tesseract
ubuntu-16.04 tesseract
edited Nov 22 at 18:18
asked Nov 22 at 18:06
user3169
10112
10112
This question was answered on Software Recs SE - Software to batch OCR multiple image files to multiple text files using Tesseract?.
– user3169
Nov 25 at 5:04
add a comment |
This question was answered on Software Recs SE - Software to batch OCR multiple image files to multiple text files using Tesseract?.
– user3169
Nov 25 at 5:04
This question was answered on Software Recs SE - Software to batch OCR multiple image files to multiple text files using Tesseract?.
– user3169
Nov 25 at 5:04
This question was answered on Software Recs SE - Software to batch OCR multiple image files to multiple text files using Tesseract?.
– user3169
Nov 25 at 5:04
add a comment |
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53436250%2fhow-to-batch-ocr-multiple-image-files-to-multiple-text-files-using-tesseract%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53436250%2fhow-to-batch-ocr-multiple-image-files-to-multiple-text-files-using-tesseract%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
This question was answered on Software Recs SE - Software to batch OCR multiple image files to multiple text files using Tesseract?.
– user3169
Nov 25 at 5:04