Getting PDF page length
In my articles which formatted PDF, one or more pages may be blanked and I want to detect them and remove from PDF file. If I can identify pages that are less than 60 KB, I think I can detect the pages that are empty. Because they're probably empty.
I tried like this:
var reader = new PdfReader("D:\_test\file.pdf");
/*
* With reader.FileLength, I can get whole pdf file size.
* But I dont know, how can I get pages'sizes...
*/
for (var i = 1; i <= reader.NumberOfPages; i++)
{
/*
* MessageBox.Show(???);
*/
}
c# itext page-size
add a comment |
In my articles which formatted PDF, one or more pages may be blanked and I want to detect them and remove from PDF file. If I can identify pages that are less than 60 KB, I think I can detect the pages that are empty. Because they're probably empty.
I tried like this:
var reader = new PdfReader("D:\_test\file.pdf");
/*
* With reader.FileLength, I can get whole pdf file size.
* But I dont know, how can I get pages'sizes...
*/
for (var i = 1; i <= reader.NumberOfPages; i++)
{
/*
* MessageBox.Show(???);
*/
}
c# itext page-size
1
How about splitting the PDF into multiple PDFs, one for each page and then measure their respective sizes?
– Uwe Keim
Nov 23 at 8:25
@uweKeim, I don't want to split PDF file page by page. Because think about what would be useful for me if I splited a storybook page by page. It didn't sound professionally to reassemble the page after splitting the pages page by page and removing the blank pages.
– Colin Henricks
Nov 23 at 14:59
add a comment |
In my articles which formatted PDF, one or more pages may be blanked and I want to detect them and remove from PDF file. If I can identify pages that are less than 60 KB, I think I can detect the pages that are empty. Because they're probably empty.
I tried like this:
var reader = new PdfReader("D:\_test\file.pdf");
/*
* With reader.FileLength, I can get whole pdf file size.
* But I dont know, how can I get pages'sizes...
*/
for (var i = 1; i <= reader.NumberOfPages; i++)
{
/*
* MessageBox.Show(???);
*/
}
c# itext page-size
In my articles which formatted PDF, one or more pages may be blanked and I want to detect them and remove from PDF file. If I can identify pages that are less than 60 KB, I think I can detect the pages that are empty. Because they're probably empty.
I tried like this:
var reader = new PdfReader("D:\_test\file.pdf");
/*
* With reader.FileLength, I can get whole pdf file size.
* But I dont know, how can I get pages'sizes...
*/
for (var i = 1; i <= reader.NumberOfPages; i++)
{
/*
* MessageBox.Show(???);
*/
}
c# itext page-size
c# itext page-size
edited Nov 23 at 15:04
asked Nov 22 at 17:46
Colin Henricks
94152245
94152245
1
How about splitting the PDF into multiple PDFs, one for each page and then measure their respective sizes?
– Uwe Keim
Nov 23 at 8:25
@uweKeim, I don't want to split PDF file page by page. Because think about what would be useful for me if I splited a storybook page by page. It didn't sound professionally to reassemble the page after splitting the pages page by page and removing the blank pages.
– Colin Henricks
Nov 23 at 14:59
add a comment |
1
How about splitting the PDF into multiple PDFs, one for each page and then measure their respective sizes?
– Uwe Keim
Nov 23 at 8:25
@uweKeim, I don't want to split PDF file page by page. Because think about what would be useful for me if I splited a storybook page by page. It didn't sound professionally to reassemble the page after splitting the pages page by page and removing the blank pages.
– Colin Henricks
Nov 23 at 14:59
1
1
How about splitting the PDF into multiple PDFs, one for each page and then measure their respective sizes?
– Uwe Keim
Nov 23 at 8:25
How about splitting the PDF into multiple PDFs, one for each page and then measure their respective sizes?
– Uwe Keim
Nov 23 at 8:25
@uweKeim, I don't want to split PDF file page by page. Because think about what would be useful for me if I splited a storybook page by page. It didn't sound professionally to reassemble the page after splitting the pages page by page and removing the blank pages.
– Colin Henricks
Nov 23 at 14:59
@uweKeim, I don't want to split PDF file page by page. Because think about what would be useful for me if I splited a storybook page by page. It didn't sound professionally to reassemble the page after splitting the pages page by page and removing the blank pages.
– Colin Henricks
Nov 23 at 14:59
add a comment |
1 Answer
1
active
oldest
votes
I would do this in 2 steps:
- first go over the document using IEventListener to detect which pages are empty
- once you've determined which pages are empty, simply create a new document by copying the non-empty pages from the source document into the new document
step 1:
List<Integer> emptyPages = new ArrayList<>();
PdfDocument pdfDocument = new PdfDocument(new PdfReader(new File(SRC)));
for(int i=1;i<pdfDocument.getNumberOfPages();i++){
IsEmptyEventListener l = new IsEmptyEventListener();
new PdfCanvasProcessor(l).processPageContent(pdfDocument.getPage(i));
if(l.isEmptyPage()){
emptyPages.add(i);
}
}
Then you need the proper implementation of IsEmptyEventListener. Which may be tricky and depend on your specific document(s). This is a demo.
class IsEmptyEventListener implements IEventListener {
private int eventCount = 0;
public void eventOccurred(IEventData data, EventType type){
// perhaps count only text rendering events?
eventCount++;
}
public boolean isEmptyPage(){ return eventCount < 32; }
}
step 2:
Based on this example: https://developers.itextpdf.com/examples/stamping-content-existing-pdfs/clone-reordering-pages
void copyNonBlankPages(List<Integer> blankPages, PdfDocument src, PdfDocument dst){
int N = src.getNumberOfPages();
List<Integer> toCopy = new ArrayList<>();
for(int i=1;i<N;i++){
if(!blankPages.contains(i)){
toCopy.add(i);
}
}
src.copyPagesTo(toCopy, dst);
}
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53436062%2fgetting-pdf-page-length%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
I would do this in 2 steps:
- first go over the document using IEventListener to detect which pages are empty
- once you've determined which pages are empty, simply create a new document by copying the non-empty pages from the source document into the new document
step 1:
List<Integer> emptyPages = new ArrayList<>();
PdfDocument pdfDocument = new PdfDocument(new PdfReader(new File(SRC)));
for(int i=1;i<pdfDocument.getNumberOfPages();i++){
IsEmptyEventListener l = new IsEmptyEventListener();
new PdfCanvasProcessor(l).processPageContent(pdfDocument.getPage(i));
if(l.isEmptyPage()){
emptyPages.add(i);
}
}
Then you need the proper implementation of IsEmptyEventListener. Which may be tricky and depend on your specific document(s). This is a demo.
class IsEmptyEventListener implements IEventListener {
private int eventCount = 0;
public void eventOccurred(IEventData data, EventType type){
// perhaps count only text rendering events?
eventCount++;
}
public boolean isEmptyPage(){ return eventCount < 32; }
}
step 2:
Based on this example: https://developers.itextpdf.com/examples/stamping-content-existing-pdfs/clone-reordering-pages
void copyNonBlankPages(List<Integer> blankPages, PdfDocument src, PdfDocument dst){
int N = src.getNumberOfPages();
List<Integer> toCopy = new ArrayList<>();
for(int i=1;i<N;i++){
if(!blankPages.contains(i)){
toCopy.add(i);
}
}
src.copyPagesTo(toCopy, dst);
}
add a comment |
I would do this in 2 steps:
- first go over the document using IEventListener to detect which pages are empty
- once you've determined which pages are empty, simply create a new document by copying the non-empty pages from the source document into the new document
step 1:
List<Integer> emptyPages = new ArrayList<>();
PdfDocument pdfDocument = new PdfDocument(new PdfReader(new File(SRC)));
for(int i=1;i<pdfDocument.getNumberOfPages();i++){
IsEmptyEventListener l = new IsEmptyEventListener();
new PdfCanvasProcessor(l).processPageContent(pdfDocument.getPage(i));
if(l.isEmptyPage()){
emptyPages.add(i);
}
}
Then you need the proper implementation of IsEmptyEventListener. Which may be tricky and depend on your specific document(s). This is a demo.
class IsEmptyEventListener implements IEventListener {
private int eventCount = 0;
public void eventOccurred(IEventData data, EventType type){
// perhaps count only text rendering events?
eventCount++;
}
public boolean isEmptyPage(){ return eventCount < 32; }
}
step 2:
Based on this example: https://developers.itextpdf.com/examples/stamping-content-existing-pdfs/clone-reordering-pages
void copyNonBlankPages(List<Integer> blankPages, PdfDocument src, PdfDocument dst){
int N = src.getNumberOfPages();
List<Integer> toCopy = new ArrayList<>();
for(int i=1;i<N;i++){
if(!blankPages.contains(i)){
toCopy.add(i);
}
}
src.copyPagesTo(toCopy, dst);
}
add a comment |
I would do this in 2 steps:
- first go over the document using IEventListener to detect which pages are empty
- once you've determined which pages are empty, simply create a new document by copying the non-empty pages from the source document into the new document
step 1:
List<Integer> emptyPages = new ArrayList<>();
PdfDocument pdfDocument = new PdfDocument(new PdfReader(new File(SRC)));
for(int i=1;i<pdfDocument.getNumberOfPages();i++){
IsEmptyEventListener l = new IsEmptyEventListener();
new PdfCanvasProcessor(l).processPageContent(pdfDocument.getPage(i));
if(l.isEmptyPage()){
emptyPages.add(i);
}
}
Then you need the proper implementation of IsEmptyEventListener. Which may be tricky and depend on your specific document(s). This is a demo.
class IsEmptyEventListener implements IEventListener {
private int eventCount = 0;
public void eventOccurred(IEventData data, EventType type){
// perhaps count only text rendering events?
eventCount++;
}
public boolean isEmptyPage(){ return eventCount < 32; }
}
step 2:
Based on this example: https://developers.itextpdf.com/examples/stamping-content-existing-pdfs/clone-reordering-pages
void copyNonBlankPages(List<Integer> blankPages, PdfDocument src, PdfDocument dst){
int N = src.getNumberOfPages();
List<Integer> toCopy = new ArrayList<>();
for(int i=1;i<N;i++){
if(!blankPages.contains(i)){
toCopy.add(i);
}
}
src.copyPagesTo(toCopy, dst);
}
I would do this in 2 steps:
- first go over the document using IEventListener to detect which pages are empty
- once you've determined which pages are empty, simply create a new document by copying the non-empty pages from the source document into the new document
step 1:
List<Integer> emptyPages = new ArrayList<>();
PdfDocument pdfDocument = new PdfDocument(new PdfReader(new File(SRC)));
for(int i=1;i<pdfDocument.getNumberOfPages();i++){
IsEmptyEventListener l = new IsEmptyEventListener();
new PdfCanvasProcessor(l).processPageContent(pdfDocument.getPage(i));
if(l.isEmptyPage()){
emptyPages.add(i);
}
}
Then you need the proper implementation of IsEmptyEventListener. Which may be tricky and depend on your specific document(s). This is a demo.
class IsEmptyEventListener implements IEventListener {
private int eventCount = 0;
public void eventOccurred(IEventData data, EventType type){
// perhaps count only text rendering events?
eventCount++;
}
public boolean isEmptyPage(){ return eventCount < 32; }
}
step 2:
Based on this example: https://developers.itextpdf.com/examples/stamping-content-existing-pdfs/clone-reordering-pages
void copyNonBlankPages(List<Integer> blankPages, PdfDocument src, PdfDocument dst){
int N = src.getNumberOfPages();
List<Integer> toCopy = new ArrayList<>();
for(int i=1;i<N;i++){
if(!blankPages.contains(i)){
toCopy.add(i);
}
}
src.copyPagesTo(toCopy, dst);
}
answered Nov 27 at 12:36
Joris Schellekens
6,03611141
6,03611141
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53436062%2fgetting-pdf-page-length%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
How about splitting the PDF into multiple PDFs, one for each page and then measure their respective sizes?
– Uwe Keim
Nov 23 at 8:25
@uweKeim, I don't want to split PDF file page by page. Because think about what would be useful for me if I splited a storybook page by page. It didn't sound professionally to reassemble the page after splitting the pages page by page and removing the blank pages.
– Colin Henricks
Nov 23 at 14:59