Describe the bug
Reading a 5 GB PDF fails due to an integer overflow in the file pointer integer cast that causes an InvalidPdfException.
To Reproduce
Code to reproduce the issue. I am unable to attach the 5GB file full of images but can provide if necessary.
- Sample Code
// forceRead=false, plainRandomAccess=false
RandomAccessFileOrArray f = new RandomAccessFileOrArray("5GB.pdf", false, false);
try (PdfReader reader = new PdfReader(f)) {
}
- Error Encountered
Exception in thread "main" org.openpdf.text.exceptions.InvalidPdfException: Rebuild failed: Position out of bounds; Original message: PDF startxref not found.
at org.openpdf.text.pdf.PdfReader.readPdfPartial(PdfReader.java:1382)
at org.openpdf.text.pdf.PdfReader.<init>(PdfReader.java:285)
Expected behavior
Able to read the 5GB PDF file.
Screenshots
- Problematic code in RandomAccessFileOrArray. The
rf.getFilePointer() returns a long but is casted to int.
https://github.com/LibrePDF/OpenPDF/blob/master/openpdf-core/src/main/java/org/openpdf/text/pdf/RandomAccessFileOrArray.java#L350
public int getFilePointer() throws IOException {
insureOpen();
int n = isBack ? 1 : 0;
if (arrayIn == null) {
return (int) (plainRandomAccess ? trf.getFilePointer() : rf.getFilePointer()) - n - startOffset;
} else {
return arrayInPtr - n - startOffset;
}
}
- Integer Overflow

- Stacktrace

System
(please complete the following information)
- OS: Windows
- Used font: Default
- OpenPDF version: 3.0.3
Additional context
Attempt via RandomAccessFileOrArray with InputStream
new PdfReader(new FileInputStream("5GB.pdf"));
Error (Expected due to byte array limit)
Exception in thread "main" java.lang.OutOfMemoryError: Required array length 2147483639 + 9 is too large
at java.base/jdk.internal.util.ArraysSupport.hugeLength(ArraysSupport.java:914)
at java.base/jdk.internal.util.ArraysSupport.newLength(ArraysSupport.java:907)
at java.base/java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:100)
at java.base/java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:132)
at org.openpdf.text.pdf.RandomAccessFileOrArray.InputStreamToArray(RandomAccessFileOrArray.java:172)
at org.openpdf.text.pdf.RandomAccessFileOrArray.<init>(RandomAccessFileOrArray.java:150)
at org.openpdf.text.pdf.PdfReader.<init>(PdfReader.java:257)
Attempt via RandomAccessFileOrArray with forceRead = true, plainRandomAccess = true
Error (Expected due to byte array limit)
Exception in thread "main" java.lang.OutOfMemoryError: Required array length 2147483639 + 9 is too large
at java.base/jdk.internal.util.ArraysSupport.hugeLength(ArraysSupport.java:914)
at java.base/jdk.internal.util.ArraysSupport.newLength(ArraysSupport.java:907)
at java.base/java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:100)
at java.base/java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:132)
at org.openpdf.text.pdf.RandomAccessFileOrArray.InputStreamToArray(RandomAccessFileOrArray.java:172)
at org.openpdf.text.pdf.RandomAccessFileOrArray.<init>(RandomAccessFileOrArray.java:131)
Attempt via RandomAccessFileOrArray with forceRead = false, plainRandomAccess = true
Error (Took 59 minutes. Same as the plainRandomAccess = fase)
Exception in thread "main" org.openpdf.text.exceptions.InvalidPdfException: Rebuild failed: Position out of bounds; Original message: PDF startxref not found.
at org.openpdf.text.pdf.PdfReader.readPdfPartial(PdfReader.java:1382)
at org.openpdf.text.pdf.PdfReader.<init>(PdfReader.java:285)
Describe the bug
Reading a 5 GB PDF fails due to an integer overflow in the file pointer integer cast that causes an InvalidPdfException.
To Reproduce
Code to reproduce the issue. I am unable to attach the 5GB file full of images but can provide if necessary.
Expected behavior
Able to read the 5GB PDF file.
Screenshots
rf.getFilePointer()returns a long but is casted to int.https://github.com/LibrePDF/OpenPDF/blob/master/openpdf-core/src/main/java/org/openpdf/text/pdf/RandomAccessFileOrArray.java#L350
System
(please complete the following information)
Additional context
Attempt via RandomAccessFileOrArray with InputStream
Error (Expected due to byte array limit)
Attempt via RandomAccessFileOrArray with forceRead = true, plainRandomAccess = true
Error (Expected due to byte array limit)
Attempt via RandomAccessFileOrArray with forceRead = false, plainRandomAccess = true
Error (Took 59 minutes. Same as the plainRandomAccess = fase)