Skip to content
Discussion options

You must be logged in to vote

In your example cases, the answer to the question "Are the PDFs equal?" is simple.
You can use the file creation / modification timestamps to answer this, and / or the file sizes. Best use the Python built-in module os here.

On the PDF level, you can look at some core indicators:

>>> import fitz
>>> doc1 = fitz.open("sbi statment_out2.pdf")
>>> doc2 = fitz.open("sbi statment_out2_Sejda_edited.pdf")
>>> from pprint import pprint
>>> pprint(doc1.metadata)
{'author': '',
 'creationDate': "D:20200911140637+05'30'",
 'creator': '',
 'encryption': None,
 'format': 'PDF 1.4',
 'keywords': '',
 'modDate': "D:20200911140637+05'30'",
 'producer': 'iText 2.0.4 (by lowagie.com)',
 'subject': '',
 'ti…

Replies: 4 comments 8 replies

Comment options

You must be logged in to vote
8 replies
@JorjMcKie
Comment options

@AbhishekTanksali
Comment options

@AbhishekTanksali
Comment options

@JorjMcKie
Comment options

@AbhishekTanksali
Comment options

Answer selected by JorjMcKie
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants