Skip to content

Sparknlp-1158 Adding Parameters Options to PDF Reader #14562

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: release/601-release-candidate
Choose a base branch
from

Conversation

danilojsl
Copy link
Contributor

Description

  • Adding splitPage parameter to identify the correct number of pages
  • Adding onlyPageNum parameter to display only the number of pages of the document
  • Adding textStripper parameter used for output layout and formatting
  • Adding sort parameter to enable or disable sorting lines

Motivation and Context

Enhancing PDF reader with additional features

How Has This Been Tested?

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • Code improvements with no or little impact
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING page.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

danilojsl and others added 3 commits April 29, 2025 21:16
* [SPARKNLP-4098] Adding split page feature
* [SPARKNLP-1098] Adding onlyPageNum parameter
* [SPARKNLP-4098] Adding split page feature
* [SPARKNLP-1098] Adding onlyPageNum parameter
* [SPARKNLP-1098] Adding textStripper parameter
* [SPARKNLP-1098] Adding a unit tests for PDF document with scattered text
@danilojsl danilojsl self-assigned this Apr 30, 2025
@danilojsl danilojsl added enhancement DON'T MERGE Do not merge this PR labels Apr 30, 2025
@danilojsl danilojsl force-pushed the feature/SPARKNLP-1158-Adding-parameters-options-to-PDF-Reader branch from 45e3083 to 03a76ee Compare May 1, 2025 02:30
@danilojsl danilojsl force-pushed the feature/SPARKNLP-1158-Adding-parameters-options-to-PDF-Reader branch from 03a76ee to 5317971 Compare May 1, 2025 11:35
@danilojsl danilojsl changed the base branch from master to release/601-release-candidate May 5, 2025 20:51
@danilojsl danilojsl requested a review from DevinTDHa May 5, 2025 20:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DON'T MERGE Do not merge this PR enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant