-
Notifications
You must be signed in to change notification settings - Fork 47
Add DHL eCommerce 14-digit format #104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
""" WalkthroughA new DHL E-Commerce tracking number pattern named "DHL E-Commerce (14)" was added to Changes
Assessment against linked issues
Possibly related PRs
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- couriers/dhl.json (1 hunks)
Additional comments not posted (3)
couriers/dhl.json (3)
51-54: LGTM!The changes to the
regexfield enhance the flexibility of the tracking number validation by accommodating a new format. The new regex pattern is correctly implemented.The code changes are approved.
62-64: LGTM!The addition of new valid tracking numbers helps in validating the new regex pattern. The tracking numbers are correctly formatted.
The code changes are approved.
Line range hint
1-65: LGTM!The overall structure and consistency of the file are maintained. The changes are well-integrated within the existing configuration.
The code changes are approved.
couriers/dhl.json
Outdated
| "regex": "\\s*((GM)|(LX)|(RX)|(UV)|(CN)|(SG)|(TH)|(IN)|(HK)|(MY))\\s*(?<SerialNumber>([0-9]\\s*){10,39})", | ||
| "regex": [ | ||
| "(\\s*((GM)|(LX)|(RX)|(UV)|(CN)|(SG)|(TH)|(IN)|(HK)|(MY))\\s*(?<SerialNumber>([0-9]\\s*){10,39}))", | ||
| "|^(?<SerialNumber>([0-9]){14})$" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkeen this pattern matches USPS 20-digit numbers (and possibly other formats), and apparently there is no validation mechanism (mod7/luhn/etc). So it seems we need to be clever.
As you see here, I tried wrapping it in line start/end syntax (^/$). However, this makes it un-usable for search blocks of text, and it fails the should_detect_number_variants test. Seems like the line start/end synax must be removed. Any ideas on how to modify this pattern so it does NOT match USPS 20-digits?
|
@dan-jensen has this ever made it over the finish line? |
|
@fthobe no, I wasn't sure how to proceed. My resolution was to stop using DHL eCommerce. |
Did you migrate to regular DHL or to a different carrier entirely |
|
@fthobe we try to avoid DHL entirely now. Their technology is such a mess it's almost un-usable. For example, their API frequently returns an error saying our account does not allow shipments to be purchased in a specific country, even though our DHL account rep says it should work. We tried to work with them for 18 months and finally gave up. |
|
They have a tendency to overpromise and underdeliver as all carriers. Any of them you have made better experience? |
|
@fthobe haven't encountered any that are great, but FedEx and UPS are usable. I'm hopeful Amazon opens up to the public at some point, I think they would be great. |
this makes it easier to avoid some complications with adding it onto that existing subtype, despite both of them being e-commerce numbers
|
@dan-jensen I moved the definition to a separate subtype and added some |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
couriers/dhl.json (1)
71-71: Missing validation stub commentIf the 14-digit format has no official checksum, add a short comment to
validationexplaining that none exists to avoid future confusion, e.g.:- "validation": {}, + "validation": { /* no published checksum for DHL-eComm 14-digit */ },Purely descriptive but improves maintainability.
| { | ||
| "name": "DHL E-Commerce (14)", | ||
| "id": "dhl_ecommerce_14", | ||
| "regex": "\\s*\\b(?<SerialNumber>(?:[0-9]\\s*){14})\\b", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ Verification inconclusive
Regex still risks false-positives; tighten with look-arounds.
Using \b is not sufficient for “pure-digit” formats: a 14-digit slice at the start or end of a longer digit string is still matched because word-boundaries don’t split digits. This will re-introduce the USPS-20-digit collision mentioned in the discussion.
- "regex": "\\s*\\b(?<SerialNumber>(?:[0-9]\\s*){14})\\b",
+ "regex": "(?<!\\d)\\s*(?<SerialNumber>(?:\\d\\s*){14})(?!\\d)",Look-behind / look-ahead guarantee we don’t match when the 14 digits are immediately preceded or followed by another digit, while keeping the pattern usable inside free-text “search” contexts.
Please verify that the regexp engine used by the gem supports variable-width look-behind (Oniguruma in MRI ≥ 2.4 does).
Regex still risks false-positives; tighten with look-arounds.
Using \b is not sufficient for “pure-digit” formats: a 14-digit slice at the start or end of a longer digit string is still matched because word-boundaries don’t split digits. This will re-introduce the USPS-20-digit collision mentioned in the discussion.
- "regex": "\\s*\\b(?<SerialNumber>(?:[0-9]\\s*){14})\\b",
+ "regex": "(?<!\\d)\\s*(?<SerialNumber>(?:\\d\\s*){14})(?!\\d)",Look-behind / look-ahead guarantee we don’t match when the 14 digits are immediately preceded or followed by another digit, while keeping the pattern usable inside free-text “search” contexts.
Please verify that the regexp engine used by the gem supports variable-width look-behind (Oniguruma in MRI ≥ 2.4 does).
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "regex": "\\s*\\b(?<SerialNumber>(?:[0-9]\\s*){14})\\b", | |
| "regex": "(?<!\\d)\\s*(?<SerialNumber>(?:\\d\\s*){14})(?!\\d)", |
🤖 Prompt for AI Agents
In couriers/dhl.json at line 69, the regex uses word boundaries (\b) which do
not prevent matching 14-digit sequences that are part of longer digit strings,
causing false positives. Replace the \b boundaries with variable-width
look-behind and look-ahead assertions to ensure the 14-digit sequence is not
immediately preceded or followed by another digit. Confirm the regex engine
supports variable-width look-behind before applying this change.
|
@jkeen looks really nice, I like how you split out the 14 separately. coderabbit has a good suggestion above, but otherwise I think this is good to go. |
Co-authored-by: Jeff Keen <[email protected]>
This is intended to add support for DHL eCommerce tracking numbers that are 14 digits, which exist in Europe if nowhere else. This resolves jkeen/tracking_number#83
WARNING: Do not merge, there is a problem to resolve.
Summary by CodeRabbit