- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 1.4k
Implement error collection for deserialization (#1196) #5364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 3.x
Are you sure you want to change the base?
Implement error collection for deserialization (#1196) #5364
Conversation
| * | ||
| * @param maxProblems Maximum number of problems to collect (must be > 0) | ||
| * @return A new ObjectReader configured for error collection | ||
| * @since 3.1 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the throws should be included in the javadoc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added now. Thanks for the feedback.
| _assertNotNull("p", p); | ||
|  | ||
| // CRITICAL: Allocate a FRESH bucket for THIS call (thread-safety) | ||
| List<tools.jackson.databind.exc.CollectedProblem> bucket = new ArrayList<>(); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to use fully qualified class name here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed now. Thanks.
|  | ||
| // Create per-call attributes with the fresh bucket | ||
| ContextAttributes perCallAttrs = _config.getAttributes() | ||
| .withPerCallAttribute(tools.jackson.databind.deser.CollectingProblemHandler.class, bucket); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, avoid using fully qualified class names - use imports so the code can be kept concise
| * @return A new ObjectReader configured for error collection | ||
| * @since 3.1 | ||
| */ | ||
| public ObjectReader collectErrors() { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really like the names of the new APIs. I don't especially like 'collect' but I think 'Error' is plain wrong.
'Error' has a very explicit meaning in Java - examples include OutOfMemoryError. Exception is more correct than Error here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like problemCollectingReader() etc would make more sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to problemCollectingReader and readValueCollectingProblems method names
| -1 on this PR itself due to the sheer amount of line changes. Ai code assistants should really get trained on splitting tasks to safer(or at least feeling so) bits of progression | 
| * @param segment The raw segment (property name or array index) | ||
| * @return Escaped segment safe for JSON Pointer | ||
| */ | ||
| private String escapeJsonPointerSegment(String segment) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should not duplicate code here instead of using src/main/java/com/fasterxml/jackson/core/JsonPointer.java (from jackson-core)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the pointer 😉 . Changed now.
| } catch (tools.jackson.databind.exc.DeferredBindingException e) { | ||
| throw e; // Already properly formatted | ||
|  | ||
| } catch (DatabindException e) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the idea to only catch DatabindExceptions and not other JacksonExceptions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is intentional. DatabindException represents deserialization/binding errors that we want to collect (missing properties, type mismatches, etc.).
Other Jackson exceptions (like JsonParseException, StreamReadException) might pertain more to malformed JSON structure that would fail fast and not be collected.
| * Unique private key object for the maximum problem limit attribute. | ||
| * Using a dedicated object prevents collisions with user attributes. | ||
| */ | ||
| private static final class MaxProblemsKey { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we store configuration as context attribute, instead of within ProblemHandler itself, as config setting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.... or maybe as typed collector state added to context, I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cowtowncoder Can you clarify if there is any change expected for this?
From my understanding, the configuration is currently part of the context as a typed collector state.
| /** | ||
| * Enables error collection mode with a custom problem limit. | ||
| * | ||
| * <p><b>Thread-safety</b>: The returned reader is immutable and thread-safe. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs to explain that this is implemented as a problem handler, replacing any handler that might have been formerly configured.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in the Javadoc for problemCollectingReader method.
- Add CollectedProblem: immutable value object for error details - Add DeferredBindingException: aggregate exception for multiple errors - Add CollectingProblemHandler: stateless handler collecting recoverable errors - Add ObjectReader.collectErrors() and readValueCollecting() methods - Add comprehensive test suite (27 tests) covering all scenarios Features: - Opt-in per-call error collection (no global config) - Thread-safe with per-call bucket isolation - RFC 6901 compliant JSON Pointer paths - DoS protection with configurable limit (default 100) - Primitive vs reference type default value policy - Suppressed exception support for hard failures Tests verify: - Per-call bucket isolation (concurrent + successive) - JSON Pointer escaping (tilde, slash, combined) - Limit reached behavior - Unknown property handling - Default value policy - Message formatting - Edge cases 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Address review feedback from specs/issue-1196-collecting-deserialization-errors-claude-review-gpt-5.md: API Surface & Delegation: - Fix ObjectReader.readValueCollecting() to use public API `this.with(perCallAttrs).readValue(p)` instead of protected `_new(...)` factory - Maintains consistency with Jackson's builder pattern and public surface Limit Resolution: - Read max-problem cap from per-call reader config instead of base _config - Properly honors per-call attribute overrides - Affects both normal completion and hard failure paths Javadoc Enhancements: - Add comprehensive class-level Javadoc to DeferredBindingException with usage examples - Enhance CollectedProblem Javadoc explaining all fields, truncation, and immutability - Expand CollectingProblemHandler Javadoc detailing design, recoverable errors, and DoS protection - Improve ObjectReader.readValueCollecting() Javadoc noting behavior without collectErrors() and parser filtering differences Testing: - All 27 CollectingErrorsTest tests pass - Full suite: 4,662 tests pass, 0 failures, 0 errors - No regressions introduced 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Add helper method overloads for all input types (File, InputStream, Reader) to eliminate inline catchThrowableOfType duplication - Extract buildInvalidOrderJson() helper to avoid duplicated JSON building logic across 4 tests - Use StandardCharsets.UTF_8 instead of checked exception "UTF-8" string - Improve executor shutdown safety with shutdownNow() fallback and proper InterruptedException handling - Enhance concurrent test to verify exact unique error values, catching any bucket-sharing regressions - Use Files.deleteIfExists() for robust temp file cleanup - Streamline hard failure test to focus on suppressed exception mechanics - Replace hand-rolled try/catch blocks with expectDeferredBinding() helper in 4 additional tests All 31 tests pass. Code is more maintainable with consistent patterns and reduced duplication. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Add release notes entry in VERSION for 3.1.0 - Add tutorial section in README.md matching existing tone - Explain problem: fix-retry cycle vs collecting all errors at once - Show basic usage with code examples - Document DoS protection with 100 error default limit - Clarify best-effort nature and what can/cannot be collected - Document JSON Pointer paths (RFC 6901) for error locations - Explain thread-safe per-call error buckets - List practical use cases: API validation, batch processing, tooling Addresses 9-year-old feature request from Oliver Drotbohm covering key concerns from discussion: DoS protection, thread safety, recoverable errors, and error reporting. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
…ception handling - Replace deprecated fail() calls with AssertJ assertions - Fix deprecated catchThrowableOfType() parameter order (Class, lambda) - Remove unnecessary throws Exception declarations from 27 test methods - Keep throws Exception only for tests using try-with-resources All 31 tests pass successfully. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Restore src/test/java/module-info.java that was accidentally removed in commit b7fbb39. This file is required for the test module system configuration. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
…umentation This commit addresses all technical feedback from reviewers @pjfanning, @cowtowncoder, and @JooHyukKim on PR FasterXML#5364. API Changes: - Rename collectErrors() → problemCollectingReader() - Rename readValueCollecting() → readValueCollectingProblems() - Rationale: "Error" conflicts with Java's Error class (OutOfMemoryError). "Problem" aligns with existing DeserializationProblemHandler terminology. Code Quality: - Add proper imports for CollectingProblemHandler, CollectedProblem, DeferredBindingException - Replace 6 fully qualified class names with short names - Add missing @throws javadoc tags Remove Code Duplication: - Refactor buildJsonPointer() to use jackson-core's JsonPointer API (appendProperty/appendIndex methods) - Delete custom escapeJsonPointerSegment() method (~8 lines) - Leverage tested jackson-core RFC 6901 escaping implementation Enhanced Documentation: - Add architecture rationale to CollectingProblemHandler explaining why context attributes are used (thread-safety, call isolation, immutability, config vs state separation) - Add exception handling strategy to readValueCollectingProblems() explaining why only DatabindException is caught (not all JacksonExceptions - streaming errors should fail fast) - Add handler replacement warning to problemCollectingReader() - Update DeferredBindingException javadoc with new API names Files changed: 5 - README.md (tutorial examples) - ObjectReader.java (main API) - CollectingProblemHandler.java (implementation) - DeferredBindingException.java (exception javadoc) - CollectingErrorsTest.java (test updates) All tests pass. No behavioral changes. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
200557a    to
    f0c4342      
    Compare
  
    | Thanks for the feedback. 
 If it's some consolation, the actual logic change in ObjectReader and CollectingProblemHandler is only like 1/3 of the total changes. Others are from domain classes, documentation and tests. As this is not a small change (atleast from my POV) this seemed reasonable for me. 
 Reg Coding Assistants and splitting tasks, this is indeed true. Once their context window starts filling up their answers are not that reliable. Let me explain my process how I overcome this (something I also do in my job). 
 If you see any code blocks that could be improved/simplified or if you have any feedback to my development process (mentioned above), I would be happy to learn and improve on it in future. | 
PR: Implement error collection for deserialization (#1196)
AI Disclosure
This is an AI-generated solution. The requirements were specified by a human developer based on a 9-year-old feature request from @odrotbohm, and the implementation was fully built by Claude Code (an AI coding agent).
Overview
Adds opt-in error collection capability to Jackson deserialization, allowing applications to collect multiple deserialization errors instead of failing fast on the first error. This addresses issue #1196 requested by Oliver Drotbohm in 2016.
Target branch:
3.x(for Jackson 3.1.0)Motivation
Currently, Jackson fails fast on the first deserialization error, requiring a fix-retry cycle to discover all problems in a payload. This is inefficient for:
Key Features
1. Opt-in Per-Call Design
collectErrors()onObjectReader2. DoS Protection
3. Best-Effort Collection
4. JSON Pointer Paths (RFC 6901)
/items/0/price~→~0,/→~15. Thread Safety
ObjectReaderacross threadsAPI Usage
Basic Usage
Custom Error Limit
Multiple Input Types
Implementation Details
New Classes
1.
CollectedProblemImmutable value object representing a single deserialization error:
path: JSON Pointer to problematic field (e.g.,/items/0/price)message: Human-readable error messagerawValue: The invalid input value that caused the errortargetType: Expected Java type2.
DeferredBindingExceptionAggregate exception thrown when errors are collected:
problems: List of all collected errorspartialResult: The partially deserialized object (with default values for failed fields)limitReached: Flag indicating if error limit was hitDatabindExceptionfor compatibility3.
CollectingProblemHandlerStateless problem handler that collects errors during deserialization:
DeserializationProblemHandlerinterfaceDeserializationContext4.
ObjectReaderExtensionsNew methods:
collectErrors(): Enable collection with default limit (100)collectErrors(int maxProblems): Enable collection with custom limitreadValueCollecting(...): Read with error collection (5 overloads for different input types)Design Decisions
readValueCollecting()call gets a fresh error list stored inDeserializationContextattributesTesting
Comprehensive test suite with 31 tests covering:
Core Functionality
Edge Cases
Backward Compatibility
DatabindExceptionPerformance Considerations
Use Cases
API Validation
Batch Processing
Limitations
Best-effort: Not all errors can be collected
Default values: Failed fields get defaults
Error limit: Default 100 errors
Future Enhancements
Potential future improvements (not in this PR):
Generated by: Claude Code (Anthropic)
Specified by: @sri-adarsh-kumar
Original feature request: @odrotbohm (2016)