Layer 2: Structural Repair
Overview
Layer 2 is the second layer in the JSON Remedy pipeline, responsible for Structural Repair. It fixes missing, extra, and mismatched delimiters using a sophisticated state machine approach to handle complex nested structure issues.
Module: JsonRemedy.Layer2.StructuralRepair
Priority: 2 (runs after Layer 1 Content Cleaning)
Behavior: Implements JsonRemedy.LayerBehaviour
Purpose & Design Philosophy
Layer 2 addresses structural integrity issues in JSON data that commonly occur when:
- JSON is truncated or incomplete (missing closing delimiters)
- Extra delimiters are accidentally added during editing
- Delimiters are mismatched (opening with
{
but closing with]
) - Complex nested structures have inconsistent nesting
The layer uses a character-by-character state machine instead of regex parsing for precise control over delimiter matching and context tracking, following the project’s philosophy of maintainable, efficient implementations.
Core Functionality
1. Missing Delimiter Detection & Repair
Automatically adds missing closing delimiters for incomplete JSON structures:
// Input (truncated JSON)
{"name": "Alice", "profile": {"city": "NYC"
// Output (Layer 2 repaired)
{"name": "Alice", "profile": {"city": "NYC"}}
Features:
- Tracks nested context using a context stack
- Maintains LIFO order for proper nesting closure
- Handles arbitrarily deep nesting levels
- Preserves string content and escape sequences
- Records repair positions for debugging
Implementation: Uses close_unclosed_contexts/1
to add missing delimiters based on the current context stack.
2. Extra Delimiter Removal
Removes redundant or duplicate delimiters that break JSON structure:
// Input (extra delimiters)
[["item1", "item2"]] // Double array wrapping
{{"key": "value"}} // Double object wrapping
// Output (Layer 2 cleaned)
["item1", "item2"] // Single array
{"key": "value"} // Single object
Features:
- Detects consecutive identical opening delimiters
- Uses look-ahead analysis to determine redundancy
- Applies heuristics for distinguishing valid from invalid nesting
- Handles both bracket
[[]]
and brace{{}}
duplications
Implementation: Uses extra_opening_delimiter?/2
with appears_redundant?/2
heuristics.
3. Delimiter Mismatch Correction
Fixes mismatched opening and closing delimiters:
// Input (mismatched delimiters)
{"items": [1, 2, 3} // Object opened, array closed with brace
["users": {"name": "Alice"]] // Array opened, object closed with bracket
// Output (Layer 2 corrected)
{"items": [1, 2, 3]} // Consistent object-array nesting
["users": {"name": "Alice"}] // Consistent array-object nesting
Features:
- Preserves the opening delimiter’s intent (objects stay objects, arrays stay arrays)
- Fixes the closing delimiter to match the opening context
- Maintains semantic meaning of the original structure
- Tracks mismatches for detailed repair reporting
Implementation: Context stack tracking in handle_closing_brace/2
and handle_closing_bracket/2
.
State Machine Architecture
Parser States
@type parser_state :: :root | :object | :array
:root
- Top-level parsing (outside any container):object
- Currently inside an object{}
:array
- Currently inside an array[]
Context Tracking
@type context_frame :: %{type: delimiter_type(), position: non_neg_integer()}
@type delimiter_type :: :brace | :bracket
The state machine maintains:
- Context Stack: LIFO stack of open delimiters with positions
- String State: Tracking when inside quoted strings to avoid processing delimiters within strings
- Escape State: Handling escape sequences within strings
- Position Tracking: Character-level position for precise repair reporting
State Transitions
:root + "{" → :object (push brace context)
:root + "[" → :array (push bracket context)
:object + "}" → previous state (pop context)
:array + "]" → previous state (pop context)
API Specification
Main Processing Function
@spec process(input :: String.t(), context :: repair_context()) :: layer_result()
Returns:
{:ok, processed_input, updated_context}
- Layer completed successfully{:continue, input, context}
- Layer doesn’t apply, pass to next layer{:error, reason}
- Layer failed, stop pipeline
Core Repair Functions
@spec parse_string(input :: String.t(), state :: map()) :: map()
@spec process_character(state :: map(), char :: String.t()) :: map()
@spec handle_delimiter(state :: state_map(), char :: String.t()) :: state_map()
@spec close_unclosed_contexts(state :: state_map()) :: state_map()
Each function maintains the state machine and accumulates repair actions for comprehensive tracking.
Layer Behavior Callbacks
@spec supports?(input :: String.t()) :: boolean()
@spec priority() :: 2
@spec name() :: String.t()
@spec validate_options(options :: keyword()) :: :ok | {:error, String.t()}
Configuration Options
Layer 2 accepts the following configuration options:
:max_nesting_depth
- Maximum allowed nesting depth (positive integer):timeout_ms
- Processing timeout in milliseconds (positive integer):strict_mode
- Enable strict structural validation (boolean)
Example:
options = [
max_nesting_depth: 50,
timeout_ms: 5000,
strict_mode: false
]
Option Validation:
max_nesting_depth
: Must be positive integertimeout_ms
: Must be positive integerstrict_mode
: Must be boolean- Unknown options result in validation errors
Processing Pipeline
Layer 2 processes input through the following stages:
State Machine Initialization
- Initialize parser state to
:root
- Create empty context stack
- Set up character position tracking
- Initialize parser state to
Character-by-Character Parsing
- Process each character sequentially
- Track string boundaries and escape sequences
- Handle structural delimiters outside strings
Context Management
- Push opening delimiters onto context stack
- Pop matching closing delimiters from stack
- Detect and handle mismatches and extras
Unclosed Context Resolution
- Add missing closing delimiters for remaining stack entries
- Maintain proper LIFO closure order
- Generate repair actions for each addition
Repair Action Tracking
Layer 2 generates detailed repair actions for transparency:
%{
layer: :structural_repair,
action: "added missing closing brace",
position: 42,
original: nil,
replacement: "}"
}
Action Types:
"added missing closing brace"
/"added missing closing bracket"
"removed extra closing brace"
/"removed extra closing bracket"
"removed extra opening brace"
/"removed extra opening bracket"
"fixed array-object mismatch: changed } to ]"
"fixed object-array mismatch: changed ] to }"
Error Handling
Layer 2 includes comprehensive error handling:
- Input Validation: Ensures input is a binary string
- State Machine Protection: Catches and reports processing errors
- Option Validation: Validates all configuration options with detailed error messages
- Graceful Degradation: Returns error tuples instead of raising exceptions
Performance Characteristics
Time Complexity
- Linear: O(n) where n is input string length
- Single-pass: Character-by-character processing
- Memory Efficient: Context stack size bounded by nesting depth
Space Complexity
- Context Stack: O(d) where d is maximum nesting depth
- Result Building: O(n) for character accumulation
- Repair Tracking: O(r) where r is number of repairs needed
Optimization Features
- Early Detection:
supports?/1
provides fast pre-screening - Minimal Copying: In-place character processing where possible
- Efficient String Handling: Uses grapheme-based iteration
Integration with JSON Remedy Pipeline
Layer Interaction
- Input: Receives output from Layer 1 (Content Cleaning)
- Output: Provides structurally-corrected JSON to Layer 3
- Context Preservation: Maintains repair history across layers
Pipeline Position
Layer 1 → Layer 2 → Layer 3 → ... → Layer N
Content Structural Next Layer
Cleaning Repair
Repair Context Flow
# Input context from Layer 1
%{repairs: [layer1_repairs], options: [...]}
# Output context to Layer 3
%{
repairs: [layer1_repairs] ++ [layer2_repairs],
options: [...],
metadata: %{layer2_processed: true}
}
Common Use Cases
1. Truncated API Responses
// Incomplete response from network timeout
{"users": [{"name": "Alice", "email": "[email protected]"
// Layer 2 repair
{"users": [{"name": "Alice", "email": "[email protected]"}]}
2. Copy-Paste Errors
// Extra brackets from editor selection
[["item1", "item2", "item3"]]
// Layer 2 cleanup
["item1", "item2", "item3"]
3. Template Generation Issues
// Wrong delimiter from template substitution
{"config": ["setting1", "setting2"}
// Layer 2 correction
{"config": ["setting1", "setting2"]}
4. Deep Nesting Repairs
// Complex incomplete structure
{"a": {"b": {"c": {"d": [1, 2, {"e": "value"
// Layer 2 completion
{"a": {"b": {"c": {"d": [1, 2, {"e": "value"}]}}}}
Testing & Validation
Layer 2 includes comprehensive test coverage:
- 329 lines of unit tests covering all repair scenarios
- Edge case handling: Empty inputs, string boundaries, escape sequences
- Complex nesting: Deep hierarchical structures with mixed containers
- Performance testing: Large input handling and timeout behavior
- Property-based testing: Randomized structural corruption and repair
Test Categories
- Missing Delimiter Tests: Incomplete objects and arrays
- Extra Delimiter Tests: Redundant wrapping and duplication
- Mismatch Tests: Wrong closing delimiter types
- Complex Nesting Tests: Multi-level structural issues
- String Handling Tests: Delimiters within string literals
- Edge Case Tests: Empty structures, single characters
Debugging & Diagnostics
Layer Support Detection
StructuralRepair.supports?('{"incomplete": "json"')
# => true (missing closing brace detected)
Repair Action Inspection
{:ok, result, context} = StructuralRepair.process(input, %{repairs: [], options: []})
IO.inspect(context.repairs)
# => [%{layer: :structural_repair, action: "added missing closing brace", ...}]
Validation Diagnostics
StructuralRepair.validate_options([max_nesting_depth: -1])
# => {:error, "Option max_nesting_depth must be a positive integer, got: -1"}
Best Practices
- Run After Content Cleaning: Layer 2 expects cleaned input from Layer 1
- Configure Timeouts: Set appropriate
:timeout_ms
for large inputs - Monitor Repair Actions: Review repair logs for data quality insights
- Validate Options: Always validate configuration before processing
- Handle Errors Gracefully: Check return tuples and handle
:error
cases
Limitations & Considerations
- Heuristic-Based: Extra delimiter detection uses simple heuristics
- Context Preservation: Always prefers to maintain original structure intent
- Single-Pass Processing: No backtracking for complex ambiguous cases
- Performance Bounds: Processing time grows linearly with input size
- String Delimiter Handling: Delimiters within strings are preserved correctly
Layer 2 provides robust structural repair capabilities that handle the majority of real-world JSON corruption scenarios while maintaining high performance and detailed repair tracking.