LAYER 2

Documentation for LAYER_2 from the Json remedy repository.

Layer 2: Structural Repair

Overview

Layer 2 is the second layer in the JSON Remedy pipeline, responsible for Structural Repair. It fixes missing, extra, and mismatched delimiters using a sophisticated state machine approach to handle complex nested structure issues.

Module: JsonRemedy.Layer2.StructuralRepair
Priority: 2 (runs after Layer 1 Content Cleaning)
Behavior: Implements JsonRemedy.LayerBehaviour

Purpose & Design Philosophy

Layer 2 addresses structural integrity issues in JSON data that commonly occur when:

JSON is truncated or incomplete (missing closing delimiters)
Extra delimiters are accidentally added during editing
Delimiters are mismatched (opening with { but closing with ])
Complex nested structures have inconsistent nesting

The layer uses a character-by-character state machine instead of regex parsing for precise control over delimiter matching and context tracking, following the project’s philosophy of maintainable, efficient implementations.

Core Functionality

1. Missing Delimiter Detection & Repair

Automatically adds missing closing delimiters for incomplete JSON structures:

// Input (truncated JSON)
{"name": "Alice", "profile": {"city": "NYC"

// Output (Layer 2 repaired)
{"name": "Alice", "profile": {"city": "NYC"}}

Features:

Tracks nested context using a context stack
Maintains LIFO order for proper nesting closure
Handles arbitrarily deep nesting levels
Preserves string content and escape sequences
Records repair positions for debugging

Implementation: Uses close_unclosed_contexts/1 to add missing delimiters based on the current context stack.

2. Extra Delimiter Removal

Removes redundant or duplicate delimiters that break JSON structure:

// Input (extra delimiters)
[["item1", "item2"]]  // Double array wrapping
{{"key": "value"}}    // Double object wrapping

// Output (Layer 2 cleaned)
["item1", "item2"]    // Single array
{"key": "value"}      // Single object

Features:

Detects consecutive identical opening delimiters
Uses look-ahead analysis to determine redundancy
Applies heuristics for distinguishing valid from invalid nesting
Handles both bracket [[]] and brace {{}} duplications

Implementation: Uses extra_opening_delimiter?/2 with appears_redundant?/2 heuristics.

3. Delimiter Mismatch Correction

Fixes mismatched opening and closing delimiters:

// Input (mismatched delimiters)
{"items": [1, 2, 3}   // Object opened, array closed with brace
["users": {"name": "Alice"]]  // Array opened, object closed with bracket

// Output (Layer 2 corrected)
{"items": [1, 2, 3]}  // Consistent object-array nesting
["users": {"name": "Alice"}]  // Consistent array-object nesting

Features:

Preserves the opening delimiter’s intent (objects stay objects, arrays stay arrays)
Fixes the closing delimiter to match the opening context
Maintains semantic meaning of the original structure
Tracks mismatches for detailed repair reporting

Implementation: Context stack tracking in handle_closing_brace/2 and handle_closing_bracket/2.

State Machine Architecture

Parser States

@type parser_state :: :root | :object | :array

:root - Top-level parsing (outside any container)
:object - Currently inside an object {}
:array - Currently inside an array []

Context Tracking

@type context_frame :: %{type: delimiter_type(), position: non_neg_integer()}
@type delimiter_type :: :brace | :bracket

The state machine maintains:

Context Stack: LIFO stack of open delimiters with positions
String State: Tracking when inside quoted strings to avoid processing delimiters within strings
Escape State: Handling escape sequences within strings
Position Tracking: Character-level position for precise repair reporting

State Transitions

:root + "{" → :object (push brace context)
:root + "[" → :array (push bracket context)
:object + "}" → previous state (pop context)
:array + "]" → previous state (pop context)

API Specification

Main Processing Function

@spec process(input :: String.t(), context :: repair_context()) :: layer_result()

Returns:

{:ok, processed_input, updated_context} - Layer completed successfully
{:continue, input, context} - Layer doesn’t apply, pass to next layer
{:error, reason} - Layer failed, stop pipeline

Core Repair Functions

@spec parse_string(input :: String.t(), state :: map()) :: map()
@spec process_character(state :: map(), char :: String.t()) :: map()
@spec handle_delimiter(state :: state_map(), char :: String.t()) :: state_map()
@spec close_unclosed_contexts(state :: state_map()) :: state_map()

Each function maintains the state machine and accumulates repair actions for comprehensive tracking.

Layer Behavior Callbacks

@spec supports?(input :: String.t()) :: boolean()
@spec priority() :: 2
@spec name() :: String.t()
@spec validate_options(options :: keyword()) :: :ok | {:error, String.t()}

Configuration Options

Layer 2 accepts the following configuration options:

:max_nesting_depth - Maximum allowed nesting depth (positive integer)
:timeout_ms - Processing timeout in milliseconds (positive integer)
:strict_mode - Enable strict structural validation (boolean)

Example:

options = [
  max_nesting_depth: 50,
  timeout_ms: 5000,
  strict_mode: false
]

Option Validation:

max_nesting_depth: Must be positive integer
timeout_ms: Must be positive integer
strict_mode: Must be boolean
Unknown options result in validation errors

Processing Pipeline

Layer 2 processes input through the following stages:

State Machine Initialization
- Initialize parser state to :root
- Create empty context stack
- Set up character position tracking
Character-by-Character Parsing
- Process each character sequentially
- Track string boundaries and escape sequences
- Handle structural delimiters outside strings
Context Management
- Push opening delimiters onto context stack
- Pop matching closing delimiters from stack
- Detect and handle mismatches and extras
Unclosed Context Resolution
- Add missing closing delimiters for remaining stack entries
- Maintain proper LIFO closure order
- Generate repair actions for each addition

Repair Action Tracking

Layer 2 generates detailed repair actions for transparency:

%{
  layer: :structural_repair,
  action: "added missing closing brace",
  position: 42,
  original: nil,
  replacement: "}"
}

Action Types:

"added missing closing brace" / "added missing closing bracket"
"removed extra closing brace" / "removed extra closing bracket"
"removed extra opening brace" / "removed extra opening bracket"
"fixed array-object mismatch: changed } to ]"
"fixed object-array mismatch: changed ] to }"

Error Handling

Layer 2 includes comprehensive error handling:

Input Validation: Ensures input is a binary string
State Machine Protection: Catches and reports processing errors
Option Validation: Validates all configuration options with detailed error messages
Graceful Degradation: Returns error tuples instead of raising exceptions

Performance Characteristics

Time Complexity

Linear: O(n) where n is input string length
Single-pass: Character-by-character processing
Memory Efficient: Context stack size bounded by nesting depth

Space Complexity

Context Stack: O(d) where d is maximum nesting depth
Result Building: O(n) for character accumulation
Repair Tracking: O(r) where r is number of repairs needed

Optimization Features

Early Detection: supports?/1 provides fast pre-screening
Minimal Copying: In-place character processing where possible
Efficient String Handling: Uses grapheme-based iteration

Integration with JSON Remedy Pipeline

Layer Interaction

Input: Receives output from Layer 1 (Content Cleaning)
Output: Provides structurally-corrected JSON to Layer 3
Context Preservation: Maintains repair history across layers

Pipeline Position

Layer 1 → Layer 2 → Layer 3 → ... → Layer N
Content   Structural  Next Layer
Cleaning   Repair

Repair Context Flow

# Input context from Layer 1
%{repairs: [layer1_repairs], options: [...]}

# Output context to Layer 3  
%{
  repairs: [layer1_repairs] ++ [layer2_repairs],
  options: [...],
  metadata: %{layer2_processed: true}
}

Common Use Cases

1. Truncated API Responses

// Incomplete response from network timeout
{"users": [{"name": "Alice", "email": "[email protected]"

// Layer 2 repair
{"users": [{"name": "Alice", "email": "[email protected]"}]}

2. Copy-Paste Errors

// Extra brackets from editor selection
[["item1", "item2", "item3"]]

// Layer 2 cleanup
["item1", "item2", "item3"]

3. Template Generation Issues

// Wrong delimiter from template substitution  
{"config": ["setting1", "setting2"}

// Layer 2 correction
{"config": ["setting1", "setting2"]}

4. Deep Nesting Repairs

// Complex incomplete structure
{"a": {"b": {"c": {"d": [1, 2, {"e": "value"

// Layer 2 completion
{"a": {"b": {"c": {"d": [1, 2, {"e": "value"}]}}}}

Testing & Validation

Layer 2 includes comprehensive test coverage:

329 lines of unit tests covering all repair scenarios
Edge case handling: Empty inputs, string boundaries, escape sequences
Complex nesting: Deep hierarchical structures with mixed containers
Performance testing: Large input handling and timeout behavior
Property-based testing: Randomized structural corruption and repair

Test Categories

Missing Delimiter Tests: Incomplete objects and arrays
Extra Delimiter Tests: Redundant wrapping and duplication
Mismatch Tests: Wrong closing delimiter types
Complex Nesting Tests: Multi-level structural issues
String Handling Tests: Delimiters within string literals
Edge Case Tests: Empty structures, single characters

Debugging & Diagnostics

Layer Support Detection

StructuralRepair.supports?('{"incomplete": "json"')
# => true (missing closing brace detected)

Repair Action Inspection

{:ok, result, context} = StructuralRepair.process(input, %{repairs: [], options: []})
IO.inspect(context.repairs)
# => [%{layer: :structural_repair, action: "added missing closing brace", ...}]

Validation Diagnostics

StructuralRepair.validate_options([max_nesting_depth: -1])
# => {:error, "Option max_nesting_depth must be a positive integer, got: -1"}

Best Practices

Run After Content Cleaning: Layer 2 expects cleaned input from Layer 1
Configure Timeouts: Set appropriate :timeout_ms for large inputs
Monitor Repair Actions: Review repair logs for data quality insights
Validate Options: Always validate configuration before processing
Handle Errors Gracefully: Check return tuples and handle :error cases

Limitations & Considerations

Heuristic-Based: Extra delimiter detection uses simple heuristics
Context Preservation: Always prefers to maintain original structure intent
Single-Pass Processing: No backtracking for complex ambiguous cases
Performance Bounds: Processing time grows linearly with input size
String Delimiter Handling: Delimiters within strings are preserved correctly

Layer 2 provides robust structural repair capabilities that handle the majority of real-world JSON corruption scenarios while maintaining high performance and detailed repair tracking.