LAYER3 QUAD

Documentation for LAYER3_QUAD from the Json remedy repository.

Layer 3 Quadratic Performance Optimization Plan

Executive Summary

Layer 3 (Syntax Normalization) exhibits clear quadratic O(n²) performance characteristics that become the dominant bottleneck in JsonRemedy processing. Performance analysis reveals:

10 objects (4.3KB): Layer 3 = 133.9ms
25 objects (10.8KB): Layer 3 = 835.9ms
50 objects (21.6KB): Layer 3 = 3354.6ms
100 objects (43.3KB): Layer 3 = 12877.4ms

Layer 3 accounts for >95% of total processing time with clear quadratic scaling. This plan provides a systematic approach to optimize Layer 3 from O(n²) to O(n) performance.

1. Performance Analysis and Root Cause Identification

1.1 Profiling Target Functions

Priority 1 - Critical Bottlenecks (String Concatenation):

quote_unquoted_keys_char_by_char/7 - Character-by-character processing with result <> char
replace_all_literals_single_pass/8 - Literal replacement with string building
add_missing_commas_recursive/9 - Comma insertion with accumulator concatenation
remove_trailing_commas_recursive/7 - Comma removal with string building

Priority 2 - Character Access Inefficiencies:

parse_characters/2 - Uses String.at(content, position) in loops
process_character/3 - Character-by-character with position tracking
consume_identifier/2 and consume_whitespace/2 - Sequential character access

Priority 3 - Context Tracking:

inside_string?/2 - Expensive string slicing for context determination
maybe_quote_next_key/4 - Look-ahead parsing with repeated character access

1.2 Specific Performance Issues

String Concatenation Patterns

# CURRENT: O(n²) - Creates new string on every character
result <> char                    # in quote_unquoted_keys_char_by_char/7
acc <> char_str                   # in add_missing_commas_recursive/9
state.result <> char              # in process_character/3

Character Access Patterns

# CURRENT: O(n) per access - String.at/2 scans from beginning
char = String.at(input, pos)       # Called in tight loops
String.length(input)               # Called repeatedly for bounds checking

Context Determination

# CURRENT: O(n) per position check
before = String.slice(input, 0, position)
quote_count = count_unescaped_quotes(before)  # Scans entire prefix

2. Optimization Strategy

2.1 String Building Optimization

Replace String Concatenation with IO Lists

Convert all string building operations to use IO lists, which provide O(1) append operations:

# BEFORE: O(n²)
defp quote_unquoted_keys_char_by_char(input, result, pos, ...) do
  char = String.at(input, pos)
  quote_unquoted_keys_char_by_char(input, result <> char, pos + 1, ...)
end

# AFTER: O(n)
defp quote_unquoted_keys_iolist(input, result_iolist, pos, ...) do
  char = String.at(input, pos)
  quote_unquoted_keys_iolist(input, [result_iolist, char], pos + 1, ...)
end

Implementation Steps:

Convert result parameter from String.t() to iodata()
Use [acc, char] instead of acc <> char
Call IO.iodata_to_binary/1 only at final result

2.2 Binary Pattern Matching Optimization

Replace String.at/2 with Binary Pattern Matching

Convert character-by-character processing to use efficient binary pattern matching:

# BEFORE: O(n) per character access
defp parse_characters(content, state) do
  if state.position >= String.length(content) do
    state
  else
    char = String.at(content, state.position)
    new_state = process_character(char, content, state)
    parse_characters(content, %{new_state | position: state.position + 1})
  end
end

# AFTER: O(1) per character access
defp parse_characters_binary(<<char::utf8, rest::binary>>, state) do
  new_state = process_character_binary(char, state)
  parse_characters_binary(rest, new_state)
end

defp parse_characters_binary(<<>>, state), do: state

Implementation Steps:

Replace position-based indexing with binary destructuring
Process one character at a time with pattern matching
Pass remaining binary to recursive calls
Handle UTF-8 characters correctly with ::utf8 pattern

2.3 Single-Pass Processing

Combine Multiple Operations into Single Pass

Instead of multiple sequential passes through the input, perform all operations in a single traversal:

# BEFORE: Multiple passes (O(n) × passes = O(mn))
{quoted_content, quote_repairs} = quote_unquoted_keys(input)
{normalized_content, norm_repairs} = normalize_quotes(quoted_content)
{literal_content, lit_repairs} = normalize_literals(normalized_content)
{comma_content, comma_repairs} = fix_commas(literal_content)

# AFTER: Single pass (O(n))
{final_content, all_repairs} = normalize_syntax_single_pass(input)

Implementation Steps:

Create unified state machine that tracks all normalization needs
Handle quote normalization, literal replacement, and comma fixes in one pass
Maintain context for string boundaries and escape sequences
Collect all repairs in single traversal

2.4 Context Optimization

Efficient String Context Tracking

Replace expensive substring operations with stateful context tracking:

# BEFORE: O(n) context check
def inside_string?(input, position) do
  before = String.slice(input, 0, position)  # O(n) operation
  quote_count = count_unescaped_quotes(before)  # O(n) operation
  rem(quote_count, 2) != 0
end

# AFTER: O(1) context check
defp track_string_context(char, state) do
  cond do
    state.escape_next -> %{state | escape_next: false}
    state.in_string && char == "\\" -> %{state | escape_next: true}
    state.in_string && char == state.quote_char -> %{state | in_string: false, quote_char: nil}
    !state.in_string && char in ["\"", "'"] -> %{state | in_string: true, quote_char: char}
    true -> state
  end
end

3. Implementation Plan

3.1 Phase 1: Profiling and Baseline (Week 1)

Day 1-2: Detailed Function Profiling

Instrument individual Layer 3 functions with timing
Identify exact bottleneck functions and call frequency
Create performance regression test suite
Document current memory usage patterns

Day 3-4: Create Optimization Test Framework

Build micro-benchmarks for each target function
Set up automated performance monitoring
Create test cases with varying input sizes (10, 100, 1K, 10K objects)
Establish baseline measurements

Day 5: Memory Profiling

Profile memory usage during string concatenation operations
Identify memory fragmentation patterns
Measure garbage collection pressure
Document memory growth characteristics

Deliverables:

Detailed performance profile of all Layer 3 functions
Benchmark suite with baseline measurements
Memory usage analysis report

3.2 Phase 2: String Building Optimization (Week 2)

Day 1-3: IO List Conversion

Target Functions:

# 1. quote_unquoted_keys_char_by_char/7
defp quote_unquoted_keys_iolist(input, result_iolist, pos, in_string, escape_next, quote_char, repairs)

# 2. replace_all_literals_single_pass/8  
defp replace_literals_iolist(input, result_iolist, pos, in_string, escape_next, quote_char, repairs, replacements)

# 3. add_missing_commas_recursive/9
defp add_commas_iolist(content, acc_iolist, in_string, escape_next, quote, pos, repairs, prev_token, in_object)

# 4. remove_trailing_commas_recursive/7
defp remove_commas_iolist(content, acc_iolist, in_string, escape_next, quote, pos, repairs)

Implementation Strategy:

Convert result parameter from String.t() to iodata()
Replace result <> char with [result, char]
Replace result <> string with [result, string]
Add IO.iodata_to_binary/1 call at function exit points
Update all recursive calls to pass iodata

Day 4-5: Testing and Validation

Run benchmark suite to measure improvement
Verify functional correctness with existing test suite
Test with UTF-8 content to ensure binary handling is correct
Profile memory usage improvement

Expected Results:

10-100x performance improvement for string building operations
Significant reduction in memory allocation and GC pressure
Linear scaling behavior for large inputs

3.3 Phase 3: Binary Pattern Matching (Week 3)

Day 1-3: Character Access Optimization

Target Functions:

# 1. Main character processing loop
defp parse_characters_binary(<<char::utf8, rest::binary>>, state)
defp parse_characters_binary(<<>>, state)

# 2. Quote processing
defp process_character_binary(char, state) when is_integer(char)

# 3. Utility functions
defp consume_identifier_binary(<<char::utf8, rest::binary>>, acc, char_count)
defp consume_whitespace_binary(<<char::utf8, rest::binary>>, acc) when char in @whitespace_chars

Implementation Strategy:

Replace String.at(input, pos) with <<char::utf8, rest::binary>>
Remove position tracking in favor of binary consumption
Use pattern matching for character classification
Handle empty binary case explicitly
Ensure UTF-8 safety with ::utf8 pattern

Day 4-5: Performance Testing

Benchmark character access performance improvement
Test with various UTF-8 content (emoji, Asian characters, accented text)
Validate that binary pattern matching maintains UTF-8 correctness
Profile overall processing speed improvement

Expected Results:

5-20x improvement in character access operations
Elimination of O(n) position-based indexing
Better memory efficiency due to reduced string creation

3.4 Phase 4: Single-Pass Processing (Week 4)

Day 1-3: Unified State Machine

Design unified state machine:

defmodule Layer3State do
  @type t :: %{
    result_iolist: iodata(),
    in_string: boolean(),
    escape_next: boolean(),
    quote_char: String.t() | nil,
    expecting: :key | :value | :colon | :comma_or_end,
    context_stack: [:object | :array],
    repairs: [repair_action()],
    # Operation-specific flags
    quote_normalization: boolean(),
    literal_replacement: boolean(),
    comma_fixing: boolean()
  }
end

defp normalize_syntax_single_pass(<<char::utf8, rest::binary>>, state) do
  new_state = 
    state
    |> handle_string_context(char)
    |> handle_quote_normalization(char)
    |> handle_literal_replacement(char, rest)
    |> handle_comma_fixes(char)
    |> append_character(char)
  
  normalize_syntax_single_pass(rest, new_state)
end

Implementation Strategy:

Combine quote normalization, literal replacement, and comma fixing
Maintain single parsing state with all context information
Process each character through all transformation pipelines
Use look-ahead for multi-character patterns (True, False, None, NULL)
Collect all repairs in single list

Day 4-5: Integration and Testing

Integrate single-pass processor with existing API
Run comprehensive test suite to ensure all functionality preserved
Benchmark end-to-end performance improvement
Test edge cases and complex nested structures

Expected Results:

Elimination of multiple parsing passes
2-5x overall performance improvement
Reduced function call overhead
Single traversal of input data

3.5 Phase 5: Context Optimization (Week 5)

Day 1-2: Stateful Context Tracking

Replace expensive context queries:

# BEFORE: O(n) context check
def inside_string?(input, position) do
  before = String.slice(input, 0, position)
  quote_count = count_unescaped_quotes(before)
  rem(quote_count, 2) != 0
end

# AFTER: O(1) context access
defp update_context(char, state) do
  case {state.in_string, state.escape_next, char} do
    {_, true, _} -> %{state | escape_next: false}
    {true, false, "\\"} -> %{state | escape_next: true}
    {true, false, quote} when quote == state.quote_char -> 
      %{state | in_string: false, quote_char: nil}
    {false, false, quote} when quote in ["\"", "'"] -> 
      %{state | in_string: true, quote_char: quote}
    _ -> state
  end
end

Day 3-4: Look-ahead Optimization

Optimize multi-character pattern matching:

defp match_literal_pattern(<<"True", rest::binary>>, state) when not state.in_string do
  {rest, add_literal_repair(state, "True", "true"), "true"}
end

defp match_literal_pattern(<<"False", rest::binary>>, state) when not state.in_string do
  {rest, add_literal_repair(state, "False", "false"), "false"}
end

defp match_literal_pattern(<<"None", rest::binary>>, state) when not state.in_string do
  {rest, add_literal_repair(state, "None", "null"), "null"}
end

Day 5: Final Integration

Integrate context optimization with single-pass processor
Run full benchmark suite
Validate linear scaling behavior
Document performance improvements

4. Performance Validation

4.1 Success Criteria

Performance Targets:

Layer 3 processing time should scale linearly O(n) with input size
10x improvement in processing speed for large inputs (>50KB)
Memory usage should remain proportional to input size (not quadratic)
No regression in functional correctness

Benchmark Targets:

100 objects (43.3KB): Target < 500ms (vs current 12877ms)
200 objects (86.6KB): Target < 1000ms (vs extrapolated 51s)
500 objects (216KB): Target < 2500ms (vs extrapolated 5+ minutes)

4.2 Testing Strategy

Performance Tests:

defmodule Layer3PerformanceTest do
  test "linear scaling with input size" do
    sizes = [10, 25, 50, 100, 200, 500]
    
    times = Enum.map(sizes, fn size ->
      input = create_malformed_json(size)
      {time, _result} = :timer.tc(fn ->
        SyntaxNormalization.process(input, %{repairs: [], options: []})
      end)
      {size, time}
    end)
    
    # Verify linear scaling (R² > 0.95 for linear fit)
    assert_linear_scaling(times)
  end
end

Memory Tests:

test "memory usage remains linear" do
  sizes = [10, 50, 100, 200]
  
  memory_usage = Enum.map(sizes, fn size ->
    input = create_malformed_json(size)
    :erlang.garbage_collect()
    memory_before = :erlang.process_info(self(), :memory) |> elem(1)
    
    SyntaxNormalization.process(input, %{repairs: [], options: []})
    
    memory_after = :erlang.process_info(self(), :memory) |> elem(1)
    {size, memory_after - memory_before}
  end)
  
  # Memory should grow linearly with input size
  assert_linear_memory_growth(memory_usage)
end

4.3 Regression Testing

Functional Correctness:

All existing Layer 3 tests must pass without modification
UTF-8 handling must remain correct
Edge cases (empty input, malformed JSON) must work
Repair action generation must be preserved

Integration Testing:

End-to-end JsonRemedy processing must produce identical results
Layer ordering and pipeline behavior must remain unchanged
Error handling and validation must work correctly

5. Implementation Details

5.1 File Structure

New Files:

lib/json_remedy/layer3/
├── syntax_normalization.ex           # Updated main module
├── optimized/
│   ├── iolist_builder.ex            # IO list utilities
│   ├── binary_parser.ex             # Binary pattern matching
│   ├── single_pass_processor.ex     # Unified processor
│   └── context_tracker.ex           # State management
└── performance/
    ├── benchmarks.ex                # Performance test suite
    └── profiler.ex                  # Instrumentation utilities

Modified Files:

lib/json_remedy/layer3/syntax_normalization.ex  # Main API preserved, implementation optimized
test/unit/layer3_syntax_normalization_test.exs  # Add performance tests
test/critical/performance_stress_layer_3_test.exs  # Update performance targets

5.2 Backwards Compatibility

API Preservation:

Public API functions maintain exact same signatures
Return value formats remain unchanged
Error handling behavior preserved
Repair action structure unchanged

Configuration Compatibility:

All existing options continue to work
Default behavior remains identical
Error messages and logging preserved

5.3 Migration Strategy

Gradual Rollout:

Phase 1: Implement optimizations behind feature flag
Phase 2: A/B test optimized vs original implementation
Phase 3: Enable optimizations by default
Phase 4: Remove original implementation after validation

Feature Flag:

@optimized_processing Application.compile_env(:json_remedy, :layer3_optimized, true)

def process(input, context) do
  if @optimized_processing do
    process_optimized(input, context)
  else
    process_original(input, context)
  end
end

6. Risk Mitigation

6.1 Technical Risks

Risk: Binary Pattern Matching UTF-8 Issues

Mitigation: Comprehensive UTF-8 test suite with emoji, Asian characters, combining characters
Fallback: Maintain String.at/2 implementation as backup

Risk: IO List Memory Usage

Mitigation: Monitor memory usage during development, implement streaming for very large inputs
Fallback: Hybrid approach using IO lists for small segments

Risk: Functional Regression

Mitigation: Comprehensive test coverage, property-based testing, fuzzing with random inputs
Fallback: Feature flag allows instant rollback to original implementation

6.2 Performance Risks

Risk: Optimization Doesn’t Achieve Linear Scaling

Mitigation: Incremental optimization with measurement at each step
Contingency: Focus on most impactful optimizations first (string concatenation)

Risk: Memory Usage Increases

Mitigation: Memory profiling throughout development
Contingency: Implement memory-conscious alternatives for large inputs

6.3 Timeline Risks

Risk: Implementation Takes Longer Than Planned

Mitigation: Break into smaller, independently valuable pieces
Contingency: Prioritize string concatenation fixes which provide biggest impact

7. Success Metrics

7.1 Performance Metrics

Primary Metrics:

Processing time scales linearly with input size (R² > 0.95)
10x improvement for 100-object inputs
Memory usage remains proportional to input size

Secondary Metrics:

Reduced garbage collection pressure
Lower CPU utilization per character processed
Improved throughput (objects/second)

7.2 Quality Metrics

Correctness:

100% test suite pass rate
No functional regressions
Identical repair action generation

Reliability:

No new error conditions introduced
Graceful handling of edge cases
Consistent behavior across input sizes

7.3 Monitoring Plan

Development Monitoring:

Continuous benchmarking in CI/CD pipeline
Memory usage tracking for large inputs
Performance regression alerts

Production Monitoring:

Processing time percentiles (P50, P95, P99)
Memory usage patterns
Error rate monitoring
Throughput measurement

8. Conclusion

This optimization plan addresses the quadratic performance bottleneck in Layer 3 through systematic improvements to string building, character access, and processing architecture. The phased approach ensures minimal risk while delivering significant performance improvements.

Expected Outcomes:

Performance: 10-100x improvement for large inputs
Scalability: Linear O(n) scaling replaces quadratic O(n²)
Memory: Reduced allocation and garbage collection pressure
Maintainability: Cleaner, more efficient code architecture

Timeline: 5 weeks for complete implementation with continuous validation and testing throughout.

The optimization preserves all existing functionality while dramatically improving performance, making JsonRemedy suitable for processing large JSON documents efficiently.