← Back to Desktop

LAYER3 QUAD

Documentation for LAYER3_QUAD from the Json remedy repository.

Layer 3 Quadratic Performance Optimization Plan

Executive Summary

Layer 3 (Syntax Normalization) exhibits clear quadratic O(n²) performance characteristics that become the dominant bottleneck in JsonRemedy processing. Performance analysis reveals:

  • 10 objects (4.3KB): Layer 3 = 133.9ms
  • 25 objects (10.8KB): Layer 3 = 835.9ms
  • 50 objects (21.6KB): Layer 3 = 3354.6ms
  • 100 objects (43.3KB): Layer 3 = 12877.4ms

Layer 3 accounts for >95% of total processing time with clear quadratic scaling. This plan provides a systematic approach to optimize Layer 3 from O(n²) to O(n) performance.

1. Performance Analysis and Root Cause Identification

1.1 Profiling Target Functions

Priority 1 - Critical Bottlenecks (String Concatenation):

  • quote_unquoted_keys_char_by_char/7 - Character-by-character processing with result <> char
  • replace_all_literals_single_pass/8 - Literal replacement with string building
  • add_missing_commas_recursive/9 - Comma insertion with accumulator concatenation
  • remove_trailing_commas_recursive/7 - Comma removal with string building

Priority 2 - Character Access Inefficiencies:

  • parse_characters/2 - Uses String.at(content, position) in loops
  • process_character/3 - Character-by-character with position tracking
  • consume_identifier/2 and consume_whitespace/2 - Sequential character access

Priority 3 - Context Tracking:

  • inside_string?/2 - Expensive string slicing for context determination
  • maybe_quote_next_key/4 - Look-ahead parsing with repeated character access

1.2 Specific Performance Issues

String Concatenation Patterns

# CURRENT: O(n²) - Creates new string on every character
result <> char                    # in quote_unquoted_keys_char_by_char/7
acc <> char_str                   # in add_missing_commas_recursive/9
state.result <> char              # in process_character/3

Character Access Patterns

# CURRENT: O(n) per access - String.at/2 scans from beginning
char = String.at(input, pos)       # Called in tight loops
String.length(input)               # Called repeatedly for bounds checking

Context Determination

# CURRENT: O(n) per position check
before = String.slice(input, 0, position)
quote_count = count_unescaped_quotes(before)  # Scans entire prefix

2. Optimization Strategy

2.1 String Building Optimization

Replace String Concatenation with IO Lists

Convert all string building operations to use IO lists, which provide O(1) append operations:

# BEFORE: O(n²)
defp quote_unquoted_keys_char_by_char(input, result, pos, ...) do
  char = String.at(input, pos)
  quote_unquoted_keys_char_by_char(input, result <> char, pos + 1, ...)
end

# AFTER: O(n)
defp quote_unquoted_keys_iolist(input, result_iolist, pos, ...) do
  char = String.at(input, pos)
  quote_unquoted_keys_iolist(input, [result_iolist, char], pos + 1, ...)
end

Implementation Steps:

  1. Convert result parameter from String.t() to iodata()
  2. Use [acc, char] instead of acc <> char
  3. Call IO.iodata_to_binary/1 only at final result

2.2 Binary Pattern Matching Optimization

Replace String.at/2 with Binary Pattern Matching

Convert character-by-character processing to use efficient binary pattern matching:

# BEFORE: O(n) per character access
defp parse_characters(content, state) do
  if state.position >= String.length(content) do
    state
  else
    char = String.at(content, state.position)
    new_state = process_character(char, content, state)
    parse_characters(content, %{new_state | position: state.position + 1})
  end
end

# AFTER: O(1) per character access
defp parse_characters_binary(<<char::utf8, rest::binary>>, state) do
  new_state = process_character_binary(char, state)
  parse_characters_binary(rest, new_state)
end

defp parse_characters_binary(<<>>, state), do: state

Implementation Steps:

  1. Replace position-based indexing with binary destructuring
  2. Process one character at a time with pattern matching
  3. Pass remaining binary to recursive calls
  4. Handle UTF-8 characters correctly with ::utf8 pattern

2.3 Single-Pass Processing

Combine Multiple Operations into Single Pass

Instead of multiple sequential passes through the input, perform all operations in a single traversal:

# BEFORE: Multiple passes (O(n) × passes = O(mn))
{quoted_content, quote_repairs} = quote_unquoted_keys(input)
{normalized_content, norm_repairs} = normalize_quotes(quoted_content)
{literal_content, lit_repairs} = normalize_literals(normalized_content)
{comma_content, comma_repairs} = fix_commas(literal_content)

# AFTER: Single pass (O(n))
{final_content, all_repairs} = normalize_syntax_single_pass(input)

Implementation Steps:

  1. Create unified state machine that tracks all normalization needs
  2. Handle quote normalization, literal replacement, and comma fixes in one pass
  3. Maintain context for string boundaries and escape sequences
  4. Collect all repairs in single traversal

2.4 Context Optimization

Efficient String Context Tracking

Replace expensive substring operations with stateful context tracking:

# BEFORE: O(n) context check
def inside_string?(input, position) do
  before = String.slice(input, 0, position)  # O(n) operation
  quote_count = count_unescaped_quotes(before)  # O(n) operation
  rem(quote_count, 2) != 0
end

# AFTER: O(1) context check
defp track_string_context(char, state) do
  cond do
    state.escape_next -> %{state | escape_next: false}
    state.in_string && char == "\\" -> %{state | escape_next: true}
    state.in_string && char == state.quote_char -> %{state | in_string: false, quote_char: nil}
    !state.in_string && char in ["\"", "'"] -> %{state | in_string: true, quote_char: char}
    true -> state
  end
end

3. Implementation Plan

3.1 Phase 1: Profiling and Baseline (Week 1)

Day 1-2: Detailed Function Profiling

  • Instrument individual Layer 3 functions with timing
  • Identify exact bottleneck functions and call frequency
  • Create performance regression test suite
  • Document current memory usage patterns

Day 3-4: Create Optimization Test Framework

  • Build micro-benchmarks for each target function
  • Set up automated performance monitoring
  • Create test cases with varying input sizes (10, 100, 1K, 10K objects)
  • Establish baseline measurements

Day 5: Memory Profiling

  • Profile memory usage during string concatenation operations
  • Identify memory fragmentation patterns
  • Measure garbage collection pressure
  • Document memory growth characteristics

Deliverables:

  • Detailed performance profile of all Layer 3 functions
  • Benchmark suite with baseline measurements
  • Memory usage analysis report

3.2 Phase 2: String Building Optimization (Week 2)

Day 1-3: IO List Conversion

Target Functions:

# 1. quote_unquoted_keys_char_by_char/7
defp quote_unquoted_keys_iolist(input, result_iolist, pos, in_string, escape_next, quote_char, repairs)

# 2. replace_all_literals_single_pass/8  
defp replace_literals_iolist(input, result_iolist, pos, in_string, escape_next, quote_char, repairs, replacements)

# 3. add_missing_commas_recursive/9
defp add_commas_iolist(content, acc_iolist, in_string, escape_next, quote, pos, repairs, prev_token, in_object)

# 4. remove_trailing_commas_recursive/7
defp remove_commas_iolist(content, acc_iolist, in_string, escape_next, quote, pos, repairs)

Implementation Strategy:

  1. Convert result parameter from String.t() to iodata()
  2. Replace result <> char with [result, char]
  3. Replace result <> string with [result, string]
  4. Add IO.iodata_to_binary/1 call at function exit points
  5. Update all recursive calls to pass iodata

Day 4-5: Testing and Validation

  • Run benchmark suite to measure improvement
  • Verify functional correctness with existing test suite
  • Test with UTF-8 content to ensure binary handling is correct
  • Profile memory usage improvement

Expected Results:

  • 10-100x performance improvement for string building operations
  • Significant reduction in memory allocation and GC pressure
  • Linear scaling behavior for large inputs

3.3 Phase 3: Binary Pattern Matching (Week 3)

Day 1-3: Character Access Optimization

Target Functions:

# 1. Main character processing loop
defp parse_characters_binary(<<char::utf8, rest::binary>>, state)
defp parse_characters_binary(<<>>, state)

# 2. Quote processing
defp process_character_binary(char, state) when is_integer(char)

# 3. Utility functions
defp consume_identifier_binary(<<char::utf8, rest::binary>>, acc, char_count)
defp consume_whitespace_binary(<<char::utf8, rest::binary>>, acc) when char in @whitespace_chars

Implementation Strategy:

  1. Replace String.at(input, pos) with <<char::utf8, rest::binary>>
  2. Remove position tracking in favor of binary consumption
  3. Use pattern matching for character classification
  4. Handle empty binary case explicitly
  5. Ensure UTF-8 safety with ::utf8 pattern

Day 4-5: Performance Testing

  • Benchmark character access performance improvement
  • Test with various UTF-8 content (emoji, Asian characters, accented text)
  • Validate that binary pattern matching maintains UTF-8 correctness
  • Profile overall processing speed improvement

Expected Results:

  • 5-20x improvement in character access operations
  • Elimination of O(n) position-based indexing
  • Better memory efficiency due to reduced string creation

3.4 Phase 4: Single-Pass Processing (Week 4)

Day 1-3: Unified State Machine

Design unified state machine:

defmodule Layer3State do
  @type t :: %{
    result_iolist: iodata(),
    in_string: boolean(),
    escape_next: boolean(),
    quote_char: String.t() | nil,
    expecting: :key | :value | :colon | :comma_or_end,
    context_stack: [:object | :array],
    repairs: [repair_action()],
    # Operation-specific flags
    quote_normalization: boolean(),
    literal_replacement: boolean(),
    comma_fixing: boolean()
  }
end

defp normalize_syntax_single_pass(<<char::utf8, rest::binary>>, state) do
  new_state = 
    state
    |> handle_string_context(char)
    |> handle_quote_normalization(char)
    |> handle_literal_replacement(char, rest)
    |> handle_comma_fixes(char)
    |> append_character(char)
  
  normalize_syntax_single_pass(rest, new_state)
end

Implementation Strategy:

  1. Combine quote normalization, literal replacement, and comma fixing
  2. Maintain single parsing state with all context information
  3. Process each character through all transformation pipelines
  4. Use look-ahead for multi-character patterns (True, False, None, NULL)
  5. Collect all repairs in single list

Day 4-5: Integration and Testing

  • Integrate single-pass processor with existing API
  • Run comprehensive test suite to ensure all functionality preserved
  • Benchmark end-to-end performance improvement
  • Test edge cases and complex nested structures

Expected Results:

  • Elimination of multiple parsing passes
  • 2-5x overall performance improvement
  • Reduced function call overhead
  • Single traversal of input data

3.5 Phase 5: Context Optimization (Week 5)

Day 1-2: Stateful Context Tracking

Replace expensive context queries:

# BEFORE: O(n) context check
def inside_string?(input, position) do
  before = String.slice(input, 0, position)
  quote_count = count_unescaped_quotes(before)
  rem(quote_count, 2) != 0
end

# AFTER: O(1) context access
defp update_context(char, state) do
  case {state.in_string, state.escape_next, char} do
    {_, true, _} -> %{state | escape_next: false}
    {true, false, "\\"} -> %{state | escape_next: true}
    {true, false, quote} when quote == state.quote_char -> 
      %{state | in_string: false, quote_char: nil}
    {false, false, quote} when quote in ["\"", "'"] -> 
      %{state | in_string: true, quote_char: quote}
    _ -> state
  end
end

Day 3-4: Look-ahead Optimization

Optimize multi-character pattern matching:

defp match_literal_pattern(<<"True", rest::binary>>, state) when not state.in_string do
  {rest, add_literal_repair(state, "True", "true"), "true"}
end

defp match_literal_pattern(<<"False", rest::binary>>, state) when not state.in_string do
  {rest, add_literal_repair(state, "False", "false"), "false"}
end

defp match_literal_pattern(<<"None", rest::binary>>, state) when not state.in_string do
  {rest, add_literal_repair(state, "None", "null"), "null"}
end

Day 5: Final Integration

  • Integrate context optimization with single-pass processor
  • Run full benchmark suite
  • Validate linear scaling behavior
  • Document performance improvements

4. Performance Validation

4.1 Success Criteria

Performance Targets:

  • Layer 3 processing time should scale linearly O(n) with input size
  • 10x improvement in processing speed for large inputs (>50KB)
  • Memory usage should remain proportional to input size (not quadratic)
  • No regression in functional correctness

Benchmark Targets:

  • 100 objects (43.3KB): Target < 500ms (vs current 12877ms)
  • 200 objects (86.6KB): Target < 1000ms (vs extrapolated 51s)
  • 500 objects (216KB): Target < 2500ms (vs extrapolated 5+ minutes)

4.2 Testing Strategy

Performance Tests:

defmodule Layer3PerformanceTest do
  test "linear scaling with input size" do
    sizes = [10, 25, 50, 100, 200, 500]
    
    times = Enum.map(sizes, fn size ->
      input = create_malformed_json(size)
      {time, _result} = :timer.tc(fn ->
        SyntaxNormalization.process(input, %{repairs: [], options: []})
      end)
      {size, time}
    end)
    
    # Verify linear scaling (R² > 0.95 for linear fit)
    assert_linear_scaling(times)
  end
end

Memory Tests:

test "memory usage remains linear" do
  sizes = [10, 50, 100, 200]
  
  memory_usage = Enum.map(sizes, fn size ->
    input = create_malformed_json(size)
    :erlang.garbage_collect()
    memory_before = :erlang.process_info(self(), :memory) |> elem(1)
    
    SyntaxNormalization.process(input, %{repairs: [], options: []})
    
    memory_after = :erlang.process_info(self(), :memory) |> elem(1)
    {size, memory_after - memory_before}
  end)
  
  # Memory should grow linearly with input size
  assert_linear_memory_growth(memory_usage)
end

4.3 Regression Testing

Functional Correctness:

  • All existing Layer 3 tests must pass without modification
  • UTF-8 handling must remain correct
  • Edge cases (empty input, malformed JSON) must work
  • Repair action generation must be preserved

Integration Testing:

  • End-to-end JsonRemedy processing must produce identical results
  • Layer ordering and pipeline behavior must remain unchanged
  • Error handling and validation must work correctly

5. Implementation Details

5.1 File Structure

New Files:

lib/json_remedy/layer3/
├── syntax_normalization.ex           # Updated main module
├── optimized/
│   ├── iolist_builder.ex            # IO list utilities
│   ├── binary_parser.ex             # Binary pattern matching
│   ├── single_pass_processor.ex     # Unified processor
│   └── context_tracker.ex           # State management
└── performance/
    ├── benchmarks.ex                # Performance test suite
    └── profiler.ex                  # Instrumentation utilities

Modified Files:

lib/json_remedy/layer3/syntax_normalization.ex  # Main API preserved, implementation optimized
test/unit/layer3_syntax_normalization_test.exs  # Add performance tests
test/critical/performance_stress_layer_3_test.exs  # Update performance targets

5.2 Backwards Compatibility

API Preservation:

  • Public API functions maintain exact same signatures
  • Return value formats remain unchanged
  • Error handling behavior preserved
  • Repair action structure unchanged

Configuration Compatibility:

  • All existing options continue to work
  • Default behavior remains identical
  • Error messages and logging preserved

5.3 Migration Strategy

Gradual Rollout:

  1. Phase 1: Implement optimizations behind feature flag
  2. Phase 2: A/B test optimized vs original implementation
  3. Phase 3: Enable optimizations by default
  4. Phase 4: Remove original implementation after validation

Feature Flag:

@optimized_processing Application.compile_env(:json_remedy, :layer3_optimized, true)

def process(input, context) do
  if @optimized_processing do
    process_optimized(input, context)
  else
    process_original(input, context)
  end
end

6. Risk Mitigation

6.1 Technical Risks

Risk: Binary Pattern Matching UTF-8 Issues

  • Mitigation: Comprehensive UTF-8 test suite with emoji, Asian characters, combining characters
  • Fallback: Maintain String.at/2 implementation as backup

Risk: IO List Memory Usage

  • Mitigation: Monitor memory usage during development, implement streaming for very large inputs
  • Fallback: Hybrid approach using IO lists for small segments

Risk: Functional Regression

  • Mitigation: Comprehensive test coverage, property-based testing, fuzzing with random inputs
  • Fallback: Feature flag allows instant rollback to original implementation

6.2 Performance Risks

Risk: Optimization Doesn’t Achieve Linear Scaling

  • Mitigation: Incremental optimization with measurement at each step
  • Contingency: Focus on most impactful optimizations first (string concatenation)

Risk: Memory Usage Increases

  • Mitigation: Memory profiling throughout development
  • Contingency: Implement memory-conscious alternatives for large inputs

6.3 Timeline Risks

Risk: Implementation Takes Longer Than Planned

  • Mitigation: Break into smaller, independently valuable pieces
  • Contingency: Prioritize string concatenation fixes which provide biggest impact

7. Success Metrics

7.1 Performance Metrics

Primary Metrics:

  • Processing time scales linearly with input size (R² > 0.95)
  • 10x improvement for 100-object inputs
  • Memory usage remains proportional to input size

Secondary Metrics:

  • Reduced garbage collection pressure
  • Lower CPU utilization per character processed
  • Improved throughput (objects/second)

7.2 Quality Metrics

Correctness:

  • 100% test suite pass rate
  • No functional regressions
  • Identical repair action generation

Reliability:

  • No new error conditions introduced
  • Graceful handling of edge cases
  • Consistent behavior across input sizes

7.3 Monitoring Plan

Development Monitoring:

  • Continuous benchmarking in CI/CD pipeline
  • Memory usage tracking for large inputs
  • Performance regression alerts

Production Monitoring:

  • Processing time percentiles (P50, P95, P99)
  • Memory usage patterns
  • Error rate monitoring
  • Throughput measurement

8. Conclusion

This optimization plan addresses the quadratic performance bottleneck in Layer 3 through systematic improvements to string building, character access, and processing architecture. The phased approach ensures minimal risk while delivering significant performance improvements.

Expected Outcomes:

  • Performance: 10-100x improvement for large inputs
  • Scalability: Linear O(n) scaling replaces quadratic O(n²)
  • Memory: Reduced allocation and garbage collection pressure
  • Maintainability: Cleaner, more efficient code architecture

Timeline: 5 weeks for complete implementation with continuous validation and testing throughout.

The optimization preserves all existing functionality while dramatically improving performance, making JsonRemedy suitable for processing large JSON documents efficiently.