Layer 3 Quadratic Performance Optimization Plan
Executive Summary
Layer 3 (Syntax Normalization) exhibits clear quadratic O(n²) performance characteristics that become the dominant bottleneck in JsonRemedy processing. Performance analysis reveals:
- 10 objects (4.3KB): Layer 3 = 133.9ms
- 25 objects (10.8KB): Layer 3 = 835.9ms
- 50 objects (21.6KB): Layer 3 = 3354.6ms
- 100 objects (43.3KB): Layer 3 = 12877.4ms
Layer 3 accounts for >95% of total processing time with clear quadratic scaling. This plan provides a systematic approach to optimize Layer 3 from O(n²) to O(n) performance.
1. Performance Analysis and Root Cause Identification
1.1 Profiling Target Functions
Priority 1 - Critical Bottlenecks (String Concatenation):
quote_unquoted_keys_char_by_char/7
- Character-by-character processing withresult <> char
replace_all_literals_single_pass/8
- Literal replacement with string buildingadd_missing_commas_recursive/9
- Comma insertion with accumulator concatenationremove_trailing_commas_recursive/7
- Comma removal with string building
Priority 2 - Character Access Inefficiencies:
parse_characters/2
- UsesString.at(content, position)
in loopsprocess_character/3
- Character-by-character with position trackingconsume_identifier/2
andconsume_whitespace/2
- Sequential character access
Priority 3 - Context Tracking:
inside_string?/2
- Expensive string slicing for context determinationmaybe_quote_next_key/4
- Look-ahead parsing with repeated character access
1.2 Specific Performance Issues
String Concatenation Patterns
# CURRENT: O(n²) - Creates new string on every character
result <> char # in quote_unquoted_keys_char_by_char/7
acc <> char_str # in add_missing_commas_recursive/9
state.result <> char # in process_character/3
Character Access Patterns
# CURRENT: O(n) per access - String.at/2 scans from beginning
char = String.at(input, pos) # Called in tight loops
String.length(input) # Called repeatedly for bounds checking
Context Determination
# CURRENT: O(n) per position check
before = String.slice(input, 0, position)
quote_count = count_unescaped_quotes(before) # Scans entire prefix
2. Optimization Strategy
2.1 String Building Optimization
Replace String Concatenation with IO Lists
Convert all string building operations to use IO lists, which provide O(1) append operations:
# BEFORE: O(n²)
defp quote_unquoted_keys_char_by_char(input, result, pos, ...) do
char = String.at(input, pos)
quote_unquoted_keys_char_by_char(input, result <> char, pos + 1, ...)
end
# AFTER: O(n)
defp quote_unquoted_keys_iolist(input, result_iolist, pos, ...) do
char = String.at(input, pos)
quote_unquoted_keys_iolist(input, [result_iolist, char], pos + 1, ...)
end
Implementation Steps:
- Convert
result
parameter fromString.t()
toiodata()
- Use
[acc, char]
instead ofacc <> char
- Call
IO.iodata_to_binary/1
only at final result
2.2 Binary Pattern Matching Optimization
Replace String.at/2 with Binary Pattern Matching
Convert character-by-character processing to use efficient binary pattern matching:
# BEFORE: O(n) per character access
defp parse_characters(content, state) do
if state.position >= String.length(content) do
state
else
char = String.at(content, state.position)
new_state = process_character(char, content, state)
parse_characters(content, %{new_state | position: state.position + 1})
end
end
# AFTER: O(1) per character access
defp parse_characters_binary(<<char::utf8, rest::binary>>, state) do
new_state = process_character_binary(char, state)
parse_characters_binary(rest, new_state)
end
defp parse_characters_binary(<<>>, state), do: state
Implementation Steps:
- Replace position-based indexing with binary destructuring
- Process one character at a time with pattern matching
- Pass remaining binary to recursive calls
- Handle UTF-8 characters correctly with
::utf8
pattern
2.3 Single-Pass Processing
Combine Multiple Operations into Single Pass
Instead of multiple sequential passes through the input, perform all operations in a single traversal:
# BEFORE: Multiple passes (O(n) × passes = O(mn))
{quoted_content, quote_repairs} = quote_unquoted_keys(input)
{normalized_content, norm_repairs} = normalize_quotes(quoted_content)
{literal_content, lit_repairs} = normalize_literals(normalized_content)
{comma_content, comma_repairs} = fix_commas(literal_content)
# AFTER: Single pass (O(n))
{final_content, all_repairs} = normalize_syntax_single_pass(input)
Implementation Steps:
- Create unified state machine that tracks all normalization needs
- Handle quote normalization, literal replacement, and comma fixes in one pass
- Maintain context for string boundaries and escape sequences
- Collect all repairs in single traversal
2.4 Context Optimization
Efficient String Context Tracking
Replace expensive substring operations with stateful context tracking:
# BEFORE: O(n) context check
def inside_string?(input, position) do
before = String.slice(input, 0, position) # O(n) operation
quote_count = count_unescaped_quotes(before) # O(n) operation
rem(quote_count, 2) != 0
end
# AFTER: O(1) context check
defp track_string_context(char, state) do
cond do
state.escape_next -> %{state | escape_next: false}
state.in_string && char == "\\" -> %{state | escape_next: true}
state.in_string && char == state.quote_char -> %{state | in_string: false, quote_char: nil}
!state.in_string && char in ["\"", "'"] -> %{state | in_string: true, quote_char: char}
true -> state
end
end
3. Implementation Plan
3.1 Phase 1: Profiling and Baseline (Week 1)
Day 1-2: Detailed Function Profiling
- Instrument individual Layer 3 functions with timing
- Identify exact bottleneck functions and call frequency
- Create performance regression test suite
- Document current memory usage patterns
Day 3-4: Create Optimization Test Framework
- Build micro-benchmarks for each target function
- Set up automated performance monitoring
- Create test cases with varying input sizes (10, 100, 1K, 10K objects)
- Establish baseline measurements
Day 5: Memory Profiling
- Profile memory usage during string concatenation operations
- Identify memory fragmentation patterns
- Measure garbage collection pressure
- Document memory growth characteristics
Deliverables:
- Detailed performance profile of all Layer 3 functions
- Benchmark suite with baseline measurements
- Memory usage analysis report
3.2 Phase 2: String Building Optimization (Week 2)
Day 1-3: IO List Conversion
Target Functions:
# 1. quote_unquoted_keys_char_by_char/7
defp quote_unquoted_keys_iolist(input, result_iolist, pos, in_string, escape_next, quote_char, repairs)
# 2. replace_all_literals_single_pass/8
defp replace_literals_iolist(input, result_iolist, pos, in_string, escape_next, quote_char, repairs, replacements)
# 3. add_missing_commas_recursive/9
defp add_commas_iolist(content, acc_iolist, in_string, escape_next, quote, pos, repairs, prev_token, in_object)
# 4. remove_trailing_commas_recursive/7
defp remove_commas_iolist(content, acc_iolist, in_string, escape_next, quote, pos, repairs)
Implementation Strategy:
- Convert
result
parameter fromString.t()
toiodata()
- Replace
result <> char
with[result, char]
- Replace
result <> string
with[result, string]
- Add
IO.iodata_to_binary/1
call at function exit points - Update all recursive calls to pass iodata
Day 4-5: Testing and Validation
- Run benchmark suite to measure improvement
- Verify functional correctness with existing test suite
- Test with UTF-8 content to ensure binary handling is correct
- Profile memory usage improvement
Expected Results:
- 10-100x performance improvement for string building operations
- Significant reduction in memory allocation and GC pressure
- Linear scaling behavior for large inputs
3.3 Phase 3: Binary Pattern Matching (Week 3)
Day 1-3: Character Access Optimization
Target Functions:
# 1. Main character processing loop
defp parse_characters_binary(<<char::utf8, rest::binary>>, state)
defp parse_characters_binary(<<>>, state)
# 2. Quote processing
defp process_character_binary(char, state) when is_integer(char)
# 3. Utility functions
defp consume_identifier_binary(<<char::utf8, rest::binary>>, acc, char_count)
defp consume_whitespace_binary(<<char::utf8, rest::binary>>, acc) when char in @whitespace_chars
Implementation Strategy:
- Replace
String.at(input, pos)
with<<char::utf8, rest::binary>>
- Remove position tracking in favor of binary consumption
- Use pattern matching for character classification
- Handle empty binary case explicitly
- Ensure UTF-8 safety with
::utf8
pattern
Day 4-5: Performance Testing
- Benchmark character access performance improvement
- Test with various UTF-8 content (emoji, Asian characters, accented text)
- Validate that binary pattern matching maintains UTF-8 correctness
- Profile overall processing speed improvement
Expected Results:
- 5-20x improvement in character access operations
- Elimination of O(n) position-based indexing
- Better memory efficiency due to reduced string creation
3.4 Phase 4: Single-Pass Processing (Week 4)
Day 1-3: Unified State Machine
Design unified state machine:
defmodule Layer3State do
@type t :: %{
result_iolist: iodata(),
in_string: boolean(),
escape_next: boolean(),
quote_char: String.t() | nil,
expecting: :key | :value | :colon | :comma_or_end,
context_stack: [:object | :array],
repairs: [repair_action()],
# Operation-specific flags
quote_normalization: boolean(),
literal_replacement: boolean(),
comma_fixing: boolean()
}
end
defp normalize_syntax_single_pass(<<char::utf8, rest::binary>>, state) do
new_state =
state
|> handle_string_context(char)
|> handle_quote_normalization(char)
|> handle_literal_replacement(char, rest)
|> handle_comma_fixes(char)
|> append_character(char)
normalize_syntax_single_pass(rest, new_state)
end
Implementation Strategy:
- Combine quote normalization, literal replacement, and comma fixing
- Maintain single parsing state with all context information
- Process each character through all transformation pipelines
- Use look-ahead for multi-character patterns (True, False, None, NULL)
- Collect all repairs in single list
Day 4-5: Integration and Testing
- Integrate single-pass processor with existing API
- Run comprehensive test suite to ensure all functionality preserved
- Benchmark end-to-end performance improvement
- Test edge cases and complex nested structures
Expected Results:
- Elimination of multiple parsing passes
- 2-5x overall performance improvement
- Reduced function call overhead
- Single traversal of input data
3.5 Phase 5: Context Optimization (Week 5)
Day 1-2: Stateful Context Tracking
Replace expensive context queries:
# BEFORE: O(n) context check
def inside_string?(input, position) do
before = String.slice(input, 0, position)
quote_count = count_unescaped_quotes(before)
rem(quote_count, 2) != 0
end
# AFTER: O(1) context access
defp update_context(char, state) do
case {state.in_string, state.escape_next, char} do
{_, true, _} -> %{state | escape_next: false}
{true, false, "\\"} -> %{state | escape_next: true}
{true, false, quote} when quote == state.quote_char ->
%{state | in_string: false, quote_char: nil}
{false, false, quote} when quote in ["\"", "'"] ->
%{state | in_string: true, quote_char: quote}
_ -> state
end
end
Day 3-4: Look-ahead Optimization
Optimize multi-character pattern matching:
defp match_literal_pattern(<<"True", rest::binary>>, state) when not state.in_string do
{rest, add_literal_repair(state, "True", "true"), "true"}
end
defp match_literal_pattern(<<"False", rest::binary>>, state) when not state.in_string do
{rest, add_literal_repair(state, "False", "false"), "false"}
end
defp match_literal_pattern(<<"None", rest::binary>>, state) when not state.in_string do
{rest, add_literal_repair(state, "None", "null"), "null"}
end
Day 5: Final Integration
- Integrate context optimization with single-pass processor
- Run full benchmark suite
- Validate linear scaling behavior
- Document performance improvements
4. Performance Validation
4.1 Success Criteria
Performance Targets:
- Layer 3 processing time should scale linearly O(n) with input size
- 10x improvement in processing speed for large inputs (>50KB)
- Memory usage should remain proportional to input size (not quadratic)
- No regression in functional correctness
Benchmark Targets:
- 100 objects (43.3KB): Target < 500ms (vs current 12877ms)
- 200 objects (86.6KB): Target < 1000ms (vs extrapolated 51s)
- 500 objects (216KB): Target < 2500ms (vs extrapolated 5+ minutes)
4.2 Testing Strategy
Performance Tests:
defmodule Layer3PerformanceTest do
test "linear scaling with input size" do
sizes = [10, 25, 50, 100, 200, 500]
times = Enum.map(sizes, fn size ->
input = create_malformed_json(size)
{time, _result} = :timer.tc(fn ->
SyntaxNormalization.process(input, %{repairs: [], options: []})
end)
{size, time}
end)
# Verify linear scaling (R² > 0.95 for linear fit)
assert_linear_scaling(times)
end
end
Memory Tests:
test "memory usage remains linear" do
sizes = [10, 50, 100, 200]
memory_usage = Enum.map(sizes, fn size ->
input = create_malformed_json(size)
:erlang.garbage_collect()
memory_before = :erlang.process_info(self(), :memory) |> elem(1)
SyntaxNormalization.process(input, %{repairs: [], options: []})
memory_after = :erlang.process_info(self(), :memory) |> elem(1)
{size, memory_after - memory_before}
end)
# Memory should grow linearly with input size
assert_linear_memory_growth(memory_usage)
end
4.3 Regression Testing
Functional Correctness:
- All existing Layer 3 tests must pass without modification
- UTF-8 handling must remain correct
- Edge cases (empty input, malformed JSON) must work
- Repair action generation must be preserved
Integration Testing:
- End-to-end JsonRemedy processing must produce identical results
- Layer ordering and pipeline behavior must remain unchanged
- Error handling and validation must work correctly
5. Implementation Details
5.1 File Structure
New Files:
lib/json_remedy/layer3/
├── syntax_normalization.ex # Updated main module
├── optimized/
│ ├── iolist_builder.ex # IO list utilities
│ ├── binary_parser.ex # Binary pattern matching
│ ├── single_pass_processor.ex # Unified processor
│ └── context_tracker.ex # State management
└── performance/
├── benchmarks.ex # Performance test suite
└── profiler.ex # Instrumentation utilities
Modified Files:
lib/json_remedy/layer3/syntax_normalization.ex # Main API preserved, implementation optimized
test/unit/layer3_syntax_normalization_test.exs # Add performance tests
test/critical/performance_stress_layer_3_test.exs # Update performance targets
5.2 Backwards Compatibility
API Preservation:
- Public API functions maintain exact same signatures
- Return value formats remain unchanged
- Error handling behavior preserved
- Repair action structure unchanged
Configuration Compatibility:
- All existing options continue to work
- Default behavior remains identical
- Error messages and logging preserved
5.3 Migration Strategy
Gradual Rollout:
- Phase 1: Implement optimizations behind feature flag
- Phase 2: A/B test optimized vs original implementation
- Phase 3: Enable optimizations by default
- Phase 4: Remove original implementation after validation
Feature Flag:
@optimized_processing Application.compile_env(:json_remedy, :layer3_optimized, true)
def process(input, context) do
if @optimized_processing do
process_optimized(input, context)
else
process_original(input, context)
end
end
6. Risk Mitigation
6.1 Technical Risks
Risk: Binary Pattern Matching UTF-8 Issues
- Mitigation: Comprehensive UTF-8 test suite with emoji, Asian characters, combining characters
- Fallback: Maintain String.at/2 implementation as backup
Risk: IO List Memory Usage
- Mitigation: Monitor memory usage during development, implement streaming for very large inputs
- Fallback: Hybrid approach using IO lists for small segments
Risk: Functional Regression
- Mitigation: Comprehensive test coverage, property-based testing, fuzzing with random inputs
- Fallback: Feature flag allows instant rollback to original implementation
6.2 Performance Risks
Risk: Optimization Doesn’t Achieve Linear Scaling
- Mitigation: Incremental optimization with measurement at each step
- Contingency: Focus on most impactful optimizations first (string concatenation)
Risk: Memory Usage Increases
- Mitigation: Memory profiling throughout development
- Contingency: Implement memory-conscious alternatives for large inputs
6.3 Timeline Risks
Risk: Implementation Takes Longer Than Planned
- Mitigation: Break into smaller, independently valuable pieces
- Contingency: Prioritize string concatenation fixes which provide biggest impact
7. Success Metrics
7.1 Performance Metrics
Primary Metrics:
- Processing time scales linearly with input size (R² > 0.95)
- 10x improvement for 100-object inputs
- Memory usage remains proportional to input size
Secondary Metrics:
- Reduced garbage collection pressure
- Lower CPU utilization per character processed
- Improved throughput (objects/second)
7.2 Quality Metrics
Correctness:
- 100% test suite pass rate
- No functional regressions
- Identical repair action generation
Reliability:
- No new error conditions introduced
- Graceful handling of edge cases
- Consistent behavior across input sizes
7.3 Monitoring Plan
Development Monitoring:
- Continuous benchmarking in CI/CD pipeline
- Memory usage tracking for large inputs
- Performance regression alerts
Production Monitoring:
- Processing time percentiles (P50, P95, P99)
- Memory usage patterns
- Error rate monitoring
- Throughput measurement
8. Conclusion
This optimization plan addresses the quadratic performance bottleneck in Layer 3 through systematic improvements to string building, character access, and processing architecture. The phased approach ensures minimal risk while delivering significant performance improvements.
Expected Outcomes:
- Performance: 10-100x improvement for large inputs
- Scalability: Linear O(n) scaling replaces quadratic O(n²)
- Memory: Reduced allocation and garbage collection pressure
- Maintainability: Cleaner, more efficient code architecture
Timeline: 5 weeks for complete implementation with continuous validation and testing throughout.
The optimization preserves all existing functionality while dramatically improving performance, making JsonRemedy suitable for processing large JSON documents efficiently.