← Back to Desktop

1

Documentation for 1 from the Json remedy repository.

JSON Repair Core Functions Analysis

Core Parser Class and Entry Points

JSONParser Class

  • Primary entry point for all parsing logic
  • Manages parsing state (index, context, logging)
  • Contains all core repair algorithms

Main Parsing Functions

parse()JSONReturnType

  • Main orchestrator function
  • Handles multiple JSON objects in sequence
  • Manages array wrapping for multiple objects
  • Core logic for determining when to parse vs. skip content

parse_json()JSONReturnType

  • Central dispatcher for JSON element types
  • Determines what type of JSON element to parse based on current character
  • Routes to appropriate specialized parser (object, array, string, number, boolean/null)
  • Handles edge cases and fallback logic

Container Parsing Functions

parse_object()dict[str, JSONReturnType]

  • Object parsing with extensive repair logic
  • Handles missing quotes in keys and values
  • Manages missing colons after keys
  • Fixes missing commas between members
  • Handles duplicate keys and array merging
  • Manages missing opening/closing braces
  • Context-aware parsing for object keys vs values

parse_array()list[JSONReturnType]

  • Array parsing with repair capabilities
  • Handles missing commas between elements
  • Fixes missing opening/closing brackets
  • Special logic for detecting objects within arrays (missing braces)
  • Manages empty arrays and trailing elements

Primitive Parsing Functions

parse_string()str | bool | None

  • Most complex repair function - handles majority of edge cases
  • Missing quote detection and repair:
    • Detects missing opening quotes
    • Detects missing closing quotes
    • Handles different quote types (", ‘, “, “)
  • Context-aware termination:
    • Uses context to determine string boundaries
    • Different termination rules for object keys vs values vs array elements
  • Escape sequence handling:
    • Processes Unicode escapes (\u, \x)
    • Handles standard escapes (\n, \t, \r, \b)
  • Special cases:
    • Doubled quotes handling
    • Unmatched delimiters
    • Streaming stability mode
    • HTML/Markdown content within strings
  • Boolean/null detection: Delegates to parse_boolean_or_null() when appropriate

parse_number()float | int | str | JSONReturnType

  • Number parsing with format flexibility
  • Handles various number formats and edge cases
  • Falls back to string parsing when number format is invalid
  • Context-aware parsing (different behavior in arrays)

parse_boolean_or_null()bool | str | None

  • Boolean and null literal parsing
  • Case-insensitive matching for true/false/null
  • Partial matching with rollback on failure

Utility and Helper Functions

parse_comment()str

  • Comment removal logic
  • Handles line comments (# and //)
  • Handles block comments (/* */)
  • Returns empty string to effectively remove comments

get_char_at(count: int = 0)str | Literal[False]

  • Safe character access with bounds checking
  • Returns False when at end of input
  • Critical for all parsing operations

skip_whitespaces_at(idx: int = 0, move_main_index=True)int

  • Whitespace skipping utility
  • Used throughout parsing to handle formatting variations
  • Can optionally move main parsing index

skip_to_character(character: str, idx: int = 0)int

  • Character search utility
  • Used for finding delimiters, terminators, etc.
  • Essential for lookahead operations in repair logic

Context Management

JsonContext Class

  • Tracks parsing context (object key, object value, array)
  • Critical for context-aware string parsing
  • Determines how to handle ambiguous cases

ContextValues Enum

  • Defines the different parsing contexts:
    • OBJECT_KEY
    • OBJECT_VALUE
    • ARRAY

Key Repair Strategies

1. Quote Repair

  • Automatic quote addition for unquoted strings
  • Quote type normalization
  • Missing quote detection using context and lookahead

2. Delimiter Repair

  • Missing commas between array elements and object members
  • Missing colons after object keys
  • Missing opening/closing brackets and braces

3. Context-Aware Parsing

  • Different parsing rules based on current context
  • Smart termination of strings based on surrounding structure
  • Fallback mechanisms when primary parsing fails

4. Streaming Stability

  • Special mode for incomplete/streaming JSON
  • Preserves partial content rather than aggressive cleanup
  • Handles incomplete escape sequences

5. Type Coercion and Fallbacks

  • Numbers that fall back to strings
  • Strings that contain boolean/null values
  • Object detection within arrays

Critical Edge Case Handling

  • Empty strings and null values
  • Malformed escape sequences
  • Mixed quote types
  • Missing structural elements (braces, brackets)
  • Embedded objects without proper delimiters
  • Comments and non-JSON content
  • Unicode and special character handling
  • Streaming/incomplete JSON

All these functions work together to provide comprehensive JSON repair capabilities, with the parse_string() function being the most complex as it handles the majority of real-world JSON formatting issues.