Part of CNS 8.0 / Grounded Dialectical Orthesis

19 — Runtime Oracle Boundary Policy

19 — Runtime Oracle Boundary Policy

Policy

CNS 8.0 can use oracles during training and evaluation. Runtime analysis cannot use hidden labels, answer keys, or LLM judgments.

Allowed offline oracle use

  • labels in SciFact, FEVER, and synthetic tasks;
  • expert annotations;
  • calibration labels;
  • human review labels;
  • synthetic hidden context labels for evaluation;
  • gold rationales for training extraction.

Forbidden runtime oracle use

  • answer keys;
  • gold labels in prompts;
  • synthetic generation parameters;
  • LLM judge as final truth oracle;
  • access to withheld test rationales;
  • hidden evaluator calls inside runtime;
  • prompting that asks a model to choose the correct label using unseen gold data.

Required metadata

Every run manifest records:

  • dataset split hash;
  • label availability;
  • prompt templates;
  • model IDs;
  • proof rule version;
  • calibration model version;
  • oracle-use declaration;
  • leakage scan result.

Leakage checks

  • scan prompts for label fields;
  • verify runtime input schema excludes gold labels;
  • run random-label controls;
  • run shuffled-evidence controls;
  • isolate synthetic generator seeds;
  • withhold latent context variables during inference.

Output language

CNS keeps likely claims separate from strict claims.

Allowed:

Strict: follows from proof trace.
Likely: posterior-supported but not strict.
Hypothesis: generated for testing.
Unresolved: evidence/access insufficient.
Rejected: failed gate.

Human review

Human experts may review outputs. Their judgments are post-runtime annotations unless explicitly used in a retraining/calibration step.

Step 20 of 39 in CNS 8.0 / Grounded Dialectical Orthesis