Part of CNS 8.0 / Grounded Dialectical Orthesis

23 — Data and Run Manifest Specification

23 — Data and Run Manifest Specification

Why manifests matter

CNS 8.0 experiment records track oracle separation and reproducibility.

Dataset manifest

{
  "dataset_id": "scifact_dev_v1",
  "source": "local_or_remote",
  "split": "dev",
  "hash": "...",
  "label_fields_available_offline": ["label", "rationale"],
  "label_fields_available_runtime": [],
  "created_at": "2026-05-15"
}

Run manifest

{
  "run_id": "cns8_run_001",
  "config_hash": "...",
  "dataset_manifest": "dataset_manifest.json",
  "oracle_policy": {
    "training_oracles": true,
    "runtime_oracles": false,
    "leakage_scan": "passed"
  },
  "models": {
    "proposer": "...",
    "entailment": "...",
    "synthesizer": "..."
  },
  "rule_bank_version": "rules_v0",
  "schemas": {
    "sno": "sno8.schema.json",
    "proof": "proof_trace.schema.json"
  },
  "metrics": {},
  "artifacts": {}
}

Artifact map

runs/{run_id}/
  evidence_atoms.jsonl
  proposed_snos.jsonl
  critic_reports.jsonl
  selected_pairs.jsonl
  proof_closure.jsonl
  residual_tensors/
  latent_predicates.jsonl
  synthesized_snos.jsonl
  orthesis_reports.jsonl
  final_report.md
  run_manifest.json

Required hashes

  • evidence atom hashes;
  • prompt template hashes;
  • config hash;
  • rule bank hash;
  • schema hash;
  • dataset split hash;
  • proof trace checksum.

Oracle leakage fields

Runtime input schemas exclude:

  • label;
  • gold rationale;
  • correct answer;
  • hidden context;
  • generator seed;
  • ground-truth world ID.