BNNR

Artifacts and Run Output

What you will find here

Real on-disk outputs produced by BNNR and consumed by report/dashboard/export flows.

When to use this page

Use this for experiment tracking, debugging missing outputs, and integration with external tooling.

Output directories

Configured by BNNRConfig:

  • checkpoint_dir
  • report_dir

Typical run structure:

report_dir/
  run_YYYYMMDD_HHMMSS/
    report.json
    events.jsonl
    run.log
    artifacts/
      xai/
      samples/
      candidate_previews/

Checkpoints are saved in checkpoint_dir.

Analyze output (bnnr analyze)

When you run bnnr analyze --model ... --data ... --output DIR, the following are written under DIR:

  • analysis_report.json — full analysis payload (metrics, per_class_accuracy, confusion, XAI, data_quality, failure_patterns, recommendations, extended fields; see keys below).
  • report.htmlself-contained HTML report: XAI/confusion overlay PNGs embedded as base64 when written by bnnr analyze or report.to_html(..., artifact_root=output_dir). Safe to share without sibling files.
  • artifacts/confusion_pairs/ (optional) — saliency overlays for top confused class pairs.
  • artifacts/class_examples/ (optional) — best/worst per-class XAI overlays.
  • artifacts/xai_examples/ (optional) — XAI overlays for high-confidence wrong predictions.
  • artifacts/data_quality/ (optional) — thumbnails from data quality analysis (embedded in HTML; paths also in JSON).

Structure of analysis_report.json (top-level keys, v0.2):

  • schema_version — report schema version string (aligned with package version since 0.4.8; legacy artifacts may use 0.2.1).
  • metrics — e.g. accuracy, f1_macro, loss.
  • per_class_accuracy — per-class counts and accuracy.
  • confusion — confusion matrix (format depends on task).
  • xai_insights, xai_diagnoses, xai_quality_summary — present when XAI was run.
  • data_quality — result of data quality analysis (warnings, duplicate_groups, summary).
  • failure_patterns — e.g. confused pairs, low XAI quality classes.
  • confusion_pair_xai, best_worst_examples — optional XAI overlay metadata (classification; written when XAI is enabled).
  • calibration_summary, analysis_scope, cv_results — optional sections depending on task and flags.
  • recommendations — list of text recommendations.
  • executive_summary — health status/score, key findings, top actions, critical classes.
  • findings — structured findings with evidence and interpretation.
  • recommendations_structured — structured recommendations linked to findings.
  • class_diagnostics, true_distribution, pred_distribution, distribution_summary.
  • failure_patterns_extended — extended failure taxonomy.
  • xai_quality_per_class, xai_examples_per_class — per-class XAI scores and overlay examples.
  • data_quality_summary — flattened dataset health summary for direct consumption by tools.
  • cv_results — optional cross-validation metrics (classification, when cv_folds > 1).

Schema orientation (classification vs multilabel):

  • Classification runs with rich XAI typically populate xai_insights, xai_diagnoses, and xai_quality_per_class.
  • Multilabel runs use a lightweight high-loss saliency path and primarily expose xai_examples_per_class (for example multilabel_high_loss) plus xai_quality_summary.multilabel_probe_count and xai_quality_summary.multilabel_xai_note.
  • Do not assume every XAI key is non-empty for every task; interpret XAI fields together with analysis_scope.task.

Analyze does not write to events.jsonl; it is standalone.

report.json

Generated by Reporter (src/bnnr/reporting.py).

Common top-level keys:

  • config
  • best_path
  • best_metrics
  • selected_augmentations
  • total_time
  • checkpoints
  • iteration_summaries
  • analysis

events.jsonl

Written when event_log_enabled=true (default). Each line is a JSON object with schema_version ("2.1", from bnnr.events.EVENT_SCHEMA_VERSION). Used by replay and export in src/bnnr/events.py and src/bnnr/dashboard/backend.py.

Common event types emitted by current code:

  • run_started
  • dataset_profile
  • probe_set_initialized
  • pipeline_phase
  • epoch_end
  • branch_created
  • branch_evaluated
  • branch_selected
  • sample_snapshot
  • sample_prediction_snapshot
  • xai_snapshot
  • pipeline_complete

Dashboard export artifacts

python3 -m bnnr dashboard export --run-dir <run_dir> --out <out_dir> writes:

  • index.html
  • data/events.jsonl
  • data/state.json
  • optional data/report.json
  • copied artifacts/
  • manifest.json

Operational checks

If replay/export appears empty, verify:

  • target run directory exists,
  • events.jsonl exists and is non-empty,
  • run was produced with event logging enabled (CLI keeps this enabled for train command).

For end-user dashboard operations (live/replay/mobile/QR), see dashboard.md.