Skill v1.0.0
Automated scan100/100version: "1.0.0" name: debug description: Investigation-first debugging — gather evidence, form confirmed root-cause hypothesis, write regression test, apply minimal fix via fix mode handoff. argument-hint: <symptom or failing test> [--no-challenge] effort: medium allowed-tools: Read, Write, Edit, Bash, Grep, Glob, Agent, TaskCreate, TaskUpdate, AskUserQuestion disable-model-invocation: true
<objective>
Investigation-first debugging. Gather evidence, trace data flow, form confirmed root-cause hypothesis, write regression test, hand off to fix mode.
NOT for: production incidents without local reproduction (use /foundry:investigate for triage); .claude/ config issues (use /foundry:audit).
</objective>
<workflow>
<!-- Agent Resolution: canonical table at plugins/develop/skills/_shared/agent-resolution.md -->
Agent Resolution
# Locate develop plugin shared dir — installed first, local workspace fallback_DEV_SHARED=$(ls -td ~/.claude/plugins/cache/borda-ai-rig/develop/*/skills/_shared 2>/dev/null | head -1)[ -z "$_DEV_SHARED" ] && _DEV_SHARED="plugins/develop/skills/_shared"
Read $_DEV_SHARED/agent-resolution.md. Contains: foundry check + fallback table. If foundry not installed: use table to substitute each foundry:X with general-purpose. Agents this skill uses: foundry:sw-engineer, foundry:challenger.
Task hygiene: Before creating tasks, call TaskList. For each found task:
- status
completedif work clearly done - status
deletedif orphaned / no longer relevant - keep
in_progressonly if genuinely continuing
Task tracking: immediately after Step 1 (scope known), TaskCreate all steps before any other work. Mark each step in_progress when starting, completed when done.
Anti-Rationalizations
| Temptation | Reality | |
|---|---|---|
| "I already know the root cause from the traceback" | Tracebacks show where, not why. Assumptions without code-path verification produce fixes for the wrong bug. | |
| "The fix is obvious — Step 2 pattern analysis is overkill" | Obvious causes are often symptoms. Pattern comparison reveals ordering, timing, or environment differences invisible in the traceback. | |
"I'll just apply the fix here instead of handing off to /develop:fix" | Debug is investigation-only. Mixing investigation and implementation conflates history and skips the regression test gate. | |
| "Low confidence is fine — I'll just try the fix and see" | A fix without a confirmed hypothesis is a guess. Guesses produce fixes that pass tests but don't resolve the underlying problem. |
Project Detection
Read $_DEV_SHARED/runner-detection.md — sets $TEST_CMD (full suite) and $PYTEST_CMD (pytest flags). Run at skill start.
Checkpoint: debug is investigation-only — no code changes. .plans/active/debug_<slug>.md (written in Step 4) serves as implicit session state. No .developments/ checkpoint needed.
Debug Mode
Argument type detection: if$ARGUMENTSis a positive integer (or prefixed with#, e.g.#123), treat as GitHub issue number and fetch withgh issue view. If text (contains spaces, letters, or special chars), treat as symptom description.
Flag parsing
Set `CHALLENGE_ENABLED=true`. If --no-challenge present in $ARGUMENTS, set CHALLENGE_ENABLED=false.
Step 1: Understand the symptom
Collect all signals before forming any hypothesis.
Issue-number mode first — if $ARGUMENTS is an issue number, fetch the issue body and extract the test path BEFORE invoking pytest:
# Strip leading '#' so both '123' and '#123' workARGUMENTS="${ARGUMENTS#\#}"
# Fetch the full issue body firstISSUE_BODY=$(gh issue view "$ARGUMENTS" --comments 2>&1) # timeout: 6000echo "$ISSUE_BODY"
# Extract a test path (e.g., tests/foo.py or test_foo.py) from the issue bodyTEST_PATH=$(echo "$ISSUE_BODY" | grep -oE '(tests?/[^[:space:]]+\.py|test_[^[:space:]]+\.py)' | head -1)if [ -z "$TEST_PATH" ]; thenecho "→ No test file found in issue; running full test suite"elif [ ! -f "$TEST_PATH" ]; thenecho "⚠ test path from issue not found on disk: $TEST_PATH — running full suite"TEST_PATH=""fi
Then run pytest with the extracted path (empty $TEST_PATH → full suite):
# Read the full traceback — never just the last line$PYTEST_CMD --tb=long ${TEST_PATH} -v 2>&1 | tail -60
# What changed recently near the failing code?git log --oneline -20COMMIT_COUNT=$(git rev-list --count HEAD 2>/dev/null || echo 1)LOOKBACK=$(( COMMIT_COUNT < 5 ? COMMIT_COUNT : 5 ))[ "$LOOKBACK" -gt 1 ] && git diff HEAD~${LOOKBACK}..HEAD -- <suspect_file>
Symptom-text mode — if $ARGUMENTS is free-text, skip the issue fetch + extraction; locate the failing test path from the symptom directly, then run:
$PYTEST_CMD --tb=long <test_path> -v 2>&1 | tail -60
Use Grep (pattern: failing symbol, class, or error keyword) to trace call path from entry point to failure site. Path hint: use src/ if that directory exists, otherwise search from project root (.).
Spawn foundry:sw-engineer agent to map execution path and produce:
- Entry point to failure: which modules does call cross?
- What state mutated along the way?
- What invariant violated at failure point?
- Any recent commit touching this path (from git log output)
Scope gate: if root cause spans 3+ modules, flag complexity smell. Use AskUserQuestion to present scope concern before proceeding, with options: "Narrow scope (Recommended)" / "Proceed anyway".
Present agent's analysis summary before proceeding.
Step 2: Pattern analysis
Find nearest similar working code path, compare exhaustively:
- Locate 2-3 code paths handling similar input or similar work successfully
- List every difference between working path and broken one — not just obvious one
- Check across axes:
- Same input, different environment (versions, config, data shape)?
- Same logic, different call order or timing?
- Conditionals taking different branches on different inputs?
- None/empty guards present in working path but absent in broken one?
Step catches non-obvious causes — ordering dependency, environment-specific state, type coercion silently changing behaviour.
Challenger gate
Skip if `CHALLENGE_ENABLED=false`.
Spawn foundry:challenger with the pattern analysis from Step 2 (differences between working/broken paths, candidate causes):
"Review the pattern analysis and candidate root causes identified. Challenge across all 5 dimensions: Assumptions, Missing Cases, Security Risks, Architectural Concerns, Complexity Creep. Apply mandatory refutation step."
Parse result:
- Blockers found → STOP. Present findings. Incorporate challenger's surviving challenges into the hypothesis list before the Step 3 gate.
- Concerns only → add as alternative hypotheses in Step 3; continue.
- No findings / all refuted → proceed.
Step 3: Hypothesis and gate
State root cause hypothesis explicitly before writing any code:
Root cause: <one sentence — what is wrong and why>Evidence for: [signals that support this]Evidence against: [anything that contradicts or remains unexplained]Confidence: high / medium / low
Gate: present hypothesis to user and wait for confirmation or challenge before proceeding to Step 4. Wrong hypothesis produces fix that passes tests but doesn't resolve underlying problem.
Autonomous-mode fallback (when running as a subagent with no direct user interaction):
- Confidence high: proceed automatically to Step 4; note "auto-confirmed (subagent mode)" in Final Report
- Confidence medium: return hypothesis + evidence to parent agent as structured JSON; let parent decide:
{"hypothesis":"<root cause>","evidence":["<s1>","<s2>"],"confidence":"medium","action_required":"confirm_before_fix"} - Confidence low: run targeted probe (minimal script, added assertion) to gather more signal before returning to parent
If confidence low: propose targeted probe (minimal script, added log statement, single assertion) to gather missing signal — run it before committing to fix.
Step 4: Hand off to fix
Root cause confirmed. Transition to fix mode with diagnosis as input — fix's Step 1 pre-answered.
Emit handoff block:
Root cause: <confirmed hypothesis from Step 3>Suspect file(s): <files identified in Steps 1-2>Evidence: <key signals that confirmed the hypothesis>
Write diagnosis to file before handing off — enables /develop:fix to skip Step 1 analysis via --diagnosis <path>:
SLUG=$(echo "<symptom first 4 words>" | tr ' ' '-' | tr '[:upper:]' '[:lower:]' | tr -cd '[:alnum:]-')[ -z "$SLUG" ] && SLUG="unnamed-$(date +%s)"DIAG_FILE=".plans/active/debug_${SLUG}.md"mkdir -p .plans/active
Write $DIAG_FILE with this structure:
# Debug Diagnosis: <symptom>## Root Cause<one sentence — confirmed hypothesis>## Suspect Files-path/to/file.py — <reason>## Evidence-<signal 1 that confirmed hypothesis>-<signal 2>## Confidence<high|medium|low>
Hand off: -> /develop:fix --diagnosis $DIAG_FILE. Root cause already known — fix's Step 1 analysis is complete.
Final Report
After root cause confirmed and handoff to /develop:fix complete, emit terminal summary:
Root Cause: <one sentence>File(s): <suspect files>Evidence: <key signals>→ Handed off to /develop:fix --diagnosis $DIAG_FILE## Confidence**Score**: 0.N — [high ≥0.9 | moderate 0.8–0.9 | low <0.8 ⚠]**Gaps**:-[e.g., unverified alternative hypotheses, hypothesis only — not confirmed via test reproduction]**Refinements**: N passes.
Follow-up gate (NEVER SKIP) — Call AskUserQuestion tool — do NOT write options as plain text first. Map options directly into the tool call arguments:
- question: "Proceed with fix?"
- (a) label:
/develop:fix --diagnosis $DIAG_FILE— description: proceed with fix using confirmed diagnosis - (b) label:
skip— description: no action
Team Assignments
When to use team mode: root cause unclear after Step 2, OR failure spans 3+ modules.
- Teammate 1-3 (foundry:sw-engineer x 2-3, model=opus): each investigates distinct root-cause hypothesis independently
Coordination:
- Lead broadcasts current evidence:
{symptom: <description>, traceback: <key lines>} - Each teammate claims one hypothesis, investigates independently — no overlap
- Lead facilitates cross-challenge between competing analyses
- Convergence deadline: after cross-challenge, if teammates still disagree on root cause, lead selects the hypothesis with the most direct evidence (observable in code or logs). If truly tied, use
AskUserQuestionto present the top 2 competing hypotheses and ask user to guide. - Lead synthesises consensus root cause, then executes Steps 3-4 (hypothesis gate, hand off to fix) alone
Spawn prompt template:
You are a foundry:sw-engineer teammate debugging: [symptom].Read ${HOME}/.claude/TEAM_PROTOCOL.md — use AgentSpeak v2 for inter-agent messages.Your hypothesis: [hypothesis N]. Investigate ONLY this root cause.Report findings to @lead using deltaT# or epsilonT# codes.Compact Instructions: preserve file paths, errors, line numbers. Discard verbose tool output.Task tracking: do NOT call TaskCreate or TaskUpdate — the lead owns all task state. Signal completion in your final delta message: "Status: complete | blocked — <reason>".Write your full analysis to .plans/active/debug-hypothesis-[N]-[timestamp].md using the Write tool. Return ONLY compact JSON: {"status":"done","file":"<path>","findings":N,"confidence":0.N}.
</workflow>