Skip to content

feat(v3.3.0): wiki knowledge base + auto-research loop + multi-LLM council#53

Merged
rohitg00 merged 5 commits into
mainfrom
feat/wiki-knowledge-base
May 9, 2026
Merged

feat(v3.3.0): wiki knowledge base + auto-research loop + multi-LLM council#53
rohitg00 merged 5 commits into
mainfrom
feat/wiki-knowledge-base

Conversation

@rohitg00
Copy link
Copy Markdown
Owner

@rohitg00 rohitg00 commented May 8, 2026

Summary

Persistent knowledge plane layered on top of pro-workflow's self-correction memory. Five new skills and a research-loop driver that turns any wiki into an auto-grown knowledge base.

What's new

Skills (5):

  • wiki-builder — 9 flavors (research, paper, domain, product, person, organization, project, codebase, incident), SQLite FTS5 shadow index, --scope global|project
  • wiki-query — BM25 retrieval with snippets, related and show subcommands
  • wiki-research-loop — budget-capped BFS driver, pluggable source fetchers (web / arXiv / GitHub + custom), convergence detection, kill-switch via ~/.pro-workflow/STOP
  • llm-council — provider-agnostic 3-phase deliberation (Anthropic / OpenAI / OpenRouter / Fireworks / custom OpenAI-compatible)
  • survey-generator — provider-agnostic literature survey artifact, output target = wiki markdown page (not standalone HTML)

Scripts:

  • scripts/embed-wiki.js — optional embeddings via OpenAI / Voyage. Hybrid BM25 + vector + RRF retrieval. Degrades to BM25-only when no embedding key set
  • scripts/research-tick.js — cron-ready single-iteration runner
  • scripts/learn-capture.js parses Wiki: <slug> for wiki-scoped learning rules
  • scripts/prompt-submit.js auto-injects top-3 wiki hits when prompt matches the index
  • scripts/session-start.js lists registered wikis on session boot
  • scripts/file-changed.js enqueues verify-seeds when files inside a wiki tree change

Schema additions:

  • wikis, wiki_pages (+ wiki_pages_fts FTS5), wiki_sources, wiki_claims, wiki_seeds, wiki_embeddings, learnings_wiki
  • Build now copies schema.sql into dist/ (replaces the silent inline-fallback hazard)

Commands:

  • /wiki unified entry: init, list, info, page, reindex, ask, related, show, seed, research, seeds, cancel, status, embed, hybrid, council, survey
  • /doctor extended with wiki KB and council-provider sections

Storage

  • Global: ~/.pro-workflow/wikis/<slug>/
  • Project: <project>/.claude/wikis/<slug>/ via --scope project

Both indexed in ~/.pro-workflow/data.db.

Headline counts

29 skills · 8 agents · 22 commands · 31 hook scripts across 24 events

Test plan

  • init_wiki.sh scaffolds correct folder layout per flavor
  • wiki-cli.js init/list/info/page/reindex round-trip via SQLite
  • wiki-query BM25 search, scoped + cross-wiki + related
  • wiki-research-loop end-to-end with stub fetcher: seed enqueue, page compile, follow-up enqueue, novelty calc, kill-switch, cancel
  • learn-capture parses Wiki: <slug> and links learning via learnings_wiki
  • session-start lists wikis; prompt-submit injects top-3 hits when ≥3 words match
  • council.js providers reports env-var availability across 5 providers
  • embed-wiki.js exits cleanly without provider env (no crash on missing key)
  • npm run build clean (tsc + schema copy); tsc --noEmit clean
  • Live API test for council / survey / embed-wiki (deferred — requires user-supplied keys)

Backwards compatibility

  • New tables only; no existing-row migrations needed
  • Existing learnings flow unchanged unless Wiki: line present
  • Inline schema-fallback in db/index.ts removed; build now mandatory (npm run build always was, this just makes the failure mode loud)

Follow-ups (not in this PR)

  • Live integration tests for council / survey under user-supplied API keys
  • sqlite-vec native binding (currently uses JS dot-product; works for thousands of pages, slower at 100k+)
  • LLM-judged claim novelty (currently Jaccard token overlap)

Summary by CodeRabbit

  • New Features

    • Persistent research wikis with FTS5 search, wiki CLI, page management, init templates, and wiki-scoped learnings
    • Wiki querying, hybrid search (embeddings + BM25), embedding tooling, and related-page discovery
    • Opt-in auto-research loop, research tick scheduler, and wiki seed workflow
    • LLM Council multi-model consensus, wiki-persisted transcripts, and survey-generator for literature reviews
  • Documentation

    • Updated README, onboarding tour, how Pro Workflow compares, and many new skill/command docs
  • Chores

    • Bumped package to v3.3.0 and updated build to include DB schema artifacts

Encodes the four upstream-failure preventions Karpathy named: silent
assumptions, overcomplicated diffs, drive-by edits, vague success
criteria. Self-correction catches mistakes after the fact - this
catches them before.

Adds rules/pre-flight-discipline.mdc (alwaysApply: true) and a §1b
cross-reference in SKILL.md so it sits next to self-correction in
the narrative. No changes to existing rules or agents.

Adapted from forrestchang/andrej-karpathy-skills (MIT) with light
reframing to match pro-workflow's voice and structure.
…uncil

Persistent knowledge plane on top of self-correction memory.

Skills (5 new):
- wiki-builder: 9 flavors, FTS5 shadow index, --scope global|project
- wiki-query: BM25 retrieval with snippet, related, show subcommands
- wiki-research-loop: budget-capped BFS driver, pluggable source fetchers
  (web/arxiv/github + custom), convergence detection, kill-switch
- llm-council: provider-agnostic 3-phase deliberation
  (Anthropic/OpenAI/OpenRouter/Fireworks/custom OpenAI-compat)
- survey-generator: provider-agnostic literature survey, output to wiki page

Scripts:
- embed-wiki.js: optional embeddings via OpenAI/Voyage; hybrid BM25+vector+RRF
- research-tick.js: cron-driven single-iteration runner
- learn-capture: parses Wiki: <slug> for wiki-scoped learnings
- prompt-submit: auto-injects top-3 wiki hits when prompt matches index
- session-start: lists registered wikis on session boot
- file-changed: edits inside wiki tree auto-enqueue verify seeds

Schema:
- wikis, wiki_pages (+FTS5), wiki_sources, wiki_claims, wiki_seeds,
  wiki_embeddings, learnings_wiki

Commands:
- /wiki unified entry: init, list, info, page, reindex, ask, related,
  show, seed, research, seeds, cancel, status, embed, hybrid, council, survey
- /doctor extended with wiki KB + council provider sections

Build:
- schema.sql now copied into dist/ on build, eliminating silent
  inline-fallback hazard
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 8, 2026

📝 Walkthrough

Walkthrough

This PR introduces pro-workflow v3.3.0: a persistent wiki knowledge base (FTS5 full-text search plus optional embeddings), DB schema and store API extensions, CLI tooling (wiki-cli, wiki-query, embed-wiki), an automated research loop with fetchers, an LLM council workflow, survey-generation tooling, orchestration hooks (file-change/research-tick/session-start), and documentation and package updates.

Changes

Wiki Knowledge Base System

Layer / File(s) Summary
Database Schema
src/db/schema.sql
Adds wikis, wiki_pages, wiki_sources, wiki_claims, wiki_seeds, wiki_embeddings, learnings_wiki, FTS5 wiki_pages_fts, triggers, and supporting indexes.
DB Init
src/db/index.ts, package.json
Schema loading now locates schema.sql from build or source and fails with a clear message if missing; package build copies schema.sql into dist/db.
Store API / Types
src/db/store.ts
Exports new types (WikiFlavor, WikiScope, Wiki, WikiPage, WikiSearchHit, WikiSeed) and extends Store with wiki/page CRUD, searchWiki(...) (FTS5 + sanitize), seed-queue ops, and addLearning(..., wikiSlug?) plus getLearningsByWiki.
Embeddings & Search
src/search/embeddings.ts, scripts/embed-wiki.js
Adds embedding provider abstraction (OpenAI/Voyage), vector serialization/upsert, vectorSearch, and reciprocalRankFusion; CLI embed-wiki.js to batch embeddings and perform vector/BM25/hybrid search.
Wiki Builder CLI & Templates
skills/wiki-builder/scripts/init_wiki.sh, skills/wiki-builder/scripts/wiki-cli.js, skills/wiki-builder/templates/*, skills/wiki-builder/SKILL.md
Adds init_wiki.sh scaffold, wiki-cli.js for init/list/info/page/reindex, templates for index/config/sources/prompts, and documentation describing flavors and layout.
Wiki Query CLI
skills/wiki-query/scripts/query.js, skills/wiki-query/SKILL.md
Adds query.js supporting search, related, and show backed by FTS5 BM25, with JSON/formatted outputs and SessionStart integration.
Research Loop & Fetchers
skills/wiki-research-loop/scripts/research-loop.js, skills/wiki-research-loop/scripts/source-fetchers/*, skills/wiki-research-loop/SKILL.md
Adds automated BFS research loop with pluggable fetchers (web/arXiv/GitHub), claim extraction, novelty (Jaccard) checks, seed queue management, budget/depth halting, and CLI seed/status/cancel commands.
Orchestration Scripts
scripts/research-tick.js, scripts/file-changed.js, scripts/session-start.js, scripts/prompt-submit.js, scripts/learn-capture.js
Adds cron-like research-tick.js, reactive file-change enqueue for wiki edits, session-start wiki listing, prompt-submit wiki hints, and learn-capture linking to learnings_wiki.
LLM Council CLI
skills/llm-council/scripts/council.js, skills/llm-council/SKILL.md
Adds council.js implementing a three-phase multi-model workflow (fan-out responses, anonymized ranking, chairman synthesis) with optional --wiki persistence.
Survey Generator
skills/survey-generator/scripts/build-survey.js, skills/survey-generator/templates/research_bundle.template.json, skills/survey-generator/SKILL.md
Adds build-survey.js to generate versioned markdown surveys from research bundles, append deduped bibliography rows to sources.md, and register results via wiki-cli.
Docs / Commands / Packaging
README.md, commands/wiki.md, commands/doctor.md, package.json
README updated with "What's New in v3.3", /wiki command reference, /doctor checks for wiki and council providers; package.json bumped to 3.3.0 and build script copied schema into dist/db.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

A rabbit builds wikis from seeds sown wide,
With FTS5 searching and vectors inside,
Councils of models debate and refine,
Surveys compile threads into a line,
Hop—knowledge grows, indexed and bona fide. 🐰📚✨

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/wiki-knowledge-base

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

🟠 Major comments (21)
package.json-8-8 (1)

8-8: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

build script is not cross-platform due to cp.

This will fail in standard Windows environments and break release/build workflows for some contributors.

💡 Suggested fix
-    "build": "tsc && cp src/db/schema.sql dist/db/schema.sql",
+    "build": "tsc && node -e \"const fs=require('fs');fs.mkdirSync('dist/db',{recursive:true});fs.copyFileSync('src/db/schema.sql','dist/db/schema.sql')\"",
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@package.json` at line 8, The package.json "build" script uses the Unix-only
"cp" command which fails on Windows; update the "build" script (the "build" npm
script entry) to perform a cross-platform copy instead—either add a dev
dependency like "cpx" or "copyfiles" and change the script to use that tool to
copy src/db/schema.sql to dist/db/schema.sql, or implement a small Node script
(e.g., copy-schema.js) and call node copy-schema.js from the "build" script;
ensure the chosen tool/script is added to devDependencies and referenced in the
"build" script so builds work on all platforms.
src/search/embeddings.ts-93-123 (1)

93-123: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Vector search currently mixes incompatible embedding dimensions.

cosine() truncates to the shorter vector, so mismatched embeddings can still get ranked, producing incorrect results.

💡 Suggested fix
-function cosine(a: Float32Array, b: Float32Array): number {
+function cosine(a: Float32Array, b: Float32Array): number {
+  if (a.length !== b.length) return 0;
   let dot = 0, na = 0, nb = 0;
-  const len = Math.min(a.length, b.length);
+  const len = a.length;
   for (let i = 0; i < len; i++) {
     dot += a[i] * b[i];
@@
 export function vectorSearch(db: Database.Database, queryVec: Float32Array, opts: { wikiSlug?: string; limit?: number } = {}): VectorHit[] {
   const limit = opts.limit ?? 10;
   const sql = opts.wikiSlug
-    ? `SELECT e.page_id, e.vector FROM wiki_embeddings e JOIN wiki_pages p ON p.id = e.page_id WHERE p.wiki_slug = ?`
-    : `SELECT e.page_id, e.vector FROM wiki_embeddings e`;
-  const rows = opts.wikiSlug ? db.prepare(sql).all(opts.wikiSlug) : db.prepare(sql).all();
+    ? `SELECT e.page_id, e.vector, e.dim FROM wiki_embeddings e JOIN wiki_pages p ON p.id = e.page_id WHERE p.wiki_slug = ? AND e.dim = ?`
+    : `SELECT e.page_id, e.vector, e.dim FROM wiki_embeddings e WHERE e.dim = ?`;
+  const rows = opts.wikiSlug
+    ? db.prepare(sql).all(opts.wikiSlug, queryVec.length)
+    : db.prepare(sql).all(queryVec.length);
   const scored: VectorHit[] = [];
-  for (const r of rows as { page_id: number; vector: Buffer }[]) {
+  for (const r of rows as { page_id: number; vector: Buffer; dim: number }[]) {
     const v = blobToF32(r.vector);
+    if (v.length !== queryVec.length) continue;
     scored.push({ page_id: r.page_id, similarity: cosine(queryVec, v) });
   }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/search/embeddings.ts` around lines 93 - 123, The vectorSearch
implementation can compare embeddings of different dimensions because cosine()
truncates to the shorter vector; update vectorSearch to skip (or optionally log)
any row where blobToF32(r.vector).length !== queryVec.length before calling
cosine, so only equal-dimension vectors are scored; reference
functions/identifiers: vectorSearch, cosine, blobToF32, VectorHit. Ensure the
behavior is deterministic (skip mismatched vectors and still apply limit) and
consider returning fewer results if many vectors are filtered out.
skills/wiki-research-loop/scripts/source-fetchers/web.js-18-20 (1)

18-20: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Redirect resolution can generate invalid URLs for relative Location values.

Using hostname string concatenation breaks redirects like Location: next-page. Resolve with URL base semantics instead.

💡 Suggested fix
       if (res.statusCode >= 300 && res.statusCode < 400 && res.headers.location) {
-        const loc = res.headers.location.startsWith('http') ? res.headers.location : `https://${u.hostname}${res.headers.location}`;
+        const loc = new URL(res.headers.location, u).toString();
+        res.resume();
         return httpsGet(loc, headers).then(resolve, reject);
       }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/wiki-research-loop/scripts/source-fetchers/web.js` around lines 18 -
20, The redirect handling in httpsGet currently constructs a new URL by
concatenating strings (using u.hostname) which produces invalid URLs for
relative Location headers; change the resolution to use the URL constructor with
a base (e.g., const loc = new URL(res.headers.location, u).toString()) so
relative locations like "next-page" resolve correctly, then call httpsGet(loc,
headers). Update the redirect branch where res.statusCode >= 300 &&
res.statusCode < 400 && res.headers.location to use this URL-based resolution
instead of manual string concatenation.
src/search/embeddings.ts-17-38 (1)

17-38: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add explicit timeout handling to embedding API HTTP calls.

Without a request timeout, the postJSON function at lines 17-38 can hang indefinitely when calling external embedding providers (OpenAI, Voyage), blocking embedding jobs and tying up resources.

Add req.setTimeout(15000, () => req.destroy(new Error('Embedding request timeout'))) after line 35 to enforce a 15-second timeout on all embedding API requests.

Suggested fix
     req.on('error', reject);
+    req.setTimeout(15000, () => req.destroy(new Error('Embedding request timeout')));
     req.write(data);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/search/embeddings.ts` around lines 17 - 38, The postJSON function can
hang indefinitely; after creating the https request (the req variable in
postJSON) add an explicit 15s timeout by calling req.setTimeout(15000, () =>
req.destroy(new Error('Embedding request timeout'))), keeping the existing
req.on('error', reject) so the timeout error rejects the Promise; ensure this
call is placed before req.write()/req.end() so timed-out requests are cleaned up
and propagated to callers of postJSON.
skills/wiki-research-loop/scripts/source-fetchers/github.js-3-13 (1)

3-13: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add timeout and redirect cap to outbound GitHub requests.

Current request flow can hang indefinitely—no timeout is set on the request. Additionally, the function can recurse infinitely on redirect loops since there is no cap on redirects, potentially stalling the research loop worker.

Suggested fix
-function httpsGet(url, headers = {}) {
+function httpsGet(url, headers = {}, redirects = 0) {
   return new Promise((resolve, reject) => {
+    if (redirects > 5) return reject(new Error('Too many redirects'));
     const opts = { headers: { 'User-Agent': 'pro-workflow/wiki-research-loop', Accept: 'application/vnd.github+json', ...headers } };
-    https.get(url, opts, res => {
+    const req = https.get(url, opts, res => {
       if (res.statusCode >= 300 && res.statusCode < 400 && res.headers.location) {
-        return httpsGet(res.headers.location, headers).then(resolve, reject);
+        res.resume();
+        return httpsGet(res.headers.location, headers, redirects + 1).then(resolve, reject);
       }
       let data = '';
       res.on('data', c => { data += c; });
       res.on('end', () => resolve({ status: res.statusCode, body: data }));
-    }).on('error', reject);
+    });
+    req.setTimeout(10000, () => req.destroy(new Error('GitHub fetch timeout')));
+    req.on('error', reject);
   });
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/wiki-research-loop/scripts/source-fetchers/github.js` around lines 3 -
13, The httpsGet function can hang and recurse indefinitely on redirects; update
httpsGet to accept and enforce a timeout and a max-redirects cap (e.g., add
optional params timeoutMs and maxRedirects with sane defaults like 10s and 5),
set the request timeout (use the request object's setTimeout or a timer that
rejects with a clear "timeout" Error), abort the request and reject when timed
out, and when following redirects decrement/track remaining redirects and reject
if the cap is exceeded (use the same httpsGet signature when recursing so you
pass the updated remaining-redirects and headers); ensure all timers and
listeners are cleaned up on resolve/reject.
src/db/index.ts-36-40 (1)

36-40: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Close the SQLite handle on schema load/exec failure paths.

If schema file lookup, read, or exec fails, the opened DB handle at line 26 is left open. While some callers wrap createStore() with error handling, the resource should be cleaned up in initializeDatabase() itself.

💡 Suggested fix
-  const schemaPath = candidates.find(p => fs.existsSync(p));
-  if (!schemaPath) {
-    throw new Error(`pro-workflow: schema.sql not found. Tried: ${candidates.join(', ')}. Run: npm run build`);
-  }
-  const schema = fs.readFileSync(schemaPath, 'utf8');
-  db.exec(schema);
+  try {
+    const schemaPath = candidates.find(p => fs.existsSync(p));
+    if (!schemaPath) {
+      throw new Error(`pro-workflow: schema.sql not found. Tried: ${candidates.join(', ')}. Run: npm run build`);
+    }
+    const schema = fs.readFileSync(schemaPath, 'utf8');
+    db.exec(schema);
+  } catch (err) {
+    db.close();
+    throw err;
+  }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/db/index.ts` around lines 36 - 40, initializeDatabase() opens the SQLite
handle (db) early but currently throws on missing schemaPath, read errors, or
db.exec failures without closing db; update initializeDatabase() (the function
that calls createStore() and uses the db variable) to ensure the db is closed on
all error paths by wrapping the schema lookup/read/exec in a try/catch and
calling db.close() before rethrowing the error (or use try/finally to close on
failure), so that if schemaPath is not found, fs.readFileSync throws, or db.exec
throws, the open handle is properly closed.
src/db/store.ts-230-234 (1)

230-234: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Claim pending seeds atomically.

nextPendingSeed() reads a row and setSeedStatus() updates it later, so two workers can pick the same pending seed before either marks it active. This queue needs a single claim operation that selects and updates in one transaction.

One way to fix it
-  const nextPendingSeedStmt = db.prepare(`
-    SELECT * FROM wiki_seeds WHERE wiki_slug = ? AND status = 'pending'
-    ORDER BY depth ASC, created_at ASC LIMIT 1
-  `);
-  const setSeedStatusStmt = db.prepare(`UPDATE wiki_seeds SET status = ? WHERE id = ?`);
+  const claimPendingSeedStmt = db.prepare(`
+    UPDATE wiki_seeds
+    SET status = 'active'
+    WHERE id = (
+      SELECT id
+      FROM wiki_seeds
+      WHERE wiki_slug = ? AND status = 'pending'
+      ORDER BY depth ASC, created_at ASC
+      LIMIT 1
+    )
+    RETURNING *
+  `);

Then expose claimPendingSeed(wikiSlug) instead of the current select-then-update pair.

Also applies to: 388-393

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/db/store.ts` around lines 230 - 234, nextPendingSeed() and
setSeedStatus() perform separate select and update steps allowing two workers to
claim the same pending seed; replace them with a single atomic operation: add a
new claimPendingSeed(wikiSlug) that within a transaction selects the oldest
pending wiki_seeds row (ORDER BY depth ASC, created_at ASC LIMIT 1) FOR UPDATE
(or uses a single UPDATE ... WHERE id = (subselect) returning id/fields) and
immediately sets status='active' and updated_at, then returns the claimed row;
update call sites that used nextPendingSeed()/setSeedStatus() to call
claimPendingSeed(wikiSlug) instead.
src/db/store.ts-250-262 (1)

250-262: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Make learning creation and wiki-linking atomic.

If linkLearningWikiStmt.run(...) fails after the insert, the learning row is already committed and the caller gets a partial write. Wrap both statements in one transaction so addLearning() is all-or-nothing.

Suggested fix
+  const addLearningTx = db.transaction((learning: Omit<Learning, 'id' | 'created_at' | 'times_applied'>, wikiSlug?: string) => {
+    const result = addLearningStmt.run({
+      project: learning.project ?? null,
+      category: learning.category,
+      rule: learning.rule,
+      mistake: learning.mistake ?? null,
+      correction: learning.correction ?? null,
+    });
+    const row = getLearningStmt.get(result.lastInsertRowid) as Learning;
+    if (wikiSlug) {
+      linkLearningWikiStmt.run(row.id, wikiSlug);
+    }
+    return row;
+  });
+
     addLearning(learning, wikiSlug) {
-      const result = addLearningStmt.run({
-        project: learning.project ?? null,
-        category: learning.category,
-        rule: learning.rule,
-        mistake: learning.mistake ?? null,
-        correction: learning.correction ?? null,
-      });
-      const row = getLearningStmt.get(result.lastInsertRowid) as Learning;
-      if (wikiSlug) {
-        linkLearningWikiStmt.run(row.id, wikiSlug);
-      }
-      return row;
+      return addLearningTx(learning, wikiSlug);
     },
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/db/store.ts` around lines 250 - 262, addLearning currently inserts via
addLearningStmt.run and then calls linkLearningWikiStmt.run separately, which
can leave a partial write if linking fails; make the insert+optional link atomic
by wrapping the insert, get (getLearningStmt.get), and conditional link call
inside a single database transaction (use the DB/connection transaction API or
explicit BEGIN/COMMIT/ROLLBACK) so that any failure during
linkLearningWikiStmt.run rolls back the insert; update the addLearning function
to perform addLearningStmt.run, fetch the inserted row with getLearningStmt.get,
run linkLearningWikiStmt.run(row.id, wikiSlug) only inside the same transaction,
and rethrow the error after rollback.
skills/llm-council/scripts/council.js-179-190 (1)

179-190: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Wrap Promise.all() calls or use Promise.allSettled() to prevent a single model failure from aborting the entire council.

Phase 1 (line 179) and Phase 2 (line 189) both use Promise.all(...) directly on provider.call(...) without per-call rejection handling. A single failed provider request will reject the entire promise, dropping all other completed results. This leaves the run partially written (e.g., phase1_responses.json exists but phase2 never runs).

Use Promise.allSettled() or wrap each call in a rejection handler to allow the council to continue with surviving models.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/llm-council/scripts/council.js` around lines 179 - 190, The
Promise.all calls for phase1Entries and phase2Entries can abort the whole run if
one provider.call rejects; change them to use Promise.allSettled (or wrap each
provider.call in a per-call .catch to return a placeholder error object) so
failures from individual models don't reject the entire batch. Specifically
update the calls that create phase1Entries and phase2Entries (which currently
call provider.call(provider, m, ...)) to collect settled results, map fulfilled
results back into phase1 and phase2 (using phase1Entries/phase2Entries and
label_of/rankings) and handle rejected entries by storing an error placeholder
(so fs.writeFileSync of phase1_responses.json and the subsequent phase2
construction still proceed).
src/db/store.ts-405-415 (1)

405-415: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't merge punctuation-delimited terms before FTS lookup.

This normalization turns wiki-builder, gpt-4o, or store.ts into wikibuilder, gpt4o, and storets, which no longer match how FTS tokenizes the indexed text. Replace punctuation with separators instead of deleting it.

Suggested fix
 function sanitizeFtsQuery(input: string, loose = false): string {
   const trimmed = input.trim();
   if (!trimmed) return '';
-  const tokens = trimmed.split(/\s+/)
-    .map(t => t.replace(/[^A-Za-z0-9_]/g, '').toLowerCase())
+  const tokens = trimmed
+    .toLowerCase()
+    .replace(/[^a-z0-9_]+/g, ' ')
+    .trim()
+    .split(/\s+/)
     .filter(t => t.length >= 2 && !STOPWORDS.has(t));
   if (!tokens.length) return '';
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/db/store.ts` around lines 405 - 415, sanitizeFtsQuery currently removes
punctuation (t.replace(/[^A-Za-z0-9_]/g, '')) which merges punctuation-delimited
terms (e.g., "gpt-4o" -> "gpt4o") and breaks FTS token matching; instead, change
the normalization in sanitizeFtsQuery so punctuation is replaced with separators
(e.g., spaces) before tokenization rather than deleted: perform the
non-alphanumeric replacement on the whole input (or on each token) using a
replacement like ' ' (space), then split on whitespace, lowercase, filter by
STOPWORDS and length, and proceed with the existing loose/exact quoting logic so
hyphenated/period-delimited identifiers remain separate tokens for FTS.
skills/llm-council/scripts/council.js-141-147 (1)

141-147: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use path.posix.join() for rel_path to ensure consistent POSIX separators across platforms.

rel_path is stored as a database key and compared directly in queries like SELECT * FROM wiki_pages WHERE wiki_slug = ? AND rel_path = ?. On Windows, path.join() produces backslashes instead of forward slashes, which breaks exact-match lookups and causes cross-platform inconsistencies.

Suggested fix
-    const relPath = path.join('derived', 'council', `${sessionId}.md`);
+    const relPath = path.posix.join('derived', 'council', `${sessionId}.md`);
     const fileAbs = path.join(wiki.root_path, relPath);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/llm-council/scripts/council.js` around lines 141 - 147, The relPath is
built with path.join which yields platform-specific backslashes on Windows and
breaks exact-match DB lookups; change building of rel_path to use
path.posix.join (e.g., set relPath = path.posix.join('derived','council',
`${sessionId}.md`)) so the stored rel_path always uses forward slashes, keep
fileAbs/file system writes using path.join(path.dirname(...), ...) as needed
(e.g., fileAbs = path.join(wiki.root_path, relPath) is fine), and ensure the
value passed to store.upsertWikiPage (rel_path) is the POSIX-joined string so DB
queries remain consistent across platforms.
skills/llm-council/scripts/council.js-60-77 (1)

60-77: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add a timeout to outbound provider requests.

https.request() in postJSON (line 64) does not configure a timeout. One stalled provider call will block the entire council run indefinitely until the process is killed.

Suggested fix
     const req = https.request({
       hostname: url.hostname,
       path: url.pathname + url.search,
       method: 'POST',
+      timeout: 60000,
       headers: { 'Content-Type': 'application/json', 'Content-Length': Buffer.byteLength(data), ...headers },
     }, res => {
       let chunks = '';
       res.on('data', c => { chunks += c; });
       res.on('end', () => resolve({ status: res.statusCode, body: chunks }));
     });
+    req.on('timeout', () => req.destroy(new Error('request timed out')));
     req.on('error', reject);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/llm-council/scripts/council.js` around lines 60 - 77, postJSON
currently makes an outbound https.request without any timeout, so add a
configurable timeout to prevent stalled calls: update the postJSON signature to
accept a timeoutMs (e.g. default 10000), call req.setTimeout(timeoutMs, () => {
req.destroy(new Error('Request timed out')); }) and ensure the existing
req.on('error', ...) will reject with that timeout error; also consider
aborting/destroying the request and avoid leaving pending writes. Keep the rest
of the behavior (JSON body, headers) unchanged and use the function name
postJSON to locate the change.
skills/wiki-builder/scripts/init_wiki.sh-25-26 (1)

25-26: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reject invalid --scope and --flavor values explicitly.

Typos currently pass through and produce ambiguous/incomplete wiki setup instead of failing fast.

Suggested fix
+case "$scope" in
+  global|project) ;;
+  *) echo "Invalid --scope: $scope (expected global|project)" >&2; exit 1 ;;
+esac
+
 case "$flavor" in
   paper) mkdir -p "$dest/wiki/sections" ;;
   domain|research) mkdir -p "$dest/wiki"/{concepts,papers,questions} ;;
   product) mkdir -p "$dest/wiki"/{features,decisions,issues} ;;
   person|organization) mkdir -p "$dest/wiki"/{publications,timelines} ;;
   project) mkdir -p "$dest/wiki"/{decisions,runbooks,questions} ;;
   codebase) mkdir -p "$dest/wiki"/{modules,symbols,decisions} ;;
   incident) mkdir -p "$dest/wiki"/{timeline,signals,fixes} ;;
+  *) echo "Invalid --flavor: $flavor" >&2; exit 1 ;;
 esac

Also applies to: 51-59

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/wiki-builder/scripts/init_wiki.sh` around lines 25 - 26, The script's
option parsing currently accepts any string for --scope and --flavor (variables
scope and flavor) which lets typos proceed; add explicit validation after
parsing (or in the case branches handling --scope/--flavor) that checks scope
against the allowed set (e.g., "local","global", etc.) and flavor against the
allowed flavors used later, and if a value is not in the allowed list call echo
>&2 with a clear error message and exit 1; implement small helper checks
(validate_scope and validate_flavor) or inline case statements to centralize
allowed values so both the initial parsing branches (where --scope/--flavor are
set) and any later re-checks (previously noted at the other section) use the
same validation logic.
skills/survey-generator/scripts/build-survey.js-96-101 (1)

96-101: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate bibliography entry schema before generation.

The current checks allow malformed entries; e.g., missing key will crash at Line 97, and missing title will break Line 100.

Suggested fix
   const bundle = JSON.parse(fs.readFileSync(bundlePath, 'utf8'));
   if (!bundle.topic || !Array.isArray(bundle.bibliography)) die('bundle missing topic or bibliography[]');
+  const invalid = bundle.bibliography.find(
+    b => !b || typeof b.key !== 'string' || !b.key.trim() || typeof b.title !== 'string' || !b.title.trim()
+  );
+  if (invalid) die('bundle bibliography[] entries must include non-empty string key and title');

Also applies to: 156-158

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/survey-generator/scripts/build-survey.js` around lines 96 - 101, The
bibliography generation loop does not validate entries before using b.key and
b.title, causing crashes when those fields are missing; in the for-loop that
iterates over bibliography (the block creating id with slugify(b.key), checking
seenKeys, computing url, and pushing to newRows) add a guard that skips entries
missing required fields (at minimum b.key and b.title), optionally logging or
counting invalid entries, and apply the same validation to the similar block
referenced around lines 156-158 so both places check b && typeof b.key ===
'string' && typeof b.title === 'string' before using slugify(b.key) or b.title.
skills/survey-generator/SKILL.md-111-117 (1)

111-117: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Align documented sources.md ID format with generator output.

Line 116 says [^src-NNN], but the generator writes table IDs in src-bib-<slug> form. This mismatch will cause manual entries to drift from tool-generated format.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/survey-generator/SKILL.md` around lines 111 - 117, Update the
documented ID format in the "Hard rules" section so it matches the generator's
output: replace the example `[^src-NNN]` with the actual ID pattern the
generator emits (e.g., `[^src-bib-<slug>]` or `src-bib-<slug>` as used in
sources.md), and clarify that every bibliography row in `sources.md` must use
the `src-bib-<slug>` slug-style IDs; ensure the rule referencing `sources.md`
and the requirement in rule 2 use the same `src-bib-<slug>` format to prevent
drift between documentation and the generator.
skills/survey-generator/scripts/build-survey.js-187-191 (1)

187-191: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not report success when wiki indexing fails.

If wiki-cli page fails, the script still exits successfully, leaving a survey file that is not queryable from the wiki index.

Suggested fix
   try {
     execFileSync('node', [wikiCli, 'page', slug, relPath, '--type', 'survey'], { stdio: 'inherit' });
   } catch (e) {
-    console.error('[survey] wiki-cli page failed:', e.message);
+    die(`wiki-cli page failed: ${e.message}`);
   }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/survey-generator/scripts/build-survey.js` around lines 187 - 191, The
catch block around execFileSync in build-survey.js currently only logs the
error, which leaves the process exiting with success; update the catch to
terminate the script with a non-zero exit (e.g., call process.exit(1)) or
rethrow the error so failures in execFileSync('node', [wikiCli, 'page', ...])
cause the process to fail; locate the try/catch that wraps execFileSync and
ensure the catch uses process.exit(1) or throws the caught error (use e.message
in the log as currently done).
scripts/research-tick.js-63-66 (1)

63-66: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add timeout/error handling around spawnSync for loop execution.

This synchronous child process can block indefinitely; tick should fail fast and log timeout/error explicitly.

Suggested fix
-  const r = spawnSync('node', [LOOP_SCRIPT, 'run', target.slug, '--max-pages', '1'], { encoding: 'utf8' });
+  const r = spawnSync('node', [LOOP_SCRIPT, 'run', target.slug, '--max-pages', '1'], {
+    encoding: 'utf8',
+    timeout: 10 * 60 * 1000,
+  });
+  if (r.error) appendLog(`error: ${r.error.message}`);
+  if (r.signal) appendLog(`signal: ${r.signal}`);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/research-tick.js` around lines 63 - 66, The call to spawnSync in tick
(using spawnSync, LOOP_SCRIPT) can hang indefinitely; wrap the spawnSync call in
a try/catch and pass a timeout option (e.g. timeout: 30_000) and killSignal so
the child is terminated if it exceeds the limit, then explicitly log timeout or
error via appendLog (include error.message and any stderr/stdout snippets) and
return a failure result (e.g. ran: false or include an exit flag) when a timeout
or exception occurs; ensure existing handling of r.status and r.stderr remains
for normal exits.
skills/survey-generator/scripts/build-survey.js-32-49 (1)

32-49: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add request timeout/abort for provider POST calls.

This external call can hang indefinitely right now, which can stall automation and CI jobs.

Suggested fix
 function postJSON(urlStr, body, headers) {
   return new Promise((resolve, reject) => {
@@
     const req = https.request({
@@
     }, res => {
@@
     });
+    req.setTimeout(45_000, () => req.destroy(new Error('request timeout')));
     req.on('error', reject);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/survey-generator/scripts/build-survey.js` around lines 32 - 49, The
postJSON function can hang because the HTTPS request has no timeout or abort;
update postJSON to accept a timeout (or use a default, e.g., 30s) and wire an
Abort/timeout mechanism that aborts the request and rejects the Promise when
exceeded: create a timer (or use AbortController) tied to the https.request
(referencing function postJSON and the local req variable), clear the timer on
'end' and 'error', call req.abort() (or controller.abort()) when the timeout
fires, and ensure the Promise rejects with a descriptive timeout error so
callers can handle stalled provider POST calls.
skills/wiki-research-loop/scripts/research-loop.js-59-67 (1)

59-67: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle CRLF front matter.

The parser only accepts LF line endings. A Windows-authored wiki.config.md will parse as empty here, so auto_research.enabled, private, and the per-run limits silently stop applying.

Minimal fix
   const raw = fs.readFileSync(cfgPath, 'utf8');
-  const m = raw.match(/^---\s*\n([\s\S]*?)\n---/);
+  const m = raw.match(/^---\s*\r?\n([\s\S]*?)\r?\n---/);
   if (!m) return {};
   const obj = {};
   let nested = null;
-  for (const line of m[1].split('\n')) {
+  for (const line of m[1].split(/\r?\n/)) {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/wiki-research-loop/scripts/research-loop.js` around lines 59 - 67, The
front-matter regex in research-loop.js (using
raw.match(/^---\s*\n([\s\S]*?)\n---/)) fails on CRLF files so m becomes null; to
fix, normalize the file contents (raw) by replacing CRLF with LF or update the
regex to accept \r?\n, then re-run the same match and parsing logic (references:
cfgPath, raw, m, and the front-matter parsing loop) so Windows-authored
wiki.config.md files correctly populate auto_research.enabled, private, and
per-run limits.
skills/wiki-research-loop/scripts/research-loop.js-257-258 (1)

257-258: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Create derived/ before writing the run artifact.

This write assumes wiki.root_path/derived already exists. If it doesn't, the command fails after pages and seed state have already been updated, which makes the run look broken even though partial work succeeded.

Minimal fix
     fs.writeFileSync(logFile, ['# Research run ' + ts, '', ...stats.log].join('\n'));
-    fs.writeFileSync(path.join(wiki.root_path, 'derived', `run-${ts}.json`), JSON.stringify(stats, null, 2));
+    const derivedDir = path.join(wiki.root_path, 'derived');
+    fs.mkdirSync(derivedDir, { recursive: true });
+    fs.writeFileSync(path.join(derivedDir, `run-${ts}.json`), JSON.stringify(stats, null, 2));
     return stats;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/wiki-research-loop/scripts/research-loop.js` around lines 257 - 258,
Before writing the run artifact, ensure the derived directory exists: create (if
missing) the directory at path.join(wiki.root_path, 'derived') prior to the
fs.writeFileSync that writes run-${ts}.json. Locate the block that calls
fs.writeFileSync(path.join(wiki.root_path, 'derived', `run-${ts}.json`), ...) in
research-loop.js and add a check/create step (using fs.mkdirSync or equivalent)
to create wiki.root_path/derived with recursive:true so the write cannot fail
when the directory is missing.
skills/wiki-research-loop/scripts/research-loop.js-197-255 (1)

197-255: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid leaving seeds stuck in active.

When the loop halts for budget on Line 218 or convergence on Line 247, the current seed never reaches the terminal update on Line 254. That leaves queue entries permanently active, which blocks retries and makes status lie about in-flight work. Please move the per-seed status transition into a try/finally and assign an explicit final status before every early break.

Suggested shape
       const seed = store.nextPendingSeed(slug);
       if (!seed) { stats.halted = 'queue-empty'; break; }
       if (seed.depth > maxDepth) { store.setSeedStatus(seed.id, 'done'); continue; }
-      store.setSeedStatus(seed.id, 'active');
+      let finalStatus = 'done';
+      let shouldBreak = false;
+      store.setSeedStatus(seed.id, 'active');

-      const docs = [];
-      for (const [name, fetcher] of Object.entries(fetchers)) {
-        try {
-          if (!fetcher.match(seed.query)) continue;
-          const cost = fetcher.estimateCost ? fetcher.estimateCost(seed.query) : { usd: 0 };
-          if (stats.cost_usd + (cost.usd || 0) > budget) { stats.halted = 'budget'; break; }
-          const hits = await fetcher.fetch(seed.query, { limit: 3 });
-          docs.push(...hits);
-          stats.cost_usd += cost.usd || 0;
-          stats.log.push(`[${new Date().toISOString()}] seed-${seed.id} fetcher=${name} hits=${hits.length}`);
-        } catch (e) {
-          stats.log.push(`[${new Date().toISOString()}] seed-${seed.id} fetcher=${name} ERROR ${e.message}`);
-        }
-      }
-      if (stats.halted === 'budget') break;
+      try {
+        const docs = [];
+        for (const [name, fetcher] of Object.entries(fetchers)) {
+          try {
+            if (!fetcher.match(seed.query)) continue;
+            const cost = fetcher.estimateCost ? fetcher.estimateCost(seed.query) : { usd: 0 };
+            if (stats.cost_usd + (cost.usd || 0) > budget) {
+              stats.halted = 'budget';
+              finalStatus = 'pending';
+              shouldBreak = true;
+              break;
+            }
+            const hits = await fetcher.fetch(seed.query, { limit: 3 });
+            docs.push(...hits);
+            stats.cost_usd += cost.usd || 0;
+            stats.log.push(`[${new Date().toISOString()}] seed-${seed.id} fetcher=${name} hits=${hits.length}`);
+          } catch (e) {
+            stats.log.push(`[${new Date().toISOString()}] seed-${seed.id} fetcher=${name} ERROR ${e.message}`);
+          }
+        }
+        if (shouldBreak) break;
+        ...
+        if (convergeStreak >= 3) {
+          stats.halted = 'converged';
+          shouldBreak = true;
+        }
+      } catch (e) {
+        finalStatus = 'failed';
+        throw e;
+      } finally {
+        store.setSeedStatus(seed.id, finalStatus);
+      }
+      if (shouldBreak) break;
-
-      store.setSeedStatus(seed.id, 'done');
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/wiki-research-loop/scripts/research-loop.js` around lines 197 - 255,
The loop can break early (stats.halted='budget' or 'converged') while a seed
remains marked 'active', so wrap the per-seed work in a try/finally and always
call store.setSeedStatus(seed.id, <terminal>) in the finally block; move
store.setSeedStatus(seed.id, 'active') as now, then in the finally decide the
final state (e.g., 'done' when completed, 'failed' when compilePage returned
falsy, or propagate a specific halted reason like
'halted-budget'/'halted-converged' if you prefer) and ensure any early breaks
(when stats.halted is set inside the fetchers loop or after novelty checks) do
not exit without running the finally. Make sure to reference seed, stats.halted,
convergeStreak, budget, compilePage, deriveFollowUps, and use
store.setSeedStatus(seed.id, ...) in the finally to guarantee terminal status.
🟡 Minor comments (14)
skills/wiki-builder/templates/sources.md-3-3 (1)

3-3: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Clarify citation format to match id format.

Line 3 states citations use [^id], but line 8 specifies the id format as src-NNN. Meanwhile, other templates (compile-concept-page.md, compile-source-page.md, wiki.config.md) consistently document the citation format as [^src-id].

This inconsistency could cause confusion: should authors write [^src-123] or [^123]?

📝 Proposed fix
-Every claim in `wiki/` cites a row from this table by `[^id]`.
+Every claim in `wiki/` cites a row from this table by `[^src-id]` (e.g., `[^src-123]`).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/wiki-builder/templates/sources.md` at line 3, Update the citation
example to match the documented id format: replace the ambiguous `[^id]` with
the canonical `[^src-id]` (or show the exact pattern `[^src-NNN]`) so it aligns
with the id format `src-NNN` and other templates (`compile-concept-page.md`,
`compile-source-page.md`, `wiki.config.md`); ensure the sentence now reads
something like "Every claim in `wiki/` cites a row from this table by
`[^src-id]` (e.g. `[^src-123]`)."
skills/wiki-builder/templates/prompts/compile-index.md-12-16 (1)

12-16: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add language identifier to fenced code block.

The fenced code block lacks a language identifier, triggering markdown linting warnings. Since this template will be copied into user wikis, adding a language identifier improves tooling compatibility.

✨ Proposed fix
-```
+```markdown
 <!-- BEGIN GENERATED INDEX -->
 ...
 <!-- END GENERATED INDEX -->
 ```
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/wiki-builder/templates/prompts/compile-index.md` around lines 12 - 16,
The fenced code block in the template currently uses ``` without a language tag;
update the template so the opening fence includes a language identifier (e.g.,
"markdown") to silence lint warnings and improve tooling when the block
containing <!-- BEGIN GENERATED INDEX --> and <!-- END GENERATED INDEX --> is
copied into user wikis; modify the snippet around the markers so the opening
fence is ```markdown and keep the closing fence as ``` to maintain valid fenced
code block markup.
skills/wiki-query/SKILL.md-68-72 (1)

68-72: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove or implement --snippet-len.

The docs say users can tune snippet length with --snippet-len, but query.js never parses that flag and the store hardcodes snippet(..., 16). Right now this is a no-op promise in the public interface.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/wiki-query/SKILL.md` around lines 68 - 72, The docs reference a
tunable `--snippet-len` flag but query.js never parses it and the store call
hardcodes snippet(..., 16), so implement parsing of the `--snippet-len` CLI/args
option in the same place query.js reads other flags (or the CLI parser used) and
propagate the resulting numeric value into the store's snippet call instead of
the hardcoded 16; update any function that constructs the store/search options
(the code invoking snippet(..., 16) in query.js) to accept a snippetLen
parameter and pass the parsed value through so the public interface actually
respects `--snippet-len`.
skills/llm-council/SKILL.md-49-50 (1)

49-50: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Align the --wiki docs with the current implementation.

council.js uses --wiki only to persist the transcript. It does not preload wiki content into the system prompt, and it writes derived/council/<id>.md under the wiki root rather than wiki/derived/....

Also applies to: 80-86

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/llm-council/SKILL.md` around lines 49 - 50, Update the SKILL.md docs
to match the current implementation in council.js: state that the --wiki flag
only persists the session transcript (it does not preload wiki content into the
system prompt), and correct the file path to show the transcript is written
under the provided wiki root at derived/council/<session-id>.md (not
wiki/derived/...). Modify the occurrences around the current lines (including
the block mentioned and lines ~80–86) to reflect these two points and keep the
wording consistent with council.js behavior.
README.md-17-17 (1)

17-17: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Stale "What's Inside" section headers contradict the updated hero and comparison table

The hero text and comparison table (lines 17, 121, 123) were updated for v3.3.0, but the "What's Inside" subsection headers were not:

Location Hero / table What's Inside header
Skills 29 ### 24 Skills (line 145)
Commands 22 ### 21 Commands (line 187)
Hook scripts 31 ### 29 Hook Scripts (line 213)

The detailed tables under those headers also don't list the 5 new skills (wiki-builder, wiki-query, wiki-research-loop, llm-council, survey-generator) or the new /wiki command.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.md` at line 17, Update the stale "What's Inside" subsection headers
and detailed tables to match the hero/comparison values: change the headers "###
24 Skills", "### 21 Commands", and "### 29 Hook Scripts" to "### 29 Skills",
"### 22 Commands", and "### 31 Hook Scripts" respectively, and add the five new
skill entries (wiki-builder, wiki-query, wiki-research-loop, llm-council,
survey-generator) into the Skills table and add the new /wiki command into the
Commands table so the detailed lists reflect the v3.3.0 hero and comparison
table.
skills/learn-rule/SKILL.md-33-36 (1)

33-36: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a language specifier to the fenced code block (markdownlint MD040)

🔧 Proposed fix
-```
+```text
 [LEARN] Editing: Cite a sources.md row before adding any wiki claim.
 Wiki: agent-memory

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @skills/learn-rule/SKILL.md around lines 33 - 36, The fenced code block
containing the lines "[LEARN] Editing: Cite a sources.md row before adding any
wiki claim." and "Wiki: agent-memory" needs a language specifier to satisfy
markdownlint MD040; update the opening fence in SKILL.md for that block (the
block surrounding the LEARN/Wiki lines) to include a language token such as
"text" (i.e., change totext) so the code fence is explicitly labeled.


</details>

</blockquote></details>
<details>
<summary>scripts/embed-wiki.js-40-46 (1)</summary><blockquote>

`40-46`: _⚠️ Potential issue_ | _🟡 Minor_ | _⚡ Quick win_

**`--limit` caps pages *scanned*, not pages *embedded* — causes silent no-ops on re-runs**

`pages.slice(0, limit)` truncates the candidate list before the already-embedded check, so on any re-run the script always evaluates the same first `N` pages. Once those are embedded, every subsequent call (with the same `--limit`) exits with `"all up-to-date"` even though pages beyond the slice have never been touched.

`--limit` should cap how many pages are *enqueued for embedding*, not how many are *inspected*.

<details>
<summary>🐛 Proposed fix</summary>

```diff
-    for (const p of pages.slice(0, limit)) {
+    for (const p of pages) {
       const has = store.db.prepare('SELECT 1 FROM wiki_embeddings WHERE page_id = ? AND model = ?').get(p.id, `${provider.name}:${provider.model}`);
       if (has && !args.force) continue;
       todo.push(p);
+      if (todo.length >= limit) break;
     }
```

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/embed-wiki.js` around lines 40 - 46, The current logic uses
pages.slice(0, limit) which limits the pages inspected rather than the pages
enqueued, causing re-runs to skip unembedded pages; change the loop to inspect
all pages (iterate over pages) and only push up to 'limit' entries into the
'todo' array while still honoring the existing DB check
(store.db.prepare('SELECT 1 FROM wiki_embeddings WHERE page_id = ? AND model =
?').get(...)) and the '--force' flag; stop adding to 'todo' once todo.length ===
parsed args.limit to ensure the limit applies to enqueued embeddings rather than
scanned pages (references: pages.slice, args.limit, todo, store.db.prepare,
wiki_embeddings, args.force).
```

</details>

</blockquote></details>
<details>
<summary>commands/doctor.md-87-99 (1)</summary><blockquote>

`87-99`: _⚠️ Potential issue_ | _🟡 Minor_ | _⚡ Quick win_

**`$PRO_WORKFLOW_ROOT` is inconsistent with `$CLAUDE_PLUGIN_ROOT` used in sections 1–7**

Sections 8 and 9 use `$PRO_WORKFLOW_ROOT`, while every prior section (e.g. section 2a) uses `$CLAUDE_PLUGIN_ROOT`. If `PRO_WORKFLOW_ROOT` is not exported in the user's shell, the new commands will fail silently (swallowed by `2>/dev/null`), giving a false "clean" impression in the doctor output.

Either standardise on the same variable throughout or add a comment documenting how to set `PRO_WORKFLOW_ROOT`.

<details>
<summary>🔧 Suggested fix (if the variables are equivalent)</summary>

```diff
-node $PRO_WORKFLOW_ROOT/skills/wiki-builder/scripts/wiki-cli.js list 2>/dev/null
-node $PRO_WORKFLOW_ROOT/skills/wiki-research-loop/scripts/research-loop.js status 2>/dev/null
+node "$CLAUDE_PLUGIN_ROOT/skills/wiki-builder/scripts/wiki-cli.js" list 2>/dev/null
+node "$CLAUDE_PLUGIN_ROOT/skills/wiki-research-loop/scripts/research-loop.js" status 2>/dev/null
```

```diff
-node $PRO_WORKFLOW_ROOT/skills/llm-council/scripts/council.js providers 2>/dev/null
+node "$CLAUDE_PLUGIN_ROOT/skills/llm-council/scripts/council.js" providers 2>/dev/null
```

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@commands/doctor.md` around lines 87 - 99, Sections 8 and 9 use
$PRO_WORKFLOW_ROOT while earlier sections use $CLAUDE_PLUGIN_ROOT, causing
silent failures; update the two occurrences of $PRO_WORKFLOW_ROOT in the
commands that invoke wiki-cli.js, research-loop.js and council.js to use
$CLAUDE_PLUGIN_ROOT (or, if they are genuinely different, add a one-line note
earlier in the doc showing how to export PRO_WORKFLOW_ROOT) so the CLI lines
reference the same environment variable as sections 1–7; specifically change the
commands that call node
$PRO_WORKFLOW_ROOT/skills/wiki-builder/scripts/wiki-cli.js, node
$PRO_WORKFLOW_ROOT/skills/wiki-research-loop/scripts/research-loop.js, and node
$PRO_WORKFLOW_ROOT/skills/llm-council/scripts/council.js providers to use
$CLAUDE_PLUGIN_ROOT or add the export instruction for PRO_WORKFLOW_ROOT.
```

</details>

</blockquote></details>
<details>
<summary>commands/wiki.md-49-54 (1)</summary><blockquote>

`49-54`: _⚠️ Potential issue_ | _🟡 Minor_ | _⚡ Quick win_

**Add language identifiers to fenced code blocks.**

Line 49 and Line 74 use unlabeled fences, which will keep triggering MD040 and reduce rendering/tooling quality.

 

<details>
<summary>Suggested doc patch</summary>

```diff
-```
+```bash
 /wiki init agent-memory --title "Agent Memory" --flavor research
 /wiki page agent-memory wiki/concepts/episodic-memory.md --type concept
 /wiki ask "what is episodic memory" --wiki agent-memory
 /wiki related agent-memory wiki/concepts/episodic-memory.md
 ```
@@
-```
+```text
 [wiki:<slug>] <Title> — `<rel_path>`
 ```
```
</details>


Also applies to: 74-76

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @commands/wiki.md around lines 49 - 54, The fenced code blocks in
commands/wiki.md are missing language identifiers which triggers MD040; update
each unlabeled triple-backtick block shown (the CLI example block with the /wiki
commands and the small example block showing "[wiki:] <Title> —
<rel_path>") to include appropriate languages (e.g., use "bash" for the CLI
command block and "text" for the output/example block) so linters render and
highlight them correctly; locate the blocks by the snippet contents (the /wiki
init/... lines and the "[wiki:] <Title> — <rel_path>" line) and add the
language tokens after the opening ``` marker.


</details>

</blockquote></details>
<details>
<summary>skills/survey-generator/SKILL.md-98-109 (1)</summary><blockquote>

`98-109`: _⚠️ Potential issue_ | _🟡 Minor_ | _⚡ Quick win_

**Add fenced block language tags (MD040).**

Use `text` for the directory-tree block and `bash` for command examples.

 


Also applies to: 122-131

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @skills/survey-generator/SKILL.md around lines 98 - 109, The markdown
contains fenced code blocks without language tags (MD040); update the tree block
shown (the triple-backtick block containing "/" and the directory
structure) to use the language tag "text" and any command/example fenced blocks
nearby (e.g., the bash-style examples referenced around lines 122-131) to use
the language tag "bash" so the blocks become text and bash respectively;
locate the blocks by the directory-tree content and by command/example content
in SKILL.md and add the appropriate tags.


</details>

</blockquote></details>
<details>
<summary>skills/wiki-builder/SKILL.md-41-51 (1)</summary><blockquote>

`41-51`: _⚠️ Potential issue_ | _🟡 Minor_ | _⚡ Quick win_

**Add language labels to fenced blocks to satisfy MD040.**

Use `text` for directory tree and `bash` for CLI examples.

 


Also applies to: 57-63

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/wiki-builder/SKILL.md` around lines 41 - 51, The fenced code blocks in
SKILL.md that show the ASCII directory tree (the block beginning with "<slug>/"
and entries like "wiki.config.md", "raw/", "wiki/") and the CLI example blocks
need language labels to satisfy MD040; update those triple-backtick fences to
use "text" for the directory-tree block and "bash" for any command/CLI example
blocks so markdown linters recognize the content type (search for the fenced
blocks containing the directory tree and CLI examples and add the appropriate
language identifiers).
```

</details>

</blockquote></details>
<details>
<summary>commands/wiki.md-18-18 (1)</summary><blockquote>

`18-18`: _⚠️ Potential issue_ | _🟡 Minor_ | _⚡ Quick win_

**Clarify feature availability wording for `/wiki research`.**

Line 18 documents `/wiki research` as available, but Line 68 says the loop driver “lands next.” This reads as contradictory and can mislead operators.

 


Also applies to: 56-69

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@commands/wiki.md` at line 18, Update the wording to remove the contradictory
availability claim for the `/wiki research` command: change the command table
entry that lists `/wiki research <slug> ...` from implying it is fully available
to mark it as "experimental" or "coming soon", and edit the paragraph that
references the loop driver “lands next” so both places consistently state the
same status (e.g., "experimental — loop driver pending release" or "coming next
release"). Ensure both the command table entry for `/wiki research` and the
later explanatory section that mentions the loop driver use identical phrasing
so operators are not misled.
```

</details>

</blockquote></details>
<details>
<summary>skills/wiki-research-loop/SKILL.md-12-22 (1)</summary><blockquote>

`12-22`: _⚠️ Potential issue_ | _🟡 Minor_ | _⚡ Quick win_

**Specify languages for fenced code blocks (MD040).**

Unlabeled fences here will keep failing markdownlint and reduce syntax-aware rendering.

 


Also applies to: 36-42, 95-97, 117-120, 124-131

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @skills/wiki-research-loop/SKILL.md around lines 12 - 22, The fenced code
blocks in SKILL.md (e.g., the sequence beginning "seed-queue (pending) →
next-seed" and the other fences at the ranges you noted) are unlabeled and
triggering MD040; add an explicit language label to each opening fence (for
example text or mermaid if it’s a diagram) so markdownlint recognizes
them—update the fences around the blocks at the shown ranges (lines ~12-22,
36-42, 95-97, 117-120, 124-131) to include the appropriate language identifier.


</details>

</blockquote></details>
<details>
<summary>skills/wiki-research-loop/scripts/research-loop.js-139-149 (1)</summary><blockquote>

`139-149`: _⚠️ Potential issue_ | _🟡 Minor_ | _⚡ Quick win_

**Define the footnotes you emit.**

Each claim references `[^src-N]`, but this function never appends matching footnote definitions, so every generated page has broken citation links. Either add the definitions at the end of the document or switch these to inline source links.  
 
<details>
<summary>Minimal fix</summary>

```diff
   for (const [i, c] of claims.entries()) {
     lines.push(`- ${c.text} [^src-${i + 1}]`);
   }
   lines.push('');
   lines.push('## Open follow-ups');
   lines.push('');
   lines.push('_Auto-extracted; review and prune._');
+  lines.push('');
+  for (const [i, c] of claims.entries()) {
+    lines.push(`[^src-${i + 1}]: ${c.source}`);
+  }
 
   return { content: lines.join('\n'), claims, novelty };
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/wiki-research-loop/scripts/research-loop.js` around lines 139 - 149,
The code emits footnote references `[^src-${i + 1}]` for each claim but never
appends matching footnote definitions, leaving broken links; fix this by adding
footnote definitions to the `lines` array before the `return { content:
lines.join('\n'), claims, novelty };` — iterate `claims` and push lines like a
definition per claim using the same index pattern (e.g., `[^src-1]: <source>`),
sourcing the URL/text from the claim object (e.g., `c.source`, `c.url`, or
another appropriate field), or alternatively replace the `[^src-${i + 1}]`
references with inline links if you prefer not to append footnotes; ensure you
update the same variables (`claims`, `lines`) and keep the `[^src-${i + 1}]`
naming consistent.
```

</details>

</blockquote></details>

</blockquote></details>

---

<details>
<summary>ℹ️ Review info</summary>

<details>
<summary>⚙️ Run configuration</summary>

**Configuration used**: defaults

**Review profile**: CHILL

**Plan**: Pro

**Run ID**: `9f034ff1-fd28-4815-89b0-b66a1a2a10a8`

</details>

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 9df6af8d39d78cc7cf30663e78b9f1dd842bfcc8 and 3dfc33f9eceab9a8a634f3ab0a3f2adbc59a1aeb.

</details>

<details>
<summary>📒 Files selected for processing (41)</summary>

* `README.md`
* `commands/doctor.md`
* `commands/wiki.md`
* `package.json`
* `scripts/embed-wiki.js`
* `scripts/file-changed.js`
* `scripts/learn-capture.js`
* `scripts/prompt-submit.js`
* `scripts/research-tick.js`
* `scripts/session-start.js`
* `skills/learn-rule/SKILL.md`
* `skills/llm-council/SKILL.md`
* `skills/llm-council/scripts/council.js`
* `skills/survey-generator/SKILL.md`
* `skills/survey-generator/scripts/build-survey.js`
* `skills/survey-generator/templates/research_bundle.template.json`
* `skills/wiki-builder/SKILL.md`
* `skills/wiki-builder/agents/openai.yaml`
* `skills/wiki-builder/references/wiki-flavors.md`
* `skills/wiki-builder/scripts/init_wiki.sh`
* `skills/wiki-builder/scripts/wiki-cli.js`
* `skills/wiki-builder/templates/index.md`
* `skills/wiki-builder/templates/maintenance-log.md`
* `skills/wiki-builder/templates/prompts/compile-concept-page.md`
* `skills/wiki-builder/templates/prompts/compile-index.md`
* `skills/wiki-builder/templates/prompts/compile-source-page.md`
* `skills/wiki-builder/templates/prompts/lint-wiki.md`
* `skills/wiki-builder/templates/prompts/query-and-file.md`
* `skills/wiki-builder/templates/sources.md`
* `skills/wiki-builder/templates/wiki.config.md`
* `skills/wiki-query/SKILL.md`
* `skills/wiki-query/scripts/query.js`
* `skills/wiki-research-loop/SKILL.md`
* `skills/wiki-research-loop/scripts/research-loop.js`
* `skills/wiki-research-loop/scripts/source-fetchers/arxiv.js`
* `skills/wiki-research-loop/scripts/source-fetchers/github.js`
* `skills/wiki-research-loop/scripts/source-fetchers/web.js`
* `src/db/index.ts`
* `src/db/schema.sql`
* `src/db/store.ts`
* `src/search/embeddings.ts`

</details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Comment thread skills/wiki-builder/scripts/init_wiki.sh
Comment thread skills/wiki-builder/scripts/wiki-cli.js Outdated
Critical (path traversal):
- init_wiki.sh: validate slug as ^[a-z0-9][a-z0-9-]*$ before touching disk
- wiki-cli.js: resolve and verify rel-path stays inside wiki.root_path

Major:
- package.json: cross-platform schema copy via Node (not cp)
- src/search/embeddings.ts: cosine returns 0 on dim mismatch; vectorSearch
  filters by dim in SQL; postJSON 30s timeout
- web.js / github.js / arxiv.js: redirect cap (5), 15s timeout, URL-base
  resolution for relative Location headers, drain redirect responses
- src/db/index.ts: close DB on schema load/exec failure
- src/db/store.ts:
  * claimPendingSeed (UPDATE...RETURNING) atomic select+claim
  * addLearning + linkLearningWiki wrapped in db.transaction
  * sanitizeFtsQuery replaces punctuation with spaces (preserves gpt-4o,
    wiki-builder, store.ts as separate tokens)
- council.js: postJSON 120s timeout; phase 1+2 use Promise.allSettled so
  one provider failure no longer aborts the run; rel_path uses path.posix.join
- build-survey.js: postJSON 180s timeout; bibliography validates
  non-empty key+title before generation; die() on wiki-cli failure
- init_wiki.sh: explicit allowlist for --scope and --flavor
- research-tick.js: spawnSync 10min timeout + SIGKILL + error/signal logging
- research-loop.js:
  * front matter parser accepts CRLF
  * derived/ created via mkdirSync before writing run-*.json
  * try/finally guarantees seeds never stuck in 'active'
  * uses claimPendingSeed; budget halt re-marks seed pending for retry
- survey-generator SKILL.md: ID format aligned with generator output
  (src-bib-<slug>)
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🧹 Nitpick comments (1)
skills/wiki-research-loop/scripts/source-fetchers/arxiv.js (1)

42-42: ⚡ Quick win

match: () => true fires the arXiv fetcher on every query

arXiv is an academic paper repository; matching every research query (including non-academic ones like tool usage, configuration, etc.) will return irrelevant entries and inflate the BFS budget unnecessarily. If the research-loop uses match to filter fetchers, consider narrowing this predicate (e.g., topic keywords, explicit arxiv: URL prefixes) or documenting that this is intentional.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/wiki-research-loop/scripts/source-fetchers/arxiv.js` at line 42, The
match predicate for the arXiv fetcher (match: () => true) currently matches
every query; replace it with a narrower predicate on the match function to avoid
firing on non-academic queries—e.g., check the incoming query string or metadata
for explicit signals like an "arxiv:" prefix, keywords such as "paper",
"research", "doi", "arXiv", author/title patterns, or a URL pattern, and return
true only when those conditions are met; update the match function in the arXiv
fetcher (the match export) accordingly and add a brief comment documenting the
chosen matching heuristic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@skills/survey-generator/scripts/build-survey.js`:
- Around line 98-101: The survey rows are written with IDs like id =
`src-bib-${slugify(b.key)}` (used in newRows and checked via seenKeys) but
elsewhere the generator produces citations as `[^paper-key]`, causing a
mismatch; fix by ensuring the citation identifier matches the persisted id (use
the same id variable / `src-bib-${slugify(b.key)}` format) wherever citations
are emitted (the code that generates citation strings around lines that handle
b.key in the loop and the section 123-149), or alternatively change the row ID
to match the citation format—make all citation generation and the sources.md row
creation consistently use the same slugified identifier (reference id, slugify,
seenKeys, newRows).
- Around line 157-162: The bundle validation currently checks only that
bibliography entries have key/title, but you must also validate that
bibliography keys are unique and that every reference in
bundle.sections[].papers is a non-empty string that exists in the bibliography
before calling the model; update the validation after reading bundle (where
bundle, bundle.bibliography and die are available) to build a Set of
bibliography keys, detect duplicates and call die(...) if found, then iterate
bundle.sections (or bundle.sections[].papers) and for each paper entry verify
it's a non-empty string and present in the bibliography key set, calling
die(...) on any missing or invalid references so the script fails fast rather
than passing bad keys into prompt generation.
- Around line 104-107: The current write branch for sources.md only appends data
rows when the file is missing or lacks the header; update the branch that checks
if (!existing.includes('| id | type |')) to first prepend a markdown table
header and separator (e.g., "| id | type | source |" and "| --- | --- | --- |"
or the appropriate columns) before appending newRows, using the same variables
(existing, newRows, file) so fs.writeFileSync writes a valid table header +
separator + rows (respecting existing.endsWith('\n') as in the other branch).

In `@skills/survey-generator/SKILL.md`:
- Around line 98-109: The two fenced code blocks in SKILL.md are missing fence
languages and trigger markdownlint MD040; update the directory tree fence to use
"text" (e.g., ```text) and update the command/example fence to use "bash" (e.g.,
```bash) so the linter recognizes them; also apply the same change to the other
similar fenced block later in the file.

In `@skills/wiki-research-loop/scripts/source-fetchers/arxiv.js`:
- Around line 44-51: The fetch method in arxiv.js currently calls
httpsGet(query) without try/catch so network/timeouts throw and escape; wrap the
httpsGet + status check + parseEntries calls in a try/catch inside async
fetch(query, opts = {}) and on any thrown error return [] (keeping the existing
behavior of returning [] when res.status !== 200), e.g., try { const res = await
httpsGet(url); if (res.status !== 200) return []; return parseEntries(res.body);
} catch (err) { return []; } — reference function: fetch, helper: httpsGet,
parser: parseEntries, and variable: limit/q/url.
- Line 45: The code uses falsy coercion when setting limit (const limit =
opts.limit || 3), which incorrectly treats an explicit 0 as absent; change the
assignment to use nullish coalescing so only undefined/null fall back to 3
(replace the expression in the arxiv fetcher where limit is defined, e.g., the
const limit = ... line in arxiv.js, to use opts.limit ?? 3).

In `@skills/wiki-research-loop/scripts/source-fetchers/web.js`:
- Line 62: The current assignment const limit = opts.limit || 3 incorrectly
treats 0 as falsy and falls back to 3; change the fallback to respect explicit
zero by using a nullish check (e.g., opts.limit ?? 3) or an explicit undefined
check when computing limit in the code that defines limit (reference the
opts.limit and limit variables in this module/file).
- Around line 18-22: The redirect handling in httpsGet currently always uses the
https module (https.get) even when the Location header points to an http:// URL;
update the redirect branch in httpsGet so it parses the redirect target (new
URL(...)), chooses the correct client (http vs https) based on url.protocol, and
re-invokes the same helper with that client (or call the top-level httpsGet
recursively but ensure the underlying request uses the matching module);
import/require the http module if not already present and preserve headers,
redirects count, and the original resolve/reject behavior when calling the
recursive request.
- Around line 24-26: The response body is currently accumulated with unbounded
"data += c" in the response handlers (res.on('data', ...), res.on('end', ...)),
which risks unlimited memory use and unbounded wall-clock time; modify those
handlers to enforce a maximum body size (e.g., track receivedBytes and if it
exceeds MAX_BODY_BYTES stop reading, remove listeners, destroy the
response/socket and reject the promise with a clear error) and also implement a
separate body-read deadline (start a setTimeout when the first chunk arrives and
reset/clear it appropriately; on timeout remove listeners, destroy socket, and
reject). Ensure you clear the timeout and remove event listeners on both success
(resolve) and all error/abort/overflow/timeout paths so there are no leaks;
update the same res.on('data', ...), res.on('end', ...), and error/close
handlers used in the current fetch logic accordingly.

In `@src/db/store.ts`:
- Around line 172-187: The wiki identity must include scope/project context to
avoid collisions: change the upsert, select and delete statements to use a
composite key (e.g., slug + scope + root_path or slug + scope) instead of
slug-only. Update upsertWikiStmt to use ON CONFLICT(slug, scope, root_path) DO
UPDATE (or ON CONFLICT(slug, scope) if you prefer) and ensure the wikis table
has a UNIQUE constraint on those columns; then update getWikiStmt and
deleteWikiStmt to SELECT/DELETE WHERE slug = ? AND scope = ? AND root_path = ?
(or include the same composite columns you chose), and update any
callers/prepared-statement parameter bindings (including the analogous
statements referenced around upsertWikiStmt and the other block at lines
~335-359) to pass the extra scope/root_path values.

---

Nitpick comments:
In `@skills/wiki-research-loop/scripts/source-fetchers/arxiv.js`:
- Line 42: The match predicate for the arXiv fetcher (match: () => true)
currently matches every query; replace it with a narrower predicate on the match
function to avoid firing on non-academic queries—e.g., check the incoming query
string or metadata for explicit signals like an "arxiv:" prefix, keywords such
as "paper", "research", "doi", "arXiv", author/title patterns, or a URL pattern,
and return true only when those conditions are met; update the match function in
the arXiv fetcher (the match export) accordingly and add a brief comment
documenting the chosen matching heuristic.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7c268377-230a-4277-badc-95aa383dc916

📥 Commits

Reviewing files that changed from the base of the PR and between 3dfc33f and ad2e3f2.

📒 Files selected for processing (14)
  • package.json
  • scripts/research-tick.js
  • skills/llm-council/scripts/council.js
  • skills/survey-generator/SKILL.md
  • skills/survey-generator/scripts/build-survey.js
  • skills/wiki-builder/scripts/init_wiki.sh
  • skills/wiki-builder/scripts/wiki-cli.js
  • skills/wiki-research-loop/scripts/research-loop.js
  • skills/wiki-research-loop/scripts/source-fetchers/arxiv.js
  • skills/wiki-research-loop/scripts/source-fetchers/github.js
  • skills/wiki-research-loop/scripts/source-fetchers/web.js
  • src/db/index.ts
  • src/db/store.ts
  • src/search/embeddings.ts
🚧 Files skipped from review as they are similar to previous changes (8)
  • package.json
  • skills/wiki-research-loop/scripts/source-fetchers/github.js
  • scripts/research-tick.js
  • skills/llm-council/scripts/council.js
  • skills/wiki-builder/scripts/init_wiki.sh
  • skills/wiki-research-loop/scripts/research-loop.js
  • src/search/embeddings.ts
  • skills/wiki-builder/scripts/wiki-cli.js

Comment thread skills/survey-generator/scripts/build-survey.js Outdated
Comment thread skills/survey-generator/scripts/build-survey.js Outdated
Comment thread skills/survey-generator/scripts/build-survey.js
Comment thread skills/survey-generator/SKILL.md Outdated
Comment thread skills/wiki-research-loop/scripts/source-fetchers/arxiv.js
Comment thread skills/wiki-research-loop/scripts/source-fetchers/arxiv.js Outdated
Comment thread skills/wiki-research-loop/scripts/source-fetchers/web.js Outdated
Comment thread skills/wiki-research-loop/scripts/source-fetchers/web.js Outdated
Comment thread skills/wiki-research-loop/scripts/source-fetchers/web.js Outdated
Comment thread src/db/store.ts
Comment on lines +172 to +187
const upsertWikiStmt = db.prepare(`
INSERT INTO wikis (slug, title, flavor, root_path, scope, auto_research, private)
VALUES (@slug, @title, @flavor, @root_path, @scope, @auto_research, @private)
ON CONFLICT(slug) DO UPDATE SET
title = excluded.title,
flavor = excluded.flavor,
root_path = excluded.root_path,
scope = excluded.scope,
auto_research = excluded.auto_research,
private = excluded.private,
updated_at = datetime('now')
`);
const getWikiStmt = db.prepare(`SELECT * FROM wikis WHERE slug = ?`);
const listWikisStmt = db.prepare(`SELECT * FROM wikis ORDER BY updated_at DESC`);
const listWikisByScopeStmt = db.prepare(`SELECT * FROM wikis WHERE scope = ? ORDER BY updated_at DESC`);
const deleteWikiStmt = db.prepare(`DELETE FROM wikis WHERE slug = ?`);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Make wiki identity include scope/project context.

ON CONFLICT(slug) plus slug-only getWiki/deleteWiki means a project wiki named foo in one repo will overwrite or delete another repo's foo, and it can also collide with a global foo. With all scopes sharing ~/.pro-workflow/data.db, this turns normal slug reuse into cross-project data loss. The key should include enough context to distinguish global vs project instances, and project instances across different roots.

Also applies to: 335-359

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/db/store.ts` around lines 172 - 187, The wiki identity must include
scope/project context to avoid collisions: change the upsert, select and delete
statements to use a composite key (e.g., slug + scope + root_path or slug +
scope) instead of slug-only. Update upsertWikiStmt to use ON CONFLICT(slug,
scope, root_path) DO UPDATE (or ON CONFLICT(slug, scope) if you prefer) and
ensure the wikis table has a UNIQUE constraint on those columns; then update
getWikiStmt and deleteWikiStmt to SELECT/DELETE WHERE slug = ? AND scope = ? AND
root_path = ? (or include the same composite columns you chose), and update any
callers/prepared-statement parameter bindings (including the analogous
statements referenced around upsertWikiStmt and the other block at lines
~335-359) to pass the extra scope/root_path values.

fence languages, fetcher safety, slug-collision guard, README + infographic

build-survey.js:
- Citation IDs now align with sources.md row IDs: prompt receives
  bibliography entries with citation_id field (src-bib-<slug>) and
  sections with paper_citation_ids. References reuse citation_id.
- Validate bibliography uniqueness; validate sections[].papers
  references exist in bibliography before any model call.
- sources.md gets a proper markdown table header + separator when
  the file is empty or lacks the header row.

survey SKILL.md:
- Fence languages: ```text for tree, ```bash for command examples
  (markdownlint MD040).
- Reference example shows src-bib-<slug> citation form.

source-fetchers:
- arxiv/github/web all wrap fetch in try/catch returning [] on
  error so transient failures don't abort the loop.
- All fetchers use opts.limit ?? 3 (preserves explicit 0 if ever
  passed; previous || coerced 0 to 3).
- web.js: chooses http vs https client based on url.protocol on
  every call (fixes redirects from https → http).
- web.js: bounded body (4 MiB MAX_BODY_BYTES) + 30s body-read
  deadline; both clean up listeners + destroy socket on overflow
  or timeout to prevent leaks.

src/db/store.ts + wiki-cli.js:
- upsertWiki refuses to register a slug that already exists at a
  different (scope, root_path); raises a clean error so two repos
  with the same project-scope slug don't silently overwrite each
  other's registration.
- wiki-cli.js init wraps the upsert in try/catch so the user sees
  a friendly error instead of a stack trace.

src/db/schema.sql:
- Comment documents that wiki identity is enforced at the
  application layer for FK simplicity.

README.md:
- Adds 60-second wiki tour after Install (init → page → ask →
  seed → research → embed → hybrid → council).
- Counts updated to 33 skills / 22 commands / 37 hook scripts to
  reflect upstream merges (token-efficiency etc.) plus v3.3 work.

docs/infographic.html:
- Version display v3.2 → v3.3.
- Title block + components grid + section 10 reflect the wiki
  knowledge plane and the new counts.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
skills/wiki-research-loop/scripts/source-fetchers/arxiv.js (1)

11-13: ⚡ Quick win

Consider capping response body size.

httpsGet accumulates the entire response body in memory without a size limit. While arXiv responses are typically small (bounded by max_results), adding a defensive cap (e.g., 1 MB) would prevent potential OOM if the server misbehaves or a redirect lands on a large payload.

🛡️ Proposed enhancement
      let data = '';
-     res.on('data', c => { data += c; });
+     res.on('data', c => {
+       data += c;
+       if (data.length > 1_000_000) {
+         req.destroy(new Error('Response too large'));
+       }
+     });
      res.on('end', () => resolve({ status: res.statusCode, body: data }));
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/wiki-research-loop/scripts/source-fetchers/arxiv.js` around lines 11 -
13, The response body is accumulated without a size limit in the httpsGet
implementation (the local variable data and the res.on('data') handler), so add
a defensive cap (e.g., 1 MB) inside that handler: incrementally track bytes
received, and if the cap is exceeded call res.destroy() (or req.abort) and
reject the surrounding promise with a clear Error (e.g., "Response body too
large") to avoid OOM; keep the existing resolve({ status: res.statusCode, body:
data }) on the res.on('end') path unchanged and ensure any error/rejection is
correctly propagated from the httpsGet function.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@skills/wiki-research-loop/scripts/source-fetchers/arxiv.js`:
- Around line 11-13: The response body is accumulated without a size limit in
the httpsGet implementation (the local variable data and the res.on('data')
handler), so add a defensive cap (e.g., 1 MB) inside that handler: incrementally
track bytes received, and if the cap is exceeded call res.destroy() (or
req.abort) and reject the surrounding promise with a clear Error (e.g.,
"Response body too large") to avoid OOM; keep the existing resolve({ status:
res.statusCode, body: data }) on the res.on('end') path unchanged and ensure any
error/rejection is correctly propagated from the httpsGet function.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b784b60a-8229-4b19-aea1-1a905d77b61f

📥 Commits

Reviewing files that changed from the base of the PR and between ad2e3f2 and a6d4b3a.

📒 Files selected for processing (10)
  • README.md
  • docs/infographic.html
  • skills/survey-generator/SKILL.md
  • skills/survey-generator/scripts/build-survey.js
  • skills/wiki-builder/scripts/wiki-cli.js
  • skills/wiki-research-loop/scripts/source-fetchers/arxiv.js
  • skills/wiki-research-loop/scripts/source-fetchers/github.js
  • skills/wiki-research-loop/scripts/source-fetchers/web.js
  • src/db/schema.sql
  • src/db/store.ts
✅ Files skipped from review due to trivial changes (3)
  • skills/survey-generator/SKILL.md
  • README.md
  • docs/infographic.html
🚧 Files skipped from review as they are similar to previous changes (5)
  • skills/wiki-research-loop/scripts/source-fetchers/github.js
  • skills/survey-generator/scripts/build-survey.js
  • skills/wiki-builder/scripts/wiki-cli.js
  • src/db/schema.sql
  • src/db/store.ts

@rohitg00 rohitg00 merged commit de80bff into main May 9, 2026
5 checks passed
@rohitg00 rohitg00 deleted the feat/wiki-knowledge-base branch May 9, 2026 10:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant