I have always wanted to keep notes, to write my thoughts and ideas down somewhere I'd actually return to them. The thing that stopped me, every time, was figuring out how to organize them. I can't sit comfortably with a pile that isn't organized well, and I'd stall before I started, rearranging folders in my head instead of writing anything down.
For a while I thought about organizing notes the way I refactor code: clean hierarchy, everything in its proper place, a structure I could defend. But notes aren't code. They don't have to be filed into one correct tree. The organization can be dynamic, and more to the point, it doesn't have to be my job. I can hand that burden to an agent. I feed it the raw thoughts and ideas, and it works out the structure and keeps it current.
That's what led me to build knowledge-base-builder, an Agent Skill that turns a directory of notes into one an Agent can navigate. No embeddings, no vector store. Just Markdown and the filesystem.
The idea
Every directory gets a lightweight index.md. Every note gets a one-line
summary in its frontmatter. That's the whole mechanism, and it's exactly the
structure I never wanted to build by hand.
An agent answers a question the way you'd browse a well-organized wiki. It reads the root index, scans the area summaries, opens the one or two areas that fit, scans the note summaries inside, and opens only the handful of files that actually match. It never loads the whole pile, and it never runs a similarity search. That's progressive disclosure, and the trick is simple: the structure tells the agent what to ignore.
The line from the skill that stuck with me is that the filesystem is the state and Markdown is the wire format. There is nothing else to keep in sync. It diffs cleanly in git, and when the agent opens the wrong file, you can read the exact summary that misled it and fix that one sentence. Try doing that with a vector you can't read.
It holds up to tens of thousands of files, which is personal and small-team scale, and it stands up quicker than a RAG pipeline. The difference is you can see all of it.
What your folder looks like after
A flat pile of notes becomes a tree that describes itself:
my-knowledge-base/
├── index.md ← navigation protocol + a map of everything
├── projects/
│ ├── index.md ← lists each note in the area with its summary
│ ├── website-rewrite.md
│ └── q3-roadmap.md
└── thoughts/
├── index.md
└── on-focus.mdEach note carries a little frontmatter, and the summary is the part the
indexes read:
---
title: Q3 roadmap
summary: The three shipping milestones, their owners, and the deadline risks.
---Install it
It runs on Node.js 18 or newer, and it's packaged with skillship so it works in Cursor, Claude Code, Claude Web, and Cowork.
npx skillship@latest install shivdeepak/knowledge-base-builder -a cursor -a claude-code
# or, via the underlying multi-agent installer:
npx skills add shivdeepak/knowledge-base-builderThen point it at a folder and ask it to organize your notes.
The summaries do all the work
The work splits in two, and only one half is hard.
- Summaries need real reading. To write one honest line about a file, the agent has to actually understand it. This is the judgment, and it's where the quality of the whole base is won or lost.
- Indexes are mechanical: walk the tree and, in each folder, list the sub-areas and notes with their summaries. Repetitive, but no scripts and no database, just the agent assembling Markdown.
A summary earns its place when someone who sees only the index can decide whether to open the file. The skill is blunt about the difference, and the examples make it obvious:
- Weak:
Notes about work. - Weak:
Some thoughts on the Q3 plan.(a teaser, what thoughts?) - Strong:
Q3 roadmap: the three shipping milestones, owners, and the deadline risks. - Strong:
Decision log for the website rewrite, why we chose Astro over Next.
Lead with what the file is, then why you'd open it. Concrete, one sentence, not bait.
How it actually runs
Point it at a folder and it moves through six steps, in order:
- Assess. It looks before it touches anything. It lists the tree, opens a few files, and works out what it's dealing with: is there structure already or is it a flat dump, do any notes have summaries, are there old index files (so this is a refresh, not a first build), are there PDFs or images to account for. Then it tells you what it found before making any large edit.
- Decide the structure. If your folders already make sense, it keeps them and works additively. If it's a flat pile, it proposes a few top-level areas, grouped the way you actually think about the material, and asks before moving a single file.
- Summarize every note. The core task, the one from above.
- Build the indexes. Bottom-up, because each folder describes itself
through the
summaryin its ownindex.mdand the parent reads that. Every index is a page you'd happily read yourself, not machine output: no generated-region markers, no fenced-off blocks. - Make the base self-describing. It writes a short navigation protocol into
the root
index.md, so the next session, even a cold agent that has never seen this skill, knows how to walk the tree. - Hand off maintenance. It leaves you a small upkeep loop: change a note,
refresh its summary, update the
index.mdin that folder and any parent whose area summary shifted.
The one step that can actually lose work is restructuring, so it treats your
consent as the gate. It never deletes content, it prefers moving over rewriting,
and if the folder isn't under version control it nudges you to git init first
so every change is reversible.
Where it fits, and where it doesn't
This is for collections a person or team curates and reads with an Agent that can reason about which files to open: notes, plans, journals, research, project docs, decision logs.
It is honestly not the tool for very large, high-churn corpora: millions of chunks, constant ingestion, fuzzy semantic recall across everything at once. That's what vector search and GraphRAG are for, and the skill says so plainly rather than pretending otherwise. I wanted the thing that covers the scale most of us actually have, which is a folder that grew faster than its structure did.
The whole approach is just an index and a sentence per file, the way you'd organize a shelf so you can find a book without reading every spine. It turns out that's most of what an agent needs too.