Architecture Overview
codectx is designed as a pipeline that sequentially processes a codebase to produce a deterministic output.
Pipeline Walkthrough
Section titled “Pipeline Walkthrough”- Discovery Phase: The crawler scans the target directory, respecting
.gitignoreand.codectx.yamlexclude rules. It builds an initial flat file list. - Parsing Phase: Using Tree-sitter, it parses structural information from source files. This extracts functions, classes, and import declarations without executing the code.
- Graph Construction: The parser identifies imports and builds a directional graph (the Dependency Graph). Circular dependencies are detected and resolved via topological sorting fallbacks.
- Ranking Engine:
- Tier 1 (Core): Entry points, architecture docs, heavily depended-upon modules.
- Tier 2 (Logic): Standard implementation files.
- Tier 3 (Periphery): Utilities, tests, configuration scripts.
- Token Budgeting & Compression: If a token limit is set,
codectxselectively trims Tier 3, strips comments from Tier 2, and formats the output to fit securely within the requested budget. - Formatting: The internal models are rendered into the final Markdown structure optimized for LLM readability.