Architecture Overview

codectx is designed as a pipeline that sequentially processes a codebase to produce a deterministic output.

Pipeline Walkthrough

Discovery Phase: The crawler scans the target directory, respecting .gitignore and .codectx.yaml exclude rules. It builds an initial flat file list.
Parsing Phase: Using Tree-sitter, it parses structural information from source files. This extracts functions, classes, and import declarations without executing the code.
Graph Construction: The parser identifies imports and builds a directional graph (the Dependency Graph). Circular dependencies are detected and resolved via topological sorting fallbacks.
Ranking Engine:
- Tier 1 (Core): Entry points, architecture docs, heavily depended-upon modules.
- Tier 2 (Logic): Standard implementation files.
- Tier 3 (Periphery): Utilities, tests, configuration scripts.
Token Budgeting & Compression: If a token limit is set, codectx selectively trims Tier 3, strips comments from Tier 2, and formats the output to fit securely within the requested budget.
Formatting: The internal models are rendered into the final Markdown structure optimized for LLM readability.