Architecture Overview
codectx is designed as a pipeline that sequentially processes a codebase to produce a deterministic output.
Pipeline Walkthrough
Section titled “Pipeline Walkthrough”- Discovery Phase: The crawler scans the target directory, respecting
.gitignoreand.codectx.yamlexclude rules. It builds an initial flat file list. - Parsing Phase: Using Tree-sitter, it parses structural information from source files. This extracts functions, classes, and import declarations without executing the code.
- Graph Construction: The parser identifies imports and builds a directional graph (the Dependency Graph). Circular dependencies are detected and resolved via topological sorting fallbacks.
- Ranking Engine:
- Tier 1 (Core): Entry points and heavily depended-upon modules.
- Tier 2 (Logic): Standard implementation files.
- Tier 3 (Periphery): Utilities, tests, configuration scripts.
- Token Budgeting & Compression: If a token limit is set,
codectxselectively compresses Tier 2 to strictly interfaces/signatures, trims Tier 3 to one-liners, and transforms Tier 1 files (save for true entry points) into highly descriptive AST structural summaries. - Summarizer Extension: Before executing final formatting, if the
--llmextra flag properties have been enabled, Tier 3 components utilize an AI summarization hook for explicit purpose mappings. - Formatting: The internal models are incrementally rendered into the final Markdown structure optimized for LLM readability.
Formatted Output Sections
Section titled “Formatted Output Sections”The final documentation generated produces sections exactly mapped sequentially for deterministic consumption. Here are the core structural mappings codectx derives within the files:
ARCHITECTURE: Derived manually from source instructions (ARCHITECTURE.md).ENTRY_POINTS: Source outputs explicitly identifying core routing operations.SYMBOL_INDEX: High level references and their localized mapping details.IMPORTANT_CALL_PATHS: Topologically parsed chains determining exact dependency flow between symbols.CORE_MODULES: Structured outputs referencing theTier 1implementation bodies.SUPPORTING_MODULES: Strictly definedTier 2interface summaries spanning generic files.DEPENDENCY_GRAPH: ASCII Rendered mappings establishing connection flow throughout implementations.RANKED_FILES: Sorted layout confirming the resulting file-system evaluations.PERIPHERY: EvaluatedTier 3elements compressed significantly or derived via heuristics/summaries.