Skip to content

Token Compression Strategy

Providing a --tokens [MAX] flag enables the codectx compression engine, capping execution out perfectly to the budget. The compression engine processes files based strictly on the Tier levels established in the ranking engine.

Instead of full source reproduction, codectx emits an AST-driven structured summary representing the overall footprint natively derived from its core definitions. This includes its exact purpose, what valid variables/internal dependencies it leverages, types, function headers, signatures, and behavioral notes out-of-the-norm (async-heavy). The only source files evaluated functionally end-to-end to 300 lines (max) are strict entry points (e.g. main.py).

Tier 2 emits exact function and class signatures alongside docstrings. Everything within the body scope of def/class implementations is intentionally discarded.

Tier 3 simply includes an exact heuristic statement of its functionality. Ex: “10 classes, 15 lines”.

Budget gets consumed through evaluating Tier 1 first, descending across its internal scores. It moves through Tier 2 down into Tier 3.

[Tokens] -> Tier 1 Output -> Tier 2 Output -> Tier 3 Output

If budget truncates along the threshold flow, we abandon any remaining Tier 3 elements, truncate Tier 2 elements natively across the file buffer bounds, all the way to trimming existing parsed summaries.