mirror of
https://github.com/ForLoopCodes/contextplus.git
synced 2026-04-26 06:25:50 +03:00
[GH-ISSUE #6] semantic_code_search fails with 'input length exceeds context length' on large codebases #3
Labels
No labels
bug
bug
documentation
enhancement
enhancement
good first issue
good first issue
help wanted
pull-request
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/contextplus#3
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @alecmarcus on GitHub (Mar 1, 2026).
Original GitHub issue: https://github.com/ForLoopCodes/contextplus/issues/6
Originally assigned to: @ForLoopCodes on GitHub.
Description
semantic_code_searchconsistently fails with the errorthe input length exceeds the context lengthon codebases with 100+ files, regardless of query length ortop_kparameter.Environment
bunx contextplusnomic-embed-text(8K context, 768d)qwen3-embedding:8b(40K context, 4096d)/api/embed)semantic_identifier_searchworks fine on the same codebase with the same modelReproduction
semantic_code_searchwith any query (even a single word like"DID") andtop_k: 1the input length exceeds the context lengthAnalysis
The error originates from Ollama, not from the model's actual context limit. Testing
qwen3-embedding:8b(40K context) directly viacurlworks fine for individual embeddings. The issue appears to be insemantic-search.ts— likely batching too many file contents into a single embedding call, or concatenating file content before embedding rather than embedding files individually.semantic_identifier_search(which embeds shorter symbol signatures rather than full file contents) works correctly on the same codebase with the same model, which supports the hypothesis that it's a file-content batching issue.Expected Behavior
semantic_code_searchshould embed files individually (or in smaller chunks) and work on codebases of any size, bounded only by the embedding model's per-document context window.Workaround
Use
semantic_identifier_searchor external grep for code search.@ForLoopCodes commented on GitHub (Mar 1, 2026):
fixed in contextplus@1.0.3 02580c0