2026-03-01

sqry v4.6.3

sqry v4.6.3 introduces natural language queries via sqry ask, switches the intent classifier to all-MiniLM-L6-v2, and validates performance at Linux kernel scale.

Natural Language Queries

sqry ask translates plain English into safe, validated sqry commands:

sqry ask "find authentication functions in rust"
# → sqry query "name~=/auth/ AND kind:function" --language rust

sqry ask "who calls the login function"
# → sqry graph direct-callers "login"

sqry ask "trace from main to database"
# → sqry graph trace-path "main" "database"

The translation pipeline runs a 6-stage process: preprocess (Unicode normalization, homoglyph detection) → extract entities → classify intent → assemble command → validate safety → cache. Every generated command is checked against a whitelist — no shell metacharacters, no path traversal, no write operations.

A 4-tier confidence system controls execution:

Tier	Confidence	Behavior
Execute	>= 85%	Shows the command, asks to run it
Confirm	65–84%	Confirmation prompt with the command
Disambiguate	< 65%	Multiple options to choose from
Reject	n/a	Input failed validation; shows suggestions

Use --auto-execute to skip confirmation for high-confidence translations, or --dry-run to see the generated command without running it.

Natural language is also available via the sqry_ask MCP tool (for Claude, Codex, Gemini) and the sqry/ask LSP endpoint.

MiniLM-L6-v2 Classifier

The intent classifier has been switched from DistilBERT to all-MiniLM-L6-v2 (22M parameters). Key improvements:

Metric	Value
Base model	`sentence-transformers/all-MiniLM-L6-v2`
ONNX INT8 size	57 MB
Accuracy	99.75%
P50 latency	2.1 ms
P90 latency	3.0 ms
Calibrated ECE	0.0006

The classifier is feature-gated. Without ONNX Runtime, sqry falls back to a rule-based classifier that achieves >=70% accuracy with zero external dependencies.

Linux Kernel Benchmarks

sqry v4.6.x has been validated against the Linux kernel source tree:

Metric	Value
Codebase	~28M LOC, 63,074 C files
Index time	1m48s (24-core machine)
Nodes indexed	11,205,544
Edges resolved	18,292,255
Snapshot size	1.8 GB
Caller query latency	~85 ms (100 results)

Tested scenarios include syscall-to-disk call path tracing, cross-subsystem cycle detection, blast-radius analysis for kfree and copy_from_user, and dead code detection in drivers/staging/.

Other Changes

Fixed CLI command names in NL pipeline and MCP skills
Retrained intent classifier with corrected command mappings
Updated training pipeline documentation for MiniLM-L6-v2

← All releases