Incremental Indexing

Overview

sqry index is incremental by default. It reads a persistent hash index from .sqry-cache/, hashes every source file in the workspace, and reparses only the files whose hash changed since the last run. Unchanged files reuse the previously computed AST nodes and edges. The result: the second invocation of sqry index . on a clean tree completes in tens of milliseconds, even on large workspaces.

The hash index lives at .sqry-cache/file_hashes.bin. It is a tiny binary file (a few hundred bytes for most projects) and is safe to commit or to delete — sqry rebuilds it on the next index run.

How it works

                                   ┌───────────────────┐
   sqry index .  ──▶  hash files ──┤ hash matches?     │── yes ──▶ reuse cached nodes/edges
                                   └────────┬──────────┘
                                            │ no
                                       reparse file
                                  commit nodes + edges
                                  update .sqry-cache/file_hashes.bin

On every run sqry:

  1. Loads .sqry-cache/file_hashes.bin if it exists.
  2. Hashes every workspace file in parallel.
  3. Reparses only files whose hash changed (or files that don’t yet have a cached hash).
  4. Commits the new nodes and edges into the existing graph snapshot.
  5. Writes the updated hash table back to .sqry-cache/file_hashes.bin.

The on-disk graph snapshot at .sqry/graph/snapshot.sqry is updated atomically. Concurrent reads (CLI/LSP/MCP queries) see a consistent view at all times.

Forcing a full rebuild

Two ways to bypass the incremental path:

sqry index --force .          # Same hash table, but reparse every file
sqry index --no-incremental . # Skip the hash table entirely (debug / forensic mode)
FlagEffect
--force (-f)Reparse every file but still update the hash table. Use after a major sqry upgrade or when the snapshot version bumps.
--no-incrementalDisable the hash index entirely; sqry parses every file and does not write .sqry-cache/file_hashes.bin. Useful for debugging metadata-only evaluation paths.
--add-to-gitignoreAuto-append .sqry-index/ to .gitignore so cached state never lands in commits.

Custom cache directory

By default, the hash index lives at <workspace>/.sqry-cache/. Override the location with --cache-dir:

sqry index . --cache-dir /tmp/sqry-cache

This is most useful in:

The --cache-dir path is created if it does not exist. Relative paths are resolved against the current working directory.

Metrics export

sqry index --status prints metadata about the existing index — age, symbol count, languages, validation health. Combine it with --metrics-format for machine-readable output:

# JSON (default)
sqry index --status --json

# Prometheus / OpenMetrics text
sqry index --status --json --metrics-format prometheus

The Prometheus output is OpenMetrics-compatible and exports the following gauges:

MetricTypeDescription
sqry_index_age_secondsgaugeSeconds since the snapshot was last written
sqry_index_node_countgaugeTotal nodes in the snapshot
sqry_index_edge_countgaugeTotal edges in the snapshot
sqry_index_file_countgaugeTotal files indexed
sqry_index_validation_totalcounterFiles inspected by the validation pass
sqry_index_validation_missingcounterFiles that disappeared between index time and now
sqry_index_validation_modifiedcounterFiles modified since index time

Pipe the output directly into a Prometheus push gateway, or scrape it from a CI job:

sqry index --status --json --metrics-format prometheus \
  | curl --data-binary @- "http://pushgateway:9091/metrics/job/sqry/instance/$(hostname)"

Validation modes

Independently of the cache, sqry index --validate <mode> controls how strict sqry is about source-file drift detected during a query:

ModeBehaviour
warn (default)Log a warning on drift, return results from the snapshot.
failExit with code 2 if more than 20% of indexed files are missing on disk.
offSkip validation entirely (fastest).
sqry search "test" --validate fail   # CI-friendly strict mode
sqry search "test" --validate off    # Hot-path performance mode

Inside the daemon

When sqry runs as a daemon (see Daemon (sqryd)), the file-system watcher debounces events over debounce_ms (default 2000 ms) and triggers sqryd’s incremental reindex path automatically. You don’t need to call sqry index by hand — saving a file in your editor is enough. The hash-index machinery is the same in both paths.

Troubleshooting