Incremental Indexing

Overview

sqry index is incremental by default. It reads a persistent hash index from .sqry-cache/, hashes every source file in the workspace, and reparses only the files whose hash changed since the last run. Unchanged files reuse the previously computed AST nodes and edges. The result: the second invocation of sqry index . on a clean tree completes in tens of milliseconds, even on large workspaces.

The hash index lives at .sqry-cache/file_hashes.bin. It is a tiny binary file (a few hundred bytes for most projects) and is safe to commit or to delete — sqry rebuilds it on the next index run.

How it works

                                   ┌───────────────────┐
   sqry index .  ──▶  hash files ──┤ hash matches?     │── yes ──▶ reuse cached nodes/edges
                                   └────────┬──────────┘
                                            │ no
                                            ▼
                                       reparse file
                                            │
                                            ▼
                                  commit nodes + edges
                                  update .sqry-cache/file_hashes.bin

On every run sqry:

Loads .sqry-cache/file_hashes.bin if it exists.
Hashes every workspace file in parallel.
Reparses only files whose hash changed (or files that don’t yet have a cached hash).
Commits the new nodes and edges into the existing graph snapshot.
Writes the updated hash table back to .sqry-cache/file_hashes.bin.

The on-disk graph snapshot at .sqry/graph/snapshot.sqry is updated atomically. Concurrent reads (CLI/LSP/MCP queries) see a consistent view at all times.

Forcing a full rebuild

Two ways to bypass the incremental path:

sqry index --force .          # Same hash table, but reparse every file
sqry index --no-incremental . # Skip the hash table entirely (debug / forensic mode)

Flag	Effect
`--force` (`-f`)	Reparse every file but still update the hash table. Use after a major sqry upgrade or when the snapshot version bumps.
`--no-incremental`	Disable the hash index entirely; sqry parses every file and does not write `.sqry-cache/file_hashes.bin`. Useful for debugging metadata-only evaluation paths.
`--add-to-gitignore`	Auto-append `.sqry-index/` to `.gitignore` so cached state never lands in commits.

Custom cache directory

By default, the hash index lives at <workspace>/.sqry-cache/. Override the location with --cache-dir:

sqry index . --cache-dir /tmp/sqry-cache

This is most useful in:

Read-only or sandboxed source trees — point the cache at a writable scratch directory while keeping the project read-only.
Ephemeral CI runners — write the cache to a host-mounted volume so it survives container teardown, then mount it back in on the next CI job for free incrementality.
Multi-checkout workflows — share one cache across two worktrees of the same repo to avoid double-indexing.

The --cache-dir path is created if it does not exist. Relative paths are resolved against the current working directory.

Metrics export

sqry index --status prints metadata about the existing index — age, symbol count, languages, validation health. Combine it with --metrics-format for machine-readable output:

# JSON (default)
sqry index --status --json

# Prometheus / OpenMetrics text
sqry index --status --json --metrics-format prometheus

The Prometheus output is OpenMetrics-compatible and exports the following gauges:

Metric	Type	Description
`sqry_index_age_seconds`	gauge	Seconds since the snapshot was last written
`sqry_index_node_count`	gauge	Total nodes in the snapshot
`sqry_index_edge_count`	gauge	Total edges in the snapshot
`sqry_index_file_count`	gauge	Total files indexed
`sqry_index_validation_total`	counter	Files inspected by the validation pass
`sqry_index_validation_missing`	counter	Files that disappeared between index time and now
`sqry_index_validation_modified`	counter	Files modified since index time

Pipe the output directly into a Prometheus push gateway, or scrape it from a CI job:

sqry index --status --json --metrics-format prometheus \
  | curl --data-binary @- "http://pushgateway:9091/metrics/job/sqry/instance/$(hostname)"

Validation modes

Independently of the cache, sqry index --validate <mode> controls how strict sqry is about source-file drift detected during a query:

Mode	Behaviour
`warn` (default)	Log a warning on drift, return results from the snapshot.
`fail`	Exit with code `2` if more than 20% of indexed files are missing on disk.
`off`	Skip validation entirely (fastest).

sqry search "test" --validate fail   # CI-friendly strict mode
sqry search "test" --validate off    # Hot-path performance mode

Inside the daemon

When sqry runs as a daemon (see Daemon (sqryd)), the file-system watcher debounces events over debounce_ms (default 2000 ms) and triggers sqryd’s incremental reindex path automatically. You don’t need to call sqry index by hand — saving a file in your editor is enough. The hash-index machinery is the same in both paths.

Troubleshooting

“Snapshot version mismatch”: A major sqry upgrade bumped the snapshot format. Run sqry index --force . once to rewrite the snapshot in the new format. Hash cache survives across version bumps; the snapshot does not.
Stale results after editing files outside the editor: If you edit files via a tool sqry’s daemon watcher doesn’t see (e.g. git checkout of a different branch), run sqry index . to refresh, or call sqry daemon rebuild <path> to refresh a daemon-loaded workspace.
Hash index corrupt: rm -rf .sqry-cache && sqry index .. The next run rebuilds it from scratch.
Cache dir on a slow filesystem (NFS, shared SMB): set --cache-dir /tmp/sqry-cache to keep the hash index on local disk.
Disk full during an index run: sqry writes the snapshot atomically — a partial write is rolled back. Free space and rerun.

Daemon (sqryd) — keep the graph warm in memory; integrates with the same hash-index machinery.
Configuration — environment variables that influence cache and indexing throughput.
Performance — measured benchmarks for cold and warm index runs.

Edit this page on GitHub →