fix: resolve datastore sync path mismatch for remote datastores#977
fix: resolve datastore sync path mismatch for remote datastores#977
Conversation
DefaultDatastorePathResolver used datastorePath as the base for all data reads/writes, while S3CacheSyncService used cachePath for sync operations. For S3 datastores these resolved to different directories, so pushChanged() never found files to upload and pullChanged() downloaded to a location the application never read from. Also update DatastoreSyncService to return file counts instead of void, and pass real counts through createDatastoreSyncDeps instead of hardcoding zero. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update `S3CacheSyncService` to return actual file counts from `pullChanged()` and `pushChanged()` instead of `Promise<void>`. This feeds accurate numbers through to the CLI output — previously `swamp datastore sync --pull` always displayed "Pulled 0 files" and the coordinator always logged "already up to date" regardless of how many files were actually transferred. Companion to systeminit/swamp#977 which updated the `DatastoreSyncService` interface to accept `Promise<number | void>`. - **`extensions/datastores/_lib/s3_cache_sync.ts`** — Return pulled/pushed counts from `pullChanged()` and `pushChanged()` - **`extensions/datastores/_lib/interfaces.ts`** — Match updated interface (`Promise<number | void>`) - Type-checks pass (`deno check`) - No behavioral change to sync logic — only the return values are new
There was a problem hiding this comment.
CLI UX Review
Blocking
None.
Suggestions
None.
Verdict
PASS — This PR is a pure bug fix with no new flags, commands, or help text. The only UX-visible change is that swamp datastore sync now correctly reports the actual number of files pulled/pushed (e.g. Pushed 3 files) instead of always showing Pushed 0 files / Pulled 0 files. Both log-mode and JSON-mode output were already structured correctly in src/presentation/renderers/datastore_sync.ts — the renderer just wasn't receiving real counts before this fix.
There was a problem hiding this comment.
Adversarial Review
Critical / High
None found.
Medium
-
src/libswamp/datastores/sync.ts:103-105—fullSyncsequential ordering may hide errors. InfullSync, ifpullChanged()succeeds butpushChanged()throws, the pull results are lost and the caller sees only the push error. Theerrorsarray is always returned empty. This isn't new to this PR (the sequential pattern existed before), but the PR now surfaces real counts, making it more visible that a partial-success scenario silently discards pull results. Consider wrapping each call in try/catch and populating theerrorsarray, or at minimum documenting thatfullSyncis not atomic. -
src/infrastructure/persistence/default_datastore_path_resolver.ts:59—cachePathcould be an empty string.cachePathis typed asstring | undefined. The??operator only triggers onnull/undefined, not on"". If a provider ever resolvescachePathto an empty string,datastoreBasePathwould become"", and alljoin()calls would produce relative paths (e.g.,"data/foo"instead of"/absolute/data/foo"). This is defensive — I don't see a current provider that does this — but a|| datastoreConfig.datastorePathwould be more robust than??.
Low
-
src/libswamp/datastores/sync.ts:97—typeof count === "number"guard. Thetypeof count === "number" ? count : 0pattern is correct for thenumber | voidunion, but if a provider implementation were to returnNaN(whichtypeofclassifies as"number"), the caller would propagateNaNas a file count. Extremely unlikely in practice. -
Test at line 150 in
default_datastore_path_resolver_test.ts— The new "prefers cachePath" test doesn't verifyresolvePathfor a non-datastore subdir (to confirmlocalPathstill usesrepoDir/.swampand notcachePath). The existing test at line 24 covers this for filesystem configs, but not for custom configs with a divergentcachePath. Theoretical gap only — thelocalPathmethod doesn't referencedatastoreBasePath.
Verdict
PASS — The core fix is correct and well-targeted. The cachePath ?? datastorePath change properly aligns the path resolver with the sync service's expectations. The return type widening is backwards-compatible (existing callers already handle void). Tests cover the key cases.
There was a problem hiding this comment.
Code Review
Blocking Issues
None.
Suggestions
-
DatastoreSyncServicereturn type:Promise<number | void>creates caller-side ambiguity requiringtypeofguards. If this interface were internal-only,Promise<number>(returning0for no-op) would be cleaner. However, since this is a contract with external extension authors who may already returnvoid, the union is a reasonable backward-compatibility choice — no change needed now, but worth revisiting if the extension API gets a major version bump. -
Pre-existing test name mismatch (line 45–57 in the test file): The test
"custom datastorePath uses datastorePath"now actually exercises a case wherecachePath === datastorePath, so it testscachePathpreference rather thandatastorePathfallback. The name was accurate before this PR but is now slightly misleading. Not blocking since the new dedicated tests cover both cases clearly.
LGTM — minimal, well-targeted fix with good test coverage. The path mismatch root cause is clearly explained and the cachePath ?? datastorePath approach is correct.
Update `S3CacheSyncService` to return actual file counts from `pullChanged()` and `pushChanged()` instead of `Promise<void>`. This feeds accurate numbers through to the CLI output — previously `swamp datastore sync --pull` always displayed "Pulled 0 files" and the coordinator always logged "already up to date" regardless of how many files were actually transferred. Companion to systeminit/swamp#977 which updated the `DatastoreSyncService` interface to accept `Promise<number | void>`. - **`extensions/datastores/_lib/s3_cache_sync.ts`** — Return pulled/pushed counts from `pullChanged()` and `pushChanged()` - **`extensions/datastores/_lib/interfaces.ts`** — Match updated interface (`Promise<number | void>`) - Type-checks pass (`deno check`) - No behavioral change to sync logic — only the return values are new
Update `S3CacheSyncService` to return actual file counts from `pullChanged()` and `pushChanged()` instead of `Promise<void>`. This feeds accurate numbers through to the CLI output — previously `swamp datastore sync --pull` always displayed "Pulled 0 files" and the coordinator always logged "already up to date" regardless of how many files were actually transferred. Companion to systeminit/swamp#977 which updated the `DatastoreSyncService` interface to accept `Promise<number | void>`. - **`extensions/datastores/_lib/s3_cache_sync.ts`** — Return pulled/pushed counts from `pullChanged()` and `pushChanged()` - **`extensions/datastores/_lib/interfaces.ts`** — Match updated interface (`Promise<number | void>`) - Type-checks pass (`deno check`) - No behavioral change to sync logic — only the return values are new
Update `S3CacheSyncService` to return actual file counts from `pullChanged()` and `pushChanged()` instead of `Promise<void>`. This feeds accurate numbers through to the CLI output — previously `swamp datastore sync --pull` always displayed "Pulled 0 files" and the coordinator always logged "already up to date" regardless of how many files were actually transferred. Companion to systeminit/swamp#977 which updated the `DatastoreSyncService` interface to accept `Promise<number | void>`. - **`extensions/datastores/_lib/s3_cache_sync.ts`** — Return pulled/pushed counts from `pullChanged()` and `pushChanged()` - **`extensions/datastores/_lib/interfaces.ts`** — Match updated interface (`Promise<number | void>`) - Type-checks pass (`deno check`) - No behavioral change to sync logic — only the return values are new
Summary
Fixes a bug where data written during workflow execution with S3 (or other
remote) datastores was never pushed to the remote, making it invisible to
other machines pulling from the same datastore.
Root Cause
For custom datastores with a local cache (e.g.
@swamp/s3-datastore), thereare two path values in
CustomDatastoreConfig:datastorePath{repoDir}/.swampDefaultDatastorePathResolver— base for all data reads/writescachePath~/.swamp/repos/unknownS3CacheSyncService— directory scanned for push, target for pullThese are different directories. The consequence:
{repoDir}/.swamp/data/...(via the path resolverusing
datastorePath)~/.swamp/repos/unknown/(via the sync service usingcachePath) — finds no files — pushes nothing to S3~/.swamp/repos/unknown/— but thepath resolver reads from
{repoDir}/.swamp/— data is invisibleThe S3 provider's
resolveCachePath()comment even says "return a sensibledefault that core will override", but the
??fallback in config resolutionnever triggers because the provider returns a string (not
undefined).The Fix
DefaultDatastorePathResolvernow usescachePath ?? datastorePathasthe base path for custom datastores. This ensures data reads/writes go to
the same directory the sync service pushes from and pulls to. When no
cachePathis configured, it falls back to the originaldatastorePathbehavior.
Additionally,
DatastoreSyncService.pullChanged()/pushChanged()nowreturn
Promise<number | void>instead ofPromise<void>, andcreateDatastoreSyncDepspasses through real file counts instead ofhardcoding zero. This fixes misleading output like
"Pulled 0 files"and"already up to date"even when files were actually transferred.Files Changed
src/infrastructure/persistence/default_datastore_path_resolver.ts—Use
cachePath ?? datastorePathfor remote datastoressrc/infrastructure/persistence/default_datastore_path_resolver_test.ts—Add tests for cachePath preference and fallback
src/domain/datastore/datastore_sync_service.ts— ReturnPromise<number | void>from sync methodssrc/libswamp/datastores/sync.ts— Pass through real counts from syncservice instead of hardcoding 0
packages/testing/datastore_types.ts— Match updated interfaceTest Plan
update_notification_service_test.ts, unrelated)cachePathis preferred overdatastorePathfor customdatastores, and that
datastorePathis used as fallback when nocachePathexistsswamp datastore sync, confirmeddata from S3 is now visible via
swamp data searchto CI and running a workflow