Skip to content

refactor(inkless:consume): replace ByteRange.coalesce with BoundingRangeAlignment strategy#532

Merged
jeqo merged 1 commit intomainfrom
jeqo/reduce-fetch-read-amplification
Mar 20, 2026
Merged

refactor(inkless:consume): replace ByteRange.coalesce with BoundingRangeAlignment strategy#532
jeqo merged 1 commit intomainfrom
jeqo/reduce-fetch-read-amplification

Conversation

@jeqo
Copy link
Copy Markdown
Contributor

@jeqo jeqo commented Mar 10, 2026

to reduce read amplification

FetchPlanner.createFetchRequests() unconditionally aligned all byte ranges to fixed blocks (e.g. 16 MiB) before the hot/cold path decision. For the cold path, the cache is bypassed entirely, so alignment padding is pure read amplification with zero benefit — a 10 KB batch could trigger a 16 MiB fetch.

Move the hot/cold determination into the planning phase so that:

  • Hot path: applies block alignment as before (cache benefit)
  • Cold path: uses coalesced exact batch ranges (minimal bytes fetched)

Add ByteRange.coalesce() to merge overlapping/adjacent ranges, and extract isLagging() helper to deduplicate the threshold check between createFetchRequests() and submitSingleRequest().

@jeqo jeqo marked this pull request as ready for review March 10, 2026 14:43
@jeqo jeqo requested a review from Copilot March 10, 2026 14:43
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces read amplification for lagging consumers by choosing the byte-range strategy during planning: hot-path requests remain block-aligned for cache efficiency, while cold-path requests bypass cache and fetch coalesced exact ranges.

Changes:

  • Update FetchPlanner to select aligned vs coalesced ranges based on lagging/hot determination during planning.
  • Add ByteRange.coalesce() to merge overlapping/adjacent ranges into a minimal set.
  • Extend unit tests to validate cold-path planning behavior and ByteRange.coalesce() semantics.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
storage/inkless/src/main/java/io/aiven/inkless/consume/FetchPlanner.java Chooses coalescing vs alignment during planning and extracts isLagging() helper used by both planning and submission.
storage/inkless/src/main/java/io/aiven/inkless/common/ByteRange.java Adds coalesce(List<ByteRange>) utility to merge overlapping/adjacent ranges.
storage/inkless/src/test/java/io/aiven/inkless/consume/FetchPlannerTest.java Adds planning-focused tests to ensure cold path uses exact/coalesced ranges and hot path stays aligned.
storage/inkless/src/test/java/io/aiven/inkless/common/ByteRangeTest.java Adds test coverage for ByteRange.coalesce() behavior across overlap/adjacency/ordering cases.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@jeqo jeqo force-pushed the jeqo/reduce-fetch-read-amplification branch from 28f05c8 to a5ecdf7 Compare March 10, 2026 16:41
@jeqo jeqo requested a review from Copilot March 10, 2026 16:42
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@jeqo jeqo force-pushed the jeqo/reduce-fetch-read-amplification branch from a5ecdf7 to 583b55a Compare March 10, 2026 16:58
@jeqo jeqo requested a review from Copilot March 10, 2026 16:58
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@jeqo jeqo force-pushed the jeqo/reduce-fetch-read-amplification branch from 583b55a to 2019a0b Compare March 10, 2026 17:26
* @param ranges the byte ranges to coalesce (may be null or empty)
* @return an unmodifiable set of merged, non-overlapping byte ranges in offset order
*/
public static Set<ByteRange> coalesce(List<ByteRange> ranges) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the signature of KeyAlignment, which makes me think this is inappropriate complexity to add as a static method.

// Overlapping or adjacent — extend
currentEnd = Math.max(currentEnd, nextEnd);
} else {
// Gap — emit current and start new
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are gaps, they are expected to be small, and making multiple requests to skip around that gap would be worse than just throwing around the data.
This routine shouldn't do this sort/match end-to-end algorithm, it should compute the minimum start and maximum end offset.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call! I have updated the implementation to go along this approach. PTAL

@jeqo jeqo force-pushed the jeqo/reduce-fetch-read-amplification branch from 2019a0b to f716ab6 Compare March 11, 2026 17:20
@jeqo jeqo changed the title feat(inkless:consume): skip block alignment for lagging consumers refactor(inkless:consume): replace ByteRange.coalesce with BoundingRangeAlignment strategy Mar 11, 2026
@jeqo jeqo force-pushed the jeqo/reduce-fetch-read-amplification branch from f716ab6 to d709c1b Compare March 11, 2026 18:41
…ngeAlignment strategy

Reuse KeyAlignmentStrategy interface for cold-path range computation instead
of a static method on ByteRange. One bounding range per object minimizes HTTP
requests at the cost of reading small interleaved gaps.
@jeqo jeqo force-pushed the jeqo/reduce-fetch-read-amplification branch from d709c1b to 83f2f9f Compare March 20, 2026 13:35
@jeqo jeqo merged commit 069cce8 into main Mar 20, 2026
4 checks passed
@jeqo jeqo deleted the jeqo/reduce-fetch-read-amplification branch March 20, 2026 14:22
AnatolyPopov pushed a commit that referenced this pull request Mar 23, 2026
…ngeAlignment strategy (#532)

Reuse KeyAlignmentStrategy interface for cold-path range computation instead
of a static method on ByteRange. One bounding range per object minimizes HTTP
requests at the cost of reading small interleaved gaps.

(cherry picked from commit 069cce8)
jeqo added a commit that referenced this pull request Mar 23, 2026
…ngeAlignment strategy (#532)

Reuse KeyAlignmentStrategy interface for cold-path range computation instead
of a static method on ByteRange. One bounding range per object minimizes HTTP
requests at the cost of reading small interleaved gaps.
jeqo added a commit that referenced this pull request Mar 23, 2026
…ngeAlignment strategy (#532)

Reuse KeyAlignmentStrategy interface for cold-path range computation instead
of a static method on ByteRange. One bounding range per object minimizes HTTP
requests at the cost of reading small interleaved gaps.

(cherry picked from commit 069cce8)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants