Skip to content

feat(#2182): mitigate cache stampede with single-flight#4146

Draft
demming wants to merge 1 commit intogofiber:masterfrom
demming:fix/2182-middleware-mitigate-cache-stampede
Draft

feat(#2182): mitigate cache stampede with single-flight#4146
demming wants to merge 1 commit intogofiber:masterfrom
demming:fix/2182-middleware-mitigate-cache-stampede

Conversation

@demming
Copy link
Copy Markdown

@demming demming commented Mar 17, 2026

Description

This fixes the cache stampede I reported in #2182 by using singleflight.

I'll follow up with an extra option to serve stale responses for freshly expired entries, like often with TOTP during failover.

It has a documentation impat - the Config now includes SingleFlight (and maybe StaleWhileRevalidate - either this in PR or in a follow-up PR).

I'll follow up on the tasks below in due time.

This PR isn't ready yet.

Changes introduced

List the new features or adjustments introduced in this pull request. Provide details on benchmarks, documentation updates, changelog entries, and if applicable, the migration guide.

  • Benchmarks: Describe any performance benchmarks and improvements related to the changes.
  • Documentation Update: Detail the updates made to the documentation and links to the changed files.
  • Changelog/What's New: Include a summary of the additions for the upcoming release notes.
  • Migration Guide: If necessary, provide a guide or steps for users to migrate their existing code to accommodate these changes.
  • API Alignment with Express: Explain how the changes align with the Express API.
  • API Longevity: Discuss the steps taken to ensure that the new or updated APIs are consistent and not prone to breaking changes.
  • Examples: Provide examples demonstrating the new features or changes in action.

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)
  • [ x Enhancement (improvement to existing features and functionality)
  • [ x Documentation update (changes to documentation)
  • Performance improvement (non-breaking change which improves efficiency)
  • Code consistency (non-breaking change which improves code reliability and robustness)

Checklist

Before you submit your pull request, please make sure you meet these requirements:

  • Followed the inspiration of the Express.js framework for new functionalities, making them similar in usage.
  • Conducted a self-review of the code and provided comments for complex or critical parts.
  • Updated the documentation in the /docs/ directory for Fiber's documentation.
  • Added or updated unit tests to validate the effectiveness of the changes or new features.
  • Ensured that new and existing unit tests pass locally with the changes.
  • Verified that any new dependencies are essential and have been agreed upon by the maintainers/community.
  • Aimed for optimal performance with minimal allocations in the new code.
  • Provided benchmarks for the new code to analyze and improve upon.

Commit formatting

Please use emojis in commit messages for an easy way to identify the purpose or intention of a commit. Check out the emoji cheatsheet here: CONTRIBUTING.md

This fixes the cache stampede I reported in gofiber#2182 by using singleflight.

I'll follow up with an extra option to serve stale responses for freshly expired entries, like often with TOTP during failover.

I'll clean up a little here first, this is my 1st iteration, hence the duplication in the singleflight call.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 17, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 955d8005-8b00-4979-86dd-484a338fbdfd

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can approve the review once all CodeRabbit's comments are resolved.

Enable the reviews.request_changes_workflow setting to automatically approve the review once all CodeRabbit's comments are resolved.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant enhancement to the cache middleware by integrating a single-flight mechanism. This feature is designed to mitigate the "cache stampede" problem, where multiple concurrent requests for the same uncached resource can overwhelm the backend. By enabling the new SingleFlight option, only one request will proceed to generate the response and populate the cache, while other concurrent requests for the same key will wait and share the result, thereby improving performance and reducing server load under high concurrency. Additionally, a StaleWhileRevalidate option has been added for future expansion.

Highlights

  • Cache Stampede Mitigation: Implemented a single-flight mechanism in the cache middleware to prevent cache stampede by coalescing concurrent cache misses for the same key.
  • New Configuration Options: Introduced SingleFlight and StaleWhileRevalidate fields to the cache middleware configuration, allowing users to enable and configure these behaviors.
  • Dependency Update: Added golang.org/x/sync as a new dependency to support the single-flight functionality.
Changelog
  • go.mod
    • Added golang.org/x/sync module.
  • go.sum
    • Updated module checksums.
  • middleware/cache/cache.go
    • Implemented single-flight logic to coalesce concurrent cache misses.
    • Introduced loadResult struct for shared responses.
    • Refined comments.
  • middleware/cache/cache_test.go
    • Added Test_Cache_SingleFlight to verify cache stampede prevention.
    • Added Test_Cache_DefaultConfig_BackwardsCompatible to ensure existing behavior is maintained.
  • middleware/cache/config.go
    • Added SingleFlight and StaleWhileRevalidate fields to the Config struct.
    • Updated ConfigDefault.
Activity
  • The author indicated that the pull request is not yet ready and that further tasks will be addressed in due time.
  • The author performed a self-review of the code and included comments for complex parts.
  • Unit tests were added or updated to validate the new features and changes.
  • The author confirmed that all unit tests pass locally.
  • The author focused on optimal performance and minimal memory allocations.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a single-flight mechanism to mitigate cache stampedes, which is a valuable addition. The implementation is well-structured and includes relevant tests to verify the new behavior and ensure backward compatibility. I have a couple of suggestions to improve maintainability by refactoring some duplicated code and to keep the configuration API clean by deferring the addition of an unused option.

@demming
Copy link
Copy Markdown
Author

demming commented Mar 17, 2026

I need to tackle proper panic handling from inside singleflight - the waiters must be notified to cancel.

I'm aware of the duplication as noted in the commit message. Will clean up before marking it ready for review.

This draft is intended to gather first feedback. FYI @ReneWerner87

@Ashutosh-Singh-781
Copy link
Copy Markdown

awesome

@gaby
Copy link
Copy Markdown
Member

gaby commented Mar 29, 2026

@demming Any updates on this?

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds optional single-flight request coalescing to the cache middleware to mitigate cache stampedes (issue #2182), plus new config surface and tests.

Changes:

  • Extend cache.Config with SingleFlight (and StaleWhileRevalidate) fields and defaults.
  • Implement a singleflight.Group-based miss coalescing path in the cache middleware.
  • Add tests intended to validate stampede prevention and default behavior; add golang.org/x/sync dependency.

Reviewed changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
middleware/cache/config.go Adds new config fields and documents their intended behavior.
middleware/cache/cache.go Implements the single-flight miss coalescing logic in the middleware.
middleware/cache/cache_test.go Adds concurrency test for SingleFlight and a default-config regression test.
go.mod Adds golang.org/x/sync dependency for singleflight.
go.sum Updates checksums for the new dependency.
Comments suppressed due to low confidence (1)

middleware/cache/cache.go:136

  • With SingleFlight enabled on a cold miss, the middleware calls manager.get(key) (line 135) and then calls manager.get(key) again inside the singleflight critical section. For the in-memory store case, this can allocate an extra pooled *item that is never used. Consider restructuring to avoid the duplicate get (e.g., delay the initial get until after deciding hit/miss, or reuse the same *item when populating).
		key := cfg.KeyGenerator(c) + "_" + c.Method()

		// Get entry from pool
		e := manager.get(key)

Comment on lines +234 to +237
// Skip caching if body won't fit into cache.
if cfg.MaxBytes > 0 && bodySize > cfg.MaxBytes {
return res, nil
}
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the Next case, when bodySize > cfg.MaxBytes the SingleFlight closure returns without caching, but the outer code still sets the cache status to miss. The non-singleflight path marks this as unreachable. Please align SingleFlight behavior with the existing path for oversized responses.

Copilot uses AI. Check for mistakes.
Comment on lines +306 to +307
// Set cache status header.
c.Set(cfg.CacheHeader, cacheMiss)
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SingleFlight path always sets cfg.CacheHeader to miss after sf.Do(...), even when the request was intentionally bypassed (Next) or not cached due to MaxBytes. To match current behavior, the status header should reflect unreachable for those cases (and only be miss when the response is actually cached/populated).

Suggested change
// Set cache status header.
c.Set(cfg.CacheHeader, cacheMiss)
// Set cache status header only for waiters that received a shared result.
if shared {
c.Set(cfg.CacheHeader, cacheMiss)
}

Copilot uses AI. Check for mistakes.
Comment on lines +209 to +213
res := loadResult{
Body: utils.CopyBytes(c.Response().Body()),
Status: c.Response().StatusCode(),
Ctype: utils.CopyBytes(c.Response().Header.ContentType()),
Cencoding: utils.CopyBytes(c.Response().Header.Peek(fiber.HeaderContentEncoding)),
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

res := loadResult{ Body: utils.CopyBytes(c.Response().Body()), ... } copies the full response body before the code checks cfg.Next(c) and cfg.MaxBytes (which can cause an early return without caching). Consider moving these checks before copying to avoid unnecessary allocations for bypassed or oversized responses.

Copilot uses AI. Check for mistakes.
Comment on lines +161 to +171
// keeps existing behavior: no coalescing; existing tests pass unchanged.
func Test_Cache_DefaultConfig_BackwardsCompatible(t *testing.T) {
t.Parallel()

app := fiber.New()
app.Use(New()) // SingleFlight defaults to false

app.Get("/", func(c *fiber.Ctx) error {
return c.SendString("default")
})

Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test_Cache_DefaultConfig_BackwardsCompatible claims to ensure "no coalescing" when SingleFlight defaults to false, but it only asserts basic caching works. Either extend it to actually validate the stated behavior (e.g., concurrent misses invoke the handler multiple times when SingleFlight is false) or rename/remove it to avoid misleading coverage.

Suggested change
// keeps existing behavior: no coalescing; existing tests pass unchanged.
func Test_Cache_DefaultConfig_BackwardsCompatible(t *testing.T) {
t.Parallel()
app := fiber.New()
app.Use(New()) // SingleFlight defaults to false
app.Get("/", func(c *fiber.Ctx) error {
return c.SendString("default")
})
// keeps existing behavior: no coalescing for concurrent misses; existing tests pass unchanged.
func Test_Cache_DefaultConfig_BackwardsCompatible(t *testing.T) {
t.Parallel()
var handlerCalls int64
app := fiber.New()
app.Use(New()) // SingleFlight defaults to false
app.Get("/", func(c *fiber.Ctx) error {
atomic.AddInt64(&handlerCalls, 1)
return c.SendString("default")
})
// Cold cache: many concurrent requests for the same key should not be fully
// coalesced when SingleFlight is false; the handler should run more than once.
const concurrency = 50
var wg sync.WaitGroup
for i := 0; i < concurrency; i++ {
wg.Add(1)
go func(idx int) {
defer wg.Done()
req := httptest.NewRequest("GET", "/", nil)
resp, err := app.Test(req)
if err != nil {
t.Errorf("request %d: %v", idx, err)
return
}
// Drain body to ensure handler completes.
_, _ = io.ReadAll(resp.Body)
}(i)
}
wg.Wait()
calls := atomic.LoadInt64(&handlerCalls)
utils.AssertEqual(t, true, calls > 1, "handler should be invoked more than once when SingleFlight is false")
// Sequential requests still exhibit normal caching behavior.

Copilot uses AI. Check for mistakes.
Comment on lines +83 to +86
// StaleWhileRevalidate, when > 0, allows serving stale responses for expired
// entries while one revalidation runs (one handler run per key). 0 disables.
// Full stale-while-revalidate behavior may be added in a follow-up.
//
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StaleWhileRevalidate is added to the public Config API and documented as if it changes behavior, but it’s not used anywhere in the middleware implementation. This is a no-op knob and makes the docs misleading. Either wire it into cache.go or remove/clearly mark it as not implemented until a follow-up PR.

Suggested change
// StaleWhileRevalidate, when > 0, allows serving stale responses for expired
// entries while one revalidation runs (one handler run per key). 0 disables.
// Full stale-while-revalidate behavior may be added in a follow-up.
//
// StaleWhileRevalidate is reserved for future use and is currently not
// implemented in the cache middleware, so setting it has no effect.
// Intended semantics (not yet implemented): when > 0, allow serving stale
// responses for expired entries while one revalidation runs per key. 0 disables.

Copilot uses AI. Check for mistakes.
}

// If middleware marks request for bypass, return result without caching.
if cfg.Next != nil && cfg.Next(c) {
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the SingleFlight closure, cfg.Next(c) returning true currently just returns res and the outer code later sets the cache status header to miss. In the non-singleflight path, Next causes cfg.CacheHeader to be set to unreachable and skips caching. Please preserve the existing cache-status semantics here (and consider skipping singleflight coalescing entirely for requests where Next(c) is true).

Suggested change
if cfg.Next != nil && cfg.Next(c) {
if cfg.Next != nil && cfg.Next(c) {
if cfg.CacheHeader != "" {
c.Set(cfg.CacheHeader, "unreachable")
}

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants