Skip to content

feat(helm): add configurable job history limits for heartbeat cronjob#15008

Draft
Syndlex wants to merge 3 commits intolinkerd:mainfrom
Syndlex:feature/heartbeat-job-history-limits
Draft

feat(helm): add configurable job history limits for heartbeat cronjob#15008
Syndlex wants to merge 3 commits intolinkerd:mainfrom
Syndlex:feature/heartbeat-job-history-limits

Conversation

@Syndlex
Copy link

@Syndlex Syndlex commented Mar 6, 2026

Add configurable job history limits for heartbeat cronjob

Problem

The heartbeat CronJob had successfulJobsHistoryLimit hardcoded to 0 and
failedJobsHistoryLimit was not set at all. This prevented users from
retaining job history for debugging and monitoring purposes.

Our monitoring is reporting failed jobs because of the change described here. This happens because the heartbeat is faster than our monitoring and the successful job was never recognised.

Solution

Add successfulJobsHistoryLimit and failedJobsHistoryLimit as configurable
values under the heartbeat section in values.yaml. The template now uses
these values with sensible defaults (0 for successful, 1 for failed).

Validation

  • Verified the Helm template renders correctly with default values
  • Verified custom values can be passed via --set heartbeat.successfulJobsHistoryLimit=X

@Syndlex Syndlex requested a review from a team as a code owner March 6, 2026 10:15
@Syndlex Syndlex force-pushed the feature/heartbeat-job-history-limits branch from d5d94bb to 47fd0dc Compare March 6, 2026 10:16
Problem

The heartbeat CronJob had successfulJobsHistoryLimit hardcoded to 0 and
failedJobsHistoryLimit was not set at all. This prevented users from
retaining job history for debugging and monitoring purposes.

Solution

Add successfulJobsHistoryLimit and failedJobsHistoryLimit as configurable
values under the heartbeat section in values.yaml. The template now uses
these values with sensible defaults (0 for successful, 1 for failed).

Validation

- Verified the Helm template renders correctly with default values
- Verified custom values can be passed via --set heartbeat.successfulJobsHistoryLimit=X

Signed-off-by: mfeix <marcel.feix@exxcellent.de>
@Syndlex Syndlex force-pushed the feature/heartbeat-job-history-limits branch from 47fd0dc to fd7d888 Compare March 6, 2026 10:56
@Syndlex
Copy link
Author

Syndlex commented Mar 19, 2026

Hey, these are Go Problems on a other level not on a change I have done.

@cratelyn
Copy link
Member

Hey, these are Go Problems on a other level not on a change I have done.

        	slice[35].map[spec].map[failedJobsHistoryLimit]: <does not have key> != 1
    --- FAIL: TestRender/14:_install_tracing.golden (1.04s)

it looks like this is related to your change.

you can see the documentation in TEST.md for more information about how to fix these failing tests.

@cratelyn
Copy link
Member

i am going to mark this as a draft until tests are passing.

@cratelyn cratelyn marked this pull request as draft March 19, 2026 16:45
…en files

Fix template to handle nil .Values.heartbeat by using parenthesized
access pattern. Regenerate test golden files to include the new
failedJobsHistoryLimit field.

Signed-off-by: mfeix <marcel.feix@exxcellent.de>
@Syndlex
Copy link
Author

Syndlex commented Mar 23, 2026

@cratelyn can you re run the tests I think this is a flaky one? not good with your infrastructure tests.

@cratelyn
Copy link
Member

@cratelyn can you re run the tests I think this is a flaky one? not good with your infrastructure tests.

@Syndlex can do! thanks for the ping.

@cratelyn cratelyn marked this pull request as ready for review March 23, 2026 17:19
@cratelyn cratelyn marked this pull request as draft March 23, 2026 17:19
@cratelyn
Copy link
Member

=== FAIL: pkg/charts/linkerd2 TestNewValues/HA (0.00s)
    values_test.go:354: HA Helm values
        [Heartbeat: <nil map> != map[failedJobsHistoryLimit:1 successfulJobsHistoryLimit:0]]
    --- FAIL: TestNewValues/HA (0.00s)

=== FAIL: pkg/charts/linkerd2 TestNewValues (0.02s)
    values_test.go:292: Helm values
        [Heartbeat: <nil map> != map[failedJobsHistoryLimit:1 successfulJobsHistoryLimit:0]]

DONE 1091 tests, 1 skipped, 3 failures in 238.744s
error: Recipe `go-test` failed on line 23 with exit code 1
Error: Process completed with exit code 1.

these test failures look related ☝️

@Syndlex
Copy link
Author

Syndlex commented Mar 23, 2026

@cratelyn sry i hope this is fixed now. A lot of stuff I haven't seen in test infrastructure.

@Syndlex Syndlex force-pushed the feature/heartbeat-job-history-limits branch from 9ada8b9 to 3c1a31c Compare March 23, 2026 22:26
Add the Heartbeat map with successfulJobsHistoryLimit and
failedJobsHistoryLimit to the expected Values struct in the test,
matching the new defaults in values.yaml.

Signed-off-by: mfeix <marcel.feix@exxcellent.de>
@Syndlex Syndlex force-pushed the feature/heartbeat-job-history-limits branch from 3c1a31c to 6242f0c Compare March 23, 2026 23:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants