Skip to content

feat: add tpu flash and paged attention#402

Open
neudinger wants to merge 7 commits intozml/v2from
kevin/attention-tpu
Open

feat: add tpu flash and paged attention#402
neudinger wants to merge 7 commits intozml/v2from
kevin/attention-tpu

Conversation

@neudinger
Copy link

No description provided.

@neudinger neudinger force-pushed the kevin/attention-tpu branch 11 times, most recently from 12b16c1 to 30df894 Compare March 12, 2026 13:58
@neudinger neudinger force-pushed the kevin/attention-tpu branch 8 times, most recently from 0c7c8ad to 15fc4d8 Compare March 23, 2026 18:52
Kevin Barre added 7 commits March 23, 2026 18:55
- Introduced `mosaic_tpu` backend in `paged_attention.zig` and updated related structures to accommodate TPU-specific parameters and options.
- Implemented `raggedPagedAttention` function in `flashattn.zig` to handle TPU-specific attention logic.
- Enhanced `Buffer` struct in `buffer.zig` with a new method `toPrefix1dSlice` for copying prefixes of rank-1 buffers across shards.
- Updated `Platform` struct in `platform.zig` to manage TPU IR runtime initialization and deinitialization.
- Modified `env` function in `testing.zig` to return a platform instance or an error, improving error handling in tests.
- Adjusted various test cases to handle platform initialization errors gracefully.
- Cleaned up unused imports and ensured consistent error handling across the codebase.
@neudinger neudinger force-pushed the kevin/attention-tpu branch from 15fc4d8 to 9fd7a8c Compare March 23, 2026 18:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant