Conversation
Comprehensive senior engineering review of the kash codebase covering architecture, code quality, Python conformance, and reusability. Includes 23 tracked beads across 5 phases of improvements. https://claude.ai/code/session_01Th8ndDSiErVpyHhGLuMFGH
…view Adds Part 5 (testability by subsystem with priority matrix) and Part 6 (guidelines conformance review) to the architecture spec. Detailed breakdown of easy/medium/hard testing per component, conftest.py design, and estimated ~300 new tests across 23 files in 3 phases. https://claude.ai/code/session_01Th8ndDSiErVpyHhGLuMFGH
…o spec Expands Testing Strategy section with specific rules from general-testing-rules, general-tdd-guidelines, and golden-testing-guidelines. Updates Part 6 with detailed conformance analysis against all applicable tbd guidelines (python-rules, python-modern-guidelines, python-cli-patterns, error-handling-rules, etc.). Adds structured References section. https://claude.ai/code/session_01Th8ndDSiErVpyHhGLuMFGH
…iers Analyzed kash-docs (393 imports, 150+ symbols) and kash-media (80+ symbols) dependency surfaces. Restructured Implementation Plan into 3 tiers: - Tier 1: Fully compatible (tests, modernization, reviews) - 12 beads - Tier 2: Additive changes (new APIs, old paths preserved) - 8 beads - Tier 3: Breaking changes (require downstream updates) - 4 beads All beads labeled with tier. Critical API surface documented. https://claude.ai/code/session_01Th8ndDSiErVpyHhGLuMFGH
Add detailed body descriptions to 7 beads that had empty bodies: kash-8hh2, kash-0vb5, kash-j8st, kash-hv4e, kash-law6, kash-78wq, kash-s9q0. Fix Part 6/7 ordering in the spec document. https://claude.ai/code/session_01Pv152sQEpeLvwDoJTdz9Fm
Create test infrastructure with fixtures for temp_workspace, mock_current_ws, mock_llm, and sample_item. Register pytest markers for slow, online, and golden test categories. https://claude.ai/code/session_01Pv152sQEpeLvwDoJTdz9Fm
…-ermq) New test files: - tests/model/test_preconditions.py: Boolean algebra (&, |, ~, and_all, or_all) - tests/model/test_params.py: Validation, json_schema, properties - tests/model/test_items.py: ItemType properties, content_equals, serialization - tests/llm_utils/test_fuzzy_parsing.py: Edge cases for parsing functions - tests/shell/test_completion_scoring.py: Scoring, normalization, recency decay 50 new tests, all pass in <0.4s. https://claude.ai/code/session_01Pv152sQEpeLvwDoJTdz9Fm
The exact pin existed due to a litellm/openai compatibility issue with ResponseTextConfig imports. The import works fine with 1.99.x. Pin to 1.x range to allow patch updates while avoiding openai 2.x breaking changes. https://claude.ai/code/session_01Pv152sQEpeLvwDoJTdz9Fm
…oring (kash-69rw) Replace 15+ magic numbers with named constants: MAX_SCORE, MIN_CUTOFF, SHORT_PREFIX_SCORE, PREFIX_BASE_SCORE, DESCRIPTION_BONUS, PATH_PHRASE_WEIGHT, PATH_RECENCY_WEIGHT, COMMENT_SCORE_WEIGHT, RECENCY_DECAY_CONSTANT, etc. Each constant has a docstring explaining its purpose. https://claude.ai/code/session_01Pv152sQEpeLvwDoJTdz9Fm
…cks (kash-kngd) - Remove commented-out lazyasd background import code in lazy_imports.py - Remove `if __name__ == "__main__"` testing blocks from colors.py, version.py, and web_extract.py (per python-rules: no __main__ for testing) https://claude.ai/code/session_01Pv152sQEpeLvwDoJTdz9Fm
…kash-aik9) Per python-rules and python-modern-guidelines: add future annotations to all Python files with type hints. 184 files modified (182 in src/kash, 2 in tests). Skipped trivial __init__.py files and files without type annotations. https://claude.ai/code/session_01Pv152sQEpeLvwDoJTdz9Fm
…(kash-gpj8) Per python-rules: ALWAYS use @OverRide decorators on overridden methods. Added @OverRide to ABC implementations (MediaService, Action, HelpDoc), framework overrides (Path, RichHandler, Protocol), and key dunder methods. Uses typing_extensions.override for Python 3.11 compatibility. https://claude.ai/code/session_01Pv152sQEpeLvwDoJTdz9Fm
Outbox bead files cleared by `tbd import --outbox --clear-on-success`. uv.lock updated to reflect openai version pin change (>=1.99,<2.0). https://claude.ai/code/session_01Pv152sQEpeLvwDoJTdz9Fm
tbd sync to remote tbd-sync branch failed with 403. Saving outbox to feature branch so bead data is preserved. Run `tbd sync` when push access is available to restore. https://claude.ai/code/session_01Pv152sQEpeLvwDoJTdz9Fm
…n_checks, selections, param_state, item_id_index, llm_completion, sliding_transforms (kash-ppz9) 75 new tests across 8 test files covering: - action_registry: registration, duplicate warning, lookup, not-found error - resolve_args: URL/path/StorePath resolution, selection fallback - precondition_checks: action matching, item filtering, max_results, error skipping - selections: Selection state, SelectionHistory push/pop/nav/persistence/truncation - param_state: set/get, persistence across instances, missing file handling - item_id_index: indexing, duplicate detection, unindexing, load error handling - llm_completion: CitationList, LLMCompletionResult, mocked API calls, template completion - sliding_transforms: filtered transform, paragraph windowing, validation errors https://claude.ai/code/session_01Pv152sQEpeLvwDoJTdz9Fm
…rts (kash-ymq4) - Add __all__ to kash/llm_utils/__init__.py (11 symbols) - Add canonical top-level imports to kash/__init__.py with __all__ (28 symbols) covering core model types, execution decorators, LLM utilities, and errors - All existing deep import paths remain valid (purely additive) https://claude.ai/code/session_01Pv152sQEpeLvwDoJTdz9Fm
- actions/__init__.py: Replace module-level import_and_register() and load_kits() with ensure_actions_registered() (idempotent, lazy) - commands/__init__.py: Same pattern with ensure_commands_registered() - action_registry.py: get_all_action_classes() calls ensure_actions_registered() instead of `import kash.actions` - command_registry.py: register_all_commands() calls ensure_commands_registered() instead of `import kash.commands` `import kash.actions` no longer triggers registration as a side effect. Registration happens lazily on first call to get_all_action_classes() or get_all_commands(). import_and_register() used by downstream kits is unchanged. https://claude.ai/code/session_01Pv152sQEpeLvwDoJTdz9Fm
…-i7dr) Remove confirmed-unused packages: tenacity, tiktoken, selectolax, pluralizer, rich-argparse. Each was verified absent from imports in both kash and downstream packages (kash-docs, kash-media). Add optional dependency groups [llm], [web], [media], [server], [shell] and [all] for users who want minimal installs. All current deps remain in the default install for backward compatibility. https://claude.ai/code/session_01Pv152sQEpeLvwDoJTdz9Fm
…hh2) New kash.run module provides kash_init() and kash_run() to execute any kash action as a Python library call without the interactive shell or xonsh dependency. kash_run() takes an action name, optional inputs (URLs, file paths, or Item objects), optional params, and returns a structured ActionResult. Auto-initializes on first call. Uses temporary workspace if none provided. Includes 13 tests covering input building (URLs, files, Items, mixed), workspace resolution, and full pipeline with mocked actions. Also exports kash_init and kash_run from the root kash package for convenient `from kash import kash_run` usage. https://claude.ai/code/session_01Pv152sQEpeLvwDoJTdz9Fm
…local_file_cache (kash-jlmn) Integration tests covering: - action_exec pipeline via kash_run(): URL/file/Item inputs, multiple inputs, params, not-found errors, no-input actions (7 tests) - web_fetch: file:// URL downloads, HttpHeaders MIME type parsing (8 tests) - local_file_cache: Loadable caching, TTL expiration, NEVER expiration, is_cached, local file caching, mtime reading (9 tests) - kash.run module: input building, workspace resolution, full pipeline (13 tests) Fix RawParamValues construction in kash.run (use values= kwarg). Total: 36 new tests (210 total across project). https://claude.ai/code/session_01Pv152sQEpeLvwDoJTdz9Fm
Mark completed beads in spec: all 12 Tier 1 items, 4/8 Tier 2 items, 3/4 Tier 3 evaluations. Add evaluation results for utils audit (kash-zt0y), web_content extraction (kash-law6), and llm_utils extraction (kash-78wq). Update spec status to In Progress. https://claude.ai/code/session_01Pv152sQEpeLvwDoJTdz9Fm
Fix 32 basedpyright errors: - kash.run: use proper raw-to-typed param conversion via parse_all() - paths_model: remove @OverRide from __truediv__ (no base method) - test_preconditions: add type ignore for dynamic Item construction - test_item_id_index: wrap url strings in Url(), narrow ItemId | None - test_llm_completion: prefix unused mock params with underscore - test_action_registry: add type ignore for mock return - test_web_fetch: use Url() NewType, fix MimeType None checks - test_local_file_cache: use named function instead of lambda for save https://claude.ai/code/session_01Pv152sQEpeLvwDoJTdz9Fm
All 301 tests pass on Python 3.14.3. The pydantic incompatibility was specific to the 3.14.0 release candidate and is resolved in the final release. https://claude.ai/code/session_01Pv152sQEpeLvwDoJTdz9Fm
Merge unique content from docs/CODE_REVIEW_2026-01.md (dependency upgrade status, TODO/FIXME catalog, creative improvement ideas, best practices alignment) into the architecture spec as appendices. Remove the standalone file to follow spec conventions. https://claude.ai/code/session_01Pv152sQEpeLvwDoJTdz9Fm
jlevy
pushed a commit
that referenced
this pull request
Feb 9, 2026
Detailed instructions for verifying kash-docs against updated kash-shell main branch (PR #6) and applying the same code quality cleanups: Python 3.14 support, ruff/codespell/basedpyright fixes, future annotations, @OverRide decorators, dependency upgrades (openai 2.x, litellm 1.81). https://claude.ai/code/session_01Pv152sQEpeLvwDoJTdz9Fm
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Comprehensive architecture modernization of kash based on a code review and planning spec. Includes dependency upgrades from PR #5, Python 3.14 CI support, and all Tier 1 beads plus partial Tier 2/3 work.
Scope: 244 files changed, 5,266 additions, 140 deletions. 301 tests (up from ~135 on main).
Key changes
kash.run):kash_init()+kash_run()for using kash programmatically without the interactive shellkash/__init__.py):__all__with 28 canonical exports; all deep import paths preservedensure_actions_registered()/ensure_commands_registered()replace module-level side effects[llm],[web],[media],[server],[shell],[all]from __future__ import annotations(184 files),@override(34 methods), named constants, dead code removalconftest.pywith shared fixtures, 301 total testsValidation Plan
Part 1: Automated — Verified by CI (all passing)
CI runs on Python 3.11, 3.12, 3.13, 3.14 — all 4 builds green.
uv run pytestbasedpyright --statsruff check+ruff formatcodespellhelto ignore (test data)uv lock --upgradeTest coverage breakdown:
tests/test_run.pykash_init(),kash_run(), input building, workspace resolution, paramstests/exec/test_action_exec_integration.pytests/exec/test_action_registry.pytests/exec/test_resolve_args.pytests/exec/test_precondition_checks.pytests/model/test_items.pytests/model/test_params.pytests/model/test_preconditions.pytests/llm_utils/test_llm_completion.pytests/llm_utils/test_fuzzy_parsing.pytests/llm_utils/test_sliding_transforms.pytests/file_storage/test_item_id_index.pytests/shell/test_completion_scoring.pytests/web_content/test_web_fetch_integration.pytests/web_content/test_local_file_cache_integration.pytests/workspaces/test_param_state.pytests/workspaces/test_selections.pysrc/**/*.pyPart 2: Manual Verification Required — Human Review Checklist
These items cannot be fully validated by automated tests and require a human to confirm.
2a. Backward Compatibility (HIGH PRIORITY)
kash-docsorkash-mediaexist as separate packages, verify they still import from kash without errorskashinteractively and verify:helpcommand works2b. OpenAI 2.x + LiteLLM Compatibility (HIGH PRIORITY)
summarize_as_bullets) with a real API key and verify:text-embedding-3-largestill works with openai 2.x2c. Standalone Library API (MEDIUM PRIORITY)
kash_run()outside the shell with a real (non-mocked) action:2d. Optional Dependency Groups (MEDIUM PRIORITY)
pip install kash-shell(oruv pip install .) — verify core imports work without optional extraspip install kash-shell[llm]— LLM actions availablepip install kash-shell[web]— Web scraping workspip install kash-shell[all]— Everything works2e. Lazy Registration Behavior (LOW PRIORITY)
time kash --versionbefore and after this PR — should be equal or fasterensure_actions_registered()multiple times doesn't re-import or duplicate2f. Documentation Accuracy (LOW PRIORITY)
[x]marks in the spec match actual bead statusPart 3: Known Issues & Risk Areas
from __future__ import annotations+ dataclass round-triptabbed_webpage.py; other files don't do YAML round-trips with nested dataclassesFollow-Up Work (Post-Merge)
The remaining 5 open beads are all Tier 2/3 features that build on this PR's foundation. They should be tackled in a follow-up PR after this one merges.
Tier 2: Remaining Additive Features (4 beads)
These add new capabilities without breaking existing imports. They build on the lazy registration and
kash.run()foundation from this PR.1. ShellContext Protocol (kash-0vb5) — P2, blocks kash-5ew2
Goal: Decouple command execution from xonsh via a
ShellContextprotocol.Design:
Implementations needed:
XonshShellContext— wraps current xonsh environment (preserves existing behavior)StandaloneContext— for library/script use (builds onkash.run)CLIContext— for standalone CLI commandsWhy deferred: Design needs careful thought to avoid leaking xonsh internals. The protocol must be stable since downstream code will depend on it.
2. Standalone CLI for Actions (kash-qejq) — P2
Goal: Enable
kash run <action> --input ... --format jsonwithout starting the interactive shell.Design:
kash_run()from this PR as the execution backend--format text|json,--non-interactive,--no-progress,--dry-runWhy deferred: Depends on ShellContext (kash-0vb5) for clean output routing. Could be started independently with a simpler approach.
3.
--format jsonand--non-interactiveFlags (kash-5ew2) — P2, blocked by kash-0vb5Goal: Machine-readable output for agent/CI consumption.
Design:
OutputManagerpattern for dual text/JSON outputCIandNO_COLORenv varsWhy deferred: Blocked by ShellContext protocol (kash-0vb5) — needs the abstraction to route output correctly.
4. KashTestRunner (kash-j8st) — P2
Goal: First-class testing support for running kash actions in pytest.
Design:
Requirements: No console output during test runs, deterministic behavior (mock LLM), easy workspace setup/teardown, assert-friendly return types.
Why deferred: Builds on
kash.run()from this PR but needs more design work around mock injection and pipeline composition.Tier 3: Breaking Changes (1 bead)
5. Lightweight MCP Mode (kash-s9q0) — P3
Goal: Expose individual actions as MCP tools without loading the entire kash registry.
Current problem: The MCP server requires full kash initialization including all registries, workspace system, and configuration. Cannot expose a single action as an MCP tool without loading everything.
Design considerations:
@kash_actionmetadataWhy deferred: This is the only Tier 3 (potentially breaking) item remaining. It may change how action discovery works internally. Benefits from the lazy registration work in this PR (kash-y80s) as a foundation.
Prerequisite evaluations (completed in this PR):
Recommended Order
Beads
Closed (19): kash-keim, kash-ermq, kash-ppz9, kash-jlmn, kash-aik9, kash-gpj8, kash-69rw, kash-kngd, kash-hv4e, kash-ht5b, kash-gzzs, kash-91re, kash-ymq4, kash-y80s, kash-i7dr, kash-8hh2, kash-zt0y, kash-law6, kash-78wq
Open (5): kash-0vb5, kash-qejq, kash-5ew2, kash-j8st, kash-s9q0
Subsumes: PR #5 (dependency upgrades, CODE_REVIEW doc, CI uv update). PR #5 can be closed.
https://claude.ai/code/session_01Pv152sQEpeLvwDoJTdz9Fm