Tags: Nerif-AI/Nerif
Tags
feat: multi-agent framework v1.3.2–1.3.5 (#65) * feat: nerif v1.3.0 - async agent, fallback, callbacks, prompt templates, rate limiting, ASR/TTS New features: - Async agent: NerifAgent.arun() with concurrent tool execution via asyncio.gather - Model fallback chain: SimpleChatModel(fallback=["model-b", "model-c"]) - Callback/hook system: 7 event types (LLMStart/End/Error, ToolCall, Fallback, Retry, Memory) - PromptTemplate: variable substitution, defaults, conditionals, partial application - Rate limiting: per-model/provider RateLimitConfig with sync/async support - Enhanced ASR: AudioModel with language/format/translate + Transcriber high-level API - Enhanced TTS: SpeechModel with model/format/speed/file output + Synthesizer API - Streaming enhancements: counter and retry_config passed to stream_chat/astream_chat Tech debt fixes: - Extract _is_transient_error shared helper (retry + fallback) - Fix pyproject.toml keywords - Fix README Python version to >=3.10 - Clean up asr/tts re-exports (remove SimpleChatModel/MultiModalMessage) Tests: 81 new tests, 506 total passing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address code review findings in PromptTemplate and RateLimiter - PromptTemplate: rewrite conditional parser to handle nested {var} braces correctly (multi-variable conditionals now work) - PromptTemplate: variables property no longer returns '?' from conditional markers - RateLimiter: create asyncio.Semaphore lazily in aacquire() instead of __init__ to avoid binding to wrong event loop Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use 'is not None' instead of 'or' for SpeechModel parameter merging Prevents falsy values (e.g. speed=0.0, empty string) from silently falling back to instance defaults. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add 21 cross-feature integration tests for v1.3 Covers: - Fallback + Retry (retry exhausted → fallback, primary success, all fail) - Fallback + Callbacks (coexistence) - Async Fallback (achat path) - Async Agent + Memory (tool messages in memory) - Async Agent + Fallback (config propagation) - Rate Limiter + SimpleChatModel (wiring) - PromptTemplate + SimpleChatModel (system prompt, user message) - Memory + Callbacks (coexistence without conflict) - ASR Transcriber end-to-end (language, format, translate) - TTS Synthesizer end-to-end (voice, speed, file output) - Full stack: PromptTemplate + Memory + Fallback + Callbacks combined - Async agent full stack: Memory + Fallback + async tool - Non-retryable error (400) skips fallback - Empty fallback list creates no FallbackConfig Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: extract shared constants, fix imports to relative, normalize logger name - Create src/nerif/utils/constants.py with DEFAULT_TIMEOUT and LOGGER_NAME - Replace all duplicate httpx.Timeout(30.0, read=120.0) definitions in utils.py, audio_model.py, and image_generation.py with import from constants - Fix absolute imports (nerif.exceptions) to relative (..exceptions) in utils.py and format.py, consistent with the rest of the package - Normalize logger name from "Nerif" (capital N) to "nerif" (lowercase) in utils.py and log.py via the LOGGER_NAME constant - Use LOGGER_NAME constant in callbacks.py for consistent logger hierarchy - Delete FIXME comment in log.py that is no longer actionable Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(security): replace eval() with ast.literal_eval in FormatVerifierStringList Eliminates arbitrary code execution vulnerability in FormatVerifierStringList.convert() by replacing eval() with ast.literal_eval(), which only parses Python literals. Also fixes the UnboundLocalError that would occur when eval() raised an exception (res was unbound but still referenced in the return statement). Corrects the match() and convert() return type annotations from list[int] to list[str]. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: replace print() with LOGGER.debug() in core.py All 7 debug print() calls in Nerif.__init__, logits_mode, embedding_mode, and judge are now routed through the structured LOGGER.debug() interface. log.py required no changes (FIXME comment was already absent). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: eliminate async lock race condition in RateLimiter Extract lazy async primitive creation into _ensure_async_primitives(), called once at the top of aacquire(). Because asyncio is single-threaded within an event loop, no two coroutines can enter _ensure_async_primitives simultaneously, making the initialisation truly safe. Also fixes the import ordering in test_v131_fixes.py and adjusts the concurrency test to avoid a semaphore deadlock (max_concurrent=0 so all gather tasks complete before any arelease is needed). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: replace requests with httpx in vision_model_enhanced Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: Memory save/load now persists configuration parameters ConversationMemory.save() now writes a v1.1 format that includes a "config" block (max_messages, max_tokens, summarize, summarize_model, summary_prompt). ConversationMemory.load() reads that block and passes the values to __init__, so a round-tripped instance has the same constraints as the original. Old v1.0 files (no "config" key) fall back to constructor defaults for backward compatibility. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: wire CallbackManager and RateLimiter into SimpleChatModel - Add _prepare_chat_kwargs() helper to deduplicate message-building logic shared between chat() and achat() - Add _process_chat_result() helper to deduplicate tool-call handling and response_model parsing shared between chat() and achat() - Rewrite chat() to fire on_llm_start/on_llm_end/on_llm_error callbacks and call rate_limiter.acquire()/release() around the API call - Rewrite achat() with same structure using aacquire()/arelease() for the async rate-limiter path; uses local llm_start_time to avoid async race conditions - Add import time as _time and LLMStartEvent/LLMEndEvent/LLMErrorEvent at module level; fix counter type hint to Optional[NerifTokenCounter] Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: agent no longer sends empty user message after tool results Add _continue_after_tools and _acontinue_after_tools to SimpleChatModel so NerifAgent can resume the conversation after tool results without appending an empty user message (which Anthropic and other strict providers reject with HTTP 400). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: add JSON mode as tier-1 in judgment chain with configurable strategy Nerif.judge() and NerifMatchString.match() now accept a 'strategy' list controlling which tiers to attempt in order. The default strategy is ["json", "logits", "embedding", "force_fit"], making structured JSON output the first tier tried before logits/embedding fallbacks. Both classes expose a new json_mode() method using a dedicated prompt and response_format={"type": "json_object"}. The strategy parameter is threaded through instance() and the module-level nerif(), nerif_match_string(), and nerif_match() helpers. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: bump version to 1.3.1, clean dependencies, pydantic as core dep Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: update tests for agent tool continuation + json_mode robustness * fix: update all failing tests — skip guards, mock updates, compression thresholds - Category 1 (live API tests): add NERIF_RUN_LIVE_TESTS opt-in skip guard to nerif_model_test.py (class-level), nerif_test.py (class-level), and nerif_token_counter_test.py (4 individual methods that hit the live OpenAI API). Using NERIF_RUN_LIVE_TESTS instead of OPENAI_API_KEY presence because the environment may have an invalid key that causes 401 errors. - Category 2 (async agent mock tests): patch _acontinue_after_tools alongside achat in TestAsyncAgentPlusMemory and TestAllFeaturesCombined tests; the agent's arun() now calls _acontinue_after_tools (not achat again) after tool execution. Also replace asyncio.get_event_loop().run_until_complete() with asyncio.run() in all three async tests to avoid RuntimeError in Python 3.12 full-suite runs. - Category 3 (image compression): lower size_threshold_mb from 0.01 (10 KB) to 0.001 (1 KB) so solid-color PNG test images (2–4 KB) are above the threshold; pass preserve_structure=False in test_batch_with_output_directory so output files land directly in the output dir instead of a mirrored tmp subdirectory. - Category 4 (vision URL test): replace @patch("requests.get") with @patch("httpx.get") to match the updated _download_image_from_url implementation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: enhanced logger and token counter for v1.3.2 Logger: - Fix logger name inconsistency in agent.py (hardcoded "Nerif" → LOGGER_NAME) - Add JsonFormatter for structured JSON log output - Add enable_debug_logging() convenience function - Add env var support (NERIF_LOG_LEVEL, NERIF_LOG_FILE) - Add RotatingFileHandler support (max_bytes, backup_count) Token Counter: - Implement per-model success_rate(model) tracking - Add to_dict() and to_json() export methods - Add context manager support (with counter:) - Add record_retry() and wire into retry_sync/retry_async - Export RequestStartEvent, RequestEndEvent, RequestErrorEvent Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: agent-as-tool for multi-agent composition (v1.3.3, Phase 1) Add NerifAgent.as_tool() that wraps an agent as a Tool for use by another agent. Provides both sync and async implementations. This is the minimum viable multi-agent pattern — zero new abstractions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version to 1.3.3 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: orchestration patterns — Pipeline, Router, Parallel (v1.3.4, Phase 2) - AgentPipeline: sequential chain, each agent's output feeds the next - AgentRouter: LLM-based routing to the best sub-agent - AgentParallel: concurrent fan-out with configurable aggregation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version to 1.3.4 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: collaboration primitives — Workspace, MessageBus, Handoff (v1.3.5, Phase 3) - SharedWorkspace: key-value store with as_tools() for agent access - AgentMessageBus: named agent registry with send/receive and as_tools() - AgentHandoff: structured task delegation dataclass with to_prompt() Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version to 1.3.5 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
release: v1.3.1 — JSON judgment chain, callback/rate-limiter wiring, … …security fixes (#62) * refactor: extract shared constants, fix imports to relative, normalize logger name - Create src/nerif/utils/constants.py with DEFAULT_TIMEOUT and LOGGER_NAME - Replace all duplicate httpx.Timeout(30.0, read=120.0) definitions in utils.py, audio_model.py, and image_generation.py with import from constants - Fix absolute imports (nerif.exceptions) to relative (..exceptions) in utils.py and format.py, consistent with the rest of the package - Normalize logger name from "Nerif" (capital N) to "nerif" (lowercase) in utils.py and log.py via the LOGGER_NAME constant - Use LOGGER_NAME constant in callbacks.py for consistent logger hierarchy - Delete FIXME comment in log.py that is no longer actionable Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(security): replace eval() with ast.literal_eval in FormatVerifierStringList Eliminates arbitrary code execution vulnerability in FormatVerifierStringList.convert() by replacing eval() with ast.literal_eval(), which only parses Python literals. Also fixes the UnboundLocalError that would occur when eval() raised an exception (res was unbound but still referenced in the return statement). Corrects the match() and convert() return type annotations from list[int] to list[str]. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: replace print() with LOGGER.debug() in core.py All 7 debug print() calls in Nerif.__init__, logits_mode, embedding_mode, and judge are now routed through the structured LOGGER.debug() interface. log.py required no changes (FIXME comment was already absent). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: eliminate async lock race condition in RateLimiter Extract lazy async primitive creation into _ensure_async_primitives(), called once at the top of aacquire(). Because asyncio is single-threaded within an event loop, no two coroutines can enter _ensure_async_primitives simultaneously, making the initialisation truly safe. Also fixes the import ordering in test_v131_fixes.py and adjusts the concurrency test to avoid a semaphore deadlock (max_concurrent=0 so all gather tasks complete before any arelease is needed). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: replace requests with httpx in vision_model_enhanced Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: Memory save/load now persists configuration parameters ConversationMemory.save() now writes a v1.1 format that includes a "config" block (max_messages, max_tokens, summarize, summarize_model, summary_prompt). ConversationMemory.load() reads that block and passes the values to __init__, so a round-tripped instance has the same constraints as the original. Old v1.0 files (no "config" key) fall back to constructor defaults for backward compatibility. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: wire CallbackManager and RateLimiter into SimpleChatModel - Add _prepare_chat_kwargs() helper to deduplicate message-building logic shared between chat() and achat() - Add _process_chat_result() helper to deduplicate tool-call handling and response_model parsing shared between chat() and achat() - Rewrite chat() to fire on_llm_start/on_llm_end/on_llm_error callbacks and call rate_limiter.acquire()/release() around the API call - Rewrite achat() with same structure using aacquire()/arelease() for the async rate-limiter path; uses local llm_start_time to avoid async race conditions - Add import time as _time and LLMStartEvent/LLMEndEvent/LLMErrorEvent at module level; fix counter type hint to Optional[NerifTokenCounter] Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: agent no longer sends empty user message after tool results Add _continue_after_tools and _acontinue_after_tools to SimpleChatModel so NerifAgent can resume the conversation after tool results without appending an empty user message (which Anthropic and other strict providers reject with HTTP 400). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: add JSON mode as tier-1 in judgment chain with configurable strategy Nerif.judge() and NerifMatchString.match() now accept a 'strategy' list controlling which tiers to attempt in order. The default strategy is ["json", "logits", "embedding", "force_fit"], making structured JSON output the first tier tried before logits/embedding fallbacks. Both classes expose a new json_mode() method using a dedicated prompt and response_format={"type": "json_object"}. The strategy parameter is threaded through instance() and the module-level nerif(), nerif_match_string(), and nerif_match() helpers. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: bump version to 1.3.1, clean dependencies, pydantic as core dep Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: update tests for agent tool continuation + json_mode robustness * fix: update all failing tests — skip guards, mock updates, compression thresholds - Category 1 (live API tests): add NERIF_RUN_LIVE_TESTS opt-in skip guard to nerif_model_test.py (class-level), nerif_test.py (class-level), and nerif_token_counter_test.py (4 individual methods that hit the live OpenAI API). Using NERIF_RUN_LIVE_TESTS instead of OPENAI_API_KEY presence because the environment may have an invalid key that causes 401 errors. - Category 2 (async agent mock tests): patch _acontinue_after_tools alongside achat in TestAsyncAgentPlusMemory and TestAllFeaturesCombined tests; the agent's arun() now calls _acontinue_after_tools (not achat again) after tool execution. Also replace asyncio.get_event_loop().run_until_complete() with asyncio.run() in all three async tests to avoid RuntimeError in Python 3.12 full-suite runs. - Category 3 (image compression): lower size_threshold_mb from 0.01 (10 KB) to 0.001 (1 KB) so solid-color PNG test images (2–4 KB) are above the threshold; pass preserve_structure=False in test_batch_with_output_directory so output files land directly in the output dir instead of a mirrored tmp subdirectory. - Category 4 (vision URL test): replace @patch("requests.get") with @patch("httpx.get") to match the updated _download_image_from_url implementation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
release: v1.3.0 - async agent, fallback, callbacks, prompt templates,… … rate limiting, ASR/TTS (#60) * feat: nerif v1.3.0 - async agent, fallback, callbacks, prompt templates, rate limiting, ASR/TTS New features: - Async agent: NerifAgent.arun() with concurrent tool execution via asyncio.gather - Model fallback chain: SimpleChatModel(fallback=["model-b", "model-c"]) - Callback/hook system: 7 event types (LLMStart/End/Error, ToolCall, Fallback, Retry, Memory) - PromptTemplate: variable substitution, defaults, conditionals, partial application - Rate limiting: per-model/provider RateLimitConfig with sync/async support - Enhanced ASR: AudioModel with language/format/translate + Transcriber high-level API - Enhanced TTS: SpeechModel with model/format/speed/file output + Synthesizer API - Streaming enhancements: counter and retry_config passed to stream_chat/astream_chat Tech debt fixes: - Extract _is_transient_error shared helper (retry + fallback) - Fix pyproject.toml keywords - Fix README Python version to >=3.10 - Clean up asr/tts re-exports (remove SimpleChatModel/MultiModalMessage) Tests: 81 new tests, 506 total passing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address code review findings in PromptTemplate and RateLimiter - PromptTemplate: rewrite conditional parser to handle nested {var} braces correctly (multi-variable conditionals now work) - PromptTemplate: variables property no longer returns '?' from conditional markers - RateLimiter: create asyncio.Semaphore lazily in aacquire() instead of __init__ to avoid binding to wrong event loop Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use 'is not None' instead of 'or' for SpeechModel parameter merging Prevents falsy values (e.g. speed=0.0, empty string) from silently falling back to instance defaults. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add 21 cross-feature integration tests for v1.3 Covers: - Fallback + Retry (retry exhausted → fallback, primary success, all fail) - Fallback + Callbacks (coexistence) - Async Fallback (achat path) - Async Agent + Memory (tool messages in memory) - Async Agent + Fallback (config propagation) - Rate Limiter + SimpleChatModel (wiring) - PromptTemplate + SimpleChatModel (system prompt, user message) - Memory + Callbacks (coexistence without conflict) - ASR Transcriber end-to-end (language, format, translate) - TTS Synthesizer end-to-end (voice, speed, file output) - Full stack: PromptTemplate + Memory + Fallback + Callbacks combined - Async agent full stack: Memory + Fallback + async tool - Non-retryable error (400) skips fallback - Empty fallback list creates no FallbackConfig Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * release: Bump version 1.2.0 -> 1.3.0 --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
release: v1.2.0 - memory, RAG, observability, exceptions, CLI, depend… …ency cleanup (#58) New features: - ConversationMemory: sliding window, auto-summarization, persistence - Lightweight RAG: VectorStoreBase, NumpyVectorStore, SimpleRAG - Enhanced TokenCounter: latency, cost, success rate, callback hooks - Custom exceptions: NerifError hierarchy (ProviderError, FormatError, etc.) - CLI tools: nerif check, nerif test-model, nerif models - py.typed marker for IDE/mypy support Dependency cleanup: - Core deps reduced from 4 to 1 (httpx only) - Removed: python-dotenv (unused), prettytable (replaced with built-in) - numpy moved to optional [rag] group - Dev deps: removed pip-tools, ipython, isort (redundant with ruff) Includes 114 unit tests + 23 integration tests, bilingual docs (EN/CN). Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
release: v1.1.0 - streaming, async, retry, pydantic, optional embeddi… …ng (#56) * refactor: remove litellm and openai dependencies, use httpx directly Replace litellm and openai SDK with direct httpx HTTP calls for all API interactions (chat, embeddings, audio, vision). Add nerif_native Rust extension for performance-critical operations (base64, image compression). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add Anthropic and Gemini provider support coverage * release: v1.1.0 - streaming, async, retry, pydantic, optional embedding v1.1.0 adds five major features: - Streaming responses: stream_chat() and astream_chat() for real-time output - Async support: achat(), aembed(), astream_chat() with native async/await - Retry with exponential backoff: RetryConfig with jitter, Retry-After support - Pydantic structured output: response_model parameter for type-safe responses - Optional embedding: nerif() works without embedding model via text fallback Also includes: - Feature subpackages: nerif[asr], nerif[tts], nerif[img-gen] - 121 new tests (331 total passing) - 5 new example files (18-22) - Bilingual documentation updates (EN + ZH) - Updated CLAUDE.md with release workflow Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: resolve ruff lint errors in examples and image_compress - Fix import sorting and unused imports in examples (15, 18, 20, 21, 22) - Move PIL import above logger in image_compress.py to fix E402 - Update CLAUDE.md with pre-commit checklist (lint includes examples/) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
release: v0.12.0 - documentation updates and version bump (#55) * docs: update documentation for v0.12 features and add Chinese translations - Update Docusaurus config: add zh-Hans locale, fix placeholder URLs, add locale dropdown, update footer links - Update English docs for v0.12: add MultiModalMessage, ToolDefinition, ToolCallResult, structured output, VideoModel, OllamaEmbeddingModel, FormatVerifierStringList, FormatVerifierJson, NerifFormat.json_parse() - Rename nerif-agent.md to nerif-model.md with updated API signatures - Create nerif-agent-framework.md documenting NerifAgent and Tool classes - Update architecture docs with three-tier matching and agent framework - Add explanatory text to examples 10-13 (tool calling, structured output, multi-modal, agent) - Update intro with v0.12 feature list - Add complete zh-Hans translations for all 27 documentation files - Verified build passes for both en and zh-Hans locales Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * release: Bump version 0.11.0 -> 0.12.0 --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
feat: add multi-modal interface, tool calling, structured output, and… … agent module (v0.11.0) (#53) - Unified model routing with get_model_response() supporting Anthropic/Gemini prefixes - MultiModalMessage builder for text + images + audio + video input - Tool calling support in SimpleChatModel (tools, tool_choice params) - Structured output / JSON mode (response_format param) - FormatVerifierJson and NerifFormat.json_parse() for robust JSON extraction - NerifAgent: ReAct-style agent with tool registration and execution loop - VideoModel for video understanding tasks - Extended MessageType with AUDIO_PATH/URL/BASE64 and VIDEO_PATH/URL - Backward compatible: all existing APIs preserved - Version bump to 0.11.0 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>