Enhanced document analysis with Markdown parsing and size statistics#1
Open
Enhanced document analysis with Markdown parsing and size statistics#1
Conversation
- Add SectionDoc for hierarchical Markdown section parsing - Add FlexDoc as unified interface for TextDoc, TextNode, and SectionDoc views - Implement thread-safe lazy loading with synchronized decorator - Add comprehensive tests for all new components - Update documentation with examples and API reference Features: - Parse Markdown into hierarchical section tree structure - Navigate sections by level, title, or path - Cross-reference between token, div, and section views - Smart section-based document chunking - Thread-safe concurrent access to all views 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add explicit type annotations for lists and dicts - Import override decorator for __repr__ methods - Fix Callable type hint in test_thread_utils.py - Ensure all type checking passes in basedpyright 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Remove override decorator usage (requires Python 3.12+) - Add reportImplicitOverride = false to basedpyright config - Ensure compatibility with Python 3.11, 3.12, and 3.13 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Use pyright: ignore comments for implicit override warnings - Maintain Python 3.11 compatibility without override decorator - Keep all basedpyright checks enabled as per best practices 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
…ff-section-iteration
- Add typing_extensions dependency for @OverRide decorator - Use @OverRide decorators for all overridden methods - Simplify docstrings by removing redundant Args/Returns sections - Keep docstrings concise as per guidelines - All tests and linting pass 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add key properties/methods summaries to class docstrings - Update README.md examples to use correct method names - Add thread safety example to README - Fix incorrect method names that don't exist in implementation - Delete API.md as all content is now properly integrated 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Create analyze_doc.py CLI tool for section-by-section analysis - Shows hierarchical tree view with statistics per section - Includes paragraphs, sentences, words, and reading time - Supports both tree and flat table output formats - Add example to README.md with sample output - Works with files or stdin input 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Remove redundant Args/Returns sections where obvious from signatures - Keep docstrings concise as per Python coding guidelines - Maintain clear documentation while reducing verbosity 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Simplified README.md by removing large code/output blocks - Replaced verbose examples with concise descriptions and links - Enhanced analyze_doc.py to use rich library for colorized output - Added proper type checking support for optional rich dependency - Maintained backwards compatibility when rich is not installed 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Added inline script dependencies (PEP 723) to all example scripts - Removed sys.path.insert workarounds in favor of proper dependencies - Added rich as a direct dependency for analyze_doc.py - Removed conditional rich import logic - script now requires rich - Updated README to remove pip install instructions - Added pyright ignore comments for external dependencies 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Updated SectionDoc to track code fence positions - Headers inside code blocks are now properly ignored - Added tests for code block handling - Fixed rich styling issue in analyze_doc.py 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Replaced regex-based header detection with marko markdown parser - Now properly uses marko's DOM traversal to find headers - Headers inside code blocks are automatically ignored by parser - Maintains backward compatibility with all existing tests - No longer needs manual code fence tracking This ensures proper markdown parsing that respects all markdown rules including code blocks, inline code, and other edge cases. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Added test_markdown.md with various markdown features including: - Code blocks in multiple languages (bash, yaml, python, shell) - Blockquotes and nested blockquotes - Numbered and bulleted lists - Tables with # symbols in cells - Inline code with # symbols - Auto-formatted all markdown files with flowmark - Verified proper header parsing ignores all non-header # symbols 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Remove obsolete brief parameter from test_insert_size_info tests - Update test to match current API after merge with main
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
insert_size_info.pyexample for inserting document statistics after section headersutil/read_time.pymoduleKey Changes
Markdown Parsing Enhancement
marko(viaflowmark) for proper Markdown AST parsing inSectionDocCore Features
SectionDocclass:FlexDocclass:New Examples & Utilities
insert_size_info.pyexample:<div class="size-info">elements after section headersread_time.pyutility module:prettyfmtfor human-readable time displayanalyze_doc.pyenhancement:richformatting for colorized tree and table outputDocumentation & Examples
Testing
insert_size_infofunctionalityread_timeAPITest Plan
make lint- all checks passmake test- all 150 tests passanalyze_doc.pyon various Markdown filesinsert_size_info.pyon README and sample documents🤖 Generated with Claude Code