Skip to content

Improved html tag find/replace with selectolax.#2

Merged
jlevy merged 2 commits intomainfrom
new-html-tags
Aug 6, 2025
Merged

Improved html tag find/replace with selectolax.#2
jlevy merged 2 commits intomainfrom
new-html-tags

Conversation

@jlevy
Copy link
Owner

@jlevy jlevy commented Aug 6, 2025

No description provided.

@jlevy jlevy requested a review from Copilot August 6, 2025 03:48
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves HTML tag find/replace functionality by replacing a regex-based implementation with a more robust selectolax-based approach. The new implementation provides better HTML parsing while maintaining exact offset tracking for surgical HTML editing capabilities.

Key changes:

  • Replaced regex-only parsing with a hybrid regex+selectolax approach for better robustness
  • Added comprehensive HTML rewriting functions with support for various quote styles and edge cases
  • Enhanced test coverage with 800+ lines of tests covering complex scenarios

Reviewed Changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/html/test_html_tags.py New comprehensive test suite covering all HTML tag manipulation functions
src/chopdiff/html/html_tags.py New implementation with selectolax integration and robust HTML rewriting capabilities
src/chopdiff/html/html_find_tags.py Removed old regex-only implementation
src/chopdiff/html/__init__.py Updated imports to use new html_tags module
pyproject.toml Added selectolax dependency
Comments suppressed due to low confidence (1)

pyproject.toml:48

  • The version 0.3.32 of selectolax does not exist. The latest available version as of January 2025 was 0.3.21. Consider using "selectolax>=0.3.21" instead.
    "selectolax>=0.3.32",


Does robust parsing and surgical replacement, preserving the original HTML exactly
except for the replaced attribute values.

Copy link

Copilot AI Aug 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring should mention that the function ignores tags inside HTML comments, as this is an important behavior documented in the tests.

Suggested change
Ignores tags that appear inside HTML comments; tags within comments are not rewritten.

Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@jlevy jlevy merged commit 01c6829 into main Aug 6, 2025
3 checks passed
@jlevy jlevy deleted the new-html-tags branch August 6, 2025 03:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants