Skip to content

[Bug]: After enabling the tag set in Ragflow, an error occurs when tagging chunks. #13729

@yinmao

Description

@yinmao

Self Checks

  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (Language Policy).
  • Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
  • Please do not modify this template :) and fill in all the required fields.

RAGFlow workspace code commit ID

none

RAGFlow image version

v0.24.0

Other environment information

Docker compose Deployment

Actual behavior

If the tag set is disabled, the document parses correctly; however, when the tag set is enabled, the following error occurs:
Traceback (most recent call last):
File "/ragflow/rag/svr/task_executor.py", line 1214, in handle_task
await do_handle_task(task)
File "/ragflow/common/connection_utils.py", line 74, in async_wrapper
return await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ragflow/rag/svr/task_executor.py", line 1111, in do_handle_task
chunks = await build_chunks(task, progress_callback)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ragflow/common/connection_utils.py", line 74, in async_wrapper
return await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ragflow/rag/svr/task_executor.py", line 473, in build_chunks
if settings.retriever.tag_content(tenant_id, kb_ids, d, all_tags, topn_tags=topn_tags, S=S) and len(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ragflow/rag/nlp/search.py", line 569, in tag_content
res = self.dataStore.search([], [], {}, [match_txt], OrderByExpr(), 0, 0, idx_nm, kb_ids, ["tag_kwd"])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ragflow/rag/utils/infinity_conn.py", line 268, in search
kb_res, extra_result = builder.option({"total_hits_count": True}).to_df()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ragflow/.venv/lib/python3.12/site-packages/infinity/remote_thrift/table.py", line 463, in to_df
return self.query_builder.to_df()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ragflow/.venv/lib/python3.12/site-packages/infinity/remote_thrift/query_builder.py", line 589, in to_df
data_dict, data_type_dict, extra_result = self.to_result()
^^^^^^^^^^^^^^^^
File "/ragflow/.venv/lib/python3.12/site-packages/infinity/remote_thrift/query_builder.py", line 585, in to_result
return self._table._execute_query(query)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ragflow/.venv/lib/python3.12/site-packages/infinity/remote_thrift/table.py", line 557, in _execute_query
raise InfinityException(res.error_code, res.error_msg)
infinity.common.InfinityException: (3052, 'Trying to match: 鹤^0.04059786365792243 鹤^0.04059786365792243 巴^0.03909770431124826 蟒^0.02756801739539204 肚^0.027065242438614953 百^0.0265636021668023 力^0.026065136207498844 利^0.026065136207498844 龙^0.026065136207498844 血^0.026065136207498844 条^0.025575324656872893 剑^0.025575324656872893 心^0.025575324656872893 人^0.024365515150344876 8^0.01490820835061202 (五 OR (000858)^0.2)^0.01490820835061202 ,^0.01490820835061202 ,^0.01490820835061202 (金 OR (000402)^0.2)^0.01490820835061202 。^0.01490820835061202 ?^0.01490820835061202 雯^0.014779565265727402 壹^0.014287490112188465 喙^0.01378400869769602 斐^0.01378400869769602 喙^0.01378400869769602 蒸^0.013532621219307477 锋^0.013532621219307477 丽^0.013532621219307477 巨^0.01328180108340115 on fields: docnm@ft_docnm_rag_coarse^10,docnm@ft_docnm_rag_fine^5,important_keywords@ft_important_keywords_rag_coarse^30,important_keywords@ft_important_keywords_rag_fine^20,questions@ft_questions_rag_fine^20,content@ft_content_rag_coarse^2,content@ft_content_rag_fine failed.@src/planner/bound_select_statement_impl.cpp:276')

Expected behavior

Tagging Completed Correctly

Steps to reproduce

Enable Infinity.
Create a Tag Set Dataset and verify that the Tag Set is parsed correctly.
Create a Document Dataset.
Upload documents using the "Book" chunking method; parsing proceeds successfully.
Enable the Tag Set and re-parse; the process fails.

Additional information

After the error occurred, some documents could be successfully re-parsed, but the majority could not.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ♾️infinityPull requests that‘s involved with infinity(DB)🐞 bugSomething isn't working, pull request that fix bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions