-
Notifications
You must be signed in to change notification settings - Fork 8.5k
Description
Self Checks
- I have searched for existing issues search for existing issues, including closed ones.
- I confirm that I am using English to submit this report (Language Policy).
- Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
- Please do not modify this template :) and fill in all the required fields.
RAGFlow workspace code commit ID
none
RAGFlow image version
v0.24.0
Other environment information
Docker compose DeploymentActual behavior
If the tag set is disabled, the document parses correctly; however, when the tag set is enabled, the following error occurs:
Traceback (most recent call last):
File "/ragflow/rag/svr/task_executor.py", line 1214, in handle_task
await do_handle_task(task)
File "/ragflow/common/connection_utils.py", line 74, in async_wrapper
return await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ragflow/rag/svr/task_executor.py", line 1111, in do_handle_task
chunks = await build_chunks(task, progress_callback)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ragflow/common/connection_utils.py", line 74, in async_wrapper
return await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ragflow/rag/svr/task_executor.py", line 473, in build_chunks
if settings.retriever.tag_content(tenant_id, kb_ids, d, all_tags, topn_tags=topn_tags, S=S) and len(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ragflow/rag/nlp/search.py", line 569, in tag_content
res = self.dataStore.search([], [], {}, [match_txt], OrderByExpr(), 0, 0, idx_nm, kb_ids, ["tag_kwd"])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ragflow/rag/utils/infinity_conn.py", line 268, in search
kb_res, extra_result = builder.option({"total_hits_count": True}).to_df()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ragflow/.venv/lib/python3.12/site-packages/infinity/remote_thrift/table.py", line 463, in to_df
return self.query_builder.to_df()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ragflow/.venv/lib/python3.12/site-packages/infinity/remote_thrift/query_builder.py", line 589, in to_df
data_dict, data_type_dict, extra_result = self.to_result()
^^^^^^^^^^^^^^^^
File "/ragflow/.venv/lib/python3.12/site-packages/infinity/remote_thrift/query_builder.py", line 585, in to_result
return self._table._execute_query(query)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ragflow/.venv/lib/python3.12/site-packages/infinity/remote_thrift/table.py", line 557, in _execute_query
raise InfinityException(res.error_code, res.error_msg)
infinity.common.InfinityException: (3052, 'Trying to match: 鹤^0.04059786365792243 鹤^0.04059786365792243 巴^0.03909770431124826 蟒^0.02756801739539204 肚^0.027065242438614953 百^0.0265636021668023 力^0.026065136207498844 利^0.026065136207498844 龙^0.026065136207498844 血^0.026065136207498844 条^0.025575324656872893 剑^0.025575324656872893 心^0.025575324656872893 人^0.024365515150344876 8^0.01490820835061202 (五 OR (000858)^0.2)^0.01490820835061202 ,^0.01490820835061202 ,^0.01490820835061202 (金 OR (000402)^0.2)^0.01490820835061202 。^0.01490820835061202 ?^0.01490820835061202 雯^0.014779565265727402 壹^0.014287490112188465 喙^0.01378400869769602 斐^0.01378400869769602 喙^0.01378400869769602 蒸^0.013532621219307477 锋^0.013532621219307477 丽^0.013532621219307477 巨^0.01328180108340115 on fields: docnm@ft_docnm_rag_coarse^10,docnm@ft_docnm_rag_fine^5,important_keywords@ft_important_keywords_rag_coarse^30,important_keywords@ft_important_keywords_rag_fine^20,questions@ft_questions_rag_fine^20,content@ft_content_rag_coarse^2,content@ft_content_rag_fine failed.@src/planner/bound_select_statement_impl.cpp:276')
Expected behavior
Tagging Completed Correctly
Steps to reproduce
Enable Infinity.
Create a Tag Set Dataset and verify that the Tag Set is parsed correctly.
Create a Document Dataset.
Upload documents using the "Book" chunking method; parsing proceeds successfully.
Enable the Tag Set and re-parse; the process fails.Additional information
After the error occurred, some documents could be successfully re-parsed, but the majority could not.