Skip to content

[Bug]: error ragflow v0.23.0 #13725

@lhxxrds

Description

@lhxxrds

Self Checks

  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (Language Policy).
  • Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
  • Please do not modify this template :) and fill in all the required fields.

RAGFlow workspace code commit ID

34567

RAGFlow image version

v0.23.0

Other environment information

ragflow-gpu-1  | ]
ragflow-gpu-1  | [2026-03-20 15:32:02 +0800] [31] [INFO] 127.0.0.1:45890 GET /api/v1/datasets/e541a586242b11f1a26d7e8d48ea0d62/documents 1.1 200 2111 18646
ragflow-gpu-1  | 2026-03-20 15:32:02,236 ERROR    52 OpenAI async completion
ragflow-gpu-1  | Traceback (most recent call last):
ragflow-gpu-1  |   File "/ragflow/rag/llm/chat_model.py", line 481, in async_chat
ragflow-gpu-1  |     return await self._async_chat(history, gen_conf, **kwargs)
ragflow-gpu-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/rag/llm/chat_model.py", line 465, in _async_chat
ragflow-gpu-1  |     response = await self.async_client.chat.completions.create(model=self.model_name, messages=history, **gen_conf, **kwargs)
ragflow-gpu-1  |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/.venv/lib/python3.12/site-packages/openai/resources/chat/completions/completions.py", line 2678, in create
ragflow-gpu-1  |     return await self._post(
ragflow-gpu-1  |            ^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1794, in post
ragflow-gpu-1  |     return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
ragflow-gpu-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1594, in request
ragflow-gpu-1  |     raise self._make_status_error_from_response(err.response) from None
ragflow-gpu-1  | openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': "This model's maximum context length is 100000 tokens. However, you requested 211529 tokens in the messages, Please reduce the length of the messages.", 'type': 'BadRequestError', 'param': None, 'code': 400}
ragflow-gpu-1  | 2026-03-20 15:32:02,238 ERROR    52 async base giving up: **ERROR**: INVALID_REQUEST - Error code: 400 - {'object': 'error', 'message': "This model's maximum context length is 100000 tokens. However, you requested 211529 tokens in the messages, Please reduce the length of the messages.", 'type': 'BadRequestError', 'param': None, 'code': 400}
ragflow-gpu-1  | 2026-03-20 15:32:02,255 INFO     52 set_progress(dc4808cd242e11f19dfa7e8d48ea0d62), progress: None, progress_msg:
ragflow-gpu-1  | 2026-03-20 15:32:02,256 INFO     52

Actual behavior

ragflow-gpu-1 | ]
ragflow-gpu-1 | [2026-03-20 15:32:02 +0800] [31] [INFO] 127.0.0.1:45890 GET /api/v1/datasets/e541a586242b11f1a26d7e8d48ea0d62/documents 1.1 200 2111 18646
ragflow-gpu-1 | 2026-03-20 15:32:02,236 ERROR 52 OpenAI async completion
ragflow-gpu-1 | Traceback (most recent call last):
ragflow-gpu-1 | File "/ragflow/rag/llm/chat_model.py", line 481, in async_chat
ragflow-gpu-1 | return await self._async_chat(history, gen_conf, **kwargs)
ragflow-gpu-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1 | File "/ragflow/rag/llm/chat_model.py", line 465, in _async_chat
ragflow-gpu-1 | response = await self.async_client.chat.completions.create(model=self.model_name, messages=history, **gen_conf, **kwargs)
ragflow-gpu-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1 | File "/ragflow/.venv/lib/python3.12/site-packages/openai/resources/chat/completions/completions.py", line 2678, in create
ragflow-gpu-1 | return await self._post(
ragflow-gpu-1 | ^^^^^^^^^^^^^^^^^
ragflow-gpu-1 | File "/ragflow/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1794, in post
ragflow-gpu-1 | return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
ragflow-gpu-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1 | File "/ragflow/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1594, in request
ragflow-gpu-1 | raise self._make_status_error_from_response(err.response) from None
ragflow-gpu-1 | openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': "This model's maximum context length is 100000 tokens. However, you requested 211529 tokens in the messages, Please reduce the length of the messages.", 'type': 'BadRequestError', 'param': None, 'code': 400}
ragflow-gpu-1 | 2026-03-20 15:32:02,238 ERROR 52 async base giving up: ERROR: INVALID_REQUEST - Error code: 400 - {'object': 'error', 'message': "This model's maximum context length is 100000 tokens. However, you requested 211529 tokens in the messages, Please reduce the length of the messages.", 'type': 'BadRequestError', 'param': None, 'code': 400}
ragflow-gpu-1 | 2026-03-20 15:32:02,255 INFO 52 set_progress(dc4808cd242e11f19dfa7e8d48ea0d62), progress: None, progress_msg:
ragflow-gpu-1 | 2026-03-20 15:32:02,256 INFO 52

Expected behavior

ragflow-gpu-1 | ]
ragflow-gpu-1 | [2026-03-20 15:32:02 +0800] [31] [INFO] 127.0.0.1:45890 GET /api/v1/datasets/e541a586242b11f1a26d7e8d48ea0d62/documents 1.1 200 2111 18646
ragflow-gpu-1 | 2026-03-20 15:32:02,236 ERROR 52 OpenAI async completion
ragflow-gpu-1 | Traceback (most recent call last):
ragflow-gpu-1 | File "/ragflow/rag/llm/chat_model.py", line 481, in async_chat
ragflow-gpu-1 | return await self._async_chat(history, gen_conf, **kwargs)
ragflow-gpu-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1 | File "/ragflow/rag/llm/chat_model.py", line 465, in _async_chat
ragflow-gpu-1 | response = await self.async_client.chat.completions.create(model=self.model_name, messages=history, **gen_conf, **kwargs)
ragflow-gpu-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1 | File "/ragflow/.venv/lib/python3.12/site-packages/openai/resources/chat/completions/completions.py", line 2678, in create
ragflow-gpu-1 | return await self._post(
ragflow-gpu-1 | ^^^^^^^^^^^^^^^^^
ragflow-gpu-1 | File "/ragflow/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1794, in post
ragflow-gpu-1 | return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
ragflow-gpu-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1 | File "/ragflow/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1594, in request
ragflow-gpu-1 | raise self._make_status_error_from_response(err.response) from None
ragflow-gpu-1 | openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': "This model's maximum context length is 100000 tokens. However, you requested 211529 tokens in the messages, Please reduce the length of the messages.", 'type': 'BadRequestError', 'param': None, 'code': 400}
ragflow-gpu-1 | 2026-03-20 15:32:02,238 ERROR 52 async base giving up: ERROR: INVALID_REQUEST - Error code: 400 - {'object': 'error', 'message': "This model's maximum context length is 100000 tokens. However, you requested 211529 tokens in the messages, Please reduce the length of the messages.", 'type': 'BadRequestError', 'param': None, 'code': 400}
ragflow-gpu-1 | 2026-03-20 15:32:02,255 INFO 52 set_progress(dc4808cd242e11f19dfa7e8d48ea0d62), progress: None, progress_msg:
ragflow-gpu-1 | 2026-03-20 15:32:02,256 INFO 52

Steps to reproduce

ragflow-gpu-1  | ]
ragflow-gpu-1  | [2026-03-20 15:32:02 +0800] [31] [INFO] 127.0.0.1:45890 GET /api/v1/datasets/e541a586242b11f1a26d7e8d48ea0d62/documents 1.1 200 2111 18646
ragflow-gpu-1  | 2026-03-20 15:32:02,236 ERROR    52 OpenAI async completion
ragflow-gpu-1  | Traceback (most recent call last):
ragflow-gpu-1  |   File "/ragflow/rag/llm/chat_model.py", line 481, in async_chat
ragflow-gpu-1  |     return await self._async_chat(history, gen_conf, **kwargs)
ragflow-gpu-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/rag/llm/chat_model.py", line 465, in _async_chat
ragflow-gpu-1  |     response = await self.async_client.chat.completions.create(model=self.model_name, messages=history, **gen_conf, **kwargs)
ragflow-gpu-1  |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/.venv/lib/python3.12/site-packages/openai/resources/chat/completions/completions.py", line 2678, in create
ragflow-gpu-1  |     return await self._post(
ragflow-gpu-1  |            ^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1794, in post
ragflow-gpu-1  |     return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
ragflow-gpu-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1594, in request
ragflow-gpu-1  |     raise self._make_status_error_from_response(err.response) from None
ragflow-gpu-1  | openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': "This model's maximum context length is 100000 tokens. However, you requested 211529 tokens in the messages, Please reduce the length of the messages.", 'type': 'BadRequestError', 'param': None, 'code': 400}
ragflow-gpu-1  | 2026-03-20 15:32:02,238 ERROR    52 async base giving up: **ERROR**: INVALID_REQUEST - Error code: 400 - {'object': 'error', 'message': "This model's maximum context length is 100000 tokens. However, you requested 211529 tokens in the messages, Please reduce the length of the messages.", 'type': 'BadRequestError', 'param': None, 'code': 400}
ragflow-gpu-1  | 2026-03-20 15:32:02,255 INFO     52 set_progress(dc4808cd242e11f19dfa7e8d48ea0d62), progress: None, progress_msg:
ragflow-gpu-1  | 2026-03-20 15:32:02,256 INFO     52

Additional information

ragflow-gpu-1 | ]
ragflow-gpu-1 | [2026-03-20 15:32:02 +0800] [31] [INFO] 127.0.0.1:45890 GET /api/v1/datasets/e541a586242b11f1a26d7e8d48ea0d62/documents 1.1 200 2111 18646
ragflow-gpu-1 | 2026-03-20 15:32:02,236 ERROR 52 OpenAI async completion
ragflow-gpu-1 | Traceback (most recent call last):
ragflow-gpu-1 | File "/ragflow/rag/llm/chat_model.py", line 481, in async_chat
ragflow-gpu-1 | return await self._async_chat(history, gen_conf, **kwargs)
ragflow-gpu-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1 | File "/ragflow/rag/llm/chat_model.py", line 465, in _async_chat
ragflow-gpu-1 | response = await self.async_client.chat.completions.create(model=self.model_name, messages=history, **gen_conf, **kwargs)
ragflow-gpu-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1 | File "/ragflow/.venv/lib/python3.12/site-packages/openai/resources/chat/completions/completions.py", line 2678, in create
ragflow-gpu-1 | return await self._post(
ragflow-gpu-1 | ^^^^^^^^^^^^^^^^^
ragflow-gpu-1 | File "/ragflow/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1794, in post
ragflow-gpu-1 | return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
ragflow-gpu-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1 | File "/ragflow/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1594, in request
ragflow-gpu-1 | raise self._make_status_error_from_response(err.response) from None
ragflow-gpu-1 | openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': "This model's maximum context length is 100000 tokens. However, you requested 211529 tokens in the messages, Please reduce the length of the messages.", 'type': 'BadRequestError', 'param': None, 'code': 400}
ragflow-gpu-1 | 2026-03-20 15:32:02,238 ERROR 52 async base giving up: ERROR: INVALID_REQUEST - Error code: 400 - {'object': 'error', 'message': "This model's maximum context length is 100000 tokens. However, you requested 211529 tokens in the messages, Please reduce the length of the messages.", 'type': 'BadRequestError', 'param': None, 'code': 400}
ragflow-gpu-1 | 2026-03-20 15:32:02,255 INFO 52 set_progress(dc4808cd242e11f19dfa7e8d48ea0d62), progress: None, progress_msg:
ragflow-gpu-1 | 2026-03-20 15:32:02,256 INFO 52

Metadata

Metadata

Assignees

No one assigned

    Labels

    🐞 bugSomething isn't working, pull request that fix bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions