[Bug]: error ragflow v0.23.0

### Self Checks

- [x] I have searched for existing issues [search for existing issues](https://github.com/infiniflow/ragflow/issues), including closed ones.
- [x] I confirm that I am using English to submit this report ([Language Policy](https://github.com/infiniflow/ragflow/issues/5910)).
- [x] Non-english title submitions will be closed directly ( &#38750;&#33521;&#25991;&#26631;&#39064;&#30340;&#25552;&#20132;&#23558;&#20250;&#34987;&#30452;&#25509;&#20851;&#38381; ) ([Language Policy](https://github.com/infiniflow/ragflow/issues/5910)).
- [x] Please do not modify this template :) and fill in all the required fields.

### RAGFlow workspace code commit ID

34567

### RAGFlow image version

v0.23.0

### Other environment information

```Markdown
ragflow-gpu-1  | ]
ragflow-gpu-1  | [2026-03-20 15:32:02 +0800] [31] [INFO] 127.0.0.1:45890 GET /api/v1/datasets/e541a586242b11f1a26d7e8d48ea0d62/documents 1.1 200 2111 18646
ragflow-gpu-1  | 2026-03-20 15:32:02,236 ERROR    52 OpenAI async completion
ragflow-gpu-1  | Traceback (most recent call last):
ragflow-gpu-1  |   File "/ragflow/rag/llm/chat_model.py", line 481, in async_chat
ragflow-gpu-1  |     return await self._async_chat(history, gen_conf, **kwargs)
ragflow-gpu-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/rag/llm/chat_model.py", line 465, in _async_chat
ragflow-gpu-1  |     response = await self.async_client.chat.completions.create(model=self.model_name, messages=history, **gen_conf, **kwargs)
ragflow-gpu-1  |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/.venv/lib/python3.12/site-packages/openai/resources/chat/completions/completions.py", line 2678, in create
ragflow-gpu-1  |     return await self._post(
ragflow-gpu-1  |            ^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1794, in post
ragflow-gpu-1  |     return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
ragflow-gpu-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1594, in request
ragflow-gpu-1  |     raise self._make_status_error_from_response(err.response) from None
ragflow-gpu-1  | openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': "This model's maximum context length is 100000 tokens. However, you requested 211529 tokens in the messages, Please reduce the length of the messages.", 'type': 'BadRequestError', 'param': None, 'code': 400}
ragflow-gpu-1  | 2026-03-20 15:32:02,238 ERROR    52 async base giving up: **ERROR**: INVALID_REQUEST - Error code: 400 - {'object': 'error', 'message': "This model's maximum context length is 100000 tokens. However, you requested 211529 tokens in the messages, Please reduce the length of the messages.", 'type': 'BadRequestError', 'param': None, 'code': 400}
ragflow-gpu-1  | 2026-03-20 15:32:02,255 INFO     52 set_progress(dc4808cd242e11f19dfa7e8d48ea0d62), progress: None, progress_msg:
ragflow-gpu-1  | 2026-03-20 15:32:02,256 INFO     52
```

### Actual behavior

ragflow-gpu-1  | ]
ragflow-gpu-1  | [2026-03-20 15:32:02 +0800] [31] [INFO] 127.0.0.1:45890 GET /api/v1/datasets/e541a586242b11f1a26d7e8d48ea0d62/documents 1.1 200 2111 18646
ragflow-gpu-1  | 2026-03-20 15:32:02,236 ERROR    52 OpenAI async completion
ragflow-gpu-1  | Traceback (most recent call last):
ragflow-gpu-1  |   File "/ragflow/rag/llm/chat_model.py", line 481, in async_chat
ragflow-gpu-1  |     return await self._async_chat(history, gen_conf, **kwargs)
ragflow-gpu-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/rag/llm/chat_model.py", line 465, in _async_chat
ragflow-gpu-1  |     response = await self.async_client.chat.completions.create(model=self.model_name, messages=history, **gen_conf, **kwargs)
ragflow-gpu-1  |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/.venv/lib/python3.12/site-packages/openai/resources/chat/completions/completions.py", line 2678, in create
ragflow-gpu-1  |     return await self._post(
ragflow-gpu-1  |            ^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1794, in post
ragflow-gpu-1  |     return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
ragflow-gpu-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1594, in request
ragflow-gpu-1  |     raise self._make_status_error_from_response(err.response) from None
ragflow-gpu-1  | openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': "This model's maximum context length is 100000 tokens. However, you requested 211529 tokens in the messages, Please reduce the length of the messages.", 'type': 'BadRequestError', 'param': None, 'code': 400}
ragflow-gpu-1  | 2026-03-20 15:32:02,238 ERROR    52 async base giving up: **ERROR**: INVALID_REQUEST - Error code: 400 - {'object': 'error', 'message': "This model's maximum context length is 100000 tokens. However, you requested 211529 tokens in the messages, Please reduce the length of the messages.", 'type': 'BadRequestError', 'param': None, 'code': 400}
ragflow-gpu-1  | 2026-03-20 15:32:02,255 INFO     52 set_progress(dc4808cd242e11f19dfa7e8d48ea0d62), progress: None, progress_msg:
ragflow-gpu-1  | 2026-03-20 15:32:02,256 INFO     52


### Expected behavior

ragflow-gpu-1  | ]
ragflow-gpu-1  | [2026-03-20 15:32:02 +0800] [31] [INFO] 127.0.0.1:45890 GET /api/v1/datasets/e541a586242b11f1a26d7e8d48ea0d62/documents 1.1 200 2111 18646
ragflow-gpu-1  | 2026-03-20 15:32:02,236 ERROR    52 OpenAI async completion
ragflow-gpu-1  | Traceback (most recent call last):
ragflow-gpu-1  |   File "/ragflow/rag/llm/chat_model.py", line 481, in async_chat
ragflow-gpu-1  |     return await self._async_chat(history, gen_conf, **kwargs)
ragflow-gpu-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/rag/llm/chat_model.py", line 465, in _async_chat
ragflow-gpu-1  |     response = await self.async_client.chat.completions.create(model=self.model_name, messages=history, **gen_conf, **kwargs)
ragflow-gpu-1  |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/.venv/lib/python3.12/site-packages/openai/resources/chat/completions/completions.py", line 2678, in create
ragflow-gpu-1  |     return await self._post(
ragflow-gpu-1  |            ^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1794, in post
ragflow-gpu-1  |     return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
ragflow-gpu-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1594, in request
ragflow-gpu-1  |     raise self._make_status_error_from_response(err.response) from None
ragflow-gpu-1  | openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': "This model's maximum context length is 100000 tokens. However, you requested 211529 tokens in the messages, Please reduce the length of the messages.", 'type': 'BadRequestError', 'param': None, 'code': 400}
ragflow-gpu-1  | 2026-03-20 15:32:02,238 ERROR    52 async base giving up: **ERROR**: INVALID_REQUEST - Error code: 400 - {'object': 'error', 'message': "This model's maximum context length is 100000 tokens. However, you requested 211529 tokens in the messages, Please reduce the length of the messages.", 'type': 'BadRequestError', 'param': None, 'code': 400}
ragflow-gpu-1  | 2026-03-20 15:32:02,255 INFO     52 set_progress(dc4808cd242e11f19dfa7e8d48ea0d62), progress: None, progress_msg:
ragflow-gpu-1  | 2026-03-20 15:32:02,256 INFO     52


### Steps to reproduce

```Markdown
ragflow-gpu-1  | ]
ragflow-gpu-1  | [2026-03-20 15:32:02 +0800] [31] [INFO] 127.0.0.1:45890 GET /api/v1/datasets/e541a586242b11f1a26d7e8d48ea0d62/documents 1.1 200 2111 18646
ragflow-gpu-1  | 2026-03-20 15:32:02,236 ERROR    52 OpenAI async completion
ragflow-gpu-1  | Traceback (most recent call last):
ragflow-gpu-1  |   File "/ragflow/rag/llm/chat_model.py", line 481, in async_chat
ragflow-gpu-1  |     return await self._async_chat(history, gen_conf, **kwargs)
ragflow-gpu-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/rag/llm/chat_model.py", line 465, in _async_chat
ragflow-gpu-1  |     response = await self.async_client.chat.completions.create(model=self.model_name, messages=history, **gen_conf, **kwargs)
ragflow-gpu-1  |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/.venv/lib/python3.12/site-packages/openai/resources/chat/completions/completions.py", line 2678, in create
ragflow-gpu-1  |     return await self._post(
ragflow-gpu-1  |            ^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1794, in post
ragflow-gpu-1  |     return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
ragflow-gpu-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1594, in request
ragflow-gpu-1  |     raise self._make_status_error_from_response(err.response) from None
ragflow-gpu-1  | openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': "This model's maximum context length is 100000 tokens. However, you requested 211529 tokens in the messages, Please reduce the length of the messages.", 'type': 'BadRequestError', 'param': None, 'code': 400}
ragflow-gpu-1  | 2026-03-20 15:32:02,238 ERROR    52 async base giving up: **ERROR**: INVALID_REQUEST - Error code: 400 - {'object': 'error', 'message': "This model's maximum context length is 100000 tokens. However, you requested 211529 tokens in the messages, Please reduce the length of the messages.", 'type': 'BadRequestError', 'param': None, 'code': 400}
ragflow-gpu-1  | 2026-03-20 15:32:02,255 INFO     52 set_progress(dc4808cd242e11f19dfa7e8d48ea0d62), progress: None, progress_msg:
ragflow-gpu-1  | 2026-03-20 15:32:02,256 INFO     52
```

### Additional information

ragflow-gpu-1  | ]
ragflow-gpu-1  | [2026-03-20 15:32:02 +0800] [31] [INFO] 127.0.0.1:45890 GET /api/v1/datasets/e541a586242b11f1a26d7e8d48ea0d62/documents 1.1 200 2111 18646
ragflow-gpu-1  | 2026-03-20 15:32:02,236 ERROR    52 OpenAI async completion
ragflow-gpu-1  | Traceback (most recent call last):
ragflow-gpu-1  |   File "/ragflow/rag/llm/chat_model.py", line 481, in async_chat
ragflow-gpu-1  |     return await self._async_chat(history, gen_conf, **kwargs)
ragflow-gpu-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/rag/llm/chat_model.py", line 465, in _async_chat
ragflow-gpu-1  |     response = await self.async_client.chat.completions.create(model=self.model_name, messages=history, **gen_conf, **kwargs)
ragflow-gpu-1  |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/.venv/lib/python3.12/site-packages/openai/resources/chat/completions/completions.py", line 2678, in create
ragflow-gpu-1  |     return await self._post(
ragflow-gpu-1  |            ^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1794, in post
ragflow-gpu-1  |     return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
ragflow-gpu-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ragflow-gpu-1  |   File "/ragflow/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1594, in request
ragflow-gpu-1  |     raise self._make_status_error_from_response(err.response) from None
ragflow-gpu-1  | openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': "This model's maximum context length is 100000 tokens. However, you requested 211529 tokens in the messages, Please reduce the length of the messages.", 'type': 'BadRequestError', 'param': None, 'code': 400}
ragflow-gpu-1  | 2026-03-20 15:32:02,238 ERROR    52 async base giving up: **ERROR**: INVALID_REQUEST - Error code: 400 - {'object': 'error', 'message': "This model's maximum context length is 100000 tokens. However, you requested 211529 tokens in the messages, Please reduce the length of the messages.", 'type': 'BadRequestError', 'param': None, 'code': 400}
ragflow-gpu-1  | 2026-03-20 15:32:02,255 INFO     52 set_progress(dc4808cd242e11f19dfa7e8d48ea0d62), progress: None, progress_msg:
ragflow-gpu-1  | 2026-03-20 15:32:02,256 INFO     52


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: error ragflow v0.23.0 #13725

Self Checks

RAGFlow workspace code commit ID

RAGFlow image version

Other environment information

Actual behavior

Expected behavior

Steps to reproduce

Additional information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: error ragflow v0.23.0 #13725

Description

Self Checks

RAGFlow workspace code commit ID

RAGFlow image version

Other environment information

Actual behavior

Expected behavior

Steps to reproduce

Additional information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions