Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 4 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -271,21 +271,11 @@ For downstream applications, please refer to our document about [API Details](./
- [Python API](./docs/APIS.md#api-python), how to use the program in other Python programs
- [HTTP API](./docs/APIS.md#api-http), how to communicate with a server with the program installed

<h2 id="todo">TODOs</h2>
<h2 id="sponsors">Sponsors</h2>

- [ ] Parse layout with DocLayNet based models, [PaddleX](https://github.com/PaddlePaddle/PaddleX/blob/17cc27ac3842e7880ca4aad92358d3ef8555429a/paddlex/repo_apis/PaddleDetection_api/object_det/official_categories.py#L81), [PaperMage](https://github.com/allenai/papermage/blob/9cd4bb48cbedab45d0f7a455711438f1632abebe/README.md?plain=1#L102), [SAM2](https://github.com/facebookresearch/sam2)

- [ ] Fix page rotation, table of contents, format of lists

- [ ] Fix pixel formula in old papers

- [ ] Async retry except KeyboardInterrupt

- [ ] Knuth–Plass algorithm for western languages

- [ ] Support non-PDF/A files

- [ ] Plugins of [Zotero](https://github.com/zotero/zotero) and [Obsidian](https://github.com/obsidianmd/obsidian-releases)
<a href="https://share.302.ai/tqTWfD">
<img width="50%" alt="image" src="https://github.com/user-attachments/assets/9c81d851-9560-4189-991a-f8036b8e8fc1" />
</a>

<h2 id="acknowledgement">Acknowledgements</h2>

Expand Down
3 changes: 2 additions & 1 deletion docs/ADVANCED.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,11 +55,12 @@ We've provided a detailed table on the required [environment variables](https://
|----------------------|----------------|-----------------------------------------------------------------------|----------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **Google (Default)** | `google` | None | N/A | None |
| **Bing** | `bing` | None | N/A | None |
| **302.AI** | `302ai` | `X302AI_API_KEY`, `X302AI_MODEL` | `[Your Key]`, `Gemma-7B` | See [302.AI](https://share.302.ai/tqTWfD) |
| **OpenAI** | `openai` | `OPENAI_BASE_URL`, `OPENAI_API_KEY`, `OPENAI_MODEL` | `https://api.openai.com/v1`, `[Your Key]`, `gpt-4o-mini` | See [OpenAI](https://platform.openai.com/docs/overview) |
| **DeepL** | `deepl` | `DEEPL_AUTH_KEY` | `[Your Key]` | See [DeepL](https://support.deepl.com/hc/en-us/articles/360020695820-API-Key-for-DeepL-s-API) |
| **DeepLX** | `deeplx` | `DEEPLX_ENDPOINT` | `https://api.deepl.com/translate` | See [DeepLX](https://github.com/OwO-Network/DeepLX) |
| **Ollama** | `ollama` | `OLLAMA_HOST`, `OLLAMA_MODEL` | `http://127.0.0.1:11434`, `gemma2` | See [Ollama](https://github.com/ollama/ollama) |
| **Xinference** | `xinference` | `XINFERENCE_HOST`, `XINFERENCE_MODEL` | `http://127.0.0.1:9997`, `gemma-2-it` | See [Xinference](https://github.com/xorbitsai/inference) |
| **OpenAI** | `openai` | `OPENAI_BASE_URL`, `OPENAI_API_KEY`, `OPENAI_MODEL` | `https://api.openai.com/v1`, `[Your Key]`, `gpt-4o-mini` | See [OpenAI](https://platform.openai.com/docs/overview) |
| **AzureOpenAI** | `azure-openai` | `AZURE_OPENAI_BASE_URL`, `AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_MODEL` | `[Your Endpoint]`, `[Your Key]`, `gpt-4o-mini` | See [Azure OpenAI](https://learn.microsoft.com/zh-cn/azure/ai-services/openai/chatgpt-quickstart?tabs=command-line%2Cjavascript-keyless%2Ctypescript-keyless%2Cpython&pivots=programming-language-python) |
| **Zhipu** | `zhipu` | `ZHIPU_API_KEY`, `ZHIPU_MODEL` | `[Your Key]`, `glm-4-flash` | See [Zhipu](https://open.bigmodel.cn/dev/api/thirdparty-frame/openai-sdk) |
| **ModelScope** | `modelscope` | `MODELSCOPE_API_KEY`, `MODELSCOPE_MODEL` | `[Your Key]`, `Qwen/Qwen2.5-Coder-32B-Instruct` | See [ModelScope](https://www.modelscope.cn/docs/model-service/API-Inference/intro) |
Expand Down
3 changes: 2 additions & 1 deletion pdf2zh/converter.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@
TencentTranslator,
XinferenceTranslator,
ZhipuTranslator,
X302AITranslator,
)

log = logging.getLogger(__name__)
Expand Down Expand Up @@ -159,7 +160,7 @@ def __init__(
if not envs:
envs = {}
for translator in [GoogleTranslator, BingTranslator, DeepLTranslator, DeepLXTranslator, OllamaTranslator, XinferenceTranslator, AzureOpenAITranslator,
OpenAITranslator, ZhipuTranslator, ModelScopeTranslator, SiliconTranslator, GeminiTranslator, AzureTranslator, TencentTranslator, DifyTranslator, AnythingLLMTranslator, ArgosTranslator, GrokTranslator, GroqTranslator, DeepseekTranslator, OpenAIlikedTranslator, QwenMtTranslator,]:
OpenAITranslator, ZhipuTranslator, ModelScopeTranslator, SiliconTranslator, GeminiTranslator, AzureTranslator, TencentTranslator, DifyTranslator, AnythingLLMTranslator, ArgosTranslator, GrokTranslator, GroqTranslator, DeepseekTranslator, OpenAIlikedTranslator, QwenMtTranslator, X302AITranslator]:
if service_name == translator.name:
self.translator = translator(lang_in, lang_out, service_model, envs=envs, prompt=prompt, ignore_cache=ignore_cache)
if not self.translator:
Expand Down
3 changes: 3 additions & 0 deletions pdf2zh/gui.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@
DeepseekTranslator,
OpenAIlikedTranslator,
QwenMtTranslator,
X302AITranslator,
)
from babeldoc.docvision.doclayout import OnnxModel
from babeldoc import __version__ as babeldoc_version
Expand Down Expand Up @@ -73,6 +74,7 @@
"DeepSeek": DeepseekTranslator,
"OpenAI-liked": OpenAIlikedTranslator,
"Ali Qwen-Translation": QwenMtTranslator,
"302.AI": X302AITranslator,
}

# The following variables associate strings with specific languages
Expand Down Expand Up @@ -382,6 +384,7 @@ def babeldoc_translate_file(**kwargs):
DeepseekTranslator,
OpenAIlikedTranslator,
QwenMtTranslator,
X302AITranslator,
]:
if kwargs["service"] == translator.name:
translator = translator(
Expand Down
2 changes: 2 additions & 0 deletions pdf2zh/pdf2zh.py
Original file line number Diff line number Diff line change
Expand Up @@ -383,6 +383,7 @@ def yadt_main(parsed_args) -> int:
DeepseekTranslator,
OpenAIlikedTranslator,
QwenMtTranslator,
X302AITranslator,
)

for translator in [
Expand All @@ -408,6 +409,7 @@ def yadt_main(parsed_args) -> int:
DeepseekTranslator,
OpenAIlikedTranslator,
QwenMtTranslator,
X302AITranslator,
]:
if service_name == translator.name:
translator = translator(
Expand Down
29 changes: 29 additions & 0 deletions pdf2zh/translator.py
Original file line number Diff line number Diff line change
Expand Up @@ -625,6 +625,35 @@ def __init__(
self.add_cache_impact_parameters("prompt", self.prompt("", self.prompttext))


class X302AITranslator(OpenAITranslator):
# https://doc.302.ai/
name = "302ai"
envs = {
"X302AI_API_KEY": None,
"X302AI_MODEL": "Gemma-7B",
}
CustomPrompt = True

def __init__(
self, lang_in, lang_out, model, envs=None, prompt=None, ignore_cache=False
):
self.set_envs(envs)
base_url = "https://api.302.ai/v1"
api_key = self.envs["X302AI_API_KEY"]
if not model:
model = self.envs["X302AI_MODEL"]
super().__init__(
lang_in,
lang_out,
model,
base_url=base_url,
api_key=api_key,
ignore_cache=ignore_cache,
)
self.prompttext = prompt
self.add_cache_impact_parameters("prompt", self.prompt("", self.prompttext))


class GeminiTranslator(OpenAITranslator):
# https://ai.google.dev/gemini-api/docs/openai
name = "gemini"
Expand Down