feat: 增加 siliconflow fim 的支持 (#63 )

siliconflow 已支持标准格式的 FIM 补全,特此 PR
Fix: Repair the logic for obtaining the configuration file path. (#60 )
2024-10-11 18:49:08 -07:00 · 2024-09-25 00:11:30 -07:00 · 2024-09-25 00:10:50 -07:00 · 2024-09-08 18:24:12 -07:00 · 2024-07-29 00:33:11 -07:00 · 2024-07-26 18:41:27 +08:00
6 changed files with 510 additions and 34 deletions
--- a/15
+++ b/15
@ -1,17 +1,22 @@
 FROM golang:alpine AS builder

 WORKDIR /app
+COPY . .

-ADD . .
+ENV GO111MODULE=on GOPROXY=https://goproxy.cn,direct
+RUN go mod download

-RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o override
+RUN CGO_ENABLED=0 go build -ldflags="-w -s" -o override

 FROM alpine:latest

-COPY --from=builder /app/override /usr/local/bin/override
+RUN apk --no-cache add ca-certificates
+
+COPY --from=builder /app/override /usr/local/bin/
+COPY config.json.example /app/config.json

 WORKDIR /app
-
-ENTRYPOINT ["/usr/local/bin/override"]
+VOLUME /app

 EXPOSE 8181
+CMD ["override"]
--- a/README.md
+++ b/README.md
@ -6,6 +6,7 @@

 ```json
    "github.copilot.advanced": {
+        "debug.overrideCAPIUrl": "http://127.0.0.1:8181/v1",
        "debug.overrideProxyUrl": "http://127.0.0.1:8181",
        "debug.chatOverrideProxyUrl": "http://127.0.0.1:8181/v1/chat/completions",
        "authProvider": "github-enterprise"
@ -27,46 +28,116 @@

 ```json
 {
-  "bind": "127.0.0.1:8181",
-  "proxy_url": "",
-  "timeout": 600,
-  "codex_api_base": "https://api-proxy.oaipro.com/v1",
-  "codex_api_key": "sk-xxx",
-  "codex_api_organization": "",
-  "codex_api_project": "",
-  "chat_api_base": "https://api-proxy.oaipro.com/v1",
-  "chat_api_key": "sk-xxx",
-  "chat_api_organization": "",
-  "chat_api_project": "",
-  "chat_max_tokens": 4096,
-  "chat_model_default": "gpt-4o",
-  "chat_model_map": {}
+ "bind": "127.0.0.1:8181",
+ "proxy_url": "",
+ "timeout": 600,
+ "codex_api_base": "https://api-proxy.oaipro.com/v1",
+ "codex_api_key": "sk-xxx",
+ "codex_api_organization": "",
+ "codex_api_project": "",
+ "codex_max_tokens": 500,
+ "code_instruct_model": "gpt-3.5-turbo-instruct",
+ "chat_api_base": "https://api-proxy.oaipro.com/v1",
+ "chat_api_key": "sk-xxx",
+ "chat_api_organization": "",
+ "chat_api_project": "",
+ "chat_max_tokens": 4096,
+ "chat_model_default": "gpt-4o",
+ "chat_model_map": {},
+ "chat_locale": "zh_CN",
+ "auth_token": ""
 }
+
 ```

 `organization` 和 `project` 除非你有，且知道怎么回事再填。

 `chat_model_map` 是个模型映射的字典。会将请求的模型映射到你想要的，如果不存在映射，则使用 `chat_model_default` 。

+`codex_max_tokens` 可以设置为你希望的最大Token数，你设置的时候最好知道自己在做什么。代码生成通常使用 `500` 即可。
+
 `chat_max_tokens` 可以设置为你希望的最大Token数，你设置的时候最好知道自己在做什么。`gpt-4o` 输出最大为 `4096`

 可以通过 `OVERRIDE_` + 大写配置项作为环境变量，可以覆盖 `config.json` 中的值。例如：`OVERRIDE_CODEX_API_KEY=sk-xxxx`

+### DeepSeek Coder 设置
+如果你希望使用 DeepSeek Coder FIM 来进行代码补全，着重修改以下配置：
+
+```json
+  "codex_api_base": "https://api.deepseek.com/beta/v1",
+  "codex_api_key": "sk-xxx",
+  "code_instruct_model": "deepseek-coder",
+```
+
+### Siliconflow 设置
+如果你希望使用 Siliconflow FIM 模型来进行代码补全，着重修改以下配置：
+
+```json
+  "codex_api_base": "https://api.siliconflow.cn/v1",
+  "codex_api_key": "sk-xxx,sk-xxx2,sk-xxx3...",
+  "code_instruct_model": "Qwen/Qwen2.5-Coder-7B-Instruct",
+```
+
+截至目前，Siliconflow 共有三个模型支持 FIM。分别是 `Qwen/Qwen2.5-Coder-7B-Instruct`、`deepseek-ai/DeepSeek-Coder-V2-Instruct` 、`deepseek-ai/DeepSeek-V2.5`。其中 `Qwen/Qwen2.5-Coder-7B-Instruct` 是免费模型，另外两个是收费模型。
+
+如果你有很多 Siliconflow API Key, 可以以英文逗号分隔填入`codex_api_key`字段, 这样可以很好的避免Siliconflow官方的 TPM RateLimit 对你编码速度影响(尤其使用收费模型时，用户级别较低，TPM 最低只有 10k)。
+
+
+
+### 本地大模型设置
+1. 安装ollama 
+2. ollama run stable-code:code  (这个模型较小，大部分显卡都能跑)  
+ 或者你的显卡比较高安装这个：ollama run stable-code:3b-code-fp16
+3. 修改config.json里面的codex_api_base为http://localhost:11434/v1/chat
+4. 修改code_instruct_model为你的模型名称，stable-code:code或者stable-code:3b-code-fp16
+5. 剩下的就按照正常流程走即可。
+6. 如果调不通，请确认http://localhost:11434/v1/chat可用。
+        
 ### 重要说明
 `codex_max_tokens` 工作并不完美，已经移除。**JetBrains IDE 完美工作**，`VSCode` 需要执行以下脚本Patch之：

 * macOS `sed -i '' -E 's/\.maxPromptCompletionTokens\(([a-zA-Z0-9_]+),([0-9]+)\)/.maxPromptCompletionTokens(\1,2048)/' ~/.vscode/extensions/github.copilot-*/dist/extension.js`
 * Linux `sed -E 's/\.maxPromptCompletionTokens\(([a-zA-Z0-9_]+),([0-9]+)\)/.maxPromptCompletionTokens(\1,2048)/' ~/.vscode/extensions/github.copilot-*/dist/extension.js`
-* Windows 不知道怎么写，期待大佬PR。
+* Windows 可以用如下的python脚本进行替换
 * 因为是Patch，所以：**Copilot每次升级都要执行一次**。
 * 具体原因是客户端需要根据 `max_tokens` 精密计算prompt，后台删减会有问题。

+```
+# github copilot extention replace script
+import re
+import glob
+import os
+
+file_paths = glob.glob(os.getenv("USERPROFILE") + r'\.vscode\extensions\github.copilot-*\dist\extension.js')
+if file_paths == list():
+    print("no copilot extension found")
+    exit()
+
+pattern = re.compile(r'\.maxPromptCompletionTokens\(([a-zA-Z0-9_]+),([0-9]+)\)')
+replacement = r'.maxPromptCompletionTokens(\1,2048)'
+
+for file_path in file_paths:
+    with open(file_path, 'r', encoding="utf-8") as file:
+        content = file.read()
+    
+    new_content = pattern.sub(replacement, content)
+    if new_content == content:
+        print("no match found in " + file_path)
+        continue
+    else:
+        print("replaced " + file_path)
+    
+    with open(file_path, 'w', encoding='utf-8') as file:
+        file.write(new_content)
+
+print("replace finish")
+```
+
 ### 其他说明
 1. 理论上，Chat 部分可以使用 `chat2api` ，而 Codex 代码生成部分则不太适合使用 `chat2api` 。
 2. 代码生成部分做过延时生成和客户端 Cancel 处理，很有效节省你的Token。
-3. 我目前就试了下 `VSCode` ，至于 `JetBrains` 等IDE尚未适配，如果你有相关经验，请告诉我。
-4. 项目基于 `MIT` 协议发布，你可以修改，请保留原作者信息。
-5. 有什么问题，请在论坛 https://linux.do 讨论，欢迎PR。
+3. 项目基于 `MIT` 协议发布，你可以修改，请保留原作者信息。
+4. 有什么问题，请在论坛 https://linux.do 讨论，欢迎PR。

 ### Star History

--- a/config.json.example
+++ b/config.json.example
@ -6,11 +6,15 @@
  "codex_api_key": "sk-xxx",
  "codex_api_organization": "",
  "codex_api_project": "",
+  "codex_max_tokens": 500,
+  "code_instruct_model": "gpt-3.5-turbo-instruct",
  "chat_api_base": "https://api-proxy.oaipro.com/v1",
  "chat_api_key": "sk-xxx",
  "chat_api_organization": "",
  "chat_api_project": "",
  "chat_max_tokens": 4096,
  "chat_model_default": "gpt-4o",
-  "chat_model_map": {}
-}
+  "chat_model_map": {},
+  "chat_locale": "zh_CN",
+  "auth_token": ""
+}
--- a/main.go
+++ b/main.go
@ -5,6 +5,7 @@ import (
 	"context"
 	"encoding/json"
 	"errors"
+	"fmt"
 	"github.com/gin-gonic/gin"
 	"github.com/tidwall/gjson"
 	"github.com/tidwall/sjson"
@ -18,9 +19,16 @@ import (
 	"strconv"
 	"strings"
 	"time"
+    "math/rand"
 )

-const InstructModel = "gpt-3.5-turbo-instruct"
+const DefaultInstructModel = "gpt-3.5-turbo-instruct"
+
+const StableCodeModelPrefix = "stable-code"
+
+const DeepSeekCoderModel = "deepseek-coder"
+
+var SiliconflowModels = []string{"deepseek-ai/DeepSeek-V2.5", "deepseek-ai/DeepSeek-Coder-V2-Instruct", "Qwen/Qwen2.5-Coder-7B-Instruct"}

 type config struct {
 	Bind                 string            `json:"bind"`
@ -30,6 +38,8 @@ type config struct {
 	CodexApiKey          string            `json:"codex_api_key"`
 	CodexApiOrganization string            `json:"codex_api_organization"`
 	CodexApiProject      string            `json:"codex_api_project"`
+	CodexMaxTokens       int               `json:"codex_max_tokens"`
+	CodeInstructModel    string            `json:"code_instruct_model"`
 	ChatApiBase          string            `json:"chat_api_base"`
 	ChatApiKey           string            `json:"chat_api_key"`
 	ChatApiOrganization  string            `json:"chat_api_organization"`
@ -37,10 +47,18 @@ type config struct {
 	ChatMaxTokens        int               `json:"chat_max_tokens"`
 	ChatModelDefault     string            `json:"chat_model_default"`
 	ChatModelMap         map[string]string `json:"chat_model_map"`
+	ChatLocale           string            `json:"chat_locale"`
+	AuthToken            string            `json:"auth_token"`
 }

 func readConfig() *config {
-	content, err := os.ReadFile("config.json")
+	var configPath string
+	if len(os.Args) > 1 {
+		configPath = os.Args[1]
+	} else {
+		configPath = "config.json"
+	}
+	content, err := os.ReadFile(configPath)
 	if nil != err {
 		log.Fatal(err)
 	}
@ -87,6 +105,17 @@ func readConfig() *config {
 			}
 		}
 	}
+	if _cfg.CodeInstructModel == "" {
+		_cfg.CodeInstructModel = DefaultInstructModel
+	}
+
+	if _cfg.CodexMaxTokens == 0 {
+		_cfg.CodexMaxTokens = 500
+	}
+
+	if _cfg.ChatMaxTokens == 0 {
+		_cfg.ChatMaxTokens = 4096
+	}

 	return _cfg
 }
@ -149,10 +178,214 @@ func NewProxyService(cfg *config) (*ProxyService, error) {
 		client: client,
 	}, nil
 }
+func AuthMiddleware(authToken string) gin.HandlerFunc {
+	return func(c *gin.Context) {
+		token := c.Param("token")
+		if token != authToken {
+			c.JSON(http.StatusUnauthorized, gin.H{"error": "Unauthorized"})
+			c.Abort()
+			return
+		}
+		c.Next()
+	}
+}

 func (s *ProxyService) InitRoutes(e *gin.Engine) {
-	e.POST("/v1/chat/completions", s.completions)
-	e.POST("/v1/engines/copilot-codex/completions", s.codeCompletions)
+	e.GET("/_ping", s.pong)
+	e.GET("/models", s.models)
+	e.GET("/v1/models", s.models)
+	authToken := s.cfg.AuthToken // replace with your dynamic value as needed
+	if authToken != "" {
+		// 鉴权
+		v1 := e.Group("/:token/v1/", AuthMiddleware(authToken))
+		{
+			v1.POST("/chat/completions", s.completions)
+			v1.POST("/engines/copilot-codex/completions", s.codeCompletions)
+
+			v1.POST("/v1/chat/completions", s.completions)
+			v1.POST("/v1/engines/copilot-codex/completions", s.codeCompletions)
+		}
+	} else {
+		e.POST("/v1/chat/completions", s.completions)
+		e.POST("/v1/engines/copilot-codex/completions", s.codeCompletions)
+
+		e.POST("/v1/v1/chat/completions", s.completions)
+		e.POST("/v1/v1/engines/copilot-codex/completions", s.codeCompletions)
+	}
+}
+
+type Pong struct {
+	Now    int    `json:"now"`
+	Status string `json:"status"`
+	Ns1    string `json:"ns1"`
+}
+
+func (s *ProxyService) pong(c *gin.Context) {
+	c.JSON(http.StatusOK, Pong{
+		Now:    time.Now().Second(),
+		Status: "ok",
+		Ns1:    "200 OK",
+	})
+}
+
+func (s *ProxyService) models(c *gin.Context) {
+	c.JSON(http.StatusOK, gin.H{
+		"data": []gin.H{
+			{
+				"capabilities": gin.H{
+					"family":    "gpt-3.5-turbo",
+					"limits":    gin.H{"max_prompt_tokens": 12288},
+					"object":    "model_capabilities",
+					"supports":  gin.H{"tool_calls": true},
+					"tokenizer": "cl100k_base",
+					"type":      "chat",
+				},
+				"id":      "gpt-3.5-turbo",
+				"name":    "GPT 3.5 Turbo",
+				"object":  "model",
+				"version": "gpt-3.5-turbo-0613",
+			},
+			{
+				"capabilities": gin.H{
+					"family":    "gpt-3.5-turbo",
+					"limits":    gin.H{"max_prompt_tokens": 12288},
+					"object":    "model_capabilities",
+					"supports":  gin.H{"tool_calls": true},
+					"tokenizer": "cl100k_base",
+					"type":      "chat",
+				},
+				"id":      "gpt-3.5-turbo-0613",
+				"name":    "GPT 3.5 Turbo",
+				"object":  "model",
+				"version": "gpt-3.5-turbo-0613",
+			},
+			{
+				"capabilities": gin.H{
+					"family":    "gpt-4",
+					"limits":    gin.H{"max_prompt_tokens": 20000},
+					"object":    "model_capabilities",
+					"supports":  gin.H{"tool_calls": true},
+					"tokenizer": "cl100k_base",
+					"type":      "chat",
+				},
+				"id":      "gpt-4",
+				"name":    "GPT 4",
+				"object":  "model",
+				"version": "gpt-4-0613",
+			},
+			{
+				"capabilities": gin.H{
+					"family":    "gpt-4",
+					"limits":    gin.H{"max_prompt_tokens": 20000},
+					"object":    "model_capabilities",
+					"supports":  gin.H{"tool_calls": true},
+					"tokenizer": "cl100k_base",
+					"type":      "chat",
+				},
+				"id":      "gpt-4-0613",
+				"name":    "GPT 4",
+				"object":  "model",
+				"version": "gpt-4-0613",
+			},
+			{
+				"capabilities": gin.H{
+					"family":    "gpt-4-turbo",
+					"limits":    gin.H{"max_prompt_tokens": 20000},
+					"object":    "model_capabilities",
+					"supports":  gin.H{"parallel_tool_calls": true, "tool_calls": true},
+					"tokenizer": "cl100k_base",
+					"type":      "chat",
+				},
+				"id":      "gpt-4-0125-preview",
+				"name":    "GPT 4 Turbo",
+				"object":  "model",
+				"version": "gpt-4-0125-preview",
+			},
+			{
+				"capabilities": gin.H{
+					"family":    "gpt-4o",
+					"limits":    gin.H{"max_prompt_tokens": 20000},
+					"object":    "model_capabilities",
+					"supports":  gin.H{"parallel_tool_calls": true, "tool_calls": true},
+					"tokenizer": "o200k_base",
+					"type":      "chat",
+				},
+				"id":      "gpt-4o",
+				"name":    "GPT 4o",
+				"object":  "model",
+				"version": "gpt-4o-2024-05-13",
+			},
+			{
+				"capabilities": gin.H{
+					"family":    "gpt-4o",
+					"limits":    gin.H{"max_prompt_tokens": 20000},
+					"object":    "model_capabilities",
+					"supports":  gin.H{"parallel_tool_calls": true, "tool_calls": true},
+					"tokenizer": "o200k_base",
+					"type":      "chat",
+				},
+				"id":      "gpt-4o-2024-05-13",
+				"name":    "GPT 4o",
+				"object":  "model",
+				"version": "gpt-4o-2024-05-13",
+			},
+			{
+				"capabilities": gin.H{
+					"family":    "gpt-4o",
+					"limits":    gin.H{"max_prompt_tokens": 20000},
+					"object":    "model_capabilities",
+					"supports":  gin.H{"parallel_tool_calls": true, "tool_calls": true},
+					"tokenizer": "o200k_base",
+					"type":      "chat",
+				},
+				"id":     "gpt-4-o-preview",
+				"name":   "GPT 4o",
+				"object": "model",
+			},
+			{
+				"capabilities": gin.H{
+					"family":    "text-embedding-ada-002",
+					"limits":    gin.H{"max_inputs": 256},
+					"object":    "model_capabilities",
+					"supports":  gin.H{},
+					"tokenizer": "cl100k_base",
+					"type":      "embeddings",
+				},
+				"id":      "text-embedding-ada-002",
+				"name":    "Embedding V2 Ada",
+				"object":  "model",
+				"version": "text-embedding-ada-002",
+			},
+			{
+				"capabilities": gin.H{
+					"family":    "text-embedding-3-small",
+					"limits":    gin.H{"max_inputs": 256},
+					"object":    "model_capabilities",
+					"supports":  gin.H{"dimensions": true},
+					"tokenizer": "cl100k_base",
+					"type":      "embeddings",
+				},
+				"id":      "text-embedding-3-small",
+				"name":    "Embedding V3 small",
+				"object":  "model",
+				"version": "text-embedding-3-small",
+			},
+			{
+				"capabilities": gin.H{
+					"family":    "text-embedding-3-small",
+					"object":    "model_capabilities",
+					"supports":  gin.H{"dimensions": true},
+					"tokenizer": "cl100k_base",
+					"type":      "embeddings",
+				},
+				"id":      "text-embedding-3-small-inference",
+				"name":    "Embedding V3 small (Inference)",
+				"object":  "model",
+				"version": "text-embedding-3-small",
+			},
+		},
+		"object": "list",
+	})
 }

 func (s *ProxyService) completions(c *gin.Context) {
@ -171,6 +404,25 @@ func (s *ProxyService) completions(c *gin.Context) {
 		model = s.cfg.ChatModelDefault
 	}
 	body, _ = sjson.SetBytes(body, "model", model)
+
+	if !gjson.GetBytes(body, "function_call").Exists() {
+		messages := gjson.GetBytes(body, "messages").Array()
+		for i, msg := range messages {
+			toolCalls := msg.Get("tool_calls").Array()
+			if len(toolCalls) == 0 {
+				body, _ = sjson.DeleteBytes(body, fmt.Sprintf("messages.%d.tool_calls", i))
+			}
+		}
+		lastIndex := len(messages) - 1
+		if !strings.Contains(messages[lastIndex].Get("content").String(), "Respond in the following locale") {
+			locale := s.cfg.ChatLocale
+			if locale == "" {
+				locale = "zh_CN"
+			}
+			body, _ = sjson.SetBytes(body, "messages."+strconv.Itoa(lastIndex)+".content", messages[lastIndex].Get("content").String()+"Respond in the following locale: "+locale+".")
+		}
+	}
+
 	body, _ = sjson.DeleteBytes(body, "intent")
 	body, _ = sjson.DeleteBytes(body, "intent_threshold")
 	body, _ = sjson.DeleteBytes(body, "intent_content")
@ -225,10 +477,14 @@ func (s *ProxyService) completions(c *gin.Context) {
 	_, _ = io.Copy(c.Writer, resp.Body)
 }

+func contains(arr []string, str string) bool {
+    return strings.Contains(strings.Join(arr, ","), str)
+}
+
 func (s *ProxyService) codeCompletions(c *gin.Context) {
 	ctx := c.Request.Context()

-	time.Sleep(100 * time.Millisecond)
+	time.Sleep(200 * time.Millisecond)
 	if ctx.Err() != nil {
 		abortCodex(c, http.StatusRequestTimeout)
 		return
@ -240,9 +496,7 @@ func (s *ProxyService) codeCompletions(c *gin.Context) {
 		return
 	}

-	body, _ = sjson.DeleteBytes(body, "extra")
-	body, _ = sjson.DeleteBytes(body, "nwo")
-	body, _ = sjson.SetBytes(body, "model", InstructModel)
+	body = ConstructRequestBody(body, s.cfg)

 	proxyUrl := s.cfg.CodexApiBase + "/completions"
 	req, err := http.NewRequestWithContext(ctx, http.MethodPost, proxyUrl, io.NopCloser(bytes.NewBuffer(body)))
@ -252,7 +506,7 @@ func (s *ProxyService) codeCompletions(c *gin.Context) {
 	}

 	req.Header.Set("Content-Type", "application/json")
-	req.Header.Set("Authorization", "Bearer "+s.cfg.CodexApiKey)
+	req.Header.Set("Authorization", "Bearer " + getRandomApiKey(s.cfg.CodexApiKey))
 	if "" != s.cfg.CodexApiOrganization {
 		req.Header.Set("OpenAI-Organization", s.cfg.CodexApiOrganization)
 	}
@ -291,6 +545,68 @@ func (s *ProxyService) codeCompletions(c *gin.Context) {
 	_, _ = io.Copy(c.Writer, resp.Body)
 }

+// 随机取一个apiKey
+func getRandomApiKey(paramStr string) string {
+    params := strings.Split(paramStr, ",")
+    rand.Seed(time.Now().UnixNano())
+    randomIndex := rand.Intn(len(params))
+	fmt.Println("Code completion API Key index:", randomIndex)
+	fmt.Println("Code completion API Key:", strings.TrimSpace(params[randomIndex]))
+    return strings.TrimSpace(params[randomIndex])
+}
+
+func ConstructRequestBody(body []byte, cfg *config) []byte {
+	body, _ = sjson.DeleteBytes(body, "extra")
+	body, _ = sjson.DeleteBytes(body, "nwo")
+	body, _ = sjson.SetBytes(body, "model", cfg.CodeInstructModel)
+
+	if int(gjson.GetBytes(body, "max_tokens").Int()) > cfg.CodexMaxTokens {
+		body, _ = sjson.SetBytes(body, "max_tokens", cfg.CodexMaxTokens)
+	}
+
+	if strings.Contains(cfg.CodeInstructModel, StableCodeModelPrefix) {
+		return constructWithStableCodeModel(body)
+	} else if strings.HasPrefix(cfg.CodeInstructModel, DeepSeekCoderModel) || contains(SiliconflowModels, cfg.CodeInstructModel) {
+		if gjson.GetBytes(body, "n").Int() > 1 {
+			body, _ = sjson.SetBytes(body, "n", 1)
+		}
+	}
+
+	if strings.HasSuffix(cfg.ChatApiBase, "chat") {
+		// @Todo  constructWithChatModel
+		// 如果code base以chat结尾则构建chatModel，暂时没有好的prompt
+	}
+
+	return body
+}
+
+func constructWithStableCodeModel(body []byte) []byte {
+	suffix := gjson.GetBytes(body, "suffix")
+	prompt := gjson.GetBytes(body, "prompt")
+	content := fmt.Sprintf("<fim_prefix>%s<fim_suffix>%s<fim_middle>", prompt, suffix)
+
+	// 创建新的 JSON 对象并添加到 body 中
+	messages := []map[string]string{
+		{
+			"role":    "user",
+			"content": content,
+		},
+	}
+	return constructWithChatModel(body, messages)
+}
+
+func constructWithChatModel(body []byte, messages interface{}) []byte {
+
+	body, _ = sjson.SetBytes(body, "messages", messages)
+
+	// fmt.Printf("Request Body: %s\n", body)
+	// 2. 将转义的字符替换回原来的字符
+	jsonStr := string(body)
+	jsonStr = strings.ReplaceAll(jsonStr, "\\u003c", "<")
+	jsonStr = strings.ReplaceAll(jsonStr, "\\u003e", ">")
+	return []byte(jsonStr)
+}
+
 func main() {
 	cfg := readConfig()

@ -310,4 +626,5 @@ func main() {
 		log.Fatal(err)
 		return
 	}
+
 }
--- a/scripts/replace_max_tokens.vbs
+++ b/scripts/replace_max_tokens.vbs
@ -0,0 +1,49 @@
+' VBScript to change max tokens to 2048
+
+MsgBox "It may take a few seconds to execute this script." & vbCrLf & vbCrLf & "Click 'OK' button and wait for the prompt of 'Done.' to pop up!"
+
+Const ForReading = 1
+Const ForWriting = 2
+
+' Subpath of the file to be replaced
+subpath = "dist\extension.js"
+
+pattern = "\.maxPromptCompletionTokens\(([a-zA-Z0-9_]+),([0-9]+)\)"
+replacement = ".maxPromptCompletionTokens($1,2048)"
+
+' Iterate over all github copilot directories
+Set objFSO = CreateObject("Scripting.FileSystemObject")
+Set objShell = CreateObject("WScript.Shell")
+Set colExtensions = objFSO.GetFolder(objShell.ExpandEnvironmentStrings("%USERPROFILE%") & "\.vscode\extensions").SubFolders
+
+For Each objExtension In colExtensions
+    extension_path = objExtension.Path & "\" & subpath
+    If objFSO.FileExists(extension_path) Then
+        backupfile = extension_path & ".bak"
+        
+        ' Delete if backup file exists
+        If objFSO.FileExists(backupfile) Then
+            objFSO.DeleteFile backupfile, True
+        End If
+        
+        ' Backup
+        objFSO.CopyFile extension_path, backupfile
+        
+        ' Do search and replace with pattern
+        Set objFile = objFSO.OpenTextFile(extension_path, ForReading)
+        strContent = objFile.ReadAll
+        objFile.Close
+        
+        Set objRegEx = New RegExp
+        objRegEx.Global = True
+        objRegEx.IgnoreCase = True
+        objRegEx.Pattern = pattern
+        strContent = objRegEx.Replace(strContent, replacement)
+        
+        Set objFile = objFSO.OpenTextFile(extension_path, ForWriting)
+        objFile.Write strContent
+        objFile.Close
+    End If
+Next
+
+MsgBox "Max tokens modification completed"
--- a/scripts/restore_max_tokens.vbs
+++ b/scripts/restore_max_tokens.vbs
@ -0,0 +1,30 @@
+' VBScript to recovery max tokens
+MsgBox "It may take a few seconds to execute this script." & vbCrLf & vbCrLf & "Click 'OK' button and wait for the prompt of 'Done.' to pop up!"
+
+Const ForReading = 1
+Const ForWriting = 2
+
+' Subpath of the file to be recovery
+subpath = "dist\extension.js"
+
+' Iterate over all github copilot directories
+Set objFSO = CreateObject("Scripting.FileSystemObject")
+Set objShell = CreateObject("WScript.Shell")
+Set colExtensions = objFSO.GetFolder(objShell.ExpandEnvironmentStrings("%USERPROFILE%") & "\.vscode\extensions").SubFolders
+
+For Each objExtension In colExtensions
+    extension_path = objExtension.Path & "\" & subpath
+    backupfile = extension_path & ".bak"
+    
+    If objFSO.FileExists(backupfile) Then
+        ' Delete if exist extension file
+        If objFSO.FileExists(extension_path) Then
+            objFSO.DeleteFile extension_path, True
+        End If
+        
+        ' Replace
+        objFSO.MoveFile backupfile, extension_path
+    End If
+Next
+
+MsgBox "Restore max tokens to default successed"
Author	SHA1	Message	Date
zhuiyue132	8603e7429e	feat: 增加 siliconflow fim 的支持 (#63 ) siliconflow 已支持标准格式的 FIM 补全,特此 PR	2024-10-11 18:49:08 -07:00
aliensb	6d9ba954dd	Fix: Repair the logic for obtaining the configuration file path. (#60 ) Fixed the logic for obtaining the configuration file path to ensure that config.json is used as the default when no command line arguments are provided.	2024-09-25 00:11:30 -07:00
今夕是何年	9ef70da47b	当message中的tool_calls字段为空数组时移除掉这个属性，防止deepseek报错。 (#57 ) Co-authored-by: liyuzhe <banyebushui@>	2024-09-25 00:10:50 -07:00
Huanzhang Hu	aef14559a1	change models api (#54 ) Co-authored-by: huhuanzhang <huhuanzhang@parkingwang.com>	2024-09-08 18:24:12 -07:00
Liu Bingyan	0685e8c153	Modify expose port (#46 ) Modify expose port to match the port in the docker-compose file.	2024-07-29 00:33:11 -07:00
wozulong	8fdd840460	update README Signed-off-by: wozulong <>	2024-07-26 18:41:27 +08:00
wozulong	19447a898a	update README Signed-off-by: wozulong <>	2024-07-26 15:21:06 +08:00
wozulong	6325a5e2f5	add deepseek-coder fim support Signed-off-by: wozulong <>	2024-07-26 15:20:18 +08:00
forose	e251e9e50b	Update Dockerfile (#39 ) Set up a proxy for Go	2024-06-18 21:20:16 -07:00
echo66677	f93a0ed85f	fix config.json.example (#36 ) Co-authored-by: Neo <Neo@zhile.io>	2024-06-18 21:19:51 -07:00
wozulong	ec57d48ae4	a fake api: /models Signed-off-by: wozulong <>	2024-06-06 14:55:57 +08:00
图南	d1c19fc7ef	add simple fake api for _ping (#34 )	2024-06-03 20:52:57 -07:00
xixingya	c9e7d75fec	add stable-code-3b local model support (#30 ) * add stable-code-3b local model support * add stable-code-3b local model support * add stable-code-3b local model support * add stable-code-3b local model support * fix code struct add chat model todo	2024-05-23 03:54:40 -07:00
machooo	4075c558ec	Revert "dockerfile添加go代理,解决拉取三方包的延迟问题 (#14 )" (#24 ) This reverts commit `ad0c436935`.	2024-05-22 20:03:45 -07:00
xpxz	55d6961c3b	补充windows上修改maxPromptCompletionTokens的方法 (#23 )	2024-05-22 20:03:23 -07:00
mibody2	7992cbe8f2	feat: jb can set the locale for chat (#22 ) Co-authored-by: mibody2 <mibody2>	2024-05-19 07:36:44 -07:00
wzdnzd	ed40f68e99	add some scripts for replacing and restoring max tokens (#21 )	2024-05-19 01:00:21 -07:00
wozulong	247a8748dc	update README Signed-off-by: wozulong <>	2024-05-18 22:52:20 +08:00