Compare commits

...

20 Commits

Author SHA1 Message Date
zhuiyue132 8603e7429e
feat: 增加 siliconflow fim 的支持 (#63)
siliconflow 已支持标准格式的 FIM 补全,特此 PR
2024-10-11 18:49:08 -07:00
aliensb 6d9ba954dd
Fix: Repair the logic for obtaining the configuration file path. (#60)
Fixed the logic for obtaining the configuration file path to ensure that config.json is used as the default when no command line arguments are provided.
2024-09-25 00:11:30 -07:00
今夕是何年 9ef70da47b
当message中的tool_calls字段为空数组时移除掉这个属性,防止deepseek报错。 (#57)
Co-authored-by: liyuzhe <banyebushui@>
2024-09-25 00:10:50 -07:00
Huanzhang Hu aef14559a1
change models api (#54)
Co-authored-by: huhuanzhang <huhuanzhang@parkingwang.com>
2024-09-08 18:24:12 -07:00
Liu Bingyan 0685e8c153
Modify expose port (#46)
Modify expose port to match the port in the docker-compose file.
2024-07-29 00:33:11 -07:00
wozulong 8fdd840460 update README
Signed-off-by: wozulong <>
2024-07-26 18:41:27 +08:00
wozulong 19447a898a update README
Signed-off-by: wozulong <>
2024-07-26 15:21:06 +08:00
wozulong 6325a5e2f5 add deepseek-coder fim support
Signed-off-by: wozulong <>
2024-07-26 15:20:18 +08:00
forose e251e9e50b
Update Dockerfile (#39)
Set up a proxy for Go
2024-06-18 21:20:16 -07:00
echo66677 f93a0ed85f
fix config.json.example (#36)
Co-authored-by: Neo <Neo@zhile.io>
2024-06-18 21:19:51 -07:00
wozulong ec57d48ae4 a fake api: /models
Signed-off-by: wozulong <>
2024-06-06 14:55:57 +08:00
图南 d1c19fc7ef
add simple fake api for _ping (#34) 2024-06-03 20:52:57 -07:00
xixingya c9e7d75fec
add stable-code-3b local model support (#30)
* add stable-code-3b local model support

* add stable-code-3b local model support

* add stable-code-3b local model support

* add stable-code-3b local model support

* fix code struct add chat model todo
2024-05-23 03:54:40 -07:00
machooo 4075c558ec
Revert "dockerfile添加go代理,解决拉取三方包的延迟问题 (#14)" (#24)
This reverts commit ad0c436935.
2024-05-22 20:03:45 -07:00
xpxz 55d6961c3b
补充windows上修改maxPromptCompletionTokens的方法 (#23) 2024-05-22 20:03:23 -07:00
mibody2 7992cbe8f2
feat: jb can set the locale for chat (#22)
Co-authored-by: mibody2 <mibody2>
2024-05-19 07:36:44 -07:00
wzdnzd ed40f68e99
add some scripts for replacing and restoring max tokens (#21) 2024-05-19 01:00:21 -07:00
wozulong 247a8748dc update README
Signed-off-by: wozulong <>
2024-05-18 22:52:20 +08:00
wozulong f8e28c3719 update max_tokens
Signed-off-by: wozulong <>
2024-05-18 22:46:48 +08:00
wozulong 5341ccb23a update README
Signed-off-by: wozulong <>
2024-05-18 19:59:11 +08:00
8 changed files with 520 additions and 86 deletions

View File

@ -1,17 +1,22 @@
FROM golang:alpine AS builder
WORKDIR /app
COPY . .
ADD . .
ENV GO111MODULE=on GOPROXY=https://goproxy.cn,direct
RUN go mod download
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o override
RUN CGO_ENABLED=0 go build -ldflags="-w -s" -o override
FROM alpine:latest
COPY --from=builder /app/override /usr/local/bin/override
RUN apk --no-cache add ca-certificates
COPY --from=builder /app/override /usr/local/bin/
COPY config.json.example /app/config.json
WORKDIR /app
ENTRYPOINT ["/usr/local/bin/override"]
VOLUME /app
EXPOSE 8181
CMD ["override"]

116
README.md
View File

@ -6,6 +6,7 @@
```json
"github.copilot.advanced": {
"debug.overrideCAPIUrl": "http://127.0.0.1:8181/v1",
"debug.overrideProxyUrl": "http://127.0.0.1:8181",
"debug.chatOverrideProxyUrl": "http://127.0.0.1:8181/v1/chat/completions",
"authProvider": "github-enterprise"
@ -27,37 +28,116 @@
```json
{
"bind": "127.0.0.1:8181",
"proxy_url": "",
"timeout": 600,
"codex_api_base": "https://api-proxy.oaipro.com/v1",
"codex_api_key": "sk-xxx",
"codex_api_organization": "",
"codex_api_project": "",
"codex_max_tokens": 4093,
"chat_api_base": "https://api-proxy.oaipro.com/v1",
"chat_api_key": "sk-xxx",
"chat_api_organization": "",
"chat_api_project": "",
"chat_model_default": "gpt-4o",
"chat_model_map": {}
"bind": "127.0.0.1:8181",
"proxy_url": "",
"timeout": 600,
"codex_api_base": "https://api-proxy.oaipro.com/v1",
"codex_api_key": "sk-xxx",
"codex_api_organization": "",
"codex_api_project": "",
"codex_max_tokens": 500,
"code_instruct_model": "gpt-3.5-turbo-instruct",
"chat_api_base": "https://api-proxy.oaipro.com/v1",
"chat_api_key": "sk-xxx",
"chat_api_organization": "",
"chat_api_project": "",
"chat_max_tokens": 4096,
"chat_model_default": "gpt-4o",
"chat_model_map": {},
"chat_locale": "zh_CN",
"auth_token": ""
}
```
`organization``project` 除非你有,且知道怎么回事再填。
`chat_model_map` 是个模型映射的字典。会将请求的模型映射到你想要的,如果不存在映射,则使用 `chat_model_default`
`code_max_tokens` 可以设置为你希望的最大Token数你设置的时候最好知道自己在做什么。
`codex_max_tokens` 可以设置为你希望的最大Token数你设置的时候最好知道自己在做什么。代码生成通常使用 `500` 即可。
`chat_max_tokens` 可以设置为你希望的最大Token数你设置的时候最好知道自己在做什么。`gpt-4o` 输出最大为 `4096`
可以通过 `OVERRIDE_` + 大写配置项作为环境变量,可以覆盖 `config.json` 中的值。例如:`OVERRIDE_CODEX_API_KEY=sk-xxxx`
### DeepSeek Coder 设置
如果你希望使用 DeepSeek Coder FIM 来进行代码补全,着重修改以下配置:
```json
"codex_api_base": "https://api.deepseek.com/beta/v1",
"codex_api_key": "sk-xxx",
"code_instruct_model": "deepseek-coder",
```
### Siliconflow 设置
如果你希望使用 Siliconflow FIM 模型来进行代码补全,着重修改以下配置:
```json
"codex_api_base": "https://api.siliconflow.cn/v1",
"codex_api_key": "sk-xxx,sk-xxx2,sk-xxx3...",
"code_instruct_model": "Qwen/Qwen2.5-Coder-7B-Instruct",
```
截至目前Siliconflow 共有三个模型支持 FIM。分别是 `Qwen/Qwen2.5-Coder-7B-Instruct`、`deepseek-ai/DeepSeek-Coder-V2-Instruct` 、`deepseek-ai/DeepSeek-V2.5`。其中 `Qwen/Qwen2.5-Coder-7B-Instruct` 是免费模型,另外两个是收费模型。
如果你有很多 Siliconflow API Key, 可以以英文逗号分隔填入`codex_api_key`字段, 这样可以很好的避免Siliconflow官方的 TPM RateLimit 对你编码速度影响(尤其使用收费模型时用户级别较低TPM 最低只有 10k)。
### 本地大模型设置
1. 安装ollama
2. ollama run stable-code:code (这个模型较小,大部分显卡都能跑)
或者你的显卡比较高安装这个ollama run stable-code:3b-code-fp16
3. 修改config.json里面的codex_api_base为http://localhost:11434/v1/chat
4. 修改code_instruct_model为你的模型名称stable-code:code或者stable-code:3b-code-fp16
5. 剩下的就按照正常流程走即可。
6. 如果调不通请确认http://localhost:11434/v1/chat可用。
### 重要说明
`codex_max_tokens` 工作并不完美,已经移除。**JetBrains IDE 完美工作**`VSCode` 需要执行以下脚本Patch之
* macOS `sed -i '' -E 's/\.maxPromptCompletionTokens\(([a-zA-Z0-9_]+),([0-9]+)\)/.maxPromptCompletionTokens(\1,2048)/' ~/.vscode/extensions/github.copilot-*/dist/extension.js`
* Linux `sed -E 's/\.maxPromptCompletionTokens\(([a-zA-Z0-9_]+),([0-9]+)\)/.maxPromptCompletionTokens(\1,2048)/' ~/.vscode/extensions/github.copilot-*/dist/extension.js`
* Windows 可以用如下的python脚本进行替换
* 因为是Patch所以**Copilot每次升级都要执行一次**。
* 具体原因是客户端需要根据 `max_tokens` 精密计算prompt后台删减会有问题。
```
# github copilot extention replace script
import re
import glob
import os
file_paths = glob.glob(os.getenv("USERPROFILE") + r'\.vscode\extensions\github.copilot-*\dist\extension.js')
if file_paths == list():
print("no copilot extension found")
exit()
pattern = re.compile(r'\.maxPromptCompletionTokens\(([a-zA-Z0-9_]+),([0-9]+)\)')
replacement = r'.maxPromptCompletionTokens(\1,2048)'
for file_path in file_paths:
with open(file_path, 'r', encoding="utf-8") as file:
content = file.read()
new_content = pattern.sub(replacement, content)
if new_content == content:
print("no match found in " + file_path)
continue
else:
print("replaced " + file_path)
with open(file_path, 'w', encoding='utf-8') as file:
file.write(new_content)
print("replace finish")
```
### 其他说明
1. 理论上Chat 部分可以使用 `chat2api` ,而 Codex 代码生成部分则不太适合使用 `chat2api`
2. 代码生成部分做过延时生成和客户端 Cancel 处理很有效节省你的Token。
3. 我目前就试了下 `VSCode` ,至于 `JetBrains` 等IDE尚未适配如果你有相关经验请告诉我。
4. 项目基于 `MIT` 协议发布,你可以修改,请保留原作者信息。
5. 有什么问题,请在论坛 https://linux.do 讨论欢迎PR。
3. 项目基于 `MIT` 协议发布,你可以修改,请保留原作者信息。
4. 有什么问题,请在论坛 https://linux.do 讨论欢迎PR。
### Star History

View File

@ -6,12 +6,15 @@
"codex_api_key": "sk-xxx",
"codex_api_organization": "",
"codex_api_project": "",
"codex_max_tokens": 2048,
"codex_max_tokens": 500,
"code_instruct_model": "gpt-3.5-turbo-instruct",
"chat_api_base": "https://api-proxy.oaipro.com/v1",
"chat_api_key": "sk-xxx",
"chat_api_organization": "",
"chat_api_project": "",
"chat_max_tokens": 4096,
"chat_model_default": "gpt-4o",
"chat_model_map": {}
}
"chat_model_map": {},
"chat_locale": "zh_CN",
"auth_token": ""
}

3
go.mod
View File

@ -6,7 +6,6 @@ toolchain go1.21.4
require (
github.com/gin-gonic/gin v1.10.0
github.com/linux-do/tiktoken-go v0.7.0
github.com/tidwall/gjson v1.17.1
github.com/tidwall/sjson v1.2.5
golang.org/x/net v0.25.0
@ -17,7 +16,6 @@ require (
github.com/bytedance/sonic/loader v0.1.1 // indirect
github.com/cloudwego/base64x v0.1.4 // indirect
github.com/cloudwego/iasm v0.2.0 // indirect
github.com/dlclark/regexp2 v1.11.0 // indirect
github.com/gabriel-vasile/mimetype v1.4.3 // indirect
github.com/gin-contrib/sse v0.1.0 // indirect
github.com/go-playground/locales v0.14.1 // indirect
@ -25,7 +23,6 @@ require (
github.com/go-playground/validator/v10 v10.20.0 // indirect
github.com/goccy/go-json v0.10.2 // indirect
github.com/google/go-cmp v0.5.9 // indirect
github.com/google/uuid v1.6.0 // indirect
github.com/json-iterator/go v1.1.12 // indirect
github.com/klauspost/cpuid/v2 v2.2.7 // indirect
github.com/kr/pretty v0.3.0 // indirect

6
go.sum
View File

@ -10,8 +10,6 @@ github.com/creack/pty v1.1.9/go.mod h1:oKZEueFk5CKHvIhNR5MUki03XCEU+Q6VDXinZuGJ3
github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/dlclark/regexp2 v1.11.0 h1:G/nrcoOa7ZXlpoa/91N3X7mM3r8eIlMBBJZvsz/mxKI=
github.com/dlclark/regexp2 v1.11.0/go.mod h1:DHkYz0B9wPfa6wondMfaivmHpzrQ3v9q8cnmRbL6yW8=
github.com/gabriel-vasile/mimetype v1.4.3 h1:in2uUcidCuFcDKtdcBxlR0rJ1+fsokWf+uqxgUFjbI0=
github.com/gabriel-vasile/mimetype v1.4.3/go.mod h1:d8uq/6HKRL6CGdk+aubisF/M5GcPfT7nKyLpA0lbSSk=
github.com/gin-contrib/sse v0.1.0 h1:Y/yl/+YNO8GZSjAhjMsSuLt29uWRFHdHYUb5lYOV9qE=
@ -31,8 +29,6 @@ github.com/goccy/go-json v0.10.2/go.mod h1:6MelG93GURQebXPDq3khkgXZkazVtN9CRI+MG
github.com/google/go-cmp v0.5.9 h1:O2Tfq5qg4qc4AmwVlvv0oLiVAGB7enBSJ2x2DqQFi38=
github.com/google/go-cmp v0.5.9/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
github.com/google/gofuzz v1.0.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg=
github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0=
github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
github.com/json-iterator/go v1.1.12 h1:PV8peI4a0ysnczrg+LtxykD8LfKY9ML6u2jnxaEnrnM=
github.com/json-iterator/go v1.1.12/go.mod h1:e30LSqwooZae/UwlEbR2852Gd8hjQvJoHmT4TnhNGBo=
github.com/klauspost/cpuid/v2 v2.0.9/go.mod h1:FInQzS24/EEf25PyTYn52gqo7WaD8xa0213Md/qVLRg=
@ -49,8 +45,6 @@ github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY=
github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE=
github.com/leodido/go-urn v1.4.0 h1:WT9HwE9SGECu3lg4d/dIA+jxlljEa1/ffXKmRjqdmIQ=
github.com/leodido/go-urn v1.4.0/go.mod h1:bvxc+MVxLKB4z00jd1z+Dvzr47oO32F/QSNjSBOlFxI=
github.com/linux-do/tiktoken-go v0.7.0 h1:Kcm/miJ5gp77srtF8GQWnfq7W9kTaXEuHZg/g9IVEu8=
github.com/linux-do/tiktoken-go v0.7.0/go.mod h1:9Vkdtp0ngi4USmrdSx984iuIQ5IMr0hnUdz4jZZTJb8=
github.com/mattn/go-isatty v0.0.20 h1:xfD0iDuEKnDkl03q4limB+vH+GxLEtL/jb4xVJSWWEY=
github.com/mattn/go-isatty v0.0.20/go.mod h1:W+V8PltTTMOvKvAeJH7IuucS94S2C6jfK/D7dTCTo3Y=
github.com/modern-go/concurrent v0.0.0-20180228061459-e0a39a4cb421/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q=

378
main.go
View File

@ -5,8 +5,8 @@ import (
"context"
"encoding/json"
"errors"
"fmt"
"github.com/gin-gonic/gin"
"github.com/linux-do/tiktoken-go"
"github.com/tidwall/gjson"
"github.com/tidwall/sjson"
"golang.org/x/net/http2"
@ -19,9 +19,16 @@ import (
"strconv"
"strings"
"time"
"math/rand"
)
const InstructModel = "gpt-3.5-turbo-instruct"
const DefaultInstructModel = "gpt-3.5-turbo-instruct"
const StableCodeModelPrefix = "stable-code"
const DeepSeekCoderModel = "deepseek-coder"
var SiliconflowModels = []string{"deepseek-ai/DeepSeek-V2.5", "deepseek-ai/DeepSeek-Coder-V2-Instruct", "Qwen/Qwen2.5-Coder-7B-Instruct"}
type config struct {
Bind string `json:"bind"`
@ -32,6 +39,7 @@ type config struct {
CodexApiOrganization string `json:"codex_api_organization"`
CodexApiProject string `json:"codex_api_project"`
CodexMaxTokens int `json:"codex_max_tokens"`
CodeInstructModel string `json:"code_instruct_model"`
ChatApiBase string `json:"chat_api_base"`
ChatApiKey string `json:"chat_api_key"`
ChatApiOrganization string `json:"chat_api_organization"`
@ -39,10 +47,18 @@ type config struct {
ChatMaxTokens int `json:"chat_max_tokens"`
ChatModelDefault string `json:"chat_model_default"`
ChatModelMap map[string]string `json:"chat_model_map"`
ChatLocale string `json:"chat_locale"`
AuthToken string `json:"auth_token"`
}
func readConfig() *config {
content, err := os.ReadFile("config.json")
var configPath string
if len(os.Args) > 1 {
configPath = os.Args[1]
} else {
configPath = "config.json"
}
content, err := os.ReadFile(configPath)
if nil != err {
log.Fatal(err)
}
@ -89,6 +105,17 @@ func readConfig() *config {
}
}
}
if _cfg.CodeInstructModel == "" {
_cfg.CodeInstructModel = DefaultInstructModel
}
if _cfg.CodexMaxTokens == 0 {
_cfg.CodexMaxTokens = 500
}
if _cfg.ChatMaxTokens == 0 {
_cfg.ChatMaxTokens = 4096
}
return _cfg
}
@ -136,9 +163,8 @@ func closeIO(c io.Closer) {
}
type ProxyService struct {
cfg *config
client *http.Client
tokenizer *tiktoken.Tiktoken
cfg *config
client *http.Client
}
func NewProxyService(cfg *config) (*ProxyService, error) {
@ -147,21 +173,219 @@ func NewProxyService(cfg *config) (*ProxyService, error) {
return nil, err
}
tokenizer, err := tiktoken.EncodingForModel(InstructModel)
if nil != err {
return nil, err
}
return &ProxyService{
cfg: cfg,
client: client,
tokenizer: tokenizer,
cfg: cfg,
client: client,
}, nil
}
func AuthMiddleware(authToken string) gin.HandlerFunc {
return func(c *gin.Context) {
token := c.Param("token")
if token != authToken {
c.JSON(http.StatusUnauthorized, gin.H{"error": "Unauthorized"})
c.Abort()
return
}
c.Next()
}
}
func (s *ProxyService) InitRoutes(e *gin.Engine) {
e.POST("/v1/chat/completions", s.completions)
e.POST("/v1/engines/copilot-codex/completions", s.codeCompletions)
e.GET("/_ping", s.pong)
e.GET("/models", s.models)
e.GET("/v1/models", s.models)
authToken := s.cfg.AuthToken // replace with your dynamic value as needed
if authToken != "" {
// 鉴权
v1 := e.Group("/:token/v1/", AuthMiddleware(authToken))
{
v1.POST("/chat/completions", s.completions)
v1.POST("/engines/copilot-codex/completions", s.codeCompletions)
v1.POST("/v1/chat/completions", s.completions)
v1.POST("/v1/engines/copilot-codex/completions", s.codeCompletions)
}
} else {
e.POST("/v1/chat/completions", s.completions)
e.POST("/v1/engines/copilot-codex/completions", s.codeCompletions)
e.POST("/v1/v1/chat/completions", s.completions)
e.POST("/v1/v1/engines/copilot-codex/completions", s.codeCompletions)
}
}
type Pong struct {
Now int `json:"now"`
Status string `json:"status"`
Ns1 string `json:"ns1"`
}
func (s *ProxyService) pong(c *gin.Context) {
c.JSON(http.StatusOK, Pong{
Now: time.Now().Second(),
Status: "ok",
Ns1: "200 OK",
})
}
func (s *ProxyService) models(c *gin.Context) {
c.JSON(http.StatusOK, gin.H{
"data": []gin.H{
{
"capabilities": gin.H{
"family": "gpt-3.5-turbo",
"limits": gin.H{"max_prompt_tokens": 12288},
"object": "model_capabilities",
"supports": gin.H{"tool_calls": true},
"tokenizer": "cl100k_base",
"type": "chat",
},
"id": "gpt-3.5-turbo",
"name": "GPT 3.5 Turbo",
"object": "model",
"version": "gpt-3.5-turbo-0613",
},
{
"capabilities": gin.H{
"family": "gpt-3.5-turbo",
"limits": gin.H{"max_prompt_tokens": 12288},
"object": "model_capabilities",
"supports": gin.H{"tool_calls": true},
"tokenizer": "cl100k_base",
"type": "chat",
},
"id": "gpt-3.5-turbo-0613",
"name": "GPT 3.5 Turbo",
"object": "model",
"version": "gpt-3.5-turbo-0613",
},
{
"capabilities": gin.H{
"family": "gpt-4",
"limits": gin.H{"max_prompt_tokens": 20000},
"object": "model_capabilities",
"supports": gin.H{"tool_calls": true},
"tokenizer": "cl100k_base",
"type": "chat",
},
"id": "gpt-4",
"name": "GPT 4",
"object": "model",
"version": "gpt-4-0613",
},
{
"capabilities": gin.H{
"family": "gpt-4",
"limits": gin.H{"max_prompt_tokens": 20000},
"object": "model_capabilities",
"supports": gin.H{"tool_calls": true},
"tokenizer": "cl100k_base",
"type": "chat",
},
"id": "gpt-4-0613",
"name": "GPT 4",
"object": "model",
"version": "gpt-4-0613",
},
{
"capabilities": gin.H{
"family": "gpt-4-turbo",
"limits": gin.H{"max_prompt_tokens": 20000},
"object": "model_capabilities",
"supports": gin.H{"parallel_tool_calls": true, "tool_calls": true},
"tokenizer": "cl100k_base",
"type": "chat",
},
"id": "gpt-4-0125-preview",
"name": "GPT 4 Turbo",
"object": "model",
"version": "gpt-4-0125-preview",
},
{
"capabilities": gin.H{
"family": "gpt-4o",
"limits": gin.H{"max_prompt_tokens": 20000},
"object": "model_capabilities",
"supports": gin.H{"parallel_tool_calls": true, "tool_calls": true},
"tokenizer": "o200k_base",
"type": "chat",
},
"id": "gpt-4o",
"name": "GPT 4o",
"object": "model",
"version": "gpt-4o-2024-05-13",
},
{
"capabilities": gin.H{
"family": "gpt-4o",
"limits": gin.H{"max_prompt_tokens": 20000},
"object": "model_capabilities",
"supports": gin.H{"parallel_tool_calls": true, "tool_calls": true},
"tokenizer": "o200k_base",
"type": "chat",
},
"id": "gpt-4o-2024-05-13",
"name": "GPT 4o",
"object": "model",
"version": "gpt-4o-2024-05-13",
},
{
"capabilities": gin.H{
"family": "gpt-4o",
"limits": gin.H{"max_prompt_tokens": 20000},
"object": "model_capabilities",
"supports": gin.H{"parallel_tool_calls": true, "tool_calls": true},
"tokenizer": "o200k_base",
"type": "chat",
},
"id": "gpt-4-o-preview",
"name": "GPT 4o",
"object": "model",
},
{
"capabilities": gin.H{
"family": "text-embedding-ada-002",
"limits": gin.H{"max_inputs": 256},
"object": "model_capabilities",
"supports": gin.H{},
"tokenizer": "cl100k_base",
"type": "embeddings",
},
"id": "text-embedding-ada-002",
"name": "Embedding V2 Ada",
"object": "model",
"version": "text-embedding-ada-002",
},
{
"capabilities": gin.H{
"family": "text-embedding-3-small",
"limits": gin.H{"max_inputs": 256},
"object": "model_capabilities",
"supports": gin.H{"dimensions": true},
"tokenizer": "cl100k_base",
"type": "embeddings",
},
"id": "text-embedding-3-small",
"name": "Embedding V3 small",
"object": "model",
"version": "text-embedding-3-small",
},
{
"capabilities": gin.H{
"family": "text-embedding-3-small",
"object": "model_capabilities",
"supports": gin.H{"dimensions": true},
"tokenizer": "cl100k_base",
"type": "embeddings",
},
"id": "text-embedding-3-small-inference",
"name": "Embedding V3 small (Inference)",
"object": "model",
"version": "text-embedding-3-small",
},
},
"object": "list",
})
}
func (s *ProxyService) completions(c *gin.Context) {
@ -180,6 +404,25 @@ func (s *ProxyService) completions(c *gin.Context) {
model = s.cfg.ChatModelDefault
}
body, _ = sjson.SetBytes(body, "model", model)
if !gjson.GetBytes(body, "function_call").Exists() {
messages := gjson.GetBytes(body, "messages").Array()
for i, msg := range messages {
toolCalls := msg.Get("tool_calls").Array()
if len(toolCalls) == 0 {
body, _ = sjson.DeleteBytes(body, fmt.Sprintf("messages.%d.tool_calls", i))
}
}
lastIndex := len(messages) - 1
if !strings.Contains(messages[lastIndex].Get("content").String(), "Respond in the following locale") {
locale := s.cfg.ChatLocale
if locale == "" {
locale = "zh_CN"
}
body, _ = sjson.SetBytes(body, "messages."+strconv.Itoa(lastIndex)+".content", messages[lastIndex].Get("content").String()+"Respond in the following locale: "+locale+".")
}
}
body, _ = sjson.DeleteBytes(body, "intent")
body, _ = sjson.DeleteBytes(body, "intent_threshold")
body, _ = sjson.DeleteBytes(body, "intent_content")
@ -234,18 +477,14 @@ func (s *ProxyService) completions(c *gin.Context) {
_, _ = io.Copy(c.Writer, resp.Body)
}
func (s *ProxyService) countToken(token string) int {
if "" == token {
return 0
}
return len(s.tokenizer.Encode(token, nil, nil))
func contains(arr []string, str string) bool {
return strings.Contains(strings.Join(arr, ","), str)
}
func (s *ProxyService) codeCompletions(c *gin.Context) {
ctx := c.Request.Context()
time.Sleep(100 * time.Millisecond)
time.Sleep(200 * time.Millisecond)
if ctx.Err() != nil {
abortCodex(c, http.StatusRequestTimeout)
return
@ -257,33 +496,7 @@ func (s *ProxyService) codeCompletions(c *gin.Context) {
return
}
prompt := gjson.GetBytes(body, "prompt").String()
suffix := gjson.GetBytes(body, "suffix").String()
inputTokens := s.countToken(prompt)
suffixTokens := s.countToken(suffix)
outputTokens := int(gjson.GetBytes(body, "max_tokens").Int())
totalTokens := inputTokens + suffixTokens + outputTokens
if totalTokens > s.cfg.CodexMaxTokens { // reduce
left, right := 0, len(prompt)
for left < right {
mid := (left + right) / 2
subPrompt := prompt[mid:]
subInputTokens := s.countToken(subPrompt)
totalTokens = subInputTokens + suffixTokens + outputTokens
if totalTokens > s.cfg.CodexMaxTokens {
left = mid + 1
} else {
right = mid
}
}
body, _ = sjson.SetBytes(body, "prompt", prompt[left:])
}
body, _ = sjson.DeleteBytes(body, "extra")
body, _ = sjson.DeleteBytes(body, "nwo")
body, _ = sjson.SetBytes(body, "model", InstructModel)
body = ConstructRequestBody(body, s.cfg)
proxyUrl := s.cfg.CodexApiBase + "/completions"
req, err := http.NewRequestWithContext(ctx, http.MethodPost, proxyUrl, io.NopCloser(bytes.NewBuffer(body)))
@ -293,7 +506,7 @@ func (s *ProxyService) codeCompletions(c *gin.Context) {
}
req.Header.Set("Content-Type", "application/json")
req.Header.Set("Authorization", "Bearer "+s.cfg.CodexApiKey)
req.Header.Set("Authorization", "Bearer " + getRandomApiKey(s.cfg.CodexApiKey))
if "" != s.cfg.CodexApiOrganization {
req.Header.Set("OpenAI-Organization", s.cfg.CodexApiOrganization)
}
@ -332,6 +545,68 @@ func (s *ProxyService) codeCompletions(c *gin.Context) {
_, _ = io.Copy(c.Writer, resp.Body)
}
// 随机取一个apiKey
func getRandomApiKey(paramStr string) string {
params := strings.Split(paramStr, ",")
rand.Seed(time.Now().UnixNano())
randomIndex := rand.Intn(len(params))
fmt.Println("Code completion API Key index:", randomIndex)
fmt.Println("Code completion API Key:", strings.TrimSpace(params[randomIndex]))
return strings.TrimSpace(params[randomIndex])
}
func ConstructRequestBody(body []byte, cfg *config) []byte {
body, _ = sjson.DeleteBytes(body, "extra")
body, _ = sjson.DeleteBytes(body, "nwo")
body, _ = sjson.SetBytes(body, "model", cfg.CodeInstructModel)
if int(gjson.GetBytes(body, "max_tokens").Int()) > cfg.CodexMaxTokens {
body, _ = sjson.SetBytes(body, "max_tokens", cfg.CodexMaxTokens)
}
if strings.Contains(cfg.CodeInstructModel, StableCodeModelPrefix) {
return constructWithStableCodeModel(body)
} else if strings.HasPrefix(cfg.CodeInstructModel, DeepSeekCoderModel) || contains(SiliconflowModels, cfg.CodeInstructModel) {
if gjson.GetBytes(body, "n").Int() > 1 {
body, _ = sjson.SetBytes(body, "n", 1)
}
}
if strings.HasSuffix(cfg.ChatApiBase, "chat") {
// @Todo constructWithChatModel
// 如果code base以chat结尾则构建chatModel暂时没有好的prompt
}
return body
}
func constructWithStableCodeModel(body []byte) []byte {
suffix := gjson.GetBytes(body, "suffix")
prompt := gjson.GetBytes(body, "prompt")
content := fmt.Sprintf("<fim_prefix>%s<fim_suffix>%s<fim_middle>", prompt, suffix)
// 创建新的 JSON 对象并添加到 body 中
messages := []map[string]string{
{
"role": "user",
"content": content,
},
}
return constructWithChatModel(body, messages)
}
func constructWithChatModel(body []byte, messages interface{}) []byte {
body, _ = sjson.SetBytes(body, "messages", messages)
// fmt.Printf("Request Body: %s\n", body)
// 2. 将转义的字符替换回原来的字符
jsonStr := string(body)
jsonStr = strings.ReplaceAll(jsonStr, "\\u003c", "<")
jsonStr = strings.ReplaceAll(jsonStr, "\\u003e", ">")
return []byte(jsonStr)
}
func main() {
cfg := readConfig()
@ -351,4 +626,5 @@ func main() {
log.Fatal(err)
return
}
}

View File

@ -0,0 +1,49 @@
' VBScript to change max tokens to 2048
MsgBox "It may take a few seconds to execute this script." & vbCrLf & vbCrLf & "Click 'OK' button and wait for the prompt of 'Done.' to pop up!"
Const ForReading = 1
Const ForWriting = 2
' Subpath of the file to be replaced
subpath = "dist\extension.js"
pattern = "\.maxPromptCompletionTokens\(([a-zA-Z0-9_]+),([0-9]+)\)"
replacement = ".maxPromptCompletionTokens($1,2048)"
' Iterate over all github copilot directories
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objShell = CreateObject("WScript.Shell")
Set colExtensions = objFSO.GetFolder(objShell.ExpandEnvironmentStrings("%USERPROFILE%") & "\.vscode\extensions").SubFolders
For Each objExtension In colExtensions
extension_path = objExtension.Path & "\" & subpath
If objFSO.FileExists(extension_path) Then
backupfile = extension_path & ".bak"
' Delete if backup file exists
If objFSO.FileExists(backupfile) Then
objFSO.DeleteFile backupfile, True
End If
' Backup
objFSO.CopyFile extension_path, backupfile
' Do search and replace with pattern
Set objFile = objFSO.OpenTextFile(extension_path, ForReading)
strContent = objFile.ReadAll
objFile.Close
Set objRegEx = New RegExp
objRegEx.Global = True
objRegEx.IgnoreCase = True
objRegEx.Pattern = pattern
strContent = objRegEx.Replace(strContent, replacement)
Set objFile = objFSO.OpenTextFile(extension_path, ForWriting)
objFile.Write strContent
objFile.Close
End If
Next
MsgBox "Max tokens modification completed"

View File

@ -0,0 +1,30 @@
' VBScript to recovery max tokens
MsgBox "It may take a few seconds to execute this script." & vbCrLf & vbCrLf & "Click 'OK' button and wait for the prompt of 'Done.' to pop up!"
Const ForReading = 1
Const ForWriting = 2
' Subpath of the file to be recovery
subpath = "dist\extension.js"
' Iterate over all github copilot directories
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objShell = CreateObject("WScript.Shell")
Set colExtensions = objFSO.GetFolder(objShell.ExpandEnvironmentStrings("%USERPROFILE%") & "\.vscode\extensions").SubFolders
For Each objExtension In colExtensions
extension_path = objExtension.Path & "\" & subpath
backupfile = extension_path & ".bak"
If objFSO.FileExists(backupfile) Then
' Delete if exist extension file
If objFSO.FileExists(extension_path) Then
objFSO.DeleteFile extension_path, True
End If
' Replace
objFSO.MoveFile backupfile, extension_path
End If
Next
MsgBox "Restore max tokens to default successed"