众所周知,Google Gemini API在12月14日凌晨0点正式上线,一同上线的还有各大语言版本的SDK,包括Python、Node.js等。这些SDK无疑极大提高了开发者的开发效率,避免从头开始调用RESTful API。然而,好戏才刚刚开场……

Gemini SDK具体如何调用,以及Gemini的各API参数、响应,详见Google Gemini官方文档:https://ai.google.dev/docs?hl=zh-cn,本文不再赘述。然而,直接调用官方SDK的问题是,在国内网络环境下API并不能成功调用,而Gemini SDK却并没有提供类似OpenAI SDK的baseURL的参数,因此,只能通过其他方式实现接入了……

RESTful API

最直接的,因为官方也提供了RESTful API,于是,直接自己手撸接口……(仍旧不想手撸而想调用SDK的,参见文末Node.js SDK部分。)

废话不多说,直接上最终代码:

typescriptCopy code
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
async sendMessageWithAPI(payload: GeminiMessageParam, appId: string): Promise<any> { const { model, config } = payload; const geminiOptions = await this.getGeminiOptions(appId); const reqURL = format(GEMINI_API_URL_MESSAGE, geminiOptions.baseURL, geminiOptions.apiKey, model); const messages: InputContent[] = payload.messages.map((msg) => ({ role: msg.role, parts: [ { text: msg.content } ] })); let geminiRes: GenerateContentResponse; try { geminiRes = ( await firstValueFrom( this.httpService.post( reqURL, { contents: messages, generationConfig: config }, { headers: { 'Content-Type': 'application/json' } } ) ) ).data; } catch (e) { const errorRes: GeminiErrorResponse = e.response?.data; throw new InternalServerErrorException({ ... }); } const resText = geminiRes.candidates?.[0].content?.parts?.[0]?.text || ''; const tokenParam = { baseURL: geminiOptions.baseURL, apiKey: geminiOptions.apiKey, model }; const inputTokens = await this.countTokensWithAPI(messages, tokenParam); const outputTokens = await this.countTokensWithAPI( [ { role: 'model', parts: [ { text: resText } ] } ], tokenParam ); return { created: Math.floor(Date.now() / 1000), model: payload.model, choices: [ { message: { role: BotMessageRole.ASSISTANT, content: resText } } ], usage: { prompt_tokens: inputTokens.totalTokens, completion_tokens: outputTokens.totalTokens, total_tokens: inputTokens.totalTokens + outputTokens.totalTokens } }; }
async sendMessageWithAPI(payload: GeminiMessageParam, appId: string): Promise { const { model, config } = payload; const geminiOptions = await this.getGeminiOptions(appId); const reqURL = format(GEMINI_API_URL_MESSAGE, geminiOptions.baseURL, geminiOptions.apiKey, model); const messages: InputContent[] = payload.messages.map((msg) => ({ role: msg.role, parts: [ { text: msg.content } ] })); let geminiRes: GenerateContentResponse; try { geminiRes = ( await firstValueFrom( this.httpService.post( reqURL, { contents: messages, generationConfig: config }, { headers: { 'Content-Type': 'application/json' } } ) ) ).data; } catch (e) { const errorRes: GeminiErrorResponse = e.response?.data; throw new InternalServerErrorException({ ... }); } const resText = geminiRes.candidates?.[0].content?.parts?.[0]?.text || ''; const tokenParam = { baseURL: geminiOptions.baseURL, apiKey: geminiOptions.apiKey, model }; const inputTokens = await this.countTokensWithAPI(messages, tokenParam); const outputTokens = await this.countTokensWithAPI( [ { role: 'model', parts: [ { text: resText } ] } ], tokenParam ); return { created: Math.floor(Date.now() / 1000), model: payload.model, choices: [ { message: { role: BotMessageRole.ASSISTANT, content: resText } } ], usage: { prompt_tokens: inputTokens.totalTokens, completion_tokens: outputTokens.totalTokens, total_tokens: inputTokens.totalTokens + outputTokens.totalTokens } }; }

详解如下:

  1. getGeminiOptions方法用于获取API keybaseURL等参数。
  2. GEMINI_API_URL_MESSAGE的值是$0/v1beta/models/$2:generateContent?key=$1format方法用于替换字符串中的占位符参数。(注:后来看了源代码才发现,连URL中的generateContent都是可以参数化的,名为task。)
  3. messages做了一层格式转换,目的是将前端各大模型通用的消息格式转化为指定模型需要的格式。
  4. firstValueFrom方法用于将Observable对象转换为Promise对象。
  5. 读取文本的geminiRes.candidates?.[0].content?.parts?.[0]?.text在官方SDK中封装了一个text()方法,里面处理了空值判断和异常判断。此处理论上也需要做类似处理,以免发生异常。
  6. countTokensWithAPI方法调用countTokens API计算token数,如果不需要可以直接删去。
  7. 最后,将接口响应的内容进行封装,以适配前端统一的格式。

其中,baseURL的取值,按需替换为实际代理地址。具体的代理配置,可以参考如下Nginx设置:

nginxCopy code
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
server { listen 8888; server_name xx.xx.xx.xx; proxy_http_version 1.1; proxy_set_header Host generativelanguage.googleapis.com; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; location / { proxy_pass https://generativelanguage.googleapis.com; proxy_ssl_server_name on; } }
server { listen 8888; server_name xx.xx.xx.xx; proxy_http_version 1.1; proxy_set_header Host generativelanguage.googleapis.com; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; location / { proxy_pass https://generativelanguage.googleapis.com; proxy_ssl_server_name on; } }

流式传输(sendMessageStream)等接口与此类似,不再赘述,有问题请评论区留言或私信。

至此,Gemini接入完成,可以愉快地调戏Gemini了。[]~( ̄▽ ̄)~*

Node.js SDK

开发到最后,才想起GitHub的fork功能,何不fork一个新版本,自己给SDK加一个baseURL参数呢?于是,果断从Google的GitHub官方库fork了一份,并且给SDK加上了baseURL

新的npm包名:@fuyun/generative-ai

npm包地址:https://www.npmjs.com/package/@fuyun/generative-ai

仓库地址:https://github.com/ifuyun/generative-ai-js

Node.js SDK调用方式(v0.2.0 版本):

typescriptCopy code
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
async getGeminiInstance(appId: string) { const geminiOptions = await this.getGeminiOptions(appId); return { geminiOptions, geminiInstance: new GoogleGenerativeAI(geminiOptions.apiKey) }; }
async getGeminiInstance(appId: string) { const geminiOptions = await this.getGeminiOptions(appId); return { geminiOptions, geminiInstance: new GoogleGenerativeAI(geminiOptions.apiKey) }; }

获取模型:

typescriptCopy code
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
const { geminiInstance, geminiOptions } = await this.getGeminiInstance(appId); const geminiModel = geminiInstance.getGenerativeModel( { model, generationConfig: config }, { baseURL: geminiOptions.baseURL } );
const { geminiInstance, geminiOptions } = await this.getGeminiInstance(appId); const geminiModel = geminiInstance.getGenerativeModel( { model, generationConfig: config }, { baseURL: geminiOptions.baseURL } );

v0.1.3 及更早版本:

typescriptCopy code
  • 1
  • 2
  • 3
  • 4
  • 5
async getGeminiInstance(appId: string) { const geminiOptions = await this.getGeminiOptions(appId); return new GoogleGenerativeAI(geminiOptions.apiKey, geminiOptions.baseURL); }
async getGeminiInstance(appId: string) { const geminiOptions = await this.getGeminiOptions(appId); return new GoogleGenerativeAI(geminiOptions.apiKey, geminiOptions.baseURL); }

注:SDK调用时,如果baseURL为空(不传参),默认仍是官方的URL。

另,之所以fork,而不是PR、issue等,在于PR、issue的时间上的未知性,也许Google官方会在后续的SDK版本更新中增加baseURL,那,届时再回来吧……┓( ´∀` )┏

至此,无论是RESTful API方式,还是SDK方式,都成功实现了Gemini的接入。再一次干杯![]~( ̄▽ ̄)~*