亚马逊AWS官方博客

基于 Amazon Bedrock 打造您的 Claude3 Opus 智能助理

近期,Anthropic 发布了其最新的大模型 Claude3。截止本文撰写时,Claude3 Opus、Claude3 Sonnet、Claude3 Haiku 均已在 Amazon Bedrock 可用,您可参考 Anthropic 的 Claude3 Sonnet 模型现已在 Amazon Bedrock 上提供Anthropic’s Claude 3 Opus model now available on Amazon Bedrock 来获取新闻资讯。随着 Amazon Bedrock 可提供越来越多的大模型,您可以在您的应用场景里将其落地,以便扩展您丰富的业务功能。

在此博客里,我们将通过构建一个前端示例(基于开源项目 ChatGPT-Next-Web)并通过 Amazon CDK 部署,来演示如何将您的 AI Chat 助手接入 Amazon Bedrock,其中根据不同场景快速提问(预置多种 Prompt)并实现多模态、流式输出(打字机效果)等能力。

我们在该方案主要使用到如下服务和组件:

  • Amazon Elastic Container Service(Amazon ECS)是一项完全托管的容器编排服务,可帮助您更有效地部署、管理和扩展容器化的应用程序。
  • Amazon Elastic Container Registry(Amazon ECR)是完全托管式容器注册表,提供高性能托管,让您能在任何地方可靠地部署应用程序映像和构件。
  • Amazon Bedrock 是一项完全托管的服务,通过单个 API 提供来自 AI21 Labs、Anthropic、Cohere、Meta、Stability AI 和 Amazon 等领先人工智能公司的高性能基础模型(FM),以及通过安全性、隐私性和负责任的 AI 构建生成式人工智能应用程序所需的一系列广泛功能。
  • Amazon Cognito Customer Identity And Access Management.

事先准备:

  • 您需要确保您已有亚马逊云科技 Global 账号,能够访问美东 1 区(us-east-1)、美西 2 区(us-west-2)、东京区(ap-northeast-1)、新加坡区(ap-southeast-1)其中任意一区,且能够在 Amazon Bedrock 控制台申请 Claude3 的权限,请参考:Amazon Bedrock endpoints and quotas – AWS General ReferenceAmazon Bedr…。如您需要 Claude3 Opus 相关功能,截止本文撰写时,Claude3 Opus 仅支持美西 2 区(us-west-2)使用。
  • 您需要准备亚马逊云科技账号上述区域的 Access Key、Secret Access Key,且此账号拥有足够 IAM 权限能够调用 Amazon Bedrock。
  • 您需要准备一台已安装Docker、Node.js(v>=18)、AWS SDK环境的电脑/EC2服务器,将上述步骤的Access Key、Secret Access Key 通过aws configure配置,上述所有服务均启动。

演示

架构设计

阐述:

  1. 项目代码通过 Next.js 实现(React.js 的一种 SSR 框架)。
  2. Next.js 的 UI 层负责页面渲染,UI 逻辑实现,向服务端发请求,预置 Prompt 和模型参数等功能。
  3. Next.js API 层负责暴露 API 供 UI 层调用,Server 层负责拼装请求大模型的参数,实现 Agent,通过 AWS SDK for javascript 调用 Amazon Bedrock。
  4. Amazon Cognito 负责鉴权,与 UI 层代码嵌入,具体实现逻辑请参考 Build & connect backend – JavaScript – AWS Amplify Documentation。您也可以根据您实际情况修改 app/components/home.tsx 中的代码来修改全局渲染逻辑,接入不同的鉴权场景。

请求流程:

  1. 用户在浏览器打开部署好的项目网址(通过 NLB 暴露)。
  2. 流量先到 Amazon ECS 判断未登录后,重定向到 Amazon Cognito 的鉴权页面,登录成功后请求流量打到 Amazon ECS,返回正常的页面。
  3. 输入问题之后,UI 层的 JS 代码会将历史对话上下文,参数配置,发送给 Server 端 API,Server 端拼装参数并请求 Amazon Bedrock。
  4. 得到返回后直接将 Bedrock response 中的 body 转成 ReadableStream,返回给 UI 层。UI 层实现流式响应、上屏逻辑。
  5. 如果您在设置中选择压缩历史记录上下文,每次对话正确响应后,UI 层会将聊天上下文自动发请求到服务端,并要求大模型对其内容进行总结归纳。

实现向 Bedrock 请求

UI 层修改

此项目的开源代码默认不支持接入 Claude3,或接入 Amazon Bedrock 托管的 Claude3,需要在代码中手动修改。

首先您需要在 app/constant.ts 文件中,添加 Claude3 的模型。

export enum ServiceProvider {
  Claude = "Claude",
  ......
}

export enum ModelProvider {
  AmazonClaude = "AmazonClaude",
  ......
}

export const SUMMARIZE_MODEL = "anthropic-claude3-sonnet";

export const AmazonPath = {
  ChatPath: "v1/chat/completions",
};

export const DEFAULT_MODELS = [
  {
    name: "Claude 3 Opus",
    available: true,
    provider: {
      id: "anthropic",
      providerName: "Amazon Bedrock",
      providerType: "anthropic-claude3-opus",
    },
  },
  {
    name: "Claude 3 Sonnet",
    available: true,
    provider: {
      id: "anthropic",
      providerName: "Amazon Bedrock",
      providerType: "anthropic-claude3-sonnet",
    },
  },
  {
    name: "Claude 3 Haiku",
    available: true,
    provider: {
      id: "anthropic",
      providerName: "Amazon Bedrock",
      providerType: "anthropic-claude3-haiku",
    },
  },
  {
    name: "Claude 2.1",
    available: true,
    provider: {
      id: "anthropic",
      providerName: "Amazon Bedrock",
      providerType: "anthropic-claude21",
    },
  },
  {
    name: "Claude 2",
    available: true,
    provider: {
      id: "anthropic",
      providerName: "Amazon Bedrock",
      providerType: "anthropic-claude2",
    },
  },
] as const;

app/client/platforms 下新建 claude.ts 逻辑可参考此文件夹下其他文件。

【可选】由于项目默认不支持设置 Claude 相关的 top_p,top_k 等参数,需要对设置场景进行修改。

app/components/model-config.tsx
<ListItem
        title={Locale.Settings.TopP.Title}
        subTitle={Locale.Settings.TopP.SubTitle}
      >
        <InputRange
          value={(props.modelConfig.top_p ?? 1).toFixed(1)}
          min="0"
          max="1"
          step="0.1"
          onChange={(e) => {
            props.updateConfig(
              (config) =>
                (config.top_p = ModalConfigValidator.top_p(
                  e.currentTarget.valueAsNumber
                ))
            );
          }}
        ></InputRange>
      </ListItem>
      {/* 新增 */}
      <ListItem
        title={Locale.Settings.TopK.Title}
        subTitle={Locale.Settings.TopK.SubTitle}
      >
        <InputRange
          value={(props.modelConfig.top_k ?? 250).toFixed(1)}
          min="0"
          max="500"
          step="1"
          onChange={(e) => {
            props.updateConfig(
              (config) =>
                (config.top_k = ModalConfigValidator.top_k(
                  e.currentTarget.valueAsNumber
                ))
            );
          }}
        ></InputRange>
      </ListItem>
app/store/config.ts
......

modelConfig: {
    model: "Claude 3 Sonnet" as ModelType,
    temperature: 0.5,
    top_p: 0.9,
    top_k: 250,
    max_tokens: 2048,
    presence_penalty: 0,
    frequency_penalty: 0,
    sendMemory: true,
    historyMessageCount: 4,
    compressMessageLengthThreshold: 1000,
    enableInjectSystemPrompts: true,
    template: DEFAULT_INPUT_TEMPLATE,
  },
  
......

export const ModalConfigValidator = {
  ......
  temperature(x: number) {
    return limitNumber(x, 0, 2, 1);
  },
  top_p(x: number) {
    return limitNumber(x, 0, 1, 1);
  },
  // 新增
  top_k(x: number) {
    return limitNumber(x, 0, 500, 1);
  },
};
app/client/api.ts
export interface LLMConfig {
  model: string;
  temperature?: number;
  top_p?: number;
  stream?: boolean;
  presence_penalty?: number;
  frequency_penalty?: number;
  top_k?: number;
  max_token?: number;
}

Server 层

新建 API,app/api 下新建 2 层文件夹:claude/[…path],随后在 app/api/claude/[…path]下新建文件 route.ts,接受 UI 层传入的请求信息。

创建处理请求的逻辑:新建 app/api/cluadeServices.ts,参考 AWS SDK for JavaScript v3

import { AWSBedrockAnthropicStream, AWSBedrockStream } from "ai";

export const requestAmazonClaude = async (req: NextRequest) => {
  const controller = new AbortController();
  const timeoutId = setTimeout(
    () => {
      controller.abort();
    },
    10 * 60 * 1000 // 10分钟超时
  );
  const requestBodyStr = await streamToString(req.body);
  const requestBody = JSON.parse(requestBodyStr);
  // 创建Bedrock Client
  const bedrockruntime = new BedrockRuntimeClient(AWS_PARAM);
  let response;
  // 发起Bedrock请求
  try {
    response = await bedrockResponse(
      bedrockruntime,
      requestBody.model.includes(CLAUDE3_KEY)
        ? requestBody.messages
        : getClaude2Prompt(requestBody.messages),
      requestBody.temperature,
      toInt(requestBody.max_tokens, 8192),
      requestBody.top_p,
      requestBody.top_k,
      requestBody.model
    );
  } finally {
    clearTimeout(timeoutId);
  }
  if (typeof response === "string") {
    return new NextResponse(response, {
      headers: {
        "Content-Type": "text/plain",
      },
    });
  }
  // 处理Claude3响应 https://sdk.vercel.ai/docs/guides/providers/aws-bedrock
  if (requestBody.model.includes(CLAUDE3_KEY)) {
    const stream = AWSBedrockStream(
      response,
      undefined,
      (chunk) => chunk.delta?.text
    );
    return new NextResponse(stream, {
      headers: {
        "Content-Type": "application/octet-stream",
      },
    });
  }
  // 处理Claude2响应
  const stream = AWSBedrockAnthropicStream(response);

  return new NextResponse(stream, {
    headers: {
      "Content-Type": "application/octet-stream",
    },
  });
};


// 具体拼装参数的逻辑
const bedrockResponse = async (
  bedrockruntime: { send: (arg0: any) => any },
  bodyData: any,
  temperature: number,
  max_tokens_to_sample: number,
  top_p: number,
  top_k: number,
  model: string
) => {
  if (model.includes(CLAUDE2_KEY)) {
    const requestBodyPrompt = {
      prompt: bodyData,
      max_tokens_to_sample,
      temperature, // 控制随机性或创造性,0.1-1 推荐0.3-0.5,太低没有创造性,一点输入错误/语意不明确就会产生比较大偏差,且过低不会有纠错机制
      top_p, // 采样参数,从 tokens 里选择 k 个作为候选,然后根据它们的 likelihood scores 来采样
      top_k, // 设置越大,生成的内容可能性越大;设置越小,生成的内容越固定;
    };
    console.log("<----- Request Claude2 Body -----> ", requestBodyPrompt);
    const command = new InvokeModelWithResponseStreamCommand({
      body: JSON.stringify(requestBodyPrompt),
      modelId:
        model === "anthropic-claude21"
          ? "anthropic.claude-v2:1"
          : "anthropic.claude-v2",
      contentType: "application/json",
      accept: "application/json",
    });
    const response = await bedrockruntime.send(command);
    return response;
  }
  if (model.includes(CLAUDE3_KEY)) {
    const requestBody = await changeMsgToStand(bodyData, true);
    const requestBodyPrompt = {
      anthropic_version:
        (ClaudeEnum as any)[model]?.knowledge_date ||
        ClaudeEnum["anthropic-claude3-sonnet"].knowledge_date,
      messages: requestBody,
      max_tokens: max_tokens_to_sample,
      temperature,
      top_p,
      top_k,
    };
    console.log("<----- Request Claude3 Body -----> ", requestBodyPrompt);
    const command = new InvokeModelWithResponseStreamCommand({
      body: JSON.stringify(requestBodyPrompt),
      modelId:
        (ClaudeEnum as any)[model]?.model_id ||
        ClaudeEnum["anthropic-claude3-sonnet"].model_id,
      contentType: "application/json",
      accept: "application/json",
    });
    const response = await bedrockruntime.send(command);
    // 注意:Claude 3 流式返回的数据结果与Claude2不同,请注意处理
    return response;
  }
};


export const ClaudeEnum = {
  "anthropic-claude3-sonnet": {
    knowledge_date: "bedrock-2023-05-31",
    model_id: "anthropic.claude-3-sonnet-20240229-v1:0",
  },
  "anthropic-claude3-haiku": {
    knowledge_date: "bedrock-2023-05-31",
    model_id: "anthropic.claude-3-haiku-20240307-v1:0",
  },
  "anthropic-claude3-opus": {
    knowledge_date: "bedrock-2023-05-31",
    model_id: "anthropic.claude-3-opus-20240229-v1:0",
  },
};

export const CLAUDE3_KEY = "anthropic-claude3";
export const CLAUDE2_KEY = "anthropic-claude2";

流式输出(Streaming output)

服务端:得到响应时,不要直接解析。逻辑如下:

  1. 调用 InvokeModelWithResponseStream API,请求 Amazon Bedrock。
  2. 得到响应response,但此时请不要直接解析 response 的 body(是一个流),否则还是在中间 Server 层串行等待,且性能很差,如需一次性返回全部响应 body,请使用 InvokeModel API。注意:Amazon Bedrock 上各模型的响应的 Body 类型各不相同,Claude2 与 Claude3 响应也不同。详情请参考文档:AWS SDK for JavaScript v3。
  3. 响应 response 的 Body 需要进行流透传,返回给浏览器端,这一步可以手动将 response 的 Body 解析成流,也可以使用第三方的库。
  4. 这里推荐使用 npm 的 ai 库(AWS Bedrock – Vercel AI SDK),但部分场景(如 Claude3 的响应)仍需手动处理。参考如下:
    // Server
    
    import { AWSBedrockAnthropicStream, AWSBedrockStream } from "ai";
    
    ......
    
    // Claude 2/2.1
    const claude2Stream = AWSBedrockAnthropicStream(claude2Response);
    return new NextResponse(claude2Stream, {
      headers: {
        "Content-Type": "application/octet-stream",
      },
    });
    
    // Claude 3 改了响应格式
    const claude3Stream = AWSBedrockStream(
          claude3Response,
          undefined,
          (chunk) => chunk.delta?.text
        );
    return new NextResponse(claude3Stream, {
      headers: {
        "Content-Type": "application/octet-stream",
      },
    });
    
    // Bedrock 上其他大模型类似

    UI 层(接受流式返回)

    ......
            fetchEventSource(chatPath, {
              ...chatPayload,
              async onopen(res) {
                clearTimeout(requestTimeoutId);
                const contentType = res.headers.get("content-type");
                console.log(
                  "[Claude] request response content type: ",
                  contentType
                );
                if (contentType?.startsWith("text/plain")) {
                  responseText = await res.clone().text();
                  return finish();
                }
                if (
                  res.body &&
                  contentType?.startsWith("application/octet-stream")
                ) {
                  try {
                    const reader = res.clone().body?.getReader();
                    const decoder = new TextDecoder("utf-8");
                    let result = (await reader?.read()) || { done: true };
                    while (!result.done) {
                      responseText += decoder.decode(result.value, {
                        stream: true,
                      });
                      continueMsg();
                      try {
                        result = (await reader?.read()) || { done: true };
                      } catch {
                        break;
                      }
                    }
                  } finally {
                  }
                  return finish();
                }
              
    ......

Amazon Bedrock Limit

Amazon Bedrock 每个账号在每个 region 请求上限为每分钟 60 次,如果您需要更多的请求量,请联系 Amazon 的支持人员,协助您提升上限。具体请参考:Quotas for Amazon Bedrock – Amazon Bedrock

如您仅希望在本地实现限流,可以通过 Nginx 反向代理、NLB 请求限制等方式实现,以下是通过 Next.js 的中间层代码进行限制:

根目录下创建 middleware.ts

您可以参考 Next.js 的官方文档,实现您的 API 拦截、请求限制等功能:Routing: Middleware | Next.js

import { NextRequest, NextResponse } from "next/server";

import rateLimit from "./app/utils/rate-limiter";

const requestCount = 0; // 初始化总请求计数
const lastResetTime = Date.now(); // 初始化上次重置时间
export async function middleware(req: NextRequest) {
  const path = req.nextUrl.pathname;
  // 只对 /api/claude 路由进行限流
  if (path.includes("api/claude/")) {
    try {
      const result = await rateLimit(
        requestCount,
        lastResetTime,
        60,
        60 * 1000
      );
      // 每分钟最多 60 次请求 bedrock限制一分钟最多60次请求 如有需要可以提Case提高上限
      // 参考https://console.aws.amazon.com/support/home#/case/create?issueType=service-limit-increase
      if (!result.success) {
        return NextResponse.json(
          { error: "Too many requests, please try again later." },
          { status: 429 }
        );
      }
    } catch (err) {
      console.error("Error rate limiting", err);
      return NextResponse.json(
        { error: "Error rate limiting" },
        { status: 500 }
      );
    }
  }

  return NextResponse.next();
}

export const config = {
  matcher: "/api/claude/:path*",
};
let requestCount = 0;
let lastResetTime = 0;

export async function rateLimit(
  requestCount: number,
  lastResetTime: number,
  limit: number,
  duration: number,
) {
  const now = Date.now();
  // 如果距离上次重置已经超过了一分钟,则重置计数
  if (now - lastResetTime > duration) {
    requestCount = 1;
    lastResetTime = now;
  } else {
    requestCount++;
  }
  if (requestCount > limit) {
    return { success: false, count: requestCount };
  }
  return { success: true, count: requestCount };
}

实现多模态

目前 Claude3 各模型支持解析图片,您可参考官方文档:https://docs.anthropic.com/claude/docs/vision

【代码改动】:您可以在 app/utils.ts 文件的 isVisionModel 方法中,设置对模型是否支持图片的判断,来放开项目对多模态的支持。

注意:本项目中 message 对图片 base64 的 type/对象名称为 image_url,与 Claude3type 名为 image 不同,需要您手动转换。

对于 PDF 等其他类型文件,可以通过第三方插件(如 react-pdf:react-pdf – npm)将其在浏览器端转换成图片。出于性能和安全性考虑,这里我们也推荐您在浏览器端实现文件解析和转换,而不是在 Next.js 的服务端。

react-pdf 在构建时需要确保您的构建环境有正确的 canvas 版本(或采用有图形化界面的环境进行构建),且 next.config.js 已按其 npm 文档中配置。

部署

可参考以下 CDK,或您通过 PM2 在 EC2 上运行。

export interface ApplicationLoadBalancerProps {
  readonly internetFacing: boolean;
}

export interface NetworkProps {
  readonly vpc: IVpc;
  readonly subnets?: SubnetSelection;
}

export interface AiChatECSProps {
  readonly networkProps: NetworkProps;
  readonly region: string;
  readonly accountId: string;
  readonly cognitoInfo: any;
  readonly accessKey?: string;
  readonly secretAccessKey?: string;
  readonly bucketName?: string;
  readonly bucketArn?: string;
  readonly partition: string;
}


export class EcsStack extends Construct {
  readonly securityGroup: ISecurityGroup;
  public ecsIP: string;

  constructor(scope: Construct, id: string, props: AiChatECSProps) {
    super(scope, id);
    const repositoryName = "commonchatui-docker-repo";

    // create ecr
    const repository = new Repository(this, "CommonchatuiRepository", {
      repositoryName: "commonchatui-docker-repo",
      imageTagMutability: TagMutability.MUTABLE,
      removalPolicy: RemovalPolicy.DESTROY,
    });
    // deploy front docker image
    const ui_image = new DockerImageAsset(this, "commonchatui_image", {
      directory: path.join(__dirname, "your_next_app_dic"), // 改成您前端应用的代码目录地址
      file: "Dockerfile",
      platform: Platform.LINUX_AMD64,
    });
    const imageTag = "latest";
    const dockerImageUri = `${props.accountId}.dkr.ecr.${props.region}.amazonaws.com/${repositoryName}:${imageTag}`;

    // upload front docker image to ecr
    const ecrDeploy = new ECRDeployment(this, "commonchat_image_deploy", {
      src: new DockerImageName(ui_image.imageUri),
      dest: new DockerImageName(dockerImageUri),
    });

    new CfnOutput(this, "ECRRepositories", {
      description: "ECR Repositories",
      value: ecrDeploy.node.addr,
    }).overrideLogicalId("ECRRepositories");

    new CfnOutput(this, "ECRImageUrl", {
      description: "ECR image url",
      value: dockerImageUri,
    }).overrideLogicalId("ECRImageUrl");

    // create ecs security group
    this.securityGroup = new SecurityGroup(this, "commonchat_sg", {
      vpc: props.networkProps.vpc,
      description: "Common chart security group",
      allowAllOutbound: true,
    });
    // add inter rule
    this.securityGroup.addIngressRule(
      Peer.anyIpv4(),
      Port.tcp(443),
      "Default ui 443 port"
    );
    this.securityGroup.addIngressRule(
      Peer.anyIpv4(),
      Port.tcp(80),
      "Default ui 80 port"
    );
    // add endpoint
    props.networkProps.vpc.addInterfaceEndpoint("CommonChatVPCECREP", {
      service: InterfaceVpcEndpointAwsService.ECR,
      privateDnsEnabled: true,
      securityGroups: [this.securityGroup],
    });
    props.networkProps.vpc.addInterfaceEndpoint("CommonChatVPCECRDockerEP", {
      service: InterfaceVpcEndpointAwsService.ECR_DOCKER,
      privateDnsEnabled: true,
      securityGroups: [this.securityGroup],
    });
    props.networkProps.vpc.addInterfaceEndpoint("CommonChatVPCLogEP", {
      service: InterfaceVpcEndpointAwsService.CLOUDWATCH_LOGS,
      privateDnsEnabled: true,
      securityGroups: [this.securityGroup],
    });

    props.networkProps.vpc.addGatewayEndpoint("CommonChatVPCS3", {
      service: GatewayVpcEndpointAwsService.S3,
    });
    props.networkProps.vpc.addInterfaceEndpoint("CommonChatVPCLogECS", {
      service: InterfaceVpcEndpointAwsService.ECS,
      privateDnsEnabled: true,
      securityGroups: [this.securityGroup],
    });
    props.networkProps.vpc.addInterfaceEndpoint("CommonChatVPCLogECSAgent", {
      service: InterfaceVpcEndpointAwsService.ECS_AGENT,
      privateDnsEnabled: true,
      securityGroups: [this.securityGroup],
    });
    props.networkProps.vpc.addInterfaceEndpoint(
      "CommonChatVPCLogECSTelemetry",
      {
        service: InterfaceVpcEndpointAwsService.ECS_TELEMETRY,
        privateDnsEnabled: true,
        securityGroups: [this.securityGroup],
      }
    );

    // create ecr service
    const ecsService = this.createECSGroup(props, imageTag, repository);

    // create lb to ecs
    this.createNlbEndpoint(props, ecsService, [80]);
  }
  private createECSGroup(
    props: AiChatECSProps,
    imageTag: string,
    repository: IRepository
  ) {
    const ecsClusterName = "CommonchatUiCluster";
    const cluster = new Cluster(this, ecsClusterName, {
      clusterName: "commonchat-ui-front",
      vpc: props.networkProps.vpc,
      enableFargateCapacityProviders: true,
    });

    const taskDefinition = new FargateTaskDefinition(
      this,
      "commonchatui_deploy",
      {
        cpu: 2048,
        memoryLimitMiB: 4096,
        runtimePlatform: {
          operatingSystemFamily: OperatingSystemFamily.LINUX,
          cpuArchitecture: CpuArchitecture.of("X86_64"),
        },
        family: "CommonchatuiDeployTask",
        taskRole: this.getTaskRole(this, "CommonchatuiDeployTaskRole"),
        executionRole: this.getExecutionTaskRole(
          this,
          "CommonchatuiDeployExecutionTaskRole"
        ),
      }
    );
    const portMappings = [
      {
        containerPort: 80,
        hostPort: 80,
        protocol: Protocol.TCP,
        appProtocol: AppProtocol.http,
        name: "app_port",
      },
      {
        containerPort: 443,
        hostPort: 443,
        protocol: Protocol.TCP,
        appProtocol: AppProtocol.http,
        name: "app_port_443",
      },
    ];
    const envConfig: any = {
      DEFAULT_REGION: props.region,
      BUCKET_NAME: props.bucketName,
      USER_POOL_ID: props.cognitoInfo.userPoolId,
      USER_POOL_CLIENT_ID: props.cognitoInfo.userPoolClientId,
    };
    if (props.accessKey && props.secretAccessKey) {
      envConfig.ACCESS_KEY = props.accessKey;
      envConfig.SECRET_ACCESS_KEY = props.secretAccessKey;
    }
    taskDefinition.addContainer("CommonchatuiContainer", {
      containerName: "commonchatui_container",
      image: ContainerImage.fromEcrRepository(repository, imageTag),
      essential: true,
      cpu: 2048,
      memoryLimitMiB: 4096,
      portMappings: portMappings,
      environment: envConfig,
      logging: LogDriver.awsLogs({
        streamPrefix: "commonchat_ui",
      }),
    });
    // 给ECS容器的角色配置权限
    taskDefinition.addToTaskRolePolicy(
      new PolicyStatement({
        effect: Effect.ALLOW,
        actions: [
          "bedrock:InvokeModel",
          "bedrock:InvokeModelWithResponseStream",
        ],
        resources: ["*"],
      })
    );

    return new FargateService(this, "CommonchatuiService", {
      serviceName: "commonchat-ui-service",
      cluster: cluster,
      taskDefinition: taskDefinition,
      desiredCount: 1,
      assignPublicIp: false,
      platformVersion: FargatePlatformVersion.LATEST,
      securityGroups: [this.securityGroup],
      vpcSubnets: { subnetType: SubnetType.PRIVATE_WITH_EGRESS },
      capacityProviderStrategies: [{ capacityProvider: "FARGATE", weight: 2 }],
      propagateTags: PropagatedTagSource.TASK_DEFINITION,
      maxHealthyPercent: 100,
      minHealthyPercent: 0,
    });
  }
  private getExecutionTaskRole(self: this, roleId: string): IRole {
    // throw new Error('Method not implemented.');
    return new Role(self, roleId, {
      assumedBy: new ServicePrincipal("ecs-tasks.amazonaws.com"),
    });
  }
  private getTaskRole(self: this, roleId: string): IRole {
    return new Role(self, roleId, {
      assumedBy: new ServicePrincipal("ecs-tasks.amazonaws.com"),
      managedPolicies: [
        ManagedPolicy.fromAwsManagedPolicyName(
          "service-role/AmazonECSTaskExecutionRolePolicy"
        ),
      ],
    });
  }

  // 如果使用自有证书 选择ALB
  private createNlbEndpoint(
    props: AiChatECSProps,
    ecsService: FargateService,
    servicePorts: Array<number>
  ) {
    const nlb = new NetworkLoadBalancer(this, "CommonchatUiLoadBalancer", {
      loadBalancerName: "commonchat-ui-service",
      internetFacing: true,
      crossZoneEnabled: false,
      vpc: props.networkProps.vpc,
      vpcSubnets: { subnetType: SubnetType.PUBLIC } as SubnetSelection,
    });
    servicePorts.forEach((itemPort) => {
      const listener = nlb.addListener(`CommonchatUiLBListener-${itemPort}`, {
        port: itemPort,
        protocol: LBProtocol.TCP_UDP,
      });
      const targetGroup = new NetworkTargetGroup(
        this,
        `CommonchatUiLBTargetGroup-${itemPort}`,
        {
          targetGroupName: "commonchat-ui-service-target",
          vpc: props.networkProps.vpc,
          port: itemPort,
          protocol: LBProtocol.TCP_UDP,
          targetType: TargetType.IP,
          healthCheck: {
            enabled: true,
            interval: Duration.seconds(180),
            healthyThresholdCount: 2,
            unhealthyThresholdCount: 2,
            port: itemPort.toString(),
            protocol: LBProtocol.TCP,
            timeout: Duration.seconds(10),
          },
        }
      );
      listener.addTargetGroups(
        `CommonChatUiLBTargetGroups-${itemPort}`,
        targetGroup
      );

      targetGroup.addTarget(
        // targetGroups
        ecsService.loadBalancerTarget({
          containerName: "commonchattui_container",
          containerPort: itemPort,
        })
      );
    });
    new CfnOutput(this, "FrontUiUrlDefault", {
      description: "Common chart ui url by default",
      value: `http://${nlb.loadBalancerDnsName}`, // alb
    }).overrideLogicalId("FrontUiUrlDefault");
  }
}

总结

本文中,我们介绍了如何使用开源框架 ChatGPT-Next-Web,接入您 AWS 账号下的 Amazon Bedrock [Claude 3 Sonnet(Text Image)、Claude 3 Haiku(Text Image)]大模型,实现一个轻量级的多模态聊天机器人,并实现流式输出和部署。

如您想获取完整代码,请联系亚马逊云科技的支持人员。

感谢您的阅读。


*前述特定亚马逊云科技生成式人工智能相关的服务仅在亚马逊云科技海外区域可用,亚马逊云科技中国仅为帮助您了解行业前沿技术和发展海外业务选择推介该服务。

参考链接

AWS Bedrock:https://aws.amazon.com/cn/generative-ai

ChatGPT-Next-Web: https://github.com/ChatGPTNextWeb/ChatGPT-Next-Web

Bedrock Limit:https://console.aws.amazon.com/support/home#/case/create?issueType=service-limit-increase

AWS Bedrock – Vercel AI SDK:https://sdk.vercel.ai/docs/guides/providers/aws-bedrock

AWS Bedrock API Reference:https://docs.aws.amazon.com/bedrock/latest/APIReference/welcome.html

本篇作者

王宇

亚马逊云科技快速原型方案架构师,负责大前端领域的产品研究与交付。针对应用程序中所涉及的移动端、前端、BFF 层原型及交付等均有涉猎,曾主导过金融、零售与广告、企业应用、大数据、AI 等领域多个大型业务系统的交互设计与实现。

内容审阅

苏品毓

亚马逊云科技行业解决方案架构师,专注于 AI/ML 和金融科技领域。曾就职于三星、猎户星空、雪球等公司,参与多个重点项目,包括手机智能语音助手,智能音箱,金融预训练模型以及金融知识图谱等,在自然语言处理领域积累了丰富的实战经验。

延诤

亚马逊云科技 CTOO 团队架构师,先后在联想、埃森哲、58、京东等公司担任产品和研发负责人等职务。加入亚马逊云科技之后,负责企业级客户的架构咨询及设计优化,同时致力于生成式 AI 技术在国内和全球企业客户的应用、落地和推广。