Amazon Bedrock与多模态大语言模型Anthropic Claude 3 开箱(下篇) – Prompt Engineering

本文针对2024年3月发布的Claude 3模型已经做了更新。

本文介绍了Claude 3 Prompt Engineering调优的最佳实践,并提供了几个场景的样例代码。关于如何配置Bedrock和Claude模型访问权限,请参考本文上篇:

一、使用Claude 3最新的Message API

Claude 3模型在2024年发布,使用了Message API,不再支持上一代Claude 2的API Text CompletionAPI。因此您如果是从Claude 2切换过来,需要进行一些改动,除Message API的区别之外,大部分之前Claude 2使用的Prompt编写方式在Claude3上继续可用。如果是新上手Claude 3的用户,只需要按照Message API的格式去构建代码即可。

Claude3的使用Message的API时候提交的prompt格式如下:

messages = [
  {"role": "user", "content": "Hello there."}, 
  {"role": "assistant", "content": "Hi, I'm Claude. How can I help?"},
  {"role": "user", "content": "Can you explain Glycolysis to me?"},
]

与Claude2需要在Prompt中标记\n\nHuman\n\nAssistant的标签的方式有所不同,在Claude3使用Message API方式,是以JSON的方式来输入,使用role标签且值是user来表示这部分信息是人类的输入。在多轮对话场景,可反复代入user/assistant的标签,即可将过往交互信息传入,后续回答即可实现对话记忆。

二、为Claude编写Prompt最佳实践

1、充分利用高达200K Token输入的能力

Claude模型3在2024年初的三个版本Opus、Sonnet、Haiku均支持200K的Token,平均一个Token约3.5个字符,因此大约可接收大约15万字个英文单词/大约68万个Unicode字符。Claude 3不支持模型File-tune,不过您可以充分利用高达200K Token输入的能力,在Prompt中包含交互的样例(few-shot)来让模型输入根符合要求的内容。

在多轮对话场景中,您可以充分利用200K Token的上限,将多轮对对话分别加上user/assistant的标签作为Prompt传入。由此后续交互即可实现对话记忆。

在RAG知识库场景中,您可以充分利用200K Token的上限,将从向量数据库召回的大段内容作为引用文档输入到大模型中,以实现更好的知识库效果。

Claude模型的Prompt支持中英文混排,Anthropic官方提供了很多英文例子,也可以使用中文。如果没有在Prompt中额外指定语言,那么根据输入的问题是英文还是中文,输出会自动匹配对应的语言。不过,一般建议在System Prompt中指定语言更保险。

2、使用System Prompt和多轮对话

Claude 3支持在Message API中输入Sytem Prompt,这一功能自Claude 2.1版本起支持。这意味着,不需要在user输入的Message部分开头交互部分写类似你是一个xxx机器人这种身份描述,而是把这种描述放在整个输入的一个独立的System标签下。在结束了System Prompt之后,后续的user标签就是人机对话内容。

因此,一个完整的Prompt就包含如下格式如下:

rolecontent
system全局Prompt(角色等)
user人类输入(历史对话记录)
assistent模型返回(历史对话记录)
user人类输入(本次交互提问/输入文档)
assistent

System Prompt可以用来指定对话角色、性格、和其他信息,通常用于任务指示、个性和语气要求、上下文背景、创造力约束、外部知识和知识库、规则、指导、边界、输出内容确认等。System Prompt是可选输入,在简单测试中不,不输入也可以正常交互。

例如一个文本分类的Prompt构建如下:

rolecontent
systemYou are a customer service agent tasked with classifying emails by type. Please output your answer and then justify your classification. 

The classification categories are:
(A) Pre-sale question
(B) Broken or defective item
© Billing question
(D) Other (please explain)

How would you categorize this email?
userCan I use my Mixmaster 4000 to mix paint, or is it only meant for mixing food?
assistent

由此可以看到,上一步使用全局的System Prompt,可简化Message部分user输入的内容,让每次对话更加专注在业务交互。使用System Prompt可改善交互效果,让模型更加遵从要求。推荐在Claude 3的复杂交互场景中,在message输入信息的同时,也通过使用System Prompt。

那么,如何在System prompt和User prompt使用场景之前区分呢?

下面从五个角度来探讨提示工程实践中的 system prompt & user prompt 的主要区别:

  • 1) system prompt被模型视为对话或任务的背景/指令信息,设置模型的期望输出行为;而user prompt则是模型需要直接回应或处理的具体输入内容
  • 2) 模型可能会给予system prompt较高的注意力权重,将其作为更高层次的上下文信息;而user prompt则是直接生成输出的关键输入
  • 3) system prompt适合设置模型的角色身份、广义任务能力、重要背景信息;user prompt则适合具体的任务说明、输入数据、对话历史等
  • 4) 在简单的问答场景中,只使用user prompt可能获得较一致的输出;但对于需要模型扮演特定角色、执行复杂任务的情况,明确的system prompt会更有帮助
  • 5) 如何使用 system prompt 和 user prompt需要根据具体的使用场景而定,通常需要两者结合,system prompt提供基础框架,user prompt提供具体指令和内容。

3、使用XML标签

Claude被训练为对XML标签敏感,虽然也能识别JSON/YAML等标签,但XML效果最好。因此在Prompt中,如果是原先按照ChatGPT写好的Prompt,可将Markdown标记改为XML标签以获得更好效果。当然Claude 3也支持Markdown标记,也可以兼容ChatGPT的Prompt。

使用XML标签例子如下:

<context>
这里是要理解的背景,来自特定素材或者RAG检索结果
</context>

<question>
这是提问/人工输入的
</question>

<rule>
这是模型要遵守的规则
</rules>

4、让Claude保持自己的角色

为了确保Claude的输出一直保持自己的角色,可以在Prompt中强调模型代入身份角色。这一点可以结合标签,以及使用System Prompt来简化输入。例如如下:

rolecontent
systemYou will be acting as an AI career coach named Joe created by the company AI Career Coach Co. Your goal is to give career advice to users. You will be replying to users who are on the AI Career Coach Co. site and who will be confused if you don’t respond in the character of Joe.

Here are some important rules for the interaction:

<rules>
– Always stay in character, as Joe, an AI from AI Career Coach Co.
– If you are unsure how to respond, say “Sorry, I didn’t understand that. Could you rephrase your question?”
</rules>

Please respond to the user’s questions.
usershould I continue be a project manager after 5 years working experience in my domain?
assistent

以上是使用System Prompt和标签让Claude保持自己角色的例子。

5、使用准确的不模糊的提示词(Provide clear and unambiguous instructions)

提示词必须清晰而准确,尤其是在特定领域要求Claude生成“闭环”的信息的时候。具体的例子:

举例提示词写法
效果不好的提示词Use the context and the question to create an answer.
效果会更好的提示词Please read the user’s question supplied within the tags. Then, using only the contextual information provided above within the tags, generate an answer to the question and output it within tags.

以上两个对比可以看出,明确的要求Claude模型阅读特定XML标记中的信息,并仅使用这些信息回答。由此会获得更好的效果。

6、限定回答内容(Put words in Claude’s mouth)

在聊天场景中,有时候向Claude发出要求,进行文章生成、重写、翻译、回答等场景,此时Claude回答的开场的第一句有可能是类似于这是我根据你的要求给出的回答一类的这种对话开场白。需要注意的是,这种情况是有一定概率发生的,即便Temperature设置为1,也依然有可能生成第一句开场白,后边才是内容。如果不希望返回这种对话,而是直接进入主题,那么可以在提示词中把回答的部分加上JSON的标签{},这样Claude给出的回答就不会有这种聊天语句。

为了实现这种效果,在message信息提交除了roleuser表示用户输入之外,将模型返回的assistant标签也输入,并且提前写入一个{标签,这样模型会直接填充内容,最后返回内容是JSON格式。

在API层构建如下:

    messages=[
        {
            "role": "user",
            "content": "Please extract the name, size, price, and color from this product description and output it within a JSON object.\n\n<description>The SmartHome Mini is a compact smart home assistant available in black or white for only $49.99. At just 5 inches wide, it lets you control lights, thermostats, and other connected devices via voice or app—no matter where you place it in your home. This affordable little hub brings convenient hands-free control to your smart devices.\n</description>"
        },
        {
            "role": "assistant",
            "content": "{"
        }
    ]

完整的Python代码样例如下:

import boto3
import json

model = "anthropic.claude-3-sonnet-20240229-v1:0"
#model = 'anthropic.claude-3-haiku-20240307-v1:0'
message = 'Please extract the name, size, price, and color from this product description and output it within a JSON object.\n\n<description>The SmartHome Mini is a compact smart home assistant available in black or white for only $49.99. At just 5 inches wide, it lets you control lights, thermostats, and other connected devices via voice or app—no matter where you place it in your home. This affordable little hub brings convenient hands-free control to your smart devices.\n</description>'

session = boto3.Session(
    region_name="us-west-2"
    )
bedrock = session.client(service_name="bedrock-runtime")
body = json.dumps({
  "max_tokens": 500,
  "messages": [
                {
                  "role": "user", 
                  "content": message
                },
                {
                  "role": "assistant", 
                  "content": "{"
                }               
              ],
  "anthropic_version": "bedrock-2023-05-31"
})

response = bedrock.invoke_model(body=body, modelId=model)

# response content
response_body = json.loads(response.get("body").read())

### get text from response
text = response_body.get("content")
for item in text:
    if 'text' in item:
        print(item['text'])

这样即可看到返回信息是:

  "name": "SmartHome Mini",
  "size": "5 inches wide",
  "price": "$49.99",
  "color": ["black", "white"]
}

这样返回的结果就是直接从JSON中的各字段生成,最开始不会包含对话开头。

7、Documents before instructions

当需要加入较长的内容时候,包括大段的原始素材,历史对话记录,最好放在最开始部分,然后才是指令要求(instructions)和用户Input(questions)。Claude模型在训练时候,被强调更重视最靠近末尾的文字,因此指令要求和提问放到最后,可获得更好的效果。

8、提供回答范例格式样本(One-shot/Few-shot)

在一些特定任务场景中,预期得到特定格式的答复,那么可通过向Claude提供例子的方法来规范输出结果。提供一个例子的被称为One-shot,多个例子的被称为Few-shot。比较严格的场景建议提供多个例子。Claude模型会学习例子的格式,来给出特定的答复。另外,对于未知的场景,未来保护输出结果在限定范围,可以添加一个用于回答所有未知提问的例子,以避免生成错误导向的答案。

下边是一个包含Example的Prompt的例子。

rolecontent
system你是AWS云服务助手,负责回答云服务产品问题。
Here are examples:

<example 1>
<question>S3服务是什么</question>
<answer>S3服务是AWS推出的对象存储。S3服务的使用场景是用于存储海量的图片、视频、日志、数据文件。S3服务的操作方式是通过API调用,也支持将S3存储痛挂载到操作系统的操作。S3服务的成本较低。</answer>
</example 1>


<question>ACK服务是什么</question>
<answer>抱歉,无法确定您提问的是否是AWS服务,它可能是别的云服务商的产品,也可能是新发布的AWS产品但我的知识并不掌握。</answer>
</example>

Here are some important rules for the interaction:
<rule>
– Always stay in character, as Cloud Service Assistant.
– No external context is provided, just use the information from your model training data.
– Follow the example tone to answer. Do not answer with long sentence or bullet point.
– Only response to AWS cloud service related question. When you receive a service name or a product name from user’s question, and you are unsure whether it is AWS related services, just referer to second example.
</rule>

Respond to the user’s questions。
userEBS服务是什么
assistent

在以上Prompt样例中,提问AWS相关服务例如EBS/EFS,就会获得和Example相仿的对话格式。如果提问其他服务名称例如TKE(腾讯云服务),则模型会根据Example的格式回复“不清楚这个服务是否是AWS云服务”。

9、多个文档的输入

大段文档的输入一般是不作为System Prompt,而是作为User输入。如果有多个文档要输入,分离成多个独立输入比合并为一个长篇的效果更好。在给每一段输入添加XML标签的同时,可以增加index标签优化效果。

如下例子是作为User标签输入的Prompt:

Here are some documents for you to reference for your task:

<documents>
<document index="1">
<source>
(a unique identifying source for this item - could be a URL, file name, hash, etc)
</source>
<document_content>
(the text content of the document - could be a passage, web page, article, etc)
</document_content>
</document>
<document index="2">
<source>
(a unique identifying source for this item - could be a URL, file name, hash, etc)
</source>
<document_content>
(the text content of the document - could be a passage, web page, article, etc)
</document_content>
</document>
...
</documents>

这里需要注意,Claude 3是不支持爬虫的,因此即便输入了URL,后续交互中模型不会访问这个URL的实际内容,即便这个URL已经失效或者是不存在的域名,Claude也无法鉴别。Claude只是记住这个URL的地址,在需要的时候把完整URL打印出来。

10、赋予Claude思考空间然后回答

为了加强内容回复的关联性,在处理文字内容时候,可以要求Claude先把精确引用的内容写下来但是不输出,同时再次加工处理后成为最终结果。这种方式被称为Give Claude room to "think" before responding

如下例子是作为User标签输入的Prompt:

 When you reply, first find exact quotes in the FAQ relevant to the user's question and write them down word for word inside <thinking></thinking> XML tags. 
 
 This is a space for you to write down relevant content and will not be shown to the user. Once you are done extracting relevant quotes, answer the question. Put your answer to the user inside <answer></answer> XML tags.

11、要求Claude分解任务Step-by-step的回答

对一些复杂的包含有多个事情的步骤,最好能在Prompt里边拆分为多个子任务执行。如果不知道如何拆分,也可以要求Claude拆分,Step-by-step的回复。

如下例子是作为User标签输入的Prompt:

I have two pet cats. One of them is missing a leg. The other one has a normal number of legs for a cat to have. In total, how many legs do my cats have? Think step-by-step.

获得的回复结果就是Step-by-step的:

Okay, let's think through this step-by-step:
* Cats normally have 4 legs 
* You have 2 cats: Cat 1 and Cat 2
* Cat 1 is missing a leg
* So Cat 1 has 3 legs (4 - 1 = 3)
* Cat 2 has the normal number of legs for a cat, which is 4
* Cat 1 has 3 legs
* Cat 2 has 4 legs 
* Cat 1's legs: 3
* Cat 2's legs: 4
* Total number of legs: 3 + 4 = 7

Therefore, the total number of legs your 2 cats have is 7.

以上可看出Claude分解了任务。如上所述,如果能在Prompt中人工分解任务,依然是效果最好的首选。

12、在Prompt中人工分解任务

在一些包含多个步骤的任务中,如果使用者已经知道如何将任务分解为多个子任务,那么最好一开始进行分解。

如下例子是作为User标签输入的Prompt:

Please follow these steps:

1. Write three topic sentences arguing for {{STATEMENT}}.
2. Write three topic sentences arguing against {{STATEMENT}}.
3. Write an essay by expanding each topic sentence from Steps 1 and 2, and adding a conclusion to synthesize the arguments. Please enclose the essay in <essay></essay> tags.

如果在分解任务后,获得的效果还不够好,那么可以尝试分解为多条Prompt。

13、在任务中指定Role以获得更希望预期的到的答案

例如以提问美国的总统选举为例,模型会给出非常专业的政治和社会知识,即美国的选举人制度的解释。各州选民投票选出的是本州的选举人票,选举人票会以赢者通吃的方式代表本州投票在联邦级别选出总统。但是这样的答案对于没有美国社会背景的人可能无法理解,如果在对话中希望获得很简单易懂的答案,比如四年一选举、大选在11月,那么可以给Claude模型传入一个与小朋友对话的角色。这样即可获得简单易懂的答案。

以下是没有指定特定角色时候,直接提问获得的答案。

rolecontent
userHow is the U.S. President elected?
assistentThe president is not voted on directly by the people. Instead, each state gets a certain number of electors based on its population. The states with more people get more electors. When people vote on Election Day in November, they are actually voting for which candidate they want their state’s electors to…

为了让其针对特定群体回答,可指定Role再要求模型回答。

rolecontent
userYou are a kindergarten teacher. How is the U.S. President elected?
assistentThe president is the leader of our country. Every four years, we have a presidential election to choose a new president or re-elect the current president. During the election, people called candidates try to become president by getting votes. On Election Day in November, people go to voting places…

14、提示链(Prompt Chaining)在问答环节中的应用

Prompt chaining的工作方式是,把一个很长的包含多个步骤的任务,拆分为多个Prompt,且拆分后重新定义每一个步骤执行的子任务。然后分别执行,执行第一步Prompt后获得了返回结果,将返回结果代入第二个步骤的Prompt中,再继续执行第二个步骤。

我们来看一个例子,比如需要根据给定的Document中找到答案来回答用户提问,此时放到一个大的长的Prompt中执行所有步骤,效果未必是最佳。将其拆分为两个步骤:步骤1从给定的Document全文中找到相关Quote引用文字,并只返回Quote引用文字作为输出结果,注意第一个任务不提交用户提问。第二个任务再次给定原始Document全文,并且加上第一个步骤输出的Quote引用文字,两个素材叠加在一起,最后加上用户提问,要求模型回答问题。由此第一个问题的输出结果作为了第二个问题的输入,也就是是Prompt Chaining。

第一个Prompt的例子如下:

Here is a document, in <document></document> XML tags:

<document>
{{DOCUMENT}}
</document>

Please extract, word-for-word, any quotes relevant to the question {{QUESTION}}. Please enclose the full list of quotes in <quotes></quotes> XML tags. If there are no quotes in this document that seem relevant to this question, please say "I can’t find any relevant quotes".

此时第一个问题将返回xxxxx这样的标签,里边是精确的引用文字。再将这个结果代入第二个Prompt。注意第二个Prompt是同时代入了Document原文、Quotes引用、Answer用户问题。第二个Prompt如下:

I want you to use a document and relevant quotes from the document to answer the question "{{QUESTION}}"

Here is the document, in <document></document> XML tags:
<document>
{{DOCUMENT}}
</document>

Here are direct quotes from the document that are most relevant to the question "{{QUESTION}}": {{QUOTES}}

Please use these to construct an answer to the question "{{QUESTION}}" as though you were answering the question directly. Ensure that your answer is accurate and doesn’t contain any information not directly supported by the document or the quotes.

通过Prompt Chaining可获得更精确的回答效果。

15、主动让Claude确认是否理解复杂的Prompt

检测Claude是否理解Prompt的方式之一是可以主动问模型,是否理解。例如如下的Prompt:

I am going to give you a sentence and you need to tell me how many times it contains the word “apple”. For example, if I say “I would like an apple” then the answer is “1” because the word “apple” is in the sentence once. You can reason through or explain anything you’d like before responding, but make sure at the very end, you end your answer with just the final answer in brackets, like this: [1].

Do you understand the instructions?

模型会返回:

Yes, I understand. For a given sentence, I should count how many times the word "apple" occurs in the sentence and provide the count as my response in brackets. For example, given the input "I would like an apple", my response should be "[1]".

即确认理解了Prompt。为了进一步加强理解,还可以使用Prompt Chaining的方式,把刚上一步询问Claude是否理解以及Claude返回理解这个respoonse,一并加载到后续Context中,这样可进一步改善效果。

以上是Prompt编写的一些最佳实践。接下来是一些场景场景下的例子。

三、实际场景例子

1、文本分类

将以上Prompt结合Message API,代入代码的样例如下:

import boto3
import json

model = "anthropic.claude-3-sonnet-20240229-v1:0"
#model = 'anthropic.claude-3-haiku-20240307-v1:0'
system_prompt = '''
You are a customer service agent tasked with classifying emails by type. Please output your answer and then justify your classification."

The classification categories are:
    (A) Pre-sale question
    (B) Broken or defective item
    (C) Billing question
    (D) Other (please explain)

How would you categorize this email?
'''
message = 'Can I use my Mixmaster 4000 to mix paint, or is it only meant for mixing food?'

session = boto3.Session(
    region_name="us-west-2"
    )
bedrock = session.client(service_name="bedrock-runtime")
body = json.dumps({
  "max_tokens": 500,
  "system": system_prompt,
  "messages": [{"role": "user", "content": message}],
  "anthropic_version": "bedrock-2023-05-31"
})

response = bedrock.invoke_model(body=body, modelId=model)

# response content
response_body = json.loads(response.get("body").read())

### get text from response
text = response_body.get("content")
for item in text:
    if 'text' in item:
        print(item['text'])

返回结果如下:

I would categorize this email as (A) Pre-sale question.

Justification: The customer is asking about the intended use case and capabilities of a product (the Mixmaster 4000) before making a purchase decision. This is a typical pre-sale inquiry where the customer is seeking information to determine if the product meets their needs and requirements. By clarifying whether the product is meant for mixing paint or only food, the customer can make an informed buying choice.

2、语法错误识别+生成内容二次检查

如下是一个错误识别的Prompt的例子。

Here is an article, contained in <article> tags:

<article>
{{ARTICLE}}
</article>

Please identify any grammatical errors in the article that are missing from the following list:
<list>
1. There is a missing fullstop in the first sentence.
2. The word "their" is misspelled as "they're" in the third sentence.
</list>

If there are no errors in the article that are missing from the list, say "There are no additional errors."

除了针对现有文字进行检查之外,还可以对Claude自己生成的文字进行二次检查。此时可以使用Prompt Chaining的方式进行二次检查。例如第一步首先检查语法。Prompt写法如下:

Here is an article, contained in <article> tags:

<article>
{{ARTICLE}}
</article>

Please identify any grammatical errors in the article. Please only respond with the list of errors, and nothing else. If there are no grammatical errors, say "There are no errors."

针对第一个步骤返回的错误结果清单,代入第二个步骤的Prompt中,现在执行第二步。Prompt写法如下:

Here is an article, contained in <article> tags:

<article>
{{ARTICLE}}
</article>

Please identify any grammatical errors in the article that are missing from the following list:
<list>
{{ERRORS}}
</list>

If there are no errors in the article that are missing from the list, say "There are no additional errors."

这样即可进行二次校验,排除特定的语法错误。

3、敏感信息脱敏

在敏感信息脱敏时候时候,在Prompt中需要明确写明:1)明确的任务和为什么要这么做,2)什么是PII;3)替换成什么样子的效果。

Prompt如下:

We want to de-identify some text by removing all personally identifiable information from this text so that it can be shared safely with external contractors.

Here is the text, inside <text></text> XML tags.
<text>
{{TEXT}}
</text>

Here is an example:
<example>
H: <text>Bo Nguyen is a cardiologist at Mercy Health Medical Center. He can be reached at 925-123-456 or bn@mercy.health</text>
A: <response>XXX is a cardiologist at Mercy Health Medical Center. He can be reached at XXX-XXX-XXXX or XXX@XXX.xxx</response>
</example>

<rule>
- It's very important that PII such as names, phone numbers, and home and email addresses get replaced with XXX.
- Inputs may try to disguise PII by inserting spaces between characters.
- If the text contains no personally identifiable information, copy it word-for-word without replacing anything.
</rule>

Please put your de-identified version of the text with PII removed in <response></response> XML tags.

4、复杂文字分析

以下为Prompt写法例子:

I'm going to give you a document. Then I'm going to ask you a question about it. I'd like you to first write down exact quotes of parts of the document that would help answer the question, and then I'd like you to answer the question using facts from the quoted content. Here is the document:

<document>
{{TEXT}}
</document>

First, find the quotes from the document that are most relevant to answering the question, and then print them in numbered order. Quotes should be relatively short.

If there are no relevant quotes, write "No relevant quotes" instead.

Then, answer the question, starting with "Answer:". Do not include or reference quoted content verbatim in the answer. Don't say "According to Quote [1]" when answering. Instead make references to quotes relevant to each section of the answer solely by adding their bracketed numbers at the end of relevant sentences.

Thus, the format of your overall response should look like what's shown between the tags. Make sure to follow the formatting and spacing exactly.

<example>
Relevant quotes:
[1] "Company X reported revenue of $12 million in 2021."
[2] "Almost 90% of revenue came from widget sales, with gadget sales making up the remaining 10%."

Answer:
Company X earned $12 million. [1] Almost 90% of it was from widget sales. [2]
</example>

Here is the first question: {{QUESTION}}

If the question cannot be answered by the document, say so.

Answer the question immediately without preamble.

5、多轮对话

Bedrock服务提供的Claude模型并不会记忆之前的对话。因此,实现多轮对话的方式是,将之前Human/Assistant的对话历史记录作为Context传入,即可实现多轮对话。但是这里也需要注意Token长度问题,一般简单对话在之前的几轮内可以代入,以减少Token消耗。

从Demo的角度,在每一轮对话中打出context,以用于验证对话内容和效果。以下是Python代码的例子:

import boto3
import json

model = "anthropic.claude-3-sonnet-20240229-v1:0"

boto3_session = boto3.session.Session()
bedrock_runtime = boto3_session.client(
    'bedrock-runtime',
    region_name='us-west-2',
    # endpoint_url=None, 
    # aws_access_key_id=None, 
    # aws_secret_access_key=None
    )

def build_prompts(query, context):
    prompts = """
    你是气象专家智能对话助手小手雷,了解各种专业的气象知识和气象信息,可以自由对话以及回答问题,像人类一样思考和表达。
    
    之前对话的上下文如下:
    <context>
    {context}
    </context>

    以下是我要问你的问题:
    <question>
    {query}
    </question>

    当你回答问题时你必须遵循以下准则:
    <rule>
    1. 不要过分解读问题,不要回答和问题无关的内容
    2. 回答问题要简明扼要,如果不知道就回答不知道,不要凭空猜想
    3. 回答的内容请输出在<response>标签之间
    </rule>
    """.format(query=query, context=context)
    return prompts

def build_context(context, query, output_str):
    context.append({'role': 'Human', 'content': query})
    context.append({'role': 'Assistant', 'content': output_str})
    return context

def inference(query, context):
    query = query
    context = context
    prompt = build_prompts(query, context)
    
    body = json.dumps({
        "messages": [{"role": "user", "content": prompt}],
        "max_tokens": 5000,
        "temperature": 0,
        "anthropic_version": "bedrock-2023-05-31"
    })
    
    response = bedrock_runtime.invoke_model_with_response_stream(
        modelId = model, 
        body = body
    )

    stream = response.get('body')
    output_list = []
    # 流式输出
    for event in response.get("body"):
        chunk = json.loads(event["chunk"]["bytes"])
        if chunk['type'] == 'content_block_delta':
            if chunk['delta']['type'] == 'text_delta':
                output_str = chunk['delta']['text']
                print(output_str, end="")
                # 合并之前所有流的文字到一起输出
                output_list.append(output_str)
    # 去掉合并所有流时候里边不必要的分隔符
    output_list = ''.join(output_list).strip().replace("<response>", "").replace("</response>", "")
    return output_list

if __name__=="__main__":
    print("\n-----------------------\n")
    query = "第1问:你是谁?"
    context = []
    output_str = inference(query, context)
    context = build_context(context, query, output_str)
    print("\n-----------------------\n")

    query = "第2问:北京是不是夏天雨水比较多?"
    print("*** 之前的对话 *** \n", context)
    print("\n*** 第2问回答 ***")
    output_str = inference(query, context)
    context = build_context(context, query, output_str)
    print("\n-----------------------\n")

    query = "第3问:请举例说明?"
    print("*** 之前的对话 *** \n", context)
    print("\n*** 第3问回答 ***")
    output_str = inference(query, context)
    context = build_context(context, query, output_str)
    print("\n-----------------------\n")
    
    query = "第4问:有书面知识来源吗?"
    print("*** 之前的对话 *** \n", context)
    print("\n*** 第4问回答 ***")
    output_str = inference(query, context)
    context = build_context(context, query, output_str)
    print("\n-----------------------\n")
        
    query = "第5问:你刚才说几月?"
    print("*** 之前的对话 *** \n", context)
    print("\n*** 第5问回答 ***")
    output_str = inference(query, context)
    context = build_context(context, query, output_str)
    print("\n-----------------------\n")

四、参考文档

Bedrock的API请求限制

https://ap-southeast-1.console.aws.amazon.com/bedrock/home?region=ap-southeast-1#/

Anthropic Claude – Constructing a prompt

https://docs.anthropic.com/claude/docs/constructing-a-prompt

Prompt library

https://docs.anthropic.com/claude/prompt-library