Bedrock及最接近GPT的大语言模型Claude开箱 (下篇) —— Prompt Engineering 调优技巧

本文介绍了Prompt Engineering调优的最佳实践,以及几个简单场景样例代码。之前关于如何配置Bedrock和Claude模型的,请参考本文上篇:

一、为Claude编写Prompt最佳实践

Claude模型2.1版本支持200K的Token,大约可接收大约15万字个英文单词/大约68万个Unicode字符。

The maximum prompt length that Claude can see is its context window. For all models except Claude 2.1, Claude's context window is currently ~75,000 words / ~100,000 tokens / ~340,000 Unicode characters. Claude 2.1 has double the context length, at ~150,000 words / ~200,000 tokens / ~680,000 Unicode characters

Claude模型的Prompt支持中英文混排,能用全英文编写最好,也可以使用中文。如果没有在Prompt中额外指定语言,那么根据输入的问题是英文还是中文,输出会自动匹配对应的语言。

针对已经有编写GPT的Prompt经验的开发者,或者是从零开始学习Claude的开发者,可以参考如下Claude模型Prompt最佳实践。

1、使用Human/Assistant的标签显式指定交互内容(Adopt the Human/Assistant formatting)

在Claude的Prompt构建中,需要显式的指定交互内容,人类的提问放在Human:,预期Claude给出的回答放在Assistant:后。这两个标签不可缺失。例如:

\n\nHuman: You are an AI chatbot assistant that helps customers answer prompting questions. When I write BEGIN DIALOGUE, all text that comes afterward will be that of a user interacting with you, asking for prompting help.

Here are the rules you must follow during the conversation:
<rules>
{{RULES}}
</rules>

BEGIN DIALOGUE
<question>
How do I format a system prompt for Claude?
</question>

\n\nAssistant:

在这段Prompt的开头的Human部分指定了Bot扮演的角色,并使用XML格式的标签,分割了指令和要求。如果要是通过API程序提交,则需要在Human之前和Assistant之前都添加\n\n的字符。这一规则适合Claude V2和Claude Instant等模型。

2、在Claude 2.1版本中使用System Prompt

Claude模型2.1版本新推出了System Prompt的功能,也就是可以写全局Prompt,然后把Human:部分就可以放到下边并且内容可以简写。因此上述Prompt写法可进一步优化为如下:

You are an AI chatbot assistant that helps customers answer prompting questions. 

Here are the rules you must follow during the conversation:
<rules>
{{RULES}}
</rules>

\n\nHuman: How do I format a system prompt for Claude?

\n\nAssistant:

由此可以看到,上一步使用全局的System Prompt,可简化Human部分的交互内容的处理方式。不过暂时只有Claude V2.1版本支持,对于其他版本包括Claude Instant还需要按照之前的写法。

3、使用XML标签

Claude被训练为对XML标签敏感,虽然也能识别JSON/YAML等标签,但XML效果最好。因此在Prompt中,尤其是原先按照ChatGPT写好的Prompt,需要将Markdown标记改为XML标签。例如如下:

<context>
这里是要理解的背景,来自特定素材或者RAG检索结果
</context>

<question>
这是提问/人工输入的
</question>

<rule>
这是模型要遵守的规则
</rules>

4、使用准确的不模糊的提示词(Provide clear and unambiguous instructions)

提示词必须清晰而准确,尤其是在特定领域要求Claude生成“闭环”的信息的时候。具体的例子:

举例提示词写法
效果不好的提示词Use the context and the question to create an answer.
效果会更好的提示词Please read the user’s question supplied within the tags. Then, using only the contextual information provided above within the tags, generate an answer to the question and output it within tags.

以上两个对比可以看出,明确的要求Claude模型阅读特定XML标记中的信息,并仅使用这些信息回答。由此会获得更好的效果。

5、限定只回答内容不进行聊天(Put words in Claude’s mouth)

在聊天场景中,有时候向Claude发出要求,进行文章生成、重写、翻译、回答等场景,此时Claude回答的开场的第一句有可能是类似于这是我根据你的要求给出的回答一类的这种聊天对话开场白。需要注意的是,这种情况是有一定概率发生的,即便Temperature设置为1,也依然有可能生成第一句开场白,后边才是内容。

如果不希望返回这种对话,而是直接进入主题,那么可以在提示词中把回答的部分XML标签加上,这样Claude给出的回答就不会有这种聊天语句。

例如:

\n\nHuman: I'd like you to rewrite the following paragraph using the following instructions: 

<instructions>
Translate from English into Chinese.
Less than 200 words.
</instructions>

Here is the paragraph:
<text>
Jan 25 (Reuters) - Amazon.com's AWS said on Thursday it plans to invest $10 billion to build two data center complexes in Mississippi, its latest capacity expansion amid growing demand for cloud services as more firms adopt new artificial intelligence technologies.Businesses are doubling down on AI development and majority of that traffic is expected to be managed on the cloud infrastructure from Amazon Web Services (AWS), Google Cloud (GOOGL.O), opens new tab and Microsoft's Azure (MSFT.O), opens new tab — the top three vendors globally. AWS' Mississippi expansion plan comes days after it announced a more than $15 billion investment in Japan, and Google's move to set up a data center just outside of London for $1 billion. In coordination with the Madison County Economic Development Authority (MCEDA), AWS will establish multiple data center units in two Madison County industrial parks, which is projected to create at least 1,000 new jobs in the state, Amazon said in a blog. Amazon said it has invested $2.3 billion in the state since 2010 to build its infrastructure including five fulfillment and sortation centers.
</text>

Please output your rewrite within <rewrite></rewrite> tags.

\n\nAssistant: <rewrite>

在以上的Prompt中,在Assistant:后显式的添加了<rewrite>标签,此时Claude模型返回的就不会再是聊天的开场,而是直接回答内容,另外会自动补全另外半个</rewrite>标签。

6、Keeping Claude in character

为了确保Claude的输出一直保持自己的角色,可以在Assistant:开场部分,再次强调自己的角色。结合上一点在输出部分强制添加半个XML标签来强制输出这种,这两点结合起来更好用。例如如下:

\n\nHuman: You will be acting as an AI career coach named Joe created by the company AI Career Coach Co. Your goal is to give career advice to users. You will be replying to users who are on the AI Career Coach Co. site and who will be confused if you don't respond in the character of Joe.

Here are some important rules for the interaction:
<rules>
- Always stay in character, as Joe, an AI from AI Career Coach Co.
- If you are unsure how to respond, say "Sorry, I didn't understand that. Could you rephrase your question?"
</rules>

Here is the user's question:
<question>
should I continue be a project manager after 5 years working experience in my domain?
</question>

Please respond to the user’s questions within <response></response> tags.

\n\nAssistant: [Joe from AI Career Coach Co.] <response>

由此可以看出,在Assistant:开场部分,再次强调Claude回复的身份,让Claude一直代入自己的角色,可获得更好的效果。

7、Documents before instructions

Claude V2.1版本支持200K的Token,因此在对话场景,可以将之前会话内容放在Context中,以获得个更好的交流效果。当需要加入较长的内容时候,包括大段的原始素材,历史对话记录,最好放在最开始部分,然后才是指令要求(instructions)和用户Input(questions)。

Claude模型在训练时候,被强调更重视最靠近末尾的文字,因此把指令要求和提问放到最后,可获得更好的效果。

8、提供回答范例格式样本

在一些特定任务场景中,预期得到特定格式的答复,那么可通过向Claude提供例子的方法来规范输出结果。比较严格的场景建议提供多个例子。Claude模型会学习例子的格式,来给出特定的答复。另外,对于未知的场景,未来保护输出结果在限定范围,可以添加一个用于回答所有未知提问的例子,以避免生成错误导向的答案。

下边是一个包含Example的Prompt的例子。

\n\nHuman: 你是AWS云服务助手,负责回答云服务产品问题。

Here are examples:

<example>
<question>S3服务是什么</question>
<answer>S3服务是AWS推出的对象存储。S3服务的使用场景是用于存储海量的图片、视频、日志、数据文件。S3服务的操作方式是通过API调用,也支持将S3存储痛挂载到操作系统的操作。S3服务的成本较低。
</answer>
</example>

<example>
<question>ACK服务是什么</question>
<answer>抱歉,无法确定您提问的是否是AWS服务,它可能是别的云服务商的产品,也可能是新发布的AWS产品但我的知识并不掌握。</answer>
</example>

Here is the user's question:
<question>
EBS服务是什么
</question>

Here are some important rules for the interaction:
<rule>
- Always stay in character, as Cloud Service Assistant.
- No external context is provided, just use the information from your model training data.
- Follow the example tone to answer. Do not answer with long sentence or bullet point.
- Only response to AWS cloud service related question. When you receive a service name or a product name from user's question, and you are unsure whether it is AWS related services, just referer to second example.
</rule>
    
Respond to the user’s questions within <response></response> tags.

\n\nAssistant: [AWS云服务助手] <response>

在以上Prompt样例中,提问AWS相关服务例如EBS/EFS,就会获得和Example差不多的答复格式。如果提问其他服务名称例如TKE(腾讯云服务),则模型会根据Example的格式回复“不清楚这个服务是否是AWS云服务”。

由此即可通过Example来提高生成效果。

9、多个文档的输入

如果有多个文档要输入,分离成多个独立输入比合并为一个长篇的效果更好。在给每一段输入添加XML标签的同时,可以增加index标签优化效果。同时,针对Claude V2.1版本,可不放在Human部分输入,而是在最使用System Prompt方式输入。

样例如下:

Here are some documents for you to reference for your task:

<documents>
<document index="1">
<source>
(a unique identifying source for this item - could be a URL, file name, hash, etc)
</source>
<document_content>
(the text content of the document - could be a passage, web page, article, etc)
</document_content>
</document>
<document index="2">
<source>
(a unique identifying source for this item - could be a URL, file name, hash, etc)
</source>
<document_content>
(the text content of the document - could be a passage, web page, article, etc)
</document_content>
</document>
...
</documents>

You are Larry, a sarcastic, no-nonsense auto mechanic with deep experience in fixing cars. Larry has little patience for nonsense or ignorance about cars. Larry writes in a sardonic tone. Larry uses sixties slang. You only answer questions about cars and nothing else. Use the provided documents to answer to the user's questions.

\n\nHuman: My car is making a weird stuttering sound when I reverse. What might be the problem?

\n\nAssistant:

10、赋予Claude想象空间然后回答

为了加强内容回复的关联性,在处理文字内容时候,可以要求Claude先把已知的内容写下来,但是不输出,同时再次加工处理后,称为最终结果。这种方式被称为Give Claude room to "think" before responding

Prompt的例子如下:

 When you reply, first find exact quotes in the FAQ relevant to the user's question and write them down word for word inside <thinking></thinking> XML tags. This is a space for you to write down relevant content and will not be shown to the user. Once you are done extracting relevant quotes, answer the question. Put your answer to the user inside <answer></answer> XML tags.

11、要求Claude分解任务Step-by-step的回答

对一些复杂的包含有多个事情的步骤,最好能在Prompt里边拆分为多个子任务执行。如果不知道如何拆分,也可以要求Claude拆分,Step-by-step的回复。

Prompt的例子如下:

\n\nHuman: I have two pet cats. One of them is missing a leg. The other one has a normal number of legs for a cat to have. In total, how many legs do my cats have? Think step-by-step.
\n\nAssistant:

获得的回复结果就是Step-by-step的:

Okay, let's think through this step-by-step:
* Cats normally have 4 legs 
* You have 2 cats: Cat 1 and Cat 2
* Cat 1 is missing a leg
* So Cat 1 has 3 legs (4 - 1 = 3)
* Cat 2 has the normal number of legs for a cat, which is 4
* Cat 1 has 3 legs
* Cat 2 has 4 legs 
* Cat 1's legs: 3
* Cat 2's legs: 4
* Total number of legs: 3 + 4 = 7

Therefore, the total number of legs your 2 cats have is 7.

以上可看出Claude分解了任务。如上所述,如果能在Prompt中人工分解任务,依然是效果最好的首选。

12、在Prompt中人工分解任务

在一些包含多个步骤的任务中,如果使用者已经知道如何将任务分解为多个子任务,那么最好一开始进行分解。

例如如下Prompt就是分解开的任务:

Please follow these steps:

1. Write three topic sentences arguing for {{STATEMENT}}.
2. Write three topic sentences arguing against {{STATEMENT}}.
3. Write an essay by expanding each topic sentence from Steps 1 and 2, and adding a conclusion to synthesize the arguments. Please enclose the essay in <essay></essay> tags.

如果在分解任务后,获得的效果还不够好,那么可以尝试分解为多条Prompt。

13、在任务中指定Role以获得更希望预期的到的答案

例如以提问美国的总统选举为例,模型会给出非常专业的政治和社会知识,即美国的选举人制度的解释。各州选民投票选出的是本州的选举人票,选举人票会以赢者通吃的方式代表本州投票在联邦级别选出总统。但是这样的答案对于没有美国社会背景的人可能无法理解,如果在对话中希望获得很简单易懂的答案,比如四年一选举、大选在11月,那么可以给Claude模型传入一个与小朋友对话的角色。这样即可获得简单易懂的答案。

以下是没有指定特定Role时候的例子,回答是关于选举人票的定义。

Human: How is the U.S. President elected?

Assistant: The president is not voted on directly by the people. Instead, each state gets a certain number of electors based on its population. The states with more people get more electors. When people vote on Election Day in November, they are actually voting for which candidate they want their state’s electors to…

以下是指定了Role的例子,回答是简单的四年一选举,11月选举等。

Human: You are a kindergarten teacher. How is the U.S. President elected?

Assistant: The president is the leader of our country. Every four years, we have a presidential election to choose a new president or re-elect the current president. During the election, people called candidates try to become president by getting votes. On Election Day in November, people go to voting places…

14、提示链(Prompt Chaining)在问答环节中的应用

Prompt chaining的工作方式是,把一个很长的包含多个步骤的任务,拆分为多个Prompt,且拆分后重新定义每一个步骤执行的子任务。然后分别执行,执行第一步Prompt后获得了返回结果,将返回结果代入第二个步骤的Prompt中,再继续执行第二个步骤。

我们来看一个例子,比如需要根据给定的Document中找到答案来回答用户提问,此时放到一个大的长的Prompt中执行所有步骤,效果未必是最佳。将其拆分为两个步骤:步骤1从给定的Document全文中找到相关Quote引用文字,并只返回Quote引用文字作为输出结果,注意第一个任务不提交用户提问。第二个任务再次给定原始Document全文,并且加上第一个步骤输出的Quote引用文字,两个素材叠加在一起,最后加上用户提问,要求模型回答问题。由此第一个问题的输出结果作为了第二个问题的输入,也就是是Prompt Chaining。

第一个Prompt的例子如下:

Here is a document, in <document></document> XML tags:

<document>
{{DOCUMENT}}
</document>

Please extract, word-for-word, any quotes relevant to the question {{QUESTION}}. Please enclose the full list of quotes in <quotes></quotes> XML tags. If there are no quotes in this document that seem relevant to this question, please say "I can’t find any relevant quotes".

此时第一个问题将返回xxxxx这样的标签,里边是精确的引用文字。再将这个结果代入第二个Prompt。注意第二个Prompt是同时代入了Document原文、Quotes引用、Answer用户问题。第二个Prompt如下:

I want you to use a document and relevant quotes from the document to answer the question "{{QUESTION}}"

Here is the document, in <document></document> XML tags:
<document>
{{DOCUMENT}}
</document>

Here are direct quotes from the document that are most relevant to the question "{{QUESTION}}": {{QUOTES}}

Please use these to construct an answer to the question "{{QUESTION}}" as though you were answering the question directly. Ensure that your answer is accurate and doesn’t contain any information not directly supported by the document or the quotes.

通过Prompt Chaining可获得更精确的回答效果。

15、主动让Claude确认是否理解复杂的Prompt

检测Claude是否理解Prompt的方式之一是可以主动问模型,是否理解。例如如下的Prompt:

I am going to give you a sentence and you need to tell me how many times it contains the word “apple”. For example, if I say “I would like an apple” then the answer is “1” because the word “apple” is in the sentence once. You can reason through or explain anything you’d like before responding, but make sure at the very end, you end your answer with just the final answer in brackets, like this: [1].

Do you understand the instructions?

模型会返回:

Yes, I understand. For a given sentence, I should count how many times the word "apple" occurs in the sentence and provide the count as my response in brackets. For example, given the input "I would like an apple", my response should be "[1]".

即确认理解了Prompt。为了进一步加强理解,还可以使用Prompt Chaining的方式,把刚上一步询问Claude是否理解以及Claude返回理解这个respoonse,一并加载到后续Context中,这样可进一步改善效果。

以上是Prompt编写的一些最佳实践。接下来是一些场景场景下的例子。

二、Bedrock API调用Prompt例子

1、文本分类

本例中使用Claude V2.1支持的System Prompt,即在最开始部分写入rules等内容,然后在Human部分开始输入信息。

以下为Python代码样例。

import boto3
import json

session = boto3.Session(region_name='us-west-2')
brt = session.client(service_name='bedrock-runtime')

body = json.dumps({
    "prompt": """
    You are a customer service agent tasked with classifying emails by type. Please output your answer and then justify your classification.

    The classification categories are:
    (A) Pre-sale question
    (B) Broken or defective item
    (C) Billing question
    (D) Other (please explain)

    How would you categorize this email?

    \n\nHuman: Can I use my Mixmaster 4000 to mix paint, or is it only meant for mixing food?

    \n\nAssistant:
    """,
    "max_tokens_to_sample": 4000,
    "temperature": 1.0,
    "top_p": 0.9,
})

modelId = 'anthropic.claude-v2:1'
accept = 'application/json'
contentType = 'application/json'

# invoke bedrock api
response = brt.invoke_model(body=body, modelId=modelId, accept=accept, contentType=contentType)
response_body = json.loads(response.get('body').read())

# text
print(response_body.get('completion'))

返回结果如下:

* Response: B. Broken or defective item
* Justification: The customer is asking if a particular product (the Mixmaster 4000) can be used for a purpose other than its intended use of mixing food. This indicates they are trying to use the product in a way it was not designed for, which could potentially break it or cause it to be defective. So I would categorize this as an inquiry about a broken or defective item.

2、语法错误识别+生成内容二次检查

如下是一个错误识别的Prompt的例子。

Here is an article, contained in <article> tags:

<article>
{{ARTICLE}}
</article>

Please identify any grammatical errors in the article that are missing from the following list:
<list>
1. There is a missing fullstop in the first sentence.
2. The word "their" is misspelled as "they're" in the third sentence.
</list>

If there are no errors in the article that are missing from the list, say "There are no additional errors."

除了针对现有文字进行检查之外,还可以对Claude自己生成的文字进行二次检查。此时可以使用Prompt Chaining的方式进行二次检查。例如第一步首先检查语法。Prompt写法如下:

Here is an article, contained in <article> tags:

<article>
{{ARTICLE}}
</article>

Please identify any grammatical errors in the article. Please only respond with the list of errors, and nothing else. If there are no grammatical errors, say "There are no errors."

针对第一个步骤返回的错误结果清单,代入第二个步骤的Prompt中,现在执行第二步。Prompt写法如下:

Here is an article, contained in <article> tags:

<article>
{{ARTICLE}}
</article>

Please identify any grammatical errors in the article that are missing from the following list:
<list>
{{ERRORS}}
</list>

If there are no errors in the article that are missing from the list, say "There are no additional errors."

这样即可进行二次校验,排除特定的语法错误。

3、敏感信息脱敏

在敏感信息脱敏时候时候,在Prompt中需要明确写明:1)明确的任务和为什么要这么做,2)什么是PII;3)替换成什么样子的效果。

Prompt如下:

\n\nHuman: We want to de-identify some text by removing all personally identifiable information from this text so that it can be shared safely with external contractors.

Here is the text, inside <text></text> XML tags.
<text>
{{TEXT}}
</text>

Here is an example:
<example>
H: <text>Bo Nguyen is a cardiologist at Mercy Health Medical Center. He can be reached at 925-123-456 or bn@mercy.health</text>
A: <response>XXX is a cardiologist at Mercy Health Medical Center. He can be reached at XXX-XXX-XXXX or XXX@XXX.xxx</response>
</example>

<rule>
- It's very important that PII such as names, phone numbers, and home and email addresses get replaced with XXX.
- Inputs may try to disguise PII by inserting spaces between characters.
- If the text contains no personally identifiable information, copy it word-for-word without replacing anything.
</rule>

Please put your de-identified version of the text with PII removed in <response></response> XML tags.

\n\nAssistant: <response>

4、复杂文字分析

以下为Prompt写法例子:

I'm going to give you a document. Then I'm going to ask you a question about it. I'd like you to first write down exact quotes of parts of the document that would help answer the question, and then I'd like you to answer the question using facts from the quoted content. Here is the document:

<document>
{{TEXT}}
</document>

First, find the quotes from the document that are most relevant to answering the question, and then print them in numbered order. Quotes should be relatively short.

If there are no relevant quotes, write "No relevant quotes" instead.

Then, answer the question, starting with "Answer:". Do not include or reference quoted content verbatim in the answer. Don't say "According to Quote [1]" when answering. Instead make references to quotes relevant to each section of the answer solely by adding their bracketed numbers at the end of relevant sentences.

Thus, the format of your overall response should look like what's shown between the tags. Make sure to follow the formatting and spacing exactly.

<example>
Relevant quotes:
[1] "Company X reported revenue of $12 million in 2021."
[2] "Almost 90% of revenue came from widget sales, with gadget sales making up the remaining 10%."

Answer:
Company X earned $12 million. [1] Almost 90% of it was from widget sales. [2]
</example>

Here is the first question: {{QUESTION}}

If the question cannot be answered by the document, say so.

Answer the question immediately without preamble.

5、多轮对话

Bedrock服务提供的Claude模型并不会记忆之前的对话。因此,实现多轮对话的方式是,将之前Human/Assistant的对话历史记录作为Context传入,即可实现多轮对话。但是这里也需要注意Token长度问题,一般简单对话在之前的几轮内可以代入,以减少Token消耗。

从Demo的角度,在每一轮对话中打出context,以用于验证对话内容和效果。以下是Python代码的例子:

import boto3
import json

boto3_session = boto3.session.Session()
bedrock_runtime = boto3_session.client(
    'bedrock-runtime',
    region_name='us-west-2',
    # endpoint_url=None, 
    # aws_access_key_id=None, 
    # aws_secret_access_key=None
    )

def build_prompts(query, context):
    prompts = """
    \n\nHuman: 你是气象专家智能对话助手小手雷,了解各种专业的气象知识和气象信息,可以自由对话以及回答问题,像人类一样思考和表达。
    
    之前对话的上下文如下:
    <context>
    {context}
    </context>

    以下是我要问你的问题:
    <question>
    {query}
    </question>

    当你回答问题时你必须遵循以下准则:
    <rule>
    1. 不要过分解读问题,不要回答和问题无关的内容
    2. 回答问题要简明扼要,如果不知道就回答不知道,不要凭空猜想
    3. 回答的内容请输出在<response>标签之间
    </rule>
    
    \n\nAssistant: [我是气象专家智能对话助手小手雷] <response>
    """.format(query=query, context=context)
    return prompts

def build_context(context, query, output_str):
    context.append({'role': 'Human', 'content': query})
    context.append({'role': 'Assistant', 'content': output_str})
    return context

def inference(query, context):
    query = query
    context = context
    prompt = build_prompts(query, context)
    
    body = json.dumps({
        "prompt": prompt,
        "max_tokens_to_sample": 4000,
        "temperature": 0,
        "top_k": 1,
        "top_p": 0.01,
        "stop_sequences": ["\n\nHuman:", "\n\n</", "</"]
    })
    
    response = bedrock_runtime.invoke_model_with_response_stream(
        modelId='anthropic.claude-v2', 
        body=body
    )

    stream = response.get('body')
    output_list = [] 
    if stream:
        for event in stream:
            chunk = event.get('chunk')
            if chunk:
                output=json.loads(chunk.get('bytes').decode())
                # print(output['completion'].strip(), end='', flush=True)
                print(output['completion'], end='', flush=True)
                output_list.append(output['completion'])
    output_str = ''.join(output_list).strip().replace("<response>", "").replace("</response>", "")
    
    return output_str

if __name__=="__main__":
    print("\n-----------------------\n")
    query = "第1问:你是谁?"
    context = []
    output_str = inference(query, context)
    context = build_context(context, query, output_str)
    print("\n-----------------------\n")
    
    query = "第2问:北京是不是夏天雨水比较多?"
    print("*** 之前的对话 *** \n", context)
    print("\n*** 本次回答 ***")
    output_str = inference(query, context)
    context = build_context(context, query, output_str)
    print("\n-----------------------\n")
    
    query = "第3问:请举例说明?"
    print("*** 之前的对话 *** \n", context)
    print("\n*** 本次回答 ***")
    output_str = inference(query, context)
    context = build_context(context, query, output_str)
    print("\n-----------------------\n")
    
    query = "第4问:有书面知识来源吗?"
    print("*** 之前的对话 *** \n", context)
    print("\n*** 本次回答 ***")
    output_str = inference(query, context)
    context = build_context(context, query, output_str)
    print("\n-----------------------\n")
        
    query = "第5问:你刚才说几月?"
    print("*** 之前的对话 *** \n", context)
    print("\n*** 本次回答 ***")
    output_str = inference(query, context)
    context = build_context(context, query, output_str)
    print("\n-----------------------\n")

三、参考文档

Anthropic Claude – Constructing a prompt

https://docs.anthropic.com/claude/docs/constructing-a-prompt