Prompt Engineering 提示词工程开发指南：八、应用实战（上）

2023-08-07 | 0 评论 | 0 浏览

一、程序辅助语言模型

高等人（2022）提出了一种使用LLM（Language Model）读取自然语言问题并生成程序作为中间推理步骤的方法。这种方法被称为程序辅助语言模型（PAL），它与思路链提示不同，它不是使用自由形式的文本来获得解决方案，而是将解决步骤转移到程序运行时，例如Python解释器中。

PAL

让我们以LangChain和OpenAI GPT-3为例，看一个实例。我们有兴趣开发一个简单的应用程序，能够解释被问的问题，并利用Python解释器提供答案。

具体来说，我们有兴趣创建一个功能，允许使用LLM来回答需要日期理解的问题。我们将向LLM提供一个包含一些示例的提示，这些示例是从这里采用的（打开新标签）。

这是我们需要的导入语句：

import openai
from datetime import datetime
from dateutil.relativedelta import relativedelta
import os
from langchain.llms import OpenAI
from dotenv import load_dotenv

Let's first configure a few things:

load_dotenv()
 
# API configuration
openai.api_key = os.getenv("OPENAI_API_KEY")
 
# for LangChain
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")

Setup model instance:

llm = OpenAI(model_name='text-davinci-003', temperature=0)

Setup prompt + question:

question = "Today is 27 February 2023. I was born exactly 25 years ago. What is the date I was born 
in MM/DD/YYYY?"
 
DATE_UNDERSTANDING_PROMPT = """
# Q: 2015 is coming in 36 hours. What is the date one week from today in MM/DD/YYYY?
# If 2015 is coming in 36 hours, then today is 36 hours before.
today = datetime(2015, 1, 1) - relativedelta(hours=36)
# One week from today,
one_week_from_today = today + relativedelta(weeks=1)
# The answer formatted with %m/%d/%Y is
one_week_from_today.strftime('%m/%d/%Y')
# Q: The first day of 2019 is a Tuesday, and today is the first Monday of 2019. What is the date today
 in MM/DD/YYYY?
# If the first day of 2019 is a Tuesday, and today is the first Monday of 2019, then today is 6 days
 later.
today = datetime(2019, 1, 1) + relativedelta(days=6)
# The answer formatted with %m/%d/%Y is
today.strftime('%m/%d/%Y')
# Q: The concert was scheduled to be on 06/01/1943, but was delayed by one day to today. What is the
 date 10 days ago in MM/DD/YYYY?
# If the concert was scheduled to be on 06/01/1943, but was delayed by one day to today, then today is
 one day later.
today = datetime(1943, 6, 1) + relativedelta(days=1)
# 10 days ago,
ten_days_ago = today - relativedelta(days=10)
# The answer formatted with %m/%d/%Y is
ten_days_ago.strftime('%m/%d/%Y')
# Q: It is 4/19/1969 today. What is the date 24 hours later in MM/DD/YYYY?
# It is 4/19/1969 today.
today = datetime(1969, 4, 19)
# 24 hours later,
later = today + relativedelta(hours=24)
# The answer formatted with %m/%d/%Y is
today.strftime('%m/%d/%Y')
# Q: Jane thought today is 3/11/2002, but today is in fact Mar 12, which is 1 day later. What is the
 date 24 hours later in MM/DD/YYYY?
# If Jane thought today is 3/11/2002, but today is in fact Mar 12, then today is 3/1/2002.
today = datetime(2002, 3, 12)
# 24 hours later,
later = today + relativedelta(hours=24)
# The answer formatted with %m/%d/%Y is
later.strftime('%m/%d/%Y')
# Q: Jane was born on the last day of Feburary in 2001. Today is her 16-year-old birthday. What is the
 date yesterday in MM/DD/YYYY?
# If Jane was born on the last day of Feburary in 2001 and today is her 16-year-old birthday, then
 today is 16 years later.
today = datetime(2001, 2, 28) + relativedelta(years=16)
# Yesterday,
yesterday = today - relativedelta(days=1)
# The answer formatted with %m/%d/%Y is
yesterday.strftime('%m/%d/%Y')
# Q: {question}
""".strip() + '\n'

llm_out = llm(DATE_UNDERSTANDING_PROMPT.format(question=question))
print(llm_out)

exec(llm_out)
print(born)

这将输出以下结果：02/27/1998

二、生成数据

LLM具有生成连贯文本的强大能力。使用有效的提示策略可以引导模型产生更好、更一致和更真实的回复。LLM还可以特别用于生成数据，这对于进行各种实验和评估非常有用。例如，我们可以使用它快速生成情感分类器的样本，如下所示：

提示词：

生成10个用于情感分析的示例。这些示例被分为正面或负面。产生2个负面示例和8个正面示例。对于这些示例，
请使用以下格式：
问题： <句子>
回答： <情感>

输出结果：

问题：我今天过得很糟糕，碰到了好多麻烦。
回答：负面

问题：我非常失望地发现这件商品的质量很糟糕。
回答：负面

问题：我很开心地收到了一封工作面试的邀请函。
回答：正面

问题：这家餐厅的食物非常美味，服务也很周到。
回答：正面

问题：我非常喜欢这个音乐会，音乐和表演都太棒了。
回答：正面

问题：这本小说的情节扣人心弦，让我无法放下。
回答：正面

问题：我感到无比感激，因为有一群好朋友一直支持着我。
回答：正面

问题：看到那只可爱的小猫咪，我感到非常愉快和放松。
回答：正面

问题：这个假期我去了一个令人惊叹的海滩度假胜地，度过了愉快的时光。
回答：正面

问题：听到这个好消息，我兴奋得无法用言语形容。
回答：正面

这非常有用。实际上，我们在指南的另一部分中使用这个例子进行了不同的测试。

三、生成代码

像ChatGPT这样的LLM在代码生成方面非常有效。在这一部分，我们将介绍如何使用ChatGPT进行代码生成的多个示例。

在下面的所有示例中，使用OpenAI的Playground（聊天模式）和gpt-3.5-turbo模型。

与OpenAI的所有聊天模型一样，您可以使用系统消息来定义响应的行为和格式。我们将使用以下System Message作为提示示例：

您是一位有帮助的代码助手，可以教初级开发者如何编写代码。您选择的编程语言是Python。不要解释代码，
只需生成代码块本身。

1、基础样例

User Message将是提示本身，包括您希望模型生成的具体代码的指示。以下是代码生成的第一个基本提示的快照。请注意，最终的代码是由助手生成的。

请用python实现下面这段逻辑：询问用户的姓名并且对用户说“嗨”

提示词：

请用python实现下面这段逻辑：询问用户的姓名并且对用户说“嗨”

2、将注释转换为代码

您还可以使用这些LLM的代码生成能力，仅通过注释生成代码。让我们看另一个示例，将指示作为注释块传递：

提示词：

"""
1. 创建一个电影列表
2. 创建一个电影评分列表
3. 将它们组合起来，生成一个包含10部电影及其评分的JSON对象。
"""

输出结果：

movies = ["The Shawshank Redemption", "The Godfather", "The Dark Knight", "Schindler's List", 
"Pulp Fiction", "The Lord of the Rings: The Return of the King", "Forrest Gump", 
"Star Wars: Episode V - The Empire Strikes Back", "Inception", "The Silence of the Lambs"]
 
ratings = [9.3, 9.2, 9.0, 8.9, 8.9, 8.9, 8.8, 8.7, 8.7, 8.6]
 
movie_ratings = {}
 
for i in range(len(movies)):
    movie_ratings[movies[i]] = ratings[i]
 
json_object = json.dumps(movie_ratings, indent=4)
 
print(json_object)

生成的代码看起来像是可以工作的代码。然而，输出中缺少一个重要的导入语句，即import json。虽然这些模型生成的代码是可行的，但您必须注意这样的细节，并始终确保测试您的代码。

3、完成函数或下一行

这些LLM还被整合到了像GitHub Copilot这样的工具中，使它们对开发者非常有用。一个有用的功能是模型能够完成函数。

提示词：

# function to multiply two numbers and add 75 to it
 
def multiply(

输出结果：

a, b):
    result = a * b
    result += 75
    return result

4、MySQL查询SQL生成

除了上面的基本代码生成示例，您还可以使用模型生成在编程的其他方面中可以使用的有用代码，比如生成和测试MySQL查询语句。

假设您有一个包含一些信息的数据集，您可以将其作为提示的一部分，并指示模型生成特定的查询语句。例如：

"""
Table departments, columns = [DepartmentId, DepartmentName]
Table students, columns = [DepartmentId, StudentId, StudentName]
Create a MySQL query for all students in the Computer Science department
"""

输出结果：

SELECT students.StudentId, students.StudentName
FROM students
INNER JOIN departments
ON students.DepartmentId = departments.DepartmentId
WHERE departments.DepartmentName = 'Computer Science';

模型生成了一个查询语句！然而，在这个示例中，我们编造的数据库，所以我们没有数据库可以测试生成的查询语句。如果您正在测试这个示例，创建这个数据库可能会很有用。幸运的是，您也可以使用ChatGPT来为您生成这个数据库。一种方法是使用上面稍微不同指令的修改版本来生成数据库模式：

Table departments, columns = [DepartmentId, DepartmentName]
Table students, columns = [DepartmentId, StudentId, StudentName]
Create a valid database schema with the above tables and columns

模型生成了以下数据库模式：

CREATE TABLE departments (
  DepartmentId INT PRIMARY KEY,
  DepartmentName VARCHAR(50)
);
CREATE TABLE students (
  DepartmentId INT,
  StudentId INT PRIMARY KEY,
  StudentName VARCHAR(50),
  FOREIGN KEY (DepartmentId) REFERENCES departments(DepartmentId)
);

您可以使用类似sqliteonline.com的工具创建一个临时数据库。

您还需要一些虚拟数据来测试原始生成的查询语句是否有效。我们也可以使用ChatGPT来生成虚拟数据！提示内容可能如下所示：

CREATE TABLE departments (
  DepartmentId INT PRIMARY KEY,
  DepartmentName VARCHAR(50)
);
CREATE TABLE students (
  DepartmentId INT,
  StudentId INT PRIMARY KEY,
  StudentName VARCHAR(50),
  FOREIGN KEY (DepartmentId) REFERENCES departments(DepartmentId)
);
Given the database schema above, generate valid insert statements include 4 rows for each table.

输出与插入语句如下所示：

INSERT INTO departments (DepartmentId, DepartmentName)
VALUES (1, 'Computer Science'), (2, 'Mathematics'), (3, 'Physics'), (4, 'Biology');
INSERT INTO students (DepartmentId, StudentId, StudentName)
VALUES (1, 101, 'John Doe'), (1, 102, 'Jane Doe'), (2, 201, 'Alice Smith'), (2, 202, 'Bob Johnson');

现在您有了数据库和虚拟数据，可以测试原始的查询语句了。根据虚拟数据，结果应该是两行，分别是Jane Doe和John Doe。

5、解释代码

如果您正在学习某种编程语言，提示模型解释某些代码片段可能会很有用。让我们重用上面生成的查询语句，并要求模型解释它。

如果您正在使用之前的系统消息，请注意指导模型的方式。例如，如果我们使用以下示例，模型会抱怨它是一个Python的代码助手，但它仍然提供了解释。有时模型会拒绝生成您的指导，因为它设计成遵循System Message。

提示词：

SELECT students.StudentId, students.StudentName
FROM students
INNER JOIN departments
ON students.DepartmentId = departments.DepartmentId
WHERE departments.DepartmentName = 'Computer Science';
请解释上面的SQL语句

输出结果：

这条SQL语句用于根据指定的条件查询学生表中与计算机科学系相关的学生信息。在这个例子中，使用了两个表：
学生表(students)和系别表(departments)。

其中，"SELECT students.StudentId, students.StudentName"表示选择学生表中的学生ID和学生姓名两个字段作为
输出结果。

"FROM students INNER JOIN departments ON students.DepartmentId = departments.DepartmentId"表示使用
INNER JOIN（内连接）将学生表和系别表连接起来，通过学生表的系别ID字段与系别表的系别ID字段进行匹配。这样，
只有在两个表中都有对应的记录时，才会返回结果。

"WHERE departments.DepartmentName = 'Computer Science'"表示通过指定一个条件，限制只返回系别名称
为'Computer Science'的记录。

因此，上述SQL语句的目的是查询与计算机科学系相关的学生的学生ID和学生姓名。

标题：Prompt Engineering 提示词工程开发指南：八、应用实战（上）
作者：michael
地址：https://blog.junxworks.cn/articles/2023/07/25/1690276186492.html