A Prompt-Based Framework for the Automated Generation of Natural Language Instructions Across Diverse Domains and Tasks

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/93002

jsp.display-item.identifier=請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/93002

题名:	A Prompt-Based Framework for the Automated Generation of Natural Language Instructions Across Diverse Domains and Tasks
作者:	林禹彤;Lin, Yu-Tung
贡献者:	資訊工程學系
关键词:	提示工程;資料自動生成;指令資料;prompt engineering;data generation;instruction data
日期:	2023-05-30
上传时间:	2024-09-19 16:38:12 (UTC+8)
出版者:	國立中央大學
摘要:	隨著 ELMo, GPT, BERT 等大型語言模型的興起，自然語言處理領域的研究逐漸轉向兩階段的訓練模式，預訓練大型語言模型及以下游任務為目標做微調，而後續對於研究模型適應於多任務或未見過的任務上，逐漸發現預訓練模型的泛化潛力，進而引導出指令微調 (instruction tuning) 的概念，這也同時影響了標註資料從原有針對不同任務所設計的資料，轉向需要指令形式的資料 (instructional data)，過去的方法曾經為將過去標註的各種自然語言處理任務的資料加上指令，成為指令微調的材料，而這浩大的工程也引起了如何自動化產生指令資料的研究，本論文提出了新的以提示工程為基礎，自動化產生指令資料的框架，我們設計了五種提示引導現有的經指令微調過的語言模型，產生對應不同領域主題和任務，超過 1 萬筆的指令資料，此框架的可控制指定任務的特性，改善了傳統利用現有自然語言處理資料所產生的指令資料，和後來學者所提出的自動化產生指令資料的方法，兩者皆出現了資料任務類型的不平衡現象。並且我們為第一個嘗試使用自動化方法產生可用於強化學習中的獎勵模型資料，雖然本實驗並無直接測試資料在強化學習中的影響力，但本實驗利用指令微調 GPT-3，並且利用近幾個月所提出的 G-Eval 方法來自動化評估不管是產生的資料本身又或是指令微調後的結果，得到了優於基線 0.15 到 0.45 差距的成果。;With the emergence of large-scale language models such as ELMo, GPT, BERT, the focus of research in natural language processing has shifted towards two-stage training paradigms. This involves pretraining large language models and fine-tuning them on downstream tasks. The progress in multitasking research and the exploration of applicability to unseen tasks have revealed the potential for generalization in pretrained language models. This has paved the way for the development of the concept of instruction tuning. This shift in research direction has also impacted the type of labeled data required. Instead of task-specific annotated data, there is a need for instructional data. Previous approaches involved adding instructions to existing annotated natural language processing datasets. However, this proved to be a significant undertaking. Subsequently, researchers explored automated methods for generating instructional data. In this experiment, we propose a novel prompt-based framework for automated instruction data generation. We design five prompts to guide existing instruction-tuned language models in generating instructional data across various domains and tasks, resulting in a dataset of over 10,000 instructions. This framework provides control over the specified task characteristics, improving upon both traditional approaches using existing NLP data and automated methods proposed by other researchers. Both previous approaches suffered from data and task type imbalances. Furthermore, we are the first to attempt generating reward model data for reinforcement learning using an automated approach. While the experiment did not directly evaluate the impact of the data in reinforcement learning, we employed instruction tuning with GPT-3 and utilized the recently proposed G-Eval method to automate the evaluation of both the generated data and the instruction-tuned results. Our findings show significant improvements ranging from 0.15 to 0.45 over the baselines.
显示于类别:	[資訊工程研究所] 博碩士論文

文件中的档案:

档案	描述	大小	格式	浏览次数
index.html		0Kb	HTML	13	检视/开启

在NCUIR中所有的数据项都受到原著作权保护.

社群 sharing

数据加载中.....