基於提示學習的中文事實查核任務之研究;The Study of Prompt Based Learning for Chinese Fact Checking

NCUIR > College of Electrical Engineering & Computer Science > Graduate Institute of Computer Science and Information Engineering > Electronic Thesis & Dissertation > Item 987654321/93208

Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/93208

Title:	基於提示學習的中文事實查核任務之研究;The Study of Prompt Based Learning for Chinese Fact Checking
Authors:	丁于晏;Ting, Yu-Yen
Contributors:	資訊工程學系
Keywords:	事實查核;提示學習;提示微調;參數高效微調;Fact Checking;Prompt Based Learning;Prompt Tuning;Parameter-Efficient Fine-Tuning
Date:	2023-07-24
Issue Date:	2024-09-19 16:48:19 (UTC+8)
Publisher:	國立中央大學
Abstract:	在當今資訊蓬勃發展的時代，網路上充斥各種主張，這些主張的真偽往往難以分辨，而人工方式審核這些主張的真實性並不容易，因此，需要透過自動事實查核解決這個問題。本篇論文研究重點在於中文事實查核任務，過去的研究主要集中在英文或多語言的資料集上，並且著重於傳統的預訓練和微調方法。因此本研究旨在利用新興自然語言範式「預訓練、提示、預測」的提示學習，來提升中文事實查核的效能。事實查核任務包括證據檢索 (Evidence Retrieval) 及宣稱驗證 (Claim Verification) 兩個子任務。在宣稱驗證方面，我們探討多種提示學習策略在宣稱驗證任務上。由於提示學習需要設計一個模板加入到輸入端，我們會分為人工設計的模板和自動生成的模板。對於自動生成方法，我們採用 Automated Prompt Engineer (APE) [1] 來生成的提示模板，研究結果顯示提示學習有助於提升宣稱驗證的 F1 效能 1%-2% (從 78.99% 到 80.70%)。在證據檢索方面，我們使用監督式的 SentenceBERT [2] 和非監督式的 PromptBERT [3] 改善證據檢索效能。非監督式 PromptBERT 可增加 F1 效能 18% (從 12.66% 到 30.61%)，而監督式SentenceBERT 更可大幅提升 F1 效能 88.15%。最後，我們整合宣稱驗證和證據檢索後，在中文事實查核的資料集 CHEF 上，F1 效能可以達到 80.54%，大幅超過基線效能 63.47%，甚至超過使用人工標記的正確證據 (Golden Evidence) 的效能 78.99%。整體而言，提示學習在中文事實查核的效能能夠改善傳統微調的效能。 ;With the wide spread of information, there are many fake claims on the Web, but it is difficult for humans to check whether the claim is true or not. Therefore, automated fact-checking can solve the problem. Our research focuses on Chinese fact-checking. Previous work has focused on English or multilingual fact-checking datasets and on pre-train and fine-tuning methods. Therefore, we want to enhance the performance of Chinese fact-checking through prompt-based learning. The fact-checking task consists of two subtasks evidence retrieval and claim verification. Since prompt based learning requires designing a template to be added to the input, we divide it into manually designed templates and automatically generated templates. For the automated method, we generate the template by Automatic Prompt Engineer (APE) and use various prompt-based learning training strategies for claim verification. Additionally, we will use supervised SentenceBERT [2] and unsupervised PromptBERT [3] models to improve the evidence retrieval. We show that prompt-based learning can improve the F1 score of claim verification by 1%-2% (from 78.99% to 80.70%), and both evidence retrieval models also show significant performance improvements by 18% (from 12.66% to 30.61%) and achieve the performance at 88.15%. Finally, we combine evidence retrieval with claim verification to construct the complete pipeline for fact-checking. We achieve an impressive F1 score of 80.54% which outperforms the baseline 63.47%, and even outperforms the gold evidence based claim verification, increasing from 78.99% to 80.54%.
Appears in Collections:	[Graduate Institute of Computer Science and Information Engineering] Electronic Thesis & Dissertation

Files in This Item:

File	Description	Size	Format
index.html		0Kb	HTML	32	View/Open

社群 sharing

Loading...