Context-Aware Question-Answer Pairing and Dialogue Act Tagging from Instant Message Chatlog

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/86537

書目資料匯出

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/86537

題名:	Context-Aware Question-Answer Pairing and Dialogue Act Tagging from Instant Message Chatlog
作者:	陶玫婉;Poopradubsil, Thamolwan
貢獻者:	資訊工程學系
關鍵詞:	對話系統;即時通訊;對話解開;響應選擇;文本分類;對話行為標記;信息檢索;dialog system;instant messaging;conversation disentanglement;response selection;text classification;dialogue act tagging;information retrieval
日期:	2021-07-26
上傳時間:	2021-12-07 12:57:02 (UTC+8)
出版者:	國立中央大學
摘要:	在本論文中，我們研究了數據準備過程的兩個不同任務：問答配對準備 (Question-Answer Pair Preparation) 和對話行為標註 (Dialogue Act Tagging)。與其他作品不同，我們的數據來自即時通訊（Instant Messaging: IM）平台，參與者更常將長句拆分成短句，分散成多條消息中發送。因此，在準備問答對時，我們還考慮了一個稱為消息合併任務的任務，用以確定這些消息是否需要進行合併以進行回复預測任務。我們提出了一個 CONTEXT-AOA 模型，將上下文（先前的對話）作為除成對消息之外的附加輸入。其次，在對話行為標註任務，當我們無法獲得更多標註數據時，我們探索了使用域外數據集來處理該任務的可能性。我們對這個任務進行了兩個實驗。第一個實驗是零樣本學習實驗，我們只使用域外數據集訓練模型並在我們的數據集上測試它們，另一個實驗是我們將一些數據集與外部數據一起包含在模型中域數據集並在剩餘數據上測試它們。我們還提出了一個 CONTEXT-BERT-CRF 模型，它利用了 BERT 微調的能力，同時仍然能夠保留對話中的所有話語並將它們全部提供給模型。我們在問答對準備任務和對話行為標記任務上的實驗顯示，我們提出的模型在大多數實驗中都能夠勝過所有現有模型。為了演示這兩個任務的使用，我們也構建了基於檢索的聊天機器人。此聊天機器人不僅根據用戶的輸入從前述準備的問答對中選擇回應，同時也應用對話行為標註資訊來幫助選擇答案。;In this thesis, we study two different tasks for data preparation process: Question-Answer Pair Preparation and Dialogue Act Tagging. Unlike other works, our data comes from instant messaging (IM) platform which has different characteristic as participants could split long sentences into short utterances and send them in multiple messages. Therefore, in the preparation for question-answer pairs, we also consider a task called message merging task which aims to determine whether those messages need to be merged or not before generating message pairs for reply-to prediction task. We propose a CONTEXT-AOA model to include the context (previous dialogue) as additional input apart from pairwise messages. For dialogue act tagging task, we explore the possibility of using out-of-domain dataset to deal with this task when we are unable to obtain more annotated data. We conduct two experiments on this task. The first experiment is a zero-shot learning experiment where we train the models using only out-of-domain datasets and test them on our dataset, and another experiment is where we include some of our dataset to the the models along with the out-of-domain datasets and test them on the remaining data. We also propose a CONTEXT-BERT-CRF model which utilizes the ability of BERT and still be able to include all of the utterances from the conversation to the model. Our experiments on both question-answer pair preparation task and dialogue act tagging task show that our proposed models are able to outperform all of the existing models in most of the experiments. To demonstrate the use of these two tasks, the retrieval-based IR-based chatbot has been built. The chatbot will select the response from Q\&A pairs prepared in question-answer pair preparation task based on the input from user and return the it back to user. We also apply dialogue act tagging task to help with the answer selection.
顯示於類別:	[資訊工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	86	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....