應用強化學習與知識圖譜於故事共述生成之研究

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：36

、訪客IP：3.22.171.136

姓名

李聿鎧(Yu-Kai Lee) 查詢紙本館藏

畢業系所

軟體工程研究所

論文名稱

應用強化學習與知識圖譜於故事共述生成之研究
(Story Co-telling Dialogue Generation via Reinforcement Learning and Knowledge Graph)

相關論文

★ 行程邀約郵件的辨識與不規則時間擷取之研究	★ NCUFree校園無線網路平台設計及應用服務開發
★ 網際網路半結構性資料擷取系統之設計與實作	★ 非簡單瀏覽路徑之探勘與應用
★ 遞增資料關聯式規則探勘之改進	★ 應用卡方獨立性檢定於關連式分類問題
★ 中文資料擷取系統之設計與研究	★ 非數值型資料視覺化與兼具主客觀的分群
★ 關聯性字組在文件摘要上的探討	★ 淨化網頁：網頁區塊化以及資料區域擷取
★ 問題答覆系統使用語句分類排序方式之設計與研究	★ 時序資料庫中緊密頻繁連續事件型樣之有效探勘
★ 星狀座標之軸排列於群聚視覺化之應用	★ 由瀏覽歷程自動產生網頁抓取程式之研究
★ 動態網頁之樣版與資料分析研究	★ 同性質網頁資料整合之自動化研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

模仿重述一則故事是一種培養學生敘事力的方法，但對於記憶力較差或是無法自己完成描述一個故事的學生來說，這也可能帶來一些困難。因此，我們希望利用自然語言處理技術，開發一款故事共述對話模組，該模組能與學生共述一則英語故事，藉此培養學生的敘事能力。然而，故事共述是一項相對較少人涉及且相對新穎的任務。其次，目前也沒有現成的故事共述對話語料集可供使用，若要求對話機器人從實際與學生互動中學習，可能會相當耗費時間與金錢成本，這使得我們需要使用機器對機器方法結合強化學習來生成相應的資料集；而缺乏強化式學習中所需的奬勵函數，也是系統設計的挑戰。
在故事共述中，模型需具備兩大能力：(1) 理解故事的內容，以掌握故事劇情和資訊；(2) 根據目前對話討論其餘故事相關劇情。我們採用開放領域資訊擷取技術來建構知識圖譜，故事知識圖譜不僅可以擷取重要資訊，還提供結構化的知識表示，有助於模型理解和組織故事資訊。同時，我們使用多代理人強化學習方法，讓兩個代理人根據對話歷史從知識圖譜中選擇相關的事實來生成回覆，並共同完成故事共述的任務。基於這些能力，對話模組可以在故事共述過程中有效引入故事元素，例如當用戶提到一個特定的情節或角色時，模型可以進一步展開故事情節，提供相關背景和發展。
透過強化學習方法，我們能根據目前的對話歷史與候選回覆中，做出更明智的選擇。相較於僅依照時間順序回覆，我們的模型經由自我訓練的獎勵評估，性能從67.01% 提升至70.81%，上升了約3.8%。

摘要(英)

We aim to develop a dialogue module for story co-telling using natural language processing techniques to help students improve their narrative abilities. However, this task is relatively less explored and lacks readily available dialogue datasets. To overcome this, we adopt a machine-to-machine approach with reinforcement learning to generate the dataset, although the absence of a reward function presents a design challenge.
In story co-telling, the model needs two main capabilities: (1) understanding the story content and (2) discussing relevant plot points based on the ongoing conversation. We use open-domain information retrieval to create a knowledge graph for the story, which captures essential information and helps the model comprehend and organize the story details. Using multi-agent reinforcement learning, two agents select relevant facts from the knowledge graph based on the conversation history to generate responses and complete the story co-telling task together. This enables the dialogue module to effectively introduce story elements during the co-telling process, like providing background and progression when the user mentions specific plots or characters.
Through reinforcement learning, we can make more informed choices based on the current conversation history and candidate responses. Compared to merely responding based on chronological order, our model′s performance improved from 67.01% to 70.81% through self-training with reward evaluation, resulting in an approximately 3.8% increase.

關鍵字(中)

★ 強化學習
★ 知識圖譜
★ 故事共述
★ 對話機器人

關鍵字(英)

★ Reinforcement Learning
★ Knowledge Graph
★ Story Co-telling
★ Chatbot

論文目次

摘要 i
Abstract ii
謝誌 iii
目錄 iv
圖目錄 vi
表目錄 vii
一、緒論 1
1-1 目標 2
1-2 挑戰 3
1-3 貢獻 3
二、相關研究 5
2-1 故事對話機器人 5
2-2 長文本故事理解 6
2-3 語言模型 7
2-3-1 T5: Text-To-Text Transfer Transformer 7
2-3-2 Sentence Transformers 8
2-3-3 ChatGPT 8
2-4 強化學習 8
2-4-1 基於價值方法 9
2-4-2 基於策略方法 10
2-4-3 價值-策略方法 10
2-5 多代理人強化學習 11
2-6 強化學習結合對話機器人 11
三、方法 12
3-1 長文本故事表示法 12
3-2 代理人 13
3-2-1 知識圖譜篩選決策 14
3-2-2 生成候選回覆 15
3-3 獎勵 15
3-3-1 對話歷史評估 15
3-3-2 實體關係評估 18
3-4 強化學習 18
四、實驗 21
4-1 資料集 21
4-2 環境 22
4-2-1 對話歷史評估模型訓練成效 22
4-2-2 比較不同獎勵對動作選擇之影響 24
4-2-3 故事共述成效 25
五、結論與未來展望 27
參考文獻 28

參考文獻

[1] 2030 雙語國家政策發展藍圖. https://www.ey.gov.tw/Page/448DE008087A1971/b7a931c4-c902-4992-a00c-7d1b87f46cea, 2018.
[2] Sebastian Wollny, Jan Schneider, Daniele Di Mitri, Joshua Weidlich, MarcRittberger, and Hendrik Drachsler. Are we there yet? - a systematic literaturereview on chatbots in education. Frontiers in Artificial Intelligence, 4, 2021.
[3] José Quiroga Pérez, Thanasis Daradoumis, and Joan Manuel Marquès Puig. Rediscovering the use of chatbots in education: A systematic literature review. Computer Applications in Engineering Education, 28(6):1549–1565, 2020.
[4] Chen-Chung Liu, Mo-Gang Liao, Chia-Hui Chang, and Hung-Ming Lin. An analysis of children’interaction with an ai chatbot and its impact on their interest in reading. Computers & Education, 189:104576, 2022.
[5] Seong Yeub Chu and Deok Gi. Min. Development of an ai chatbot-based teaching model for english picture book retelling activities. Modern English Education, 22(4):37–50, 2021.
[6] Wei-Nan Zhang, Zhigang Chen, Wanxiang Che, Guoping Hu, and Ting Liu. The first evaluation of chinese human-computer dialogue technology, 2019.
[7] Berkeley R Andrus, Yeganeh Nasiri, Shilong Cui, Benjamin Cullen, and Nancy Fulda. Enhanced story comprehension for large language models through dynamic document-based knowledge graphs. Proceedings of the AAAI Conference on Artificial Intelligence, 36(10):10436–10444, Jun. 2022.
[8] Zheng Zhang, Ying Xu, Yanhao Wang, Bingsheng Yao, Daniel Ritchie, Tongshuang Wu, Mo Yu, Dakuo Wang, and Toby Jia-Jun Li. Storybuddy: A human-ai collaborative chatbot for parent-child interactive storytelling with flexible parental involvement. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, CHI ’22, New York, NY, USA, 2022. Association for Computing Machinery.
[9] Guangxuan Xu, Paulina Toro Isaza, Moshi Li, Akintoye Oloko, Bingsheng Yao, Cassia Sanctos, Aminat Adebiyi, Yufang Hou, Nanyun Peng, and Dakuo Wang. Nece: Narrative event chain extraction toolkit, 2023.
[10] Gabor Angeli, Melvin Jose Johnson Premkumar, and Christopher D. Manning. Leveraging linguistic structure for open domain information extraction. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 344–354, Beijing, China, July 2015. Association for Computational Linguistics.
[11] Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21(1), jan 2020.
[12] Nils Reimers and Iryna Gurevych. Sentence-BERT: Sentence embeddings using Siamese BERT-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992, Hong Kong, China, November 2019. Association for Computational Linguistics.
[13] Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, and Ryan Lowe. Training language models to follow instructions with human feedback, 2022.
[14] David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel, and Demis Hassabis. Mastering the game of go with deep neural networks and tree search. Nature, 529(7587):484–489, 2016.
[15] Oriol Vinyals, Igor Babuschkin, Wojciech M. Czarnecki, Michaël Mathieu, Andrew Dudzik, Junyoung Chung, David H. Choi, Richard Powell, Timo Ewalds, Petko Georgiev, Junhyuk Oh, Dan Horgan, Manuel Kroiss, Ivo Danihelka, Aja Huang, Laurent Sifre, Trevor Cai, John P. Agapiou, Max Jaderberg, Alexander S. Vezhnevets, Rémi Leblond, Tobias Pohlen, Valentin Dalibard, David Budden, Yury Sulsky, James Molloy, Tom L. Paine, Caglar Gulcehre, Ziyu Wang, Tobias Pfaff, Yuhuai Wu, Roman Ring, Dani Yogatama, Dario Wünsch, Katrina McKinney, Oliver Smith, Tom Schaul, Timothy Lillicrap, Koray Kavukcuoglu, Demis Hassabis, Chris Apps, and David Silver. Grandmaster level in starcraft ii using multi-agent reinforcement learning. Nature, 575(7782):350–354, Nov 2019.
[16] OpenAI, :, Christopher Berner, Greg Brockman, Brooke Chan, Vicki Cheung, Przemysław Dębiak, Christy Dennison, David Farhi, Quirin Fischer, Shariq Hashme, Chris Hesse, Rafal Józefowicz, Scott Gray, Catherine Olsson, Jakub Pachocki, Michael Petrov, Henrique P. d. O. Pinto, Jonathan Raiman, Tim Salimans, Jeremy Schlatter, Jonas Schneider, Szymon Sidor, Ilya Sutskever, Jie Tang, Filip Wolski, and Susan Zhang. Dota 2 with large scale deep reinforcement learning, 2019.
[17] Alexander Zai and Brandon Brown. Deep Reinforcement Learning in Action. Manning Publications Co., 2020.
[18] Iulian V. Serban, Chinnadhurai Sankar, Mathieu Germain, Saizheng Zhang, Zhouhan Lin, Sandeep Subramanian, Taesup Kim, Michael Pieper, Sarath Chandar, Nan Rosemary Ke, Sai Rajeshwar, Alexandre de Brebisson, Jose M. R. Sotelo, Dendi Suhubdy, Vincent Michalski, Alexandre Nguyen, Joelle Pineau, and Yoshua Bengio. A deep reinforcement learning chatbot, 2017.
[19] 陳臆玄. 應用強化式學習於多面向對話回應模組之研究. 碩士論文, 國立中央大學, 2022.
[20] Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David McClosky. The Stanford CoreNLP natural language processing toolkit. In Association for Computational Linguistics (ACL) System Demonstrations, pages 55–60, 2014.
[21] Peng Qi, Yuhao Zhang, Yuhui Zhang, Jason Bolton, and Christopher D. Manning. Stanza: A Python natural language processing toolkit for many human languages. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2020.
[22] Karthik Raghunathan, Heeyoung Lee, Sudarshan Rangarajan, Nathanael Chambers, Mihai Surdeanu, Dan Jurafsky, and Christopher Manning. A multi-pass sieve for coreference resolution. In Empirical Methods in Natural Language Processing (EMNLP), 2010.
[23] Heeyoung Lee, Yves Peirsman, Angel Chang, Nathanael Chambers, Mihai Surdeanu, and Dan Jurafsky. Stanford’s multi-pass sieve coreference resolution system at the conll-2011 shared task. In Conference on Natural Language Learning (CoNLL) Shared Task, 2011.
[24] Marta Recasens, Marie-Catherine de Marneffe, and Christopher Potts. The life and death of discourse entities: Identifying singleton mentions. In North American Association for Computational Linguistics (NAACL), 2013.
[25] Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. Roberta: A robustly optimized bert pretraining approach, 2019.
[26] Ying Xu, Dakuo Wang, Mo Yu, Daniel Ritchie, Bingsheng Yao, Tongshuang Wu, Zheng Zhang, Toby Li, Nora Bradford, Branda Sun, Tran Hoang, Yisi Sang, Yufang Hou, Xiaojuan Ma, Diyi Yang, Nanyun Peng, Zhou Yu, and Mark Warschauer. Fantastic questions and where to find them: FairytaleQA – an authentic dataset for narrative comprehension. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 447–460, Dublin, Ireland, May 2022. Association for Computational Linguistics.
[27] Christian Di Maio and Giacomo Nunziati. Mariorossi/t5-base-finetunedquestion-answering (huggingface). https://huggingface.co/MaRiOrOsSi/t5-base-finetuned-question-answering.

指導教授

張嘉惠(Chia-Hui Chang)

審核日期

2023-7-28

推文