摘要: | 生成式摘要任務的目標是將一篇長文本精簡地濃縮成語意相同,且保留主要資訊的 重點摘要,可以應用於眾多情境,例如:製作新聞標題、學術論文摘要、自動化報告生 成與問答聊天機器人等。本研究的主要目標檢索式醫療問答系統的問題理解,使用者醫 療問題多數存在過多不必要的信息,導致檢索系統的問答匹配精準性下降,因此,我們 開發生成式摘要技術做為問題理解的解決方案,產生對應使用者醫療問題的摘要問句,, 用以輸入檢索式醫療問答系統,改善撈取相關答案的匹配性。我們提出一個基於意圖的 醫療問題摘要 (Intent-based Medical Question Summarization, IMQS) 模型,包含實體辨 識器擷取原問句的醫療實體,然後使用實體提示方式加入原始問句,做為摘要模型的輸 入句,共同學習問題意圖分類與摘要任務,微調於摘要語言模型的編碼器與解碼器,藉 以生成對實體有更加關注且與保留原問句意圖的摘要。 我們透過網路爬蟲蒐集醫聯網醫師諮詢平台的民眾提問,篩選合適的問題進行醫療 實體標記、問題意圖標記、以及問題摘要標記,最終建置一組醫療問題摘要資料集 Med- QueSumm,包含 2,468 個中文醫療問題,原問句平均約 110 個字元及 7.75 個實體,以及 6 個定義好的意圖種類(病症、藥物、科室、治療、檢查、資訊)其中之一,摘要問句平均 約 45 個字元,長度約為原問句的 40%。藉由實驗結果與 IMQS 模型分析得知,我們提 出的模型在摘要任務達到最好的 ROUGE-1 69.59% 、ROUGE-2 51.32% 、ROUGE-L 61.69% 與 BERTScore 64.08% , 比 相 關 研 究 模 型 (BERTSum-abs, PEGASUS, ProphetNet, CPT, BART, GSum, SpanCopy)等有更好的摘要效能,且在意圖分類上也達到 Micro-F1 85.54%。整題而言,IMQS 模型為兼具摘要品質與意圖分析的中文醫療問題摘 要方法。;The goal of the generative summarization task is to condense a long text into a shorter summary while retaining the main information and key contents. The main objective of this research is to understand medical problems through summarization techniques. In retrieval- based question-answering systems, users’ medical questions may contain unnecessary information that hinders the retrieval performance. Therefore, we focus on developing a generative summarization model called IMQS (Intent-based Medical Question Summarization) to create corresponding question summaries. First, we use an entity recognizer to extract the medical entities of an original question and design an entity prompt to formulate the input question to our summarization model. Then, joint learning of question intents and summaries to fine-tune the encoder and decoder in the language model. Finally, we can obtain more attention to medical entities and retain the intent of an original question in the generated summary. We collected users’ questions from a physician consultation platform: MedNet and selected suitable ones for entity tagging, intent labeling, and question summarization, resulting in a dataset called Med-QueSumm. We have a total of 2,468 Chinese medical questions, each with an average of about 110 characters and 7.75 entities, while the summarized questions are around 45 characters, accounting for near 40% of original questions. In addition, each question is annotated to one of six intent categories: symptoms, drugs, departments, treatments, examinations, and information. Experimental results and model analysis show that our IMQS model achieves the best ROUGE-1/-2/-L of 69.59/51.32/61.69 and a BERTScore of 64.08 in the summarization task, outperforming other related models including BERTSum-abs, PEGASUS, ProphetNet, CPT, BART, GSum, and SpanCopy. Besides, our IMQS model obtained the best micro-F1 score of 85.54 in intent classification. Overall, it’s an effective summarization method for Chinese medical questions. |