以分層注意力網路建構財務報表欺詐檢測模型

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：14

、訪客IP：3.146.105.137

姓名

陳弘偉(Hong-Wei Chen) 查詢紙本館藏

畢業系所

工業管理研究所

論文名稱

以分層注意力網路建構財務報表欺詐檢測模型
(Constructing a Financial Statement Fraud Detection Model Using Hierarchical Attention Networks)

相關論文

★ 應用失效模式效應分析於產品研發時程之改善	★ 服務品質因子與客戶滿意度關係研究-以汽車保修廠服務為例
★ 家庭購車決策與行銷策略之研究	★ 計程車車隊派遣作業之研究
★ 電業服務品質與服務失誤之探討-以台電桃園區營業處為例	★ 應用資料探勘探討筆記型電腦異常零件-以A公司為例
★ 車用配件開發及車主購買意願探討(以C公司汽車配件業務為實例)	★ 應用田口式實驗法於先進高強度鋼板阻抗熔接條件最佳化研究
★ 以層級分析法探討評選第三方物流服務要素之研究-以日系在台廠商為例	★ 變動良率下的最佳化批量研究
★ 供應商庫存管理架構下運用層級分析法探討供應商評選之研究-以某電子代工廠為例	★ 台灣地區快速流通消費產品銷售預測模型分析研究–以聯華食品可樂果為例
★ 競爭優勢與顧客滿意度分析以中華汽車為例	★ 綠色採購導入對電子代工廠的影響-以A公司為例
★ 以德菲法及層級分析法探討軌道運輸業之供應商評選研究–以T公司為例	★ 應用模擬系統改善存貨管理制度與服務水準之研究-以電線電纜製造業為例

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

財務報表欺詐是一種白領犯罪會造成嚴重後果，包括投資者和債權人的財務損失、公司聲譽的損害以及個人和公司法律和監管後果。檢測欺詐的傳統方法非常耗時，並且需要大量的人工操作。本研究提出一種財務報表欺詐檢測系統之架構，其中包含四個方向的考量分別是採用分層注意力網路模型(HAN)從年度報告中管理討論與分析(MD&A)提取文本特徵以獲取管理階層對公司營運看法、利用Bi-LSTM及相似度分析提取MD&A的時間變化量以及利用該公司的財務報表理解該公司的營運狀況。與過去典型黑盒子的類神經網路不同，本研究利用深度模型的預測能力及注意力模型結合以得到擁有可解釋的模型，以及利用貝氏類神經網路(BNN)量化該模型的不確定性，還有藉由HAN提高提取文本特徵的準確度及效率。本研究通過提高預測分析中的準確度、效率以及將可解釋性和不確定性加入模型中為文獻做出貢獻，並為監管機構提供一種藉由檢查大量公開文本及資料以監控並預測財務報表欺詐的方法。

摘要(英)

Financial statement fraud is a type of white-collar crime that can have serious consequences, including financial losses for investors and creditors, damage to company reputation, and legal and regulatory consequences for individuals and companies. Traditional methods for detecting fraud are time-consuming and require extensive manual operations. This study proposes an architecture for a financial statement fraud detection system that incorporates four directions of consideration. These include using a Hierarchical Attention Network (HAN) model to extract textual features from Management′s Discussion and Analysis (MD&A) in annual reports to obtain management′s perspectives on company operations, utilizing Bi-LSTM and similarity analysis to extract temporal changes in MD&A, and using the company′s financial statements to understand its operational status. Unlike typical black-box neural networks used in the past, this research utilizes the predictive ability of deep models and combines attention models to obtain an interpretable model. It also quantifies the model′s uncertainty using Bayesian neural networks (BNN), and enhances the accuracy and efficiency of extracting textual features by leveraging HAN. This study contributes to the literature by improving accuracy, efficiency, interpretability, and incorporating uncertainty in predictive analytics, and provides a method for regulatory agencies to monitor and predict financial statement fraud by examining a large volume of public text and data.

關鍵字(中)

★ 財務報表欺詐
★ 深度學習
★ 文本分析
★ 情緒分析
★ 年度報表

關鍵字(英)

★ financial statement fraud
★ deep learning
★ text analysis
★ sentiment analysis
★ annual reports

論文目次

摘要 i
Abstract ii
目錄 iii
圖目錄 v
表目錄 vi
第一章緒論 1
1.1 研究背景與動機 1
1.2 研究挑戰 2
1.3 研究目的 2
1.4 研究方法 2
第二章文獻回顧 3
2.1 財務報表欺詐檢測相關研究 3
2.2 文本分析 5
2.3 分層注意力網路(Hierarchical attention network, HAN) 5
2.3.1詞嵌入(Word Embedding) 6
2.3.2 雙向長短記憶(Bi-directional Long Short-Term Memory, Bi-LSTM) 8
2.3.3 注意力機制(Attention) 8
2.4 貝氏類神經網路(Bayesian neural network, BNN) 9
第三章方法論 11
3.1 分層注意力網路模型(Hierarchical attention network, HAN) 12
3.1.1 FinBERT embedding 12
3.1.2 特徵值選取 13
3.1.4 雙向長短記憶(Bi-directional Long Short-Term Memory, Bi-LSTM) 14
3.1.5 句子注意力機制(Sentence Attention) 17
3.2 MD&A時間變化量 17
3.3 貝氏類神經網路(Bayesian neural network, BNN) 17
3.4評估指標 18
第四章實驗結果 19
4.1 數據及前處理 19
4.1.1 數值資料 19
4.1.2 文本資料 20
4.1.3 標籤資料 21
4.1.4 數據集 21
4.2 實驗設置 22
4.3 分類結果 22
4.3.1 財務數據(FIN) 23
4.3.2 文本數據(TXT) 24
4.3.3 文本數據及財務數據(TXT+FIN) 25
4.4可解釋性 26
4.4.1 單詞級別 27
4.4.2 句子級別 28
第五章結論 31
附錄1 財務變量 32
參考文獻 35

參考文獻

[1] Abbasi, A., Albrecht, C., Vance, A., & Hansen, J. (2012). Metafraud: a meta-learning framework for detecting financial fraud. Mis Quarterly, 1293-1327.
[2] Alberti, C., Andor, D., Pitler, E., Devlin, J., & Collins, M. (2019). Synthetic QA corpora generation with roundtrip consistency. arXiv preprint arXiv:1906.05416.
[3] Bao, Y., Ke, B., Li, B., Yu, Y. J., & Zhang, J. (2020). Detecting accounting fraud in publicly traded US firms using a machine learning approach. Journal of Accounting Research, 58(1), 199-235.
[4] Beneish, M. D. (1999). The detection of earnings manipulation. Financial Analysts Journal, 55(5), 24-36.
[5] Cohen, L., Malloy, C., & Nguyen, Q. (2020). Lazy prices. The Journal of Finance, 75(3), 1371-1415.
[6] Cornegruta, S., Bakewell, R., Withey, S., & Montana, G. (2016). Modelling radiological language with bidirectional long short-term memory networks. arXiv preprint arXiv:1609.08409.
[7] Craja, P., Kim, A., & Lessmann, S. (2020). Deep learning for detecting financial statement fraud. Decision Support Systems, 139, 113421.
[8] Dechow, P. M., Ge, W., Larson, C. R., & Sloan, R. G. (2011). Predicting material accounting misstatements. Contemporary Accounting Research, 28(1), 17-82.
[9] Dogan, A., & Birant, D. (2021). Machine learning and data mining in manufacturing. Expert Systems with Applications, 166, 114060.
[10] Dong, W., Liao, S., & Liang, L. (2016). Financial statement fraud detection using text mining: A systemic functional linguistics theory perspective.
[11] Fanning, K. M., & Cogger, K. O. (1998). Neural network detection of management fraud using published financial data. Intelligent Systems in Accounting, Finance & Management, 7(1), 21-41.
[12] Gaganis, C. (2009). Classification techniques for the identification of falsified financial statements: a comparative analysis. Intelligent Systems in Accounting, Finance & Management: International Journal, 16(3), 207-229.
[13] Goel, S., Gangolly, J., Faerman, S. R., & Uzuner, O. (2010). Can linguistic predictors detect fraudulent financial filings? Journal of Emerging Technologies in Accounting, 7(1), 25-46.
[14] Graves, A., & Schmidhuber, J. (2005). Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural networks, 18(5-6), 602-610.
[15] Hajek, P., & Henriques, R. (2017). Mining corporate annual reports for intelligent detection of financial statement fraud–A comparative study of machine learning methods. Knowledge-Based Systems, 128, 139-152.
[16] Hamal, S., & Senvar, Ö. (2021). Comparing performances and effectiveness of machine learning classifiers in detecting financial accounting fraud for Turkish SMEs. Int. J. Comput. Intell. Syst., 14(1), 769-782.
[17] Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780.
[18] Huang, A. H., Wang, H., & Yang, Y. (2023). FinBERT: A large language model for extracting information from financial text. Contemporary Accounting Research, 40(2), 806-841.
[19] Humpherys, S. L., Moffitt, K. C., Burns, M. B., Burgoon, J. K., & Felix, W. F. (2011). Identification of fraudulent financial statements using linguistic credibility analysis. Decision Support Systems, 50(3), 585-594.
[20] Izzalqurny, T. R., Subroto, B., & Ghofar, A. (2019). Relationship between financial ratio and financial statement fraud risk moderated by auditor quality. International Journal of Research in Business and Social Science (2147-4478), 8(4), 34-43.
[21] conference proceddings ：Jain, A., Patel, H., Nagalapatti, L., Gupta, N., Mehta, S., Guttula, S., Mujumdar, S., Afzal, S., Sharma Mittal, R., & Munigala, V. (2020). Overview and importance of data quality for machine learning tasks. Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, 3561-3562.
[22] Jang, B., Kim, M., Harerimana, G., Kang, S.-u., & Kim, J. W. (2020). Bi-LSTM model to increase accuracy in text classification: Combining Word2vec CNN and attention mechanism. Applied Sciences, 10(17), 5841.
[23] Karpoff, J. M., Koester, A., Lee, D. S., & Martin, G. S. (2014). Database challenges in financial misconduct research. Georgetown McDonough School of Business Research Paper(2012–15).
[24] Khan, S., Fazil, M., Sejwal, V. K., Alshara, M. A., Alotaibi, R. M., Kamal, A., & Baig, A. R. (2022). BiCHAT: BiLSTM with deep CNN and hierarchical attention for hate speech detection. Journal of King Saud University-Computer and Information Sciences, 34(7), 4335-4344.
[25] Lebret, R. P. (2016). Word embeddings for natural language processing. EPFL.
[26] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
[27] Li, C., Jan, N. M., & Huang, B. (2018). Data analytics for oil sands subcool prediction—a comparative study of machine learning algorithms. IFAC-PapersOnLine, 51(18), 886-891.
[28] Li, F. (2010). Textual analysis of corporate disclosures: A survey of the literature. Journal of accounting literature, 29, 143.
[29] Li, Y., Zhu, Z., Kong, D., Han, H., & Zhao, Y. (2019). EA-LSTM: Evolutionary attention-based LSTM for time series prediction. Knowledge-Based Systems, 181, 104785.
[30] conference proceddings ：Liu, S., Tao, H., & Feng, S. (2019). Text classification research based on bert model and bayesian network. 2019 Chinese Automation Congress (CAC), 5842-5846.
[31] Loughran, T., & McDonald, B. (2016). Textual analysis in accounting and finance: A survey. Journal of Accounting Research, 54(4), 1187-1230.
[32] conference proceddings ：Ma, J., Gao, W., Joty, S., & Wong, K.-F. (2019). Sentence-level evidence embedding for claim verification with hierarchical attention networks.
[33] Purda, L., & Skillicorn, D. (2015). Accounting variables, deception, and a bag of words: Assessing the tools of fraud detection. Contemporary Accounting Research, 32(3), 1193-1223.
[34] conference proceddings ：Rawte, V., Gupta, A., & Zaki, M. J. (2020). A comparative analysis of temporal long text similarity: Application to financial documents. Workshop on Mining Data for Financial Applications, 77-91.
[35] conference proceddings ：Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). " Why should i trust you?" Explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 1135-1144.
[36] Schilit, H. M., & Perler, J. (2010). Financial Shenanigans Third Edition. In: McGraw-Hill.
[37] Shridhar, K., Laumann, F., & Liwicki, M. (2019). A comprehensive guide to bayesian convolutional neural network with variational inference. arXiv preprint arXiv:1901.02731.
[38] conference proceddings ：Wallach, H. M. (2006). Topic modeling: beyond bag-of-words. Proceedings of the 23rd international conference on Machine learning, 977-984.
[39] West, J., & Bhattacharya, M. (2016). Intelligent financial fraud detection: a comprehensive review. Computers & security, 57, 47-66.
[40] conference proceddings ：Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., & Hovy, E. (2016). Hierarchical attention networks for document classification. Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, 1480-1489.
[41] Yin, W., Kann, K., Yu, M., & Schütze, H. (2017). Comparative study of CNN and RNN for natural language processing. arXiv preprint arXiv:1702.01923.
[42] Zhang, X., Chen, F., & Huang, R. (2018). A combination of RNN and CNN for attention-based relation classification. Procedia computer science, 131, 911-917.
[43] Zhou, W., & Kapoor, G. (2011). Detecting evolutionary financial statement fraud. Decision Support Systems, 50(3), 570-575.

指導教授

葉英傑(Ying-Chieh Yeh)

審核日期

2023-7-18

推文