時間序列預測中變換器架構之位置編碼設計

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：47

、訪客IP：18.189.180.76

姓名

游景翔(Ching-Siang You) 查詢紙本館藏

畢業系所

工業管理研究所

論文名稱

時間序列預測中變換器架構之位置編碼設計
(Design of Transformer Architecture based on different Position Encoding in Time series forecasting)

相關論文

★ 二階段作業研究模式於立體化設施規劃應用之探討–以半導體製造廠X及Y公司為例	★ 推行TPM活動以改善設備總合效率並提昇企業競爭力...以U公司桃園工廠為例
★ 資訊系統整合業者行銷通路策略之研究	★ 以決策樹法歸納關鍵製程暨以群集法識別關鍵路徑
★ 關鍵績效指標(KPI)之建立與推行 - 在造紙業	★ 應用實驗計劃法- 提昇IC載板錫球斷面品質最佳化之研究
★ 如何從歷史鑽孔Cp值導出新設計規則進而達到兼顧品質與降低生產成本目標	★ 產品資料管理系統建立及導入-以半導體IC封裝廠C公司為例
★ 企業由設計代工轉型為自有品牌之營運管理	★ 運用六標準差步驟與FMEA於塑膠射出成型之冷料改善研究(以S公司為例)
★ 台灣地區輪胎產業經營績效之研究	★ 以方法時間衡量法訂定OLED面板蒸鍍有機材料更換作業之時間標準
★ 利用六標準差管理提升生產效率－以Ａ公司塗料充填流程改善為例	★ 依流程相似度對目標群組做群集分析- 以航空發動機維修廠之自修工件為例
★ 設計鏈績效衡量指標建立 —以電動巴士產業A公司為例	★ 應用資料探勘尋找影響太陽能模組製程良率之因子研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

時間序列分析和預測是資料探勘中的一個重要領域。時間序列資料是在統一的時間間隔內收集大量的數據值，例如年、月、週、日等。透過分析時間序列，我們可以預測數據的變化，並提供未來資料的預測。近年來，時間序列預測一直是研究的焦點，並在機器學習和人工智能領域引發了各種研究和發展。隨著數據可用性的增加和計算能力的提升，也出現了許多基於深度學習的模型。且因為不同領域間的多樣性，衍伸許多不同的深度學習模型。時間序列趨勢預測一直是一個重要的課題，其結果可以為各領域的應用提供基礎，例如生產計劃的控制和優化等。
變換器模型（Transformer Model）最初是為了處理自然語言而提出的一種神經網絡架構。它使用一種稱為注意力或自我注意力（Self-attention）的機制來檢測序列中元素之間的相互影響和相互依賴關係。在本研究中，我們將變換器模型應用於時間序列資料的預測，並探討其並行計算的特性是否能解決長短期記憶模型（LSTM）在學習長序列時的限制。
此外，我們通過使用不同的位置編碼機制（Positional Encoding）來提供時間序列資料在序列中的位置信息，並探討不同的位置編碼方式對於時間序列預測在變換器模型中的影響。我們在實驗中使用了五種真實的時間序列資料，評估各種模型對於不同時間趨勢的預測結果。

摘要(英)

Time series analysis and forecasting are essential components of data mining. Time series data refers to a collection of data values gathered at regular time intervals, such as yearly, monthly, weekly, or daily intervals. By analyzing time series data, we can predict the changes occurring within the dataset and forecast future data trends. Time series prediction has been a research hotspot in the past ten years, with the increase in data availability and the improvement of computing power, many deep learning-based models have also emerged in recent years, and many different deep learning model designs have also emerged considering the diversity of time series problems between other domains. Time series trend forecasting has always been an important topic. The predicted results can provide the basis for applications in various fields, such as the control and optimization of production planning.
Transformer Model is a kind of neural network, which was mainly applied to natural language processing when it was first proposed. It primarily uses a set of mechanisms called attention or self-attention to detect data elements in sequences that influence and depend on each other. In this study, we use the Transformer model to predict time series data and explore whether its parallel operation characteristics can solve the long-short-term memory model (LSTM) with a certain length limit in sequence learning. In addition, we use different Positional Encoding mechanisms to give time series data position information in the sequence and discuss the impact of different position encoding methods to express the positional relationship of time series data at time points on the time series prediction of the transformer model. In Chapter 4, we used 5 kinds of real-world time series data to examine each model′s results to predict different time trends.

關鍵字(中)

★ 資料探勘
★ 深度學習
★ 時間序列
★ 變換器模型

關鍵字(英)

★ Data mining
★ Deep learning
★ Time series
★ Transformer model

論文目次

中文摘要 i
Abstract ii
List of Tables v
List of Figures vii
Chapter 1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Research Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Research Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Chapter 2 Literature Review 10
2.1 Introduction of Time Series Data Forecasting . . . . . . . . . . . . . . . . . . 10
2.1.1 Time Series Forecasting Problems . . . . . . . . . . . . . . . . . . . . 11
2.1.2 Formulating the problem of temporal sequence forecasting . . . . . . . 12
2.2 Time Series Forecasting Techniques . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.1 Traditional Forecasting Techniques . . . . . . . . . . . . . . . . . . . . 14
2.2.2 Deep Learning Techniques . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3 Introduction of Transformer model . . . . . . . . . . . . . . . . . . . . . . . . 31
2.4 Positional encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Chapter 3 Methodology 39
3.1 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2 Transformer Model Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3 Self-Attention in Transformer model . . . . . . . . . . . . . . . . . . . . . . . 43
3.3.1 Multi-head Self-attention . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.4 Positional Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.4.1 Learnable Positional Encoding . . . . . . . . . . . . . . . . . . . . . . 51
3.4.2 Sinusoidal Positional Encoding . . . . . . . . . . . . . . . . . . . . . . 54
3.5 The Transformer Architecture we use . . . . . . . . . . . . . . . . . . . . . . 59
3.6 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Chapter 4 Experiment 68
4.1 Experimental procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.1.1 Data preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.1.2 Cross-Validation for time series data . . . . . . . . . . . . . . . . . . . 70
4.2 Introduction of Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Chapter 5 Conclusion and Summary 96
Appendix 104
Reference 121

參考文獻

[1] Box, G. E. and Jenkins, G. M. (1976). Time series analysis: Forecasting and control san
francisco. Calif: Holden-Day.
[2] Box, G. E., Jenkins, G. M., and Reinsel, G. (1970). Time series analysis: forecasting and
control holden-day san francisco. BoxTime Series Analysis: Forecasting and Control Holden Day1970.
[3] Brown, R. G. (1960). Statistical forecasting for inventory control.
[4] Chen, J., Lu, Y., Yu, Q., Luo, X., Adeli, E., Wang, Y., Lu, L., Yuille, A. L., and Zhou, Y.
(2021). Transunet: Transformers make strong encoders for medical image segmentation.
CoRR, abs/2102.04306.
[5] Cho, K., van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., and Bengio, Y. (2014). Learning phrase representations using RNN encoder–decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1724–1734, Doha, Qatar. Association for Computational Linguistics.
[6] Chu, X., Tian, Z., Zhang, B., Wang, X., Wei, X., Xia, H., and Shen, C. (2021). Conditional positional encodings for vision transformers. Computing Researc Repository (CoRR).
[7] Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., and Houlsby, N. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net.
[8] Gardner Jr, E. S. (1985). Exponential smoothing: The state of the art. Journal of forecasting, 4(1):1–28.
[9] Gehring, J., Auli, M., Grangier, D., Yarats, D., and Dauphin, Y. N. (2017). Convolutional sequence to sequence learning. In International conference on machine learning, pages 1243–1252. PMLR.
[10] Gers, F. A., Schmidhuber, J., and Cummins, F. (2000). Learning to forget: Continual prediction with lstm. Neural computation, 12(10):2451–2471.
[11] Gong, Y., Chung, Y.-A., and Glass, J. (2021). Ast: Audio spectrogram transformer. Computing Research Repository (CoRR).
[12] Han, Z., Zhao, J., Leung, H., Ma, K. F., and Wang, W. (2019). A review of deep learning models for time series prediction. IEEE Sensors Journal, 21(6):7833–7848.
[13] Harrell, F. E. et al. (2001). Regression modeling strategies: with applications to linear models, logistic regression, and survival analysis, volume 608. Springer.
[14] Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8):1735–1780.
[15] Holt, C. C. (2004). Forecasting seasonals and trends by exponentially weighted moving averages. International journal of forecasting, 20(1):5–10.
[16] Hoshmand, A. R. (2009). Business forecasting: a practical approach. Routledge.
[17] Jenkins, G. M., Box, G. E., and Reinsel, G. C. (2011). Time series analysis: forecasting and control, volume 734. John Wiley & Sons.
[18] Jordan, M. (1986). Serial order: a parallel distributed processing approach. technical report, june 1985-march 1986. Technical report, California Univ., San Diego, La Jolla (USA). Inst. for Cognitive Science.
[19] Kenton, J. D. M.-W. C. and Toutanova, L. K. (2019). Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of naacL-HLT, volume 1, page 2.
[20] Khan, S., Naseer, M., Hayat, M., Zamir, S. W., Khan, F. S., and Shah, M. (2022). Transformers in vision: A survey. ACM computing surveys (CSUR), 54(10s):1–41.
[21] LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learning. nature, 521(7553):436–444.
[22] Lin, T., Wang, Y., Liu, X., and Qiu, X. (2022). A survey of transformers. AI Open.
[23] Liu, Z., Zhu, Z., Gao, J., and Xu, C. (2021). Forecast methods for time series data: a survey. IEEE Access, 9:91896–91912.
[24] Noever, D., Ciolino, M., and Kalin, J. (2020). The chess transformer: Mastering play using generative language models. arXiv: Artificial Intelligence.
[25] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8):9.
[26] Sezer, O. B., Gudelek, M. U., and Ozbayoglu, A. M. (2020). Financial time series forecasting with deep learning: A systematic literature review: 2005–2019. Applied soft computing, 90:106181.
[27] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.
[28] Winters, P. R. (1960). Forecasting sales by exponentially weighted moving averages. Management science, 6(3):324–342.

指導教授

曾富祥(Fu-Shiang Tseng)

審核日期

2023-7-10

推文