工業控制系統 (industrial control systems, ICS)結合資訊技術(information technology, IT)和運營技術 (operational technology, OT),透過網路以監視、控制和管理大型生產系統或關鍵基礎建設。工業控制系統一旦遭受資安攻擊,輕則系統性能下降、功能喪失,重則導致環境汙染、經濟損失、人員傷亡、甚至危害國家安全。因此,發展入侵偵測系統(intrusion detection system)及入侵分類系統(intrusion classification system),以檢測及分類資安攻擊所造成的異常(anomaly)變得非常重要。 本論文提出基於流 (flow-based)的異常分類方法,結合多重注意力區塊(multi-attention block)機制與殘差區塊 (residual block)機制建構深度神經網路以發展在工業控制系統中的入侵分類系統。所提出的方法首先透過匯集相同資料流(data flow)以獲得更多的特徵,接著使用多重注意力區塊提取在不同維度中的特徵,再使用殘差區塊導出輸入和輸出之間的殘差,以去除主體中相同的部分,從而突出微小的變化。為了增加訓練時的穩健性 (robustness),我們選擇 Ranger (RAdam + LookAhead)作為優化器來減少梯度的方差,選擇 Focal Loss作為損失函數為每個樣本給予相對應的損 失權重,以加強神經網路處理不平衡資料的能力。 本論文採用 Electra Modbus資料集來評估所提方法之效能,不僅將所提方法的不同機制組合進行效能比較,也與其他相關方法進行效能比較。比較結果顯示,所提方法在入侵分類方面,具有最好的精準度、召回率和 F1分數。;Industrial control systems (ICSs) combine information technology and operational technology to monitor, control and manage large-scale production systems or critical infrastructures through networking. Once industrial control systems suffer from information security attacks, their performance degrades and some functions may fail, leading to environmental pollution, economic losses, casualties, and even national security crises. Therefore, it is very important to develop an intrusion detection system and an intrusion classification system to detect and classify anomalies caused by information security attacks. This thesis proposes a flow-based anomaly classification method that combines multi-attention blocks and residual blocks to construct deep neural networks (DNNs) for developing intrusion classification systems in ICSs. The proposed method first obtains more features through aggregating the same data flows. It then uses multi-attention blocks to extract features in different dimensions, and employs attention blocks to derive the residual between input and output for removing identical portions in the main body, and highlighting small changes. In order to increase the robustness during training, we choose Ranger (RAdam + LookAhead) as the optimizer to reduce the variance of the gradient, and choose Focal Loss as the loss function to give each sample a corresponding loss weight so that DNNs can process imbalanced data properly. The Electra Modbus dataset is used to evaluate the performance of the proposed method for different combinations of mechanism options. The proposed method is also compared with other related methods in terms of the recison, recall and F1 score to show that it has the best performance.