通過強化學習與積分滑模動量觀測器實現機器手臂的強健近佳PD控制策略;Robust Near-Optimal PD-like Control Strategy for Robot Manipulators via Reinforcement Learning and Integral Sliding-Mode Momentum Observer

NCU Institutional Repository > 資訊電機學院 > 電機工程研究所 > 博碩士論文 > Item 987654321/95690

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/95690

題名:	通過強化學習與積分滑模動量觀測器實現機器手臂的強健近佳PD控制策略;Robust Near-Optimal PD-like Control Strategy for Robot Manipulators via Reinforcement Learning and Integral Sliding-Mode Momentum Observer
作者:	顏佑丞;Yan, You-Cheng
貢獻者:	電機工程學系
關鍵詞:	機器手臂;強化學習;動量觀測器;軌跡追蹤控制;最佳化控制;Robot manipulator;Reinforcement learning;Momentum observer;Trajectory tracking control;Optimal control
日期:	2024-07-23
上傳時間:	2024-10-09 17:09:34 (UTC+8)
出版者:	國立中央大學
摘要:	機器手臂因為其高精度和持續性而被廣泛的使用在現今的工廠自動化產線，執行的任務常需要使安裝在末端的夾爪沿著預定義好的位置軌跡移動，然而在移動過程中難免會受到不確定性影響，導致移動精度下降。本篇論文為機器手臂的軌跡追蹤控制提出了一個控制策略，包含了一個不確定性估測器與一個基於強化學習的actor-critic最佳化追蹤控制器。首先，在已被投入商業應用的動量觀測器的基礎上，結合了積分滑模控制技術，除了繼承傳統動量觀測器的優點外，也擁有滑模控制的強健性，提升不確定性估測能力，並將估測值用於補償。其次，在強化學習追蹤控制理論下結合了傳統的PD加上前饋控制器，設計出一個神經網路參數選擇流程，此流程可避免耗時的神經網路活化函數與初始權重的調整，保證了初始控制器的可接受性，在控制過程中則利用強化學習的actor-critic架構來自適應調整控制輸出。該控制策略應用於機器手臂的閉迴路系統穩定性，已由Lyapunov方法證明所有誤差訊號都是有界的。為了驗證提出的控制策略的有效性與優越性，在二軸機器手臂的數值模擬中，與傳統的PD加上前饋控制器還有自適應RBF神經網路控制器做比較，結果顯示了提出的控制策略比其他兩者擁有更快的收斂速度與更小的穩態誤差。在真實二軸機器手臂上的實驗結果也證實了實務上的可行性。;Robot manipulators are widely used in today’s factory automation production lines due to their high precision and consistency, which in turn improves productivity and quality. These tasks often require the end-effector mounted on the arm to move along predefined position trajectories. However, uncertainties during the movement can affect the precision, leading to decreased accuracy. This thesis proposes a control strategy for trajectory tracking control of robot manipulators, which includes an uncertainty estimator and a reinforcement learning-based actor-critic optimal tracking controller. First, building on the commercially applied momentum observer, we designed a momentum observer combined with integral sliding mode control. This observer not only inherits the advantages of the traditional momentum observer but also possesses the robustness of sliding mode control, enhancing uncertainty estimation capabilities and using the estimated values for compensation. Second, under the existing reinforcement learning tracking control theory, we integrated a traditional PD with feedforward controller and designed a neural network parameters selection procedure. This procedure avoid time-consuming adjustments of neural network activation functions and initial weights, ensuring the admissibility of the initial control policy. During the control period, the actor-critic architecture of reinforcement learning is used to adaptively adjust the control output. The closed-loop system stability has been proven by the Lyapunov method that all error signals are bounded. To verify the effectiveness and superiority of the proposed control strategy, it was compared with the traditional PD with feedforward controller and the adaptive RBF neural network controller in a two-link robot manipulator numerical simulation. The results showed that the proposed control strategy has a faster convergence speed and smaller steady-state error than the other two. Also, the practical feasibility has been confirmed through real-world experiments.
顯示於類別:	[電機工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	43	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....