改良式強化學習於太陽能獵取之 多用戶上鏈功率控制研究;Novel Reinforcement Learning-based Multiuser Uplink Power Control with Solar Energy Harvesting

NCU Institutional Repository > 資訊電機學院 > 通訊工程研究所 > 博碩士論文 > Item 987654321/85044

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/85044

題名:	改良式強化學習於太陽能獵取之多用戶上鏈功率控制研究;Novel Reinforcement Learning-based Multiuser Uplink Power Control with Solar Energy Harvesting
作者:	高懿辰;KAO, YI-CHEN
貢獻者:	通訊工程學系
關鍵詞:	強化學習;多層感知器;能量獵取;功率控制;Reinforcement Learning;Deep Neural Networks;Energy Harvesting;Power Control
日期:	2021-01-06
上傳時間:	2021-03-18 17:28:13 (UTC+8)
出版者:	國立中央大學
摘要:	能量獵取技術(Energy Harvest)被視為能夠實現延長無線通訊設備使用時效及自我維持工作條件的有效技術，而近些年來物聯網的興起，小規模的無線通訊遽烈增加，由於有限容量電池的限制，小規模無線通訊裝置有著電量消耗及使用時效的問題，而能量獵取技術能夠藉由獵取周遭環境的能源來維持小規模無線通訊裝置使用時效並最佳化功率控制藉此解決有限容量電池造成的消耗及裝置壽命問題。為了最大化無線通訊系統整體傳輸吞吐量，傳統可以使用凸優化、馬可夫決策過程等最佳化方式來進行系統效能的最佳化，但上述方法存在著必須得知未來一段時間的通道增益、能量獵取情況、系統狀態轉移機率才能夠執行最佳化的計算，而強化學習基於與環境的相互探索，能夠藉由多次的疊代優化出最佳的動作價值函數。本論文採用強化學習(Reinforcement Learning)來最佳化無線通訊系統的行動策略(policy)，藉由使用獵取能量數值、有限容量電池數值及無線通道增益來定義出馬可夫決策過程中的狀態，研究能量獵取無線通訊上鏈系統在多用戶的環境下之功率控制最佳方案，並且在觀察狀態的資訊後，可以得到在電池狀態分量中有著小區段線性相關的關係，藉由加權方式改進傳統強化學習的探索及評估數值使系統效能在複雜度小幅提升的情況下，使基於能量獵取的多用戶無線通訊上鏈系統的傳輸吞吐量有明顯的增加。 ;Energy harvesting is regarded as an effective technology that can extend the lifetime of wireless communication applications and self-sustain working conditions. In recent years, the rise of the Internet of Things has led to a rapid increase in small-scale wireless communication. Restrictions, small-scale wireless communication devices have power consumption and use time issues, and energy harvesting technology can maintain the use timeliness of small-scale wireless communication devices by gathering energy from the surrounding environment and optimize power control to solve the problem of limited capacity batteries consumption and device life issues. In order to maximize the overall transmission throughput of the wireless communication system, optimization methods such as convex optimization and Markov decision process can be used to optimize the system performance by power control. However, the above methods rely on the perfect knowledge of the future information of energy harvesting conditions and channel gains control, which makes it difficult to be implemented in real applications. Reinforcement learning can approximate the best action-value function by multiple iterations based on mutual exploration with the environment. In this thesis, reinforcement learning is used to optimize the action strategy of the wireless communication system. By using the harvested energy value, the limited capacity battery value, and the wireless channel gain to define the state in the Markov decision process, this thesis investigate energy harvesting power control of uplink wireless communication systems in a multi-user environment, and after observing the state of the environment information, it can be obtained that there is a piecewise linear correlation in a battery state component, and the conventional reinforcement learning is improved by weighting the exploration and evaluation values of learning enable the system performance to be slightly increased in the case of a small increase in complexity, which significantly increases the transmission throughput of the multi-user wireless communication uplink system based on energy harvesting.
顯示於類別:	[通訊工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	131	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....