Reinforcement learning for task offloading in next generation networks : algorithms and hardware acceleration

Mobile Edge Computing is going to evolve in a platform both AI-enabled and AI-enabling in the beyond 5G network era. Thus, the need for offloading tasks to distributed dedicated edge hardware is only going to increase and the orchestration algorithms governing the offloading operations shall become...

Πλήρης περιγραφή

Λεπτομέρειες βιβλιογραφικής εγγραφής
Κύριος συγγραφέας: Γερογιάννης, Γεράσιμος
Άλλοι συγγραφείς: Gerogiannis, Gerasimos
Γλώσσα:English
Έκδοση: 2021
Θέματα:
Διαθέσιμο Online:http://hdl.handle.net/10889/14963
Περιγραφή
Περίληψη:Mobile Edge Computing is going to evolve in a platform both AI-enabled and AI-enabling in the beyond 5G network era. Thus, the need for offloading tasks to distributed dedicated edge hardware is only going to increase and the orchestration algorithms governing the offloading operations shall become even more sophisticated and operate with even smaller latencies, consistent with the data rates of the future generations of wireless communications. In this thesis, Reinforcement Learning is adopted as a solution mechanism for Task/Computation Offloading orchestration problems in next generation networks and is investigated in two different perspectives. In the perspective of the employed algorithms, original work is presented regarding the extension and application of the Bandit Learning Discounted Upper Confidence Bound (D-UCB) algorithm for a swarm of Users offloading their computations to a set of Edge Servers. The resulting algorithm, Certainty Aggregation Reward Decomposition Upper Confidence Bound (CARD-D-UCB), efficiently tackles the Information Asymmetry and Uncertainty introduced when transitioning from the single user scenario to the swarm of users scenario and achieves performance comparable to the single user case. In the perspective of hardware acceleration of Deep Reinforcement Learning based orchestration algorithms, the design and implementation of an FPGA-based Accelerator for the real-time ultra-low latency solution of Mixed Integer Programming problems is presented. The design is implemented using High Level Synthesis. The Accelerator’s performance is evaluated in a Task Offloading and Resource Allocation scenario, supported by 5G and beyond technologies. The implemented Accelerator achieves near optimal performance in the selected use case while decreasing the training-inference execution latency to 4.3 μs per timestep which is decreased by an order of magnitude in comparison with a high-end CPU based implementation.