Quantile Regression Deep Q-Networks for Multi-Agent System Control

Access: Use of this item is restricted to the UNT Community
Training autonomous agents that are capable of performing their assigned job without fail is the ultimate goal of deep reinforcement learning. This thesis introduces a dueling Quantile Regression Deep Q-network, where the network learns the state value quantile function and advantage quantile function separately. With this network architecture the agent is able to learn to control simulated robots in the Gazebo simulator. Carefully crafted reward functions and state spaces must be designed for the agent to learn in complex non-stationary environments. When trained for only 100,000 timesteps, the agent is able reach asymptotic performance in environments with moving and stationary obstacles using only the data from the inertial measurement unit, LIDAR, and positional information. Through the use of transfer learning, the agents are also capable of formation control and flocking patterns. The performance of agents with frozen networks is improved through advice giving in Deep Q-networks by use of normalized Q-values and majority voting.
Date: May 2019
Creator: Howe, Dustin
System: The UNT Digital Library
Mesh Networking for Inter-UAV Communications (open access)

Mesh Networking for Inter-UAV Communications

Unmanned aerial systems (UASs) have a great potential to enhanced situational awareness in public safety operations. Many UASs operating in the same airspace can cause mid-air collisions. NASA and the FAA are developing a UAS traffic management (UTM) system, which could be used in public safety operations to manage the UAS airspace. UTM relies on an existing communication backhaul, however natural disasters may disrupt existing communications infrastructure or occur in areas where no backhaul exists. This thesis outlines a robust communications alternative that interfaces a fleet of UASs with a UTM service supplier (USS) over a mesh network. Additionally, this thesis outlines an algorithm for vehicle-to-vehicle discovery and communication over the mesh network.
Date: May 2019
Creator: Walton, Michael Tanner
System: The UNT Digital Library

Proximal Policy Optimization in StarCraft

Access: Use of this item is restricted to the UNT Community
Deep reinforcement learning is an area of research that has blossomed tremendously in recent years and has shown remarkable potential in computer games. Real-time strategy game has become an important field of artificial intelligence in game for several years. This paper is about to introduce a kind of algorithm that used to train agents to fight against computer bots. Not only because games are excellent tools to test deep reinforcement learning algorithms for their valuable insight into how well an algorithm can perform in isolated environments without the real-life consequences, but also real-time strategy games are a very complex genre that challenges artificial intelligence agents in both short-term or long-term planning. In this paper, we introduce some history of deep learning and reinforcement learning. Then we combine them with StarCraft. PPO is the algorithm which have some of the benefits of trust region policy optimization (TRPO), but it is much simpler to implement, more general for environment, and have better sample complexity. The StarCraft environment: Blood War Application Programming Interface (BWAPI) is open source to test. The results show that PPO can work well in BWAPI and train units to defeat the opponents. The algorithm presented in the thesis is …
Date: May 2019
Creator: Liu, Yuefan
System: The UNT Digital Library
Design of Voltage Boosting Rectifiers for Wireless Power Transfer Systems (open access)

Design of Voltage Boosting Rectifiers for Wireless Power Transfer Systems

This thesis presents a multi-stage rectifier for wireless power transfer in biomedical implant systems. The rectifier is built using Schottky diodes. The design has been simulated in 0.5µm and 130nm CMOS processes. The challenges for a rectifier in a wireless power transfer systems are observed to be the efficiency, output voltage yield, operating frequency range and the minimum input voltage the rectifier can convert. The rectifier outperformed the contemporary works in the mentioned criteria.
Date: May 2019
Creator: Suri, Ramaa Saket
System: The UNT Digital Library
Implementations of Fuzzy Adaptive Dynamic Programming Controls on DC to DC Converters (open access)

Implementations of Fuzzy Adaptive Dynamic Programming Controls on DC to DC Converters

DC to DC converters stabilize the voltage obtained from voltage sources such as solar power system, wind energy sources, wave energy sources, rectified voltage from alternators, and so forth. Hence, the need for improving its control algorithm is inevitable. Many algorithms are applied to DC to DC converters. This thesis designs fuzzy adaptive dynamic programming (Fuzzy ADP) algorithm. Also, this thesis implements both adaptive dynamic programming (ADP) and Fuzzy ADP on DC to DC converters to observe the performance of the output voltage trajectories.
Date: May 2019
Creator: Chotikorn, Nattapong
System: The UNT Digital Library