UAV spatiotemporal crowdsourcing resource allocation based on deep reinforcement learning
-
-
Abstract
Spatiotemporal crowdsourcing involves the use of various Internet of Things (IoT) devices distributed across industrial environments to collect and transmit spatiotemporal data related to industrial operations. Unmanned aerial vehicles (UAVs) play a crucial role in further collecting this data from IoT devices, especially in spatiotemporal crowdsourcing tasks. In the realm of industrial IoT energy management, allocating spatiotemporal crowdsourcing resources to UAVs poses substantial challenges. Traditional approaches to this problem have focused on optimizing the Age of Information (AoI) to ensure timely and equitable data updates. However, these methods often overlook critical operational constraints such as UAV no-fly zones and the risk of data interception by eavesdroppers. These issues can adversely affect the freshness and integrity of the information being gathered and transmitted. To address these shortcomings, this paper presents a novel deep reinforcement learning-based framework for UAV spatiotemporal crowdsourcing resource allocation. This approach aims to minimize the average AoI across the network while also reducing the energy consumption of IoT devices. It incorporates spatial constraints imposed by UAV no-fly zones and actively manages the transmission of jamming signals to mitigate the threat posed by eavesdroppers, thus ensuring data security. However, the complexity of allocating spatiotemporal crowdsourcing resources to UAVs is notable owing to numerous decision variables, which increase linearly with the duration of the service. Furthermore, the relationship between performance metrics and decision variables is intricate, requiring adherence to quality of service requirements. This problem is formalized as a Markov decision process (MDP), providing a structured approach to model the decision-making scenario faced by UAVs in a dynamic environment. To solve this MDP, we employ the soft actor critic (SAC) algorithm, an advanced deep reinforcement learning method known for its sample efficiency and stability. The SAC algorithm is adept at handling the continuous action spaces typical of UAV flight paths and power control problems, making it particularly well-suited for our application. We rigorously test our proposed methods in scenarios involving multiple UAVs, demonstrating the algorithm’s effectiveness in managing the spatiotemporal allocation of resources. Our results show that the SAC algorithm achieves faster convergence speed and better solutions than existing state-of-the-art methods, such as the twin delayed deep deterministic policy gradient (TD3) and the deep deterministic policy gradient (DDPG) algorithms. Furthermore, the paper delves into the strategic selection of the optimal number of UAVs to balance the trade-offs between coverage, energy consumption, and operational efficiency. By analytically and empirically examining the impact of the UAV fleet size on system performance, we provide insights into configuring UAV networks to achieve optimal outcomes in terms of AoI, energy management, and security. In conclusion, our research introduces a robust and intelligent framework for UAV resource allocation. The demonstrated efficacy of the SAC algorithm in this context paves the way for its future application in other domains where secure, efficient, and intelligent resource management is paramount.
-
-