基于MDP和Q-learning的绿色移动边缘计算任务卸载策略-河南理工大学出版中心

>> 自然科学版 >> 当期目录 >> 正文

基于MDP和Q-learning的绿色移动边缘计算任务卸载策略

时间: 2025-07-23

次数:

赵宏伟, 吕盛凱, 庞芷茜,等.基于MDP和Q-learning的绿色移动边缘计算任务卸载策略[J].河南理工大学学报（自然科学版）,2025,44(5):9-16.

ZHAO H W, LYU S K, PANG Z X, et al. A green task offloading strategy for mobile edge computing based on MDP and Q-learning [J]. Journal of Henan Polytechnic University (Natural Science) , 2025, 44(5): 9-16.

基于MDP和Q-learning的绿色移动边缘计算任务卸载策略

赵宏伟1,2, 吕盛凱1, 庞芷茜1, 马子涵1, 李雨1

1.沈阳大学信息工程学院，辽宁沈阳 110044；2.沈阳大学碳中和技术与政策研究所，辽宁沈阳 110044

摘要: 目的为了在汽车、空调等制造类工业互联网企业中实现碳中和，利用边缘计算任务卸载技术处理生产设备的任务卸载问题，以减少服务器的中心负载，减少数据中心的能源消耗和碳排放。方法提出一种基于马尔可夫决策过程（Markov decision process，MDP）和Q-learning的绿色边缘计算任务卸载策略，该策略考虑了计算频率、传输功率、碳排放等约束，基于云边端协同计算模型，将碳排放优化问题转化为混合整数线性规划模型，通过MDP和Q-learning求解模型，并对比随机分配算法、Q-learning算法、SARSA（state action reward state action ）算法的收敛性能、碳排放与总时延。结果与已有的计算卸载策略相比，新策略对应的任务调度算法收敛比SARSA算法、Q-learning算法分别提高了5%，2%，收敛性更好；系统碳排放成本比Q-learning算法、SARSA算法分别减少了8%，22%；考虑终端数量多少，新策略比Q-learning算法、SARSA算法终端数量分别减少了6%，7%；系统总计算时延上，新策略明显低于其他算法，比随机分配算法、Q-learning算法、SARSA算法分别减少了27%，14%，22%。结论该策略能够合理优化卸载计算任务和资源分配，权衡时延、能耗，减少系统碳排放量。

关键词:碳排放;边缘计算;强化学习;马尔可夫决策过程;任务卸载

DOI: 10.16186/j.cnki.1673-9787.2024070047

基金项目:国家自然科学基金资助项目（71672117）；东北地质科技创新中心区创新基金资助项目（QCJJ2023-49）

收稿日期:2024/07/10

修回日期:2024/09/29

出版日期:2025/07/23

A green task offloading strategy for mobile edge computing based on MDP and Q-learning

Zhao Hongwei1,2, Lyu Shengkai1, Pang Zhixi1, Ma Zihan1, Li Yu1

1.School of Information Engineering， Shenyang University， Shenyang 110044， Liaoning， China；2.Institute of Carbon Neutral Technology and Policy， Shenyang University， Shenyang 110044， Liaoning， China

Abstract: Objectives To achieve carbon neutrality in manufacturing industrial Internet companies such as automobile and air conditioner production， edge computing task offloading technology was utilized to address the task offloading problem for production equipment， aiming to reduce the central server load as well as energy consumption and carbon emissions in data centers. Methods A green edge computing task offloading strategy based on Markov decision process （MDP） and Q-learning was proposed. The strategy accounted for constraints including computing frequency， transmission power， and carbon emissions. Using a cloud-edge-end collaborative computing model， the carbon emission optimization problem was formulated as a mixed integer linear programming model. The model was solved via MDP and Q-learning algorithms. The convergence performance， carbon emissions， and total latency of the proposed method were compared with random allocation， Q-learning， and SARSA algorithms. Results Compared with existing computation offloading strategies， the proposed task scheduling algorithm demonstrated superior convergence performance， improving by 5% and 2% over the SARSA and Q-learning algorithms， respectively. The system’s carbon emission cost was reduced by 8% and 22% compared to Q-learning and SARSA algorithms， respectively. As the number of terminals increased， the new strategy continued to outperform， achieving carbon emission reductions of 6% and 7% compared to the Q-learning and SARSA algorithms. In terms of total system computation latency， the proposed strategy significantly outperformed other methods， with reductions of 27%， 14%， and 22% compared to the random allocation， Q-learning， and SARSA algorithms， respectively. Conclusions The proposed task offloading strategy effectively optimized computation task distribution and resource allocation in mobile edge computing scenarios. It striked a balance between latency and energy consumption while significantly reducing system carbon emissions， making it a promising solution for green edge computing.

Key words:carbon emission;edge computing;reinforcement learning;Markov decision process;task offloading

附件【002_2024070047_赵宏伟_L.pdf】已下载次