Deep Reinforcement Learning with PPO for Autonomous Mobile Robot Navigation Using ROS 2 Framework

Authors: Rana A., B. Kaveendran

DOI Link: https://doi.org/10.22214/ijraset.2025.73330

Abstract

This paper brings robot learning to life by showing how a humble TurtleBot3 can teach itself to navigate using an approach inspired by how humans learn through trial and error. We\'ve created a custom training playground where the robot learns from its 360-degree laser \"vision\" (like constantly feeling its surroundings with outstretched arms) to smoothly move through spaces without collisions. We have established a link between virtual practice sessions and actual performance by integrating the robot\'s operating system (ROS 2) with sophisticated AI training tools (PPO algorithm). Our learner achieved an 82% success rate in navigating unfamiliar spaces after countless simulated trial runs, which are the robotic equivalent of a student taking practice exams. The main finding is intriguing: with the correct training framework, robots can acquire surprisingly human-like navigation skills. This is true even though the robot performs marginally better in simulation than in messy reality, where unexpected lighting and textures can confuse its sensors. This work is unique because we have kept things realistic by concentrating on solutions that can be implemented in homes or workplaces and utilizing reasonably priced hardware. Although the system isn\'t flawless—it occasionally pauses in confined spaces like a cautious driver—it shows how artificial intelligence can enable machines to move more naturally.

Introduction

1. Overview of Path Planning in Autonomous Systems

Path planning involves determining a collision-free path from a start to a goal point in a given environment.
It's vital for autonomous vehicles (AVs) to ensure safety, adaptability, and efficiency.
Robot Operating System (ROS) enables flexible robot development by providing communication protocols and shared functionality for various robot types.

2. Role of AI and Deep Reinforcement Learning (DRL)

AI techniques (e.g., DRL, GANs, Deep Learning) enhance robot autonomy in dynamic environments.
DRL is especially effective in real-time navigation using raw sensory data, excelling in partially observable and changing scenarios.
DRL methods outperform traditional rule-based systems by learning directly from experience rather than requiring handcrafted logic.

3. DRL Pros & Cons (Table I)

Strengths	Weaknesses
High adaptability to dynamic environments	Requires large data and long training times
Can learn complex policies	Sim-to-real transfer challenges
Less reliance on manual rules	Risk of unsafe actions during training

4. Training Architecture and Methodology

Robot used: TurtleBot3
Sensors: 360° LiDAR for environmental perception
DRL Algorithm: Proximal Policy Optimization (PPO)—favored for stable learning via clipped objective function
Training system: Built using ROS 2, allowing seamless communication and realistic simulations
Reward System: Encourages forward movement and penalizes collisions
- Reward = (forward_progress * 0.1) - (collision * 10)

System Details:

Observation Space: 360 LiDAR readings (range 0.1–3.5 m)
Action Space: 2D (linear and angular velocities)
Training Setup:
- Algorithm: PPO
- Learning Rate: 3e-4
- Hardware: NVIDIA RTX 3060, 32GB RAM
- Training Time: ~4.2 hours (~100,000 steps)

5. Key Results

Success Rate: 82% in simulation
Average Speed: 0.15 m/s (safe and cautious)
Performance Trends:
- Consistent learning with smoother, longer navigation episodes
- Stable simulation speed (~9 fps) over training
- Emergence of human-like behavior: slowing in narrow spaces, careful navigation

6. Comparative Analysis

Our system vs. Traditional methods:

Component	Our Approach	Traditional Methods
Perception	Raw LiDAR	Processed features
Control	Direct velocity	PID controllers
Safety	Reward shaping	Hard-coded constraints

Compared to MIT’s work (2023):
- MIT: focused on precision in robotic arms
- This study: focuses on flexibility and adaptability in mobile navigation under unpredictable conditions

7. Innovation and Contribution

Developed a ROS 2-compatible training playground for DRL.
Balanced realism and computational feasibility, avoiding the extremes of basic simulations and resource-heavy models.
Showcased how reward-based learning can replicate real-world navigation strategies without manually programmed rules.

Conclusion

Our experiment shows that robots can indeed learn navigation much like living creatures do, through exploration and feedback. While the simulation-to-reality gap per-sists (like how flight simulators can\'t capture all real flying conditions), the potential is undeniable. Future improvements might include giving the robot better \"peripheral vision\" with cameras, or programming instinctive emergency stops - essentially devel-oping robotic reflexes.

References

[1] L. Yang et al., “Path Planning Technique for Mobile Robots: A Review,” Machines, vol. 11, no. 10, Art. no. 10, Oct. 2023, doi: 10.3390/machines11100980. [2] Z. Zhang, H. Fu, J. Yang, and Y. Lin, “Deep reinforcement learning for path planning of autonomous mobile robots in complicated environments,” Complex Intell. Syst., vol. 11, no. 6, p. 277, May 2025, doi: 10.1007/s40747-025-01906-9. [3] A. Gasparetto, P. Boscariol, A. Lanzutti, and R. Vidoni, “Path Planning and Trajectory Planning Algorithms: A General Overview,” in Motion and Operation Planning of Robotic Systems: Background and Practical Approaches, G. Carbone and F. Gomez-Bravo, Eds., Cham: Springer International Publishing, 2015, pp. 3–27. doi: 10.1007/978-3-319-14705-5_1. [4] S. Macenski, F. Martín, R. White, and J. G. Clavero, “The Marathon 2: A Navigation System,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Oct. 2020, pp. 2718–2725. doi: 10.1109/IROS45743.2020.9341207. [5] M. Reda, A. Onsy, A. Y. Haikal, and A. Ghanbari, “Path planning algorithms in the autonomous driving system: A comprehensive review,” Robot. Auton. Syst., vol. 174, p. 104630, Apr. 2024, doi: 10.1016/j.robot.2024.104630. [6] M. Alajlan and A. Koubâa, “Writing Global Path Planners Plugins in ROS: A Tutorial,” in Robot Operating System (ROS): The Complete Reference (Volume 1), A. Koubaa, Ed., Cham: Springer International Publishing, 2016, pp. 73–97. doi: 10.1007/978-3-319-26054-9_4. [7] A. Bonci, F. Gaudeni, M. C. Giannini, and S. Longhi, “Robot Operating System 2 (ROS2)-Based Frameworks for Increasing Robot Autonomy: A Survey,” Appl. Sci., vol. 13, no. 23, Art. no. 23, Jan. 2023, doi: 10.3390/app132312796. [8] C. Min et al., “Autonomous Driving in Unstructured Environments: How Far Have We Come?,” Nov. 01, 2024, arXiv: arXiv:2410.07701. doi: 10.48550/arXiv.2410.07701. [9] A. N. Abbas et al., “Safety-Driven Deep Reinforcement Learning Framework for Cobots: A Sim2Real Approach,” July 02, 2024, arXiv: arXiv:2407.02231. doi: 10.48550/arXiv.2407.02231. [10] B. R. Kiran et al., “Deep Reinforcement Learning for Autonomous Driving: A Survey,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 6, pp. 4909–4926, June 2022, doi: 10.1109/TITS.2021.3054625. [11] H. Taheri and S. R. Hosseini, “Deep Reinforcement Learning with Enhanced PPO for Safe Mobile Robot Navigation,” May 25, 2024, arXiv: arXiv:2405.16266. doi: 10.48550/arXiv.2405.16266. [12] “Deep Reinforcement Learning - an overview | ScienceDirect Topics.” Accessed: Nov. 08, 2024. [Online]. Available: https://www.sciencedirect.com/topics/computer-science/deep-reinforcement-learning [13] L. Yang, J. Bi, and H. Yuan, “Dynamic Path Planning for Mobile Robots with Deep Reinforcement Learning,” IFAC-Pap., vol. 55, no. 11, pp. 19–24, Jan. 2022, doi: 10.1016/j.ifacol.2022.08.042. [14] L. Kästner, J. Cox, T. Buiyan, and J. Lambrecht, “All-in-One: A DRL-based Control Switch Combining State-of-the-art Navigation Planners,” Sept. 23, 2021, arXiv: arXiv:2109.11636. doi: 10.48550/arXiv.2109.11636.

Copyright

Copyright © 2025 Rana A., B. Kaveendran. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET73330

Publish Date : 2025-07-23

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here