This paper brings robot learning to life by showing how a humble TurtleBot3 can teach itself to navigate using an approach inspired by how humans learn through trial and error. We\'ve created a custom training playground where the robot learns from its 360-degree laser \"vision\" (like constantly feeling its surroundings with outstretched arms) to smoothly move through spaces without collisions. We have established a link between virtual practice sessions and actual performance by integrating the robot\'s operating system (ROS 2) with sophisticated AI training tools (PPO algorithm). Our learner achieved an 82% success rate in navigating unfamiliar spaces after countless simulated trial runs, which are the robotic equivalent of a student taking practice exams. The main finding is intriguing: with the correct training framework, robots can acquire surprisingly human-like navigation skills. This is true even though the robot performs marginally better in simulation than in messy reality, where unexpected lighting and textures can confuse its sensors. This work is unique because we have kept things realistic by concentrating on solutions that can be implemented in homes or workplaces and utilizing reasonably priced hardware. Although the system isn\'t flawless—it occasionally pauses in confined spaces like a cautious driver—it shows how artificial intelligence can enable machines to move more naturally.
Introduction
1. Overview of Path Planning in Autonomous Systems
Path planning involves determining a collision-free path from a start to a goal point in a given environment.
It's vital for autonomous vehicles (AVs) to ensure safety, adaptability, and efficiency.
Robot Operating System (ROS) enables flexible robot development by providing communication protocols and shared functionality for various robot types.
2. Role of AI and Deep Reinforcement Learning (DRL)
AI techniques (e.g., DRL, GANs, Deep Learning) enhance robot autonomy in dynamic environments.
DRL is especially effective in real-time navigation using raw sensory data, excelling in partially observable and changing scenarios.
DRL methods outperform traditional rule-based systems by learning directly from experience rather than requiring handcrafted logic.
3. DRL Pros & Cons (Table I)
Strengths
Weaknesses
High adaptability to dynamic environments
Requires large data and long training times
Can learn complex policies
Sim-to-real transfer challenges
Less reliance on manual rules
Risk of unsafe actions during training
4. Training Architecture and Methodology
Robot used: TurtleBot3
Sensors: 360° LiDAR for environmental perception
DRL Algorithm:Proximal Policy Optimization (PPO)—favored for stable learning via clipped objective function
Training system: Built using ROS 2, allowing seamless communication and realistic simulations
Reward System: Encourages forward movement and penalizes collisions
Observation Space: 360 LiDAR readings (range 0.1–3.5 m)
Action Space: 2D (linear and angular velocities)
Training Setup:
Algorithm: PPO
Learning Rate: 3e-4
Hardware: NVIDIA RTX 3060, 32GB RAM
Training Time: ~4.2 hours (~100,000 steps)
5. Key Results
Success Rate: 82% in simulation
Average Speed: 0.15 m/s (safe and cautious)
Performance Trends:
Consistent learning with smoother, longer navigation episodes
Stable simulation speed (~9 fps) over training
Emergence of human-like behavior: slowing in narrow spaces, careful navigation
6. Comparative Analysis
Our system vs. Traditional methods:
Component
Our Approach
Traditional Methods
Perception
Raw LiDAR
Processed features
Control
Direct velocity
PID controllers
Safety
Reward shaping
Hard-coded constraints
Compared to MIT’s work (2023):
MIT: focused on precision in robotic arms
This study: focuses on flexibility and adaptability in mobile navigation under unpredictable conditions
7. Innovation and Contribution
Developed a ROS 2-compatible training playground for DRL.
Balanced realism and computational feasibility, avoiding the extremes of basic simulations and resource-heavy models.
Showcased how reward-based learning can replicate real-world navigation strategies without manually programmed rules.
Conclusion
Our experiment shows that robots can indeed learn navigation much like living creatures do, through exploration and feedback. While the simulation-to-reality gap per-sists (like how flight simulators can\'t capture all real flying conditions), the potential is undeniable. Future improvements might include giving the robot better \"peripheral vision\" with cameras, or programming instinctive emergency stops - essentially devel-oping robotic reflexes.
References
[1] L. Yang et al., “Path Planning Technique for Mobile Robots: A Review,” Machines, vol. 11, no. 10, Art. no. 10, Oct. 2023, doi: 10.3390/machines11100980.
[2] Z. Zhang, H. Fu, J. Yang, and Y. Lin, “Deep reinforcement learning for path planning of autonomous mobile robots in complicated environments,” Complex Intell. Syst., vol. 11, no. 6, p. 277, May 2025, doi: 10.1007/s40747-025-01906-9.
[3] A. Gasparetto, P. Boscariol, A. Lanzutti, and R. Vidoni, “Path Planning and Trajectory Planning Algorithms: A General Overview,” in Motion and Operation Planning of Robotic Systems: Background and Practical Approaches, G. Carbone and F. Gomez-Bravo, Eds., Cham: Springer International Publishing, 2015, pp. 3–27. doi: 10.1007/978-3-319-14705-5_1.
[4] S. Macenski, F. Martín, R. White, and J. G. Clavero, “The Marathon 2: A Navigation System,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Oct. 2020, pp. 2718–2725. doi: 10.1109/IROS45743.2020.9341207.
[5] M. Reda, A. Onsy, A. Y. Haikal, and A. Ghanbari, “Path planning algorithms in the autonomous driving system: A comprehensive review,” Robot. Auton. Syst., vol. 174, p. 104630, Apr. 2024, doi: 10.1016/j.robot.2024.104630.
[6] M. Alajlan and A. Koubâa, “Writing Global Path Planners Plugins in ROS: A Tutorial,” in Robot Operating System (ROS): The Complete Reference (Volume 1), A. Koubaa, Ed., Cham: Springer International Publishing, 2016, pp. 73–97. doi: 10.1007/978-3-319-26054-9_4.
[7] A. Bonci, F. Gaudeni, M. C. Giannini, and S. Longhi, “Robot Operating System 2 (ROS2)-Based Frameworks for Increasing Robot Autonomy: A Survey,” Appl. Sci., vol. 13, no. 23, Art. no. 23, Jan. 2023, doi: 10.3390/app132312796.
[8] C. Min et al., “Autonomous Driving in Unstructured Environments: How Far Have We Come?,” Nov. 01, 2024, arXiv: arXiv:2410.07701. doi: 10.48550/arXiv.2410.07701.
[9] A. N. Abbas et al., “Safety-Driven Deep Reinforcement Learning Framework for Cobots: A Sim2Real Approach,” July 02, 2024, arXiv: arXiv:2407.02231. doi: 10.48550/arXiv.2407.02231.
[10] B. R. Kiran et al., “Deep Reinforcement Learning for Autonomous Driving: A Survey,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 6, pp. 4909–4926, June 2022, doi: 10.1109/TITS.2021.3054625.
[11] H. Taheri and S. R. Hosseini, “Deep Reinforcement Learning with Enhanced PPO for Safe Mobile Robot Navigation,” May 25, 2024, arXiv: arXiv:2405.16266. doi: 10.48550/arXiv.2405.16266.
[12] “Deep Reinforcement Learning - an overview | ScienceDirect Topics.” Accessed: Nov. 08, 2024. [Online]. Available: https://www.sciencedirect.com/topics/computer-science/deep-reinforcement-learning
[13] L. Yang, J. Bi, and H. Yuan, “Dynamic Path Planning for Mobile Robots with Deep Reinforcement Learning,” IFAC-Pap., vol. 55, no. 11, pp. 19–24, Jan. 2022, doi: 10.1016/j.ifacol.2022.08.042.
[14] L. Kästner, J. Cox, T. Buiyan, and J. Lambrecht, “All-in-One: A DRL-based Control Switch Combining State-of-the-art Navigation Planners,” Sept. 23, 2021, arXiv: arXiv:2109.11636. doi: 10.48550/arXiv.2109.11636.