This paper presents a system for speech-controlled optical wearables that dynamically adjust lens focus through motorized control. The system integrates the AI Thinker VCO2 module for offline voice recognition and the ESP32 micro- controller for motor control. The system provides users with hands-free control to adjust lens focus using natural language commands such as ’Zoom In’ or ’Zoom Out’. We analyze the architecture, process flow, and mathematical models that govern motor control and evaluate performance based on real-world testing.Experimentalresultsdemonstratethesystem’sfeasibility in various environmental conditions.
Introduction
Overview:
Traditional optical wearables like glasses have static lenses that don’t adapt to different focal needs. With the advancement of AI and user expectations, there’s a growing demand for dynamic focus adjustment, especially for people with vision impairments. This study introduces a speech-controlled smart glasses system that adjusts lens focus in real-time via voice commands, enhancing accessibility, usability, and independence.
Key Components & Technologies:
Speech Recognition Module:
AI Thinker VCO2 – enables offline speech recognition using pre-set commands like “Zoom In” or “Zoom Out.” Uses MFCC for feature extraction and DTW for matching.
Microcontroller: ESP32 – processes recognized commands and controls the motorized lens adjustment.
Optical System:
A multi-lens system with motorized adjustment alters the focal length using lens curvature and distance. Real-time control allows users to shift between near and far vision hands-free.
System Design:
Speech input is captured via microphone.
Voice commands are processed offline by AI Thinker VCO2.
ESP32 microcontroller interprets commands and generates motor signals.
Lens focus is adjusted using a motorized multi-lens system.
Feedback is given via LEDs or beeps to confirm action.
Mathematical Modeling:
Motor control is modeled using a first-order differential equation involving torque constant (K), inertia (J), and damping (B).
PWM (Pulse Width Modulation) controls motor position.
Lens position and curvature (R1, R2) affect total focal length, modeled using lens combination formulas.
Experimental Results:
Knob-controlled mechanism tested: A movable lens with angular displacement from 0° to 360° adjusted focal length from 66.67mm to 67.17mm.
The system shows smooth, precise adjustments of focus.
Voice command response time: ~2 seconds.
Recognition accuracy:
95% in controlled settings
85% in noisy environments
Comparative Analysis of Speech Modules:
Module
Cost
Offline Support
Accuracy
Latency
Power
Customizability
AI Thinker VCO2
$5
Yes
95%
100ms
50mW
High
Google Assistant
$150
No
98%
120ms
300mW
High
Azure Speech Service
$100
No
99%
500ms
400mW
Moderate
ReSpeaker
$20
Yes
94%
150ms
120mW
Medium
Conclusion
This paper presents a speech-controlled optical wearable system capable of dynamically adjusting lens focus based on spokencommands.ByintegratingtheAIThinkerVCO2mod- uleforofflinevoicerecognitionandtheESP32microcontroller forprecisemotorcontrol,wehavesuccessfullydeveloped a hands-free, user-friendly device designed to enhance the quality of life for users with vision impairments. The seam- less interaction between voice commands and the system’s mechanical components exemplifies how modern technology can bridge accessibility gaps and deliver tailored solutions for specific needs.
The system demonstrates excellent performance in con- trolledenvironments,wherevoicecommandslike”Zoom In” and ”Zoom Ou,” are recognized and executed with high accuracy. The ability to operate without internet connectivity ensures user privacy and reliability, while the compact design oftheglassesmakesthempracticalandcomfortableforevery- day use. Furthermore, the modular architecture of the system allows for potential upgrades, such as adding additional voice commands or integrating with other assistive technologies.However, certain challenges must be addressed before this system can achieve widespread adoption. Noise interferenceinreal-worldenvironmentsremainsacriticalissue,potentially affectingtheaccuracyofvoicerecognition.Similarly,optimiz- ing the motor’s response time and minimizing mechanical lag are essential to ensure a smooth and intuitive user experience. Addressing these challenges will require advanced signal pro- cessing techniques, enhanced noise-cancellation algorithms, and improvements in hardware design.Futureworkwillfocusonovercomingtheselimitations by enhancing the system’s robustness in noisy settings, ex- ploring the integration of adaptive algorithms for improved speech recognition, and optimizing motor control for quicker and more accurate adjustments. Additionally, extending the functionality of the device by incorporating features such as gesture recognition or integration with smartphone apps could further enhance its versatility and usability.Thisprojectdemonstratesthepotentialofcombiningspeech recognition, motor control, and wearable technology to create aninnovativeandaccessiblesolutionforvision-impairedindividuals. By addressing the identified challenges and building upon the current design, this system could become a valuable toolinassistivetechnology,empoweringuserstonavigatetheir world with greater independence and ease. .
References
[1] E. S. Arza, E. and M. Masoumi, “Optical wearables for healthcareapplications: A review,” Sensors (Basel), vol. 20, no. 4, p. 1213, 2020.
[2] G.P.Jain,A.andS.Singh,“Integrationofspeechrecognitioninsmartglassesforenhanceduserinteraction,”in2018IEEEInternationalConference on Consumer Electronics (ICCE), 2018, pp. 1–2.
[3] P. J. Jain, H. and K. Lee, “Speech recognition in smart eyewear: Chal-lengesandopportunities,”ACMTransactionsonMultimediaComputing,Communications, and Applications (TOMM), vol. 15, no. 3s, pp. 1–14,2019.
[4] M. Khurshid and A. Mahmood, “Advancements in speech recognitionfor wearable computing devices,” in 2020 IEEE 10th InternationalConference on Electronics Information and Emergency Communication(ICEIEC), 2020, pp. 1–5.
[5] K. N. K. J. Kim, M. and Y. Choi, “Optical wearables for augmentedreality applications,” Sensors (Basel), vol. 17, no. 8, p. 1901, 2017.
[6] D. S. Rajput, N. and R. Agrawal, “Optical wearables for sports andfitnessmonitoring:Acomprehensivereview,”JournalofBiomedicalandHealth Informatics, vol. 22, no. 2, pp. 380–398, 2018.
[7] A. K. Singh and P. K. Singh, “Optical wearables for augmented realityin industrial settings,” Proceedings of the Institution of MechanicalEngineers,PartO:JournalofRiskandReliability,vol.235,no.3, pp.623–638,2021.
[8] B.SreenivasuluandP.S.Avadhani,“Speechrecognitiontechniquesfor wearable devices: A survey,” International Journal of ComputerApplications, vol. 167, no. 12, pp. 1–6, 2017.
[9] A.N.J.A.M.T.A.M.Umar,H.andM.Aqeel,“Speechrecognitioninwearabledevices:Areview,”IEEEAccess,vol.7,pp.56918–56941, 2019.
[10] N.S.S.S.A.M.M.M.Lee,S.S.andF.Z.Satti,“Opticalhead-mounteddisplay with speech recognition for augmented reality applications,” in2017 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC), 2017, pp. 1–5.
[11] T. Smith and R. Johnson, “Advances in offline speech recognitionsystems for iot devices,” IEEE Transactions on Industrial Informatics,vol. 17, no. 4, pp. 2550–2562, 2021.
[12] L. Zhang and H. Chen, “Edge ai for real-time speech recognition inwearabledevices,”in2020IEEESymposiumonEdgeComputing(SEC),2020, pp. 150–156.
[13] Y. Wang and H. Li, “Trends in wearable speech recognition technologyfor healthcare applications,” IEEE Journal of Biomedical and HealthInformatics, vol. 25, no. 9, pp. 3550–3565, 2021.
[14] M. Lopez and P. Ramirez, “Enhancing speech recognition accuracy innoise-rich environments,” in 2019 IEEE International Conference onAcoustics, Speech and Signal Processing (ICASSP), 2019, pp. 4565–4569.
[15] K. R. Ali, N. and Z. Ahmed, “Real-time optical speech recognition forsmart glasses,” IEEE Access, vol. 8, pp. 98745–98758, 2020.