Enhancing IoT Security using Feature Reduction Techniques

Authors: Dr. Pradeep Kumar, Prakhar Gupta, Harsh Som, Om Gupta

DOI Link: https://doi.org/10.22214/ijraset.2025.69739

Abstract

As Internet of Things (IoT) devices become more common, they’re also becoming a biggertarget for cyberattacks. Thesedevicesoften have limited resources, making it tough to keep them secure. One promising solution is using machine learning in Network Intrusion Detection Systems (NIDS). However, for these systems to work efficiently, they need to reduce the number of features they analyze—this helps save computational power without sacrificing accuracy. In this study, we take a close look at different ways to reduce features, comparing techniques like feature selection (FS)—such as Pearson Correlation and Chi-square—with feature extraction (FE) methods like Principal Component Analysis (PCA) and Autoencoders (AE). We tested these approaches using two IoT- specific datasets, TON-IoT and BoT-IoT, and evaluated five machine learning models: Decision Tree,RandomForest,k-NearestNeighbors,Naive Bayes, and Multi-Layer Perceptron. The results show that FE methods tend to perform better in terms of accuracy and robustness, especially with complex datasets, though they require more computational power. On the other hand, FS techniques like Chi-square offer a good balance between performance and efficiency. Among FE methods,PCAisfasterthanAE.Interestingly,FS works better with smaller datasets, while FE is more effective for handling a variety of attack types.Thisresearchprovidespracticaladviceon choosing the right feature reduction method for IoT environments, helping strike a balance between accuracy and computational efficiency. These insights are crucial for building scalable, real-time NIDS that can handle the unique challenges of IoT systems, benefiting both researchers and industry professionals working on IoT security.

Introduction

I. Introduction

The Internet of Things (IoT) is transforming daily life and industries, with over 30.9 billion IoT devices projected by 2025. However, these devices often have limited resources and weak security, making them prime targets for cyberattacks (e.g., DDoS, ransomware). To address this, machine learning-based Network Intrusion Detection Systems (NIDS) are being explored, enhanced by feature reduction techniques for greater efficiency and accuracy.

II. Goal of the Project

This study focuses on:

Improving IoT security through NIDS using feature selection (FS) and feature extraction (FE)
Comparing FS and FE techniques on TON-IoT and BoT-IoT datasets
Enhancing attack detection while conserving computational resources

III. Methodology

The approach is divided into three phases:

Data Preprocessing & Class Balancing:
- Cleaning: Handle missing values, remove duplicates, encode categorical features.
- Normalization: Apply min-max scaling.
- Balancing: Use SMOTE to handle class imbalance.
- Data Splitting: 80% training / 20% testing.
Dimensionality Reduction:
- Feature Selection (FS):
  - Pearson Correlation: Chooses features with high correlation to target.
  - Chi-Square Test: Selects statistically significant categorical features.
- Feature Extraction (FE):
  - Principal Component Analysis (PCA): Compresses data by maximizing variance.
  - Autoencoders (AE): Neural networks that reduce and reconstruct data, good for non-linear relationships.
Classification:
- Train and test ML models using reduced data.
- Metrics used: Accuracy, precision, recall, F1-score, training/prediction time.

IV. Datasets Used

TON-IoT:
- Simulates realistic IoT settings (edge, fog, cloud layers).
- Includes diverse data: system logs, telemetry, and network traffic.
- Captures benign and attack behavior for smart city scenarios.
BoT-IoT:
- Simulates various IoT-based attacks (DDoS, keylogging, data theft).
- Strong labeling and documentation.
- Focuses on network forensics.

V. Feature Reduction

Why? IoT datasets are high-dimensional, resource-constrained devices need lighter models.
Goal: Reduce features while retaining core information.

1. Feature Selection (FS):

Pearson Correlation: Selects non-redundant, relevant features.
Chi-Square Test: Ideal for classifying categorical attack data with limited computation.

2. Feature Extraction (FE):

PCA: Reduces dimensionality while preserving variance.
Autoencoders: Good for capturing non-linear patterns, but more computationally expensive.

VI. Attack Classification and ML Models

ML models are trained to distinguish:

Normal vs. malicious traffic
Specific attack types (e.g., DDoS, DoS, ransomware)
Rare attack patterns

Models Used:

Decision Tree (DT) – Simple, low-resource.
Random Forest (RF) – High accuracy, ensemble of DTs.
k-Nearest Neighbors (kNN) – Distance-based, slower on large data.
Naive Bayes (NB) – Probabilistic, fast and efficient.
Multi-Layer Perceptron (MLP) – Deep learning model, resource-intensive.

VII. Key Takeaways

Feature reduction enhances NIDS efficiency in resource-constrained IoT environments.
PCA and Autoencoders improve detection but vary in computational demand.
SMOTE effectively addresses class imbalance, improving detection of rare attacks.
Combining strong datasets (TON-IoT & BoT-IoT) with optimized ML models leads to scalable, accurate intrusion detection for IoT networks.

Conclusion

This \'Improving IoT Security Using Feature Reduction Methods\' project aims to mitigate increasing vulnerabilities in IoT networks. IoT networks are often cyberattacked due to their restricted processing capabilities and lack of resources. Thestudyexploredfeatureselection(FS)techniques, such as Pearson Correlation and Chi-square, with feature extraction (FE) techniques, such as Principal ComponentAnalysis(PCA)andAutoencoders(AE), on actual IoT datasets like TON-IoT and BoT-IoT. The results indicated that FS methods are computationallyless intensive and adequate for real- time applications, while FE methods are more accurateandrobust but comeat thecost of increased computation. Different machine learning techniques—random trees,randomforests,k-nearestneighbors,andNaive Bayes, Multi-Layer Perceptrons, and Decision Trees—were compared to determine the most effective combinations of feature reduction algorithms and classifiers. The findings illustrated that Autoencoders combined with Random Forests delivered the greatest accuracy, and Chi-square in combination with PCA offered a good blend of computationaleffectivenessandbalanceindetection. By solving problems like high-dimensional data, classimbalance,andresourcelimitation,thisresearch helps in the creation of scalable, real-time security solutions specific to IoT networks. The findings of this research provide useful insights for both academic researchers and industry professionals looking to design more robust and adaptive security frameworks for IoT ecosystems. In conclusion, this project highlights the central role of feature reduction methods in future-proofing IoT security. It establishes the foundation for the next generation of innovations in intrusion detection systems that are not only fast and efficient but specially crafted to suit the specific needs of IoT environments.

References

[1] Overview for machine learning algorithms to enhance iot system security–https://www.nature.com/articles/s41598-024-62861-y [2] Efficientnetworkintrusiondetection using pca-based dimensionality reduction of feature– https://ieeexplore.ieee.org/document/8909140 [3] Enhancing iot security through machine learning-driven anomaly detection– HTTPS://WWW.RESEARCHGATE.NET/PUBLICATION/38291 3162ENHANCINGIOTSECURIT YLEVERAGIN GARTIFICIALINTELLIGENCE [4] A novel svm-knn-pso ensemble methodforintrusiondetectionsystem –https://linkinghub.elsevier.com/retrie ve/pii/s1568494615006328 [5] Internet of things: a survey on enabling technologies,protocols,and applications– https://ieeexplore.ieee.org/document/8080566 [6] Evaluation of machine learning algorithms for intrusion detection system –https://ieeexplore.ieee.org/document/8 080566 [7] Building an intrusion detection system using a filter-based feature selectionalgorithm– https://ieeexplore.ieee.org/document/7387736

Copyright

Copyright © 2025 Dr. Pradeep Kumar, Prakhar Gupta, Harsh Som, Om Gupta. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET69739

Publish Date : 2025-04-25

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here