2025 Volume 10 Issue 1 Supplementary
Creative Commons License

A Proposed Approach for Managing Noise and Uncertainty in Big Data


Abstract

Sorting and clustering in big data increase search speed because searching within large datasets is both critical and time-consuming. Extensive research has been conducted in recent years on detecting outliers by employing classification through engineering methods such as artificial intelligence and machine learning. One of the significant challenges in applying these methods is the presence of noise in the data. Considering that big data exhibits imbalanced distributions across its classes, it is possible to train these data by creating transformed datasets and perform the necessary classification based on the specific problem. In this study, data clustering is initially performed using the K-means method and hierarchical analysis. After clustering, a hybrid approach composed of neural networks, decision trees, and boosting is applied to manage noise in the dataset. Subsequently, the proposed approach is compared with two other methods, including structural feature methods and data mining techniques such as support vector machines, based on criteria like accuracy, precision, and recall. The results demonstrate an improvement of 2 to 4 percent across all metrics for our proposed method.


Related articles:
Most viewed articles:
Issue 2 Volume 11 - 2026