Fuzzified Clustering and Sample Reduction for Intelligent High Performance Distributed Classification of Heterogeneous Uncertain Big Data
Keywords:
Big Data, Fuzzified Clustering, Classifier Ensemble, Weighted Subsampling, Parallel Classification, Sample Reduction, Veracity.
Abstract
diverse datasets efficiently. This paper introduces a Fuzzified Clustering technique with sample reduction and distributed Parallel Classification (FCPC). Fuzzified clustering is particularly well-suited for Big Data as it enables the intelligent partitioning of datasets while managing uncertainties and overlapping data points. The FCPC technique takes advantage of this capability to reduce dataset size, capturing essential data structures and enhancing classification performance. Benchmark Big Data sets are used to compare FCPC with traditional classifiers, which require the entire dataset to fit in memory. Four classification techniques were evaluated in terms of classification evaluation metrics, namely, Accuracy, Area Under the ROC Curve, and F1 Score. The proposed model demonstrated improved classification predictive power with a sample reduction of approximately 90%, leading to enhanced performance and potential reductions in computational resources.
Published
2025-01-29
How to Cite
Moawad, S. S., Osman, M., & Moussa, A. S. (2025). Fuzzified Clustering and Sample Reduction for Intelligent High Performance Distributed Classification of Heterogeneous Uncertain Big Data. Statistics, Optimization & Information Computing. https://doi.org/10.19139/soic-2310-5070-2275
Issue
Section
Research Articles
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).