AN IMPROVED FEATURE SELECTION TECHNIQUE FOR ANDROID MALWARE DETECTION SYSTEM IN SOCIAL MEDIA USING AN ENSEMBLE OF MACHINE LEARNING MODEL

Authors

  • Saqib Malik, Narendra Sharma Author

Abstract

To propose a cross-platform detection system that provides a complete defense by monitoring and analyzing data across various social media platforms (Twitter, Facebook, and Instagram). Ensemble of machine learning methods is used to address real-time challenges such as detecting Android malware in multiple social media platforms to prevent cyber fraud. The method first involved four ML models: random forest, decision tree, naive Bayes, and Stochastic Gradient Tree Boosting. The meta-learner support vector machine combined the outputs from both pre-trained models. The TF-IDF with bag-of-words were utilized in feature extraction for these ML algorithms. Finally, this approach was evaluated on a Kaggle datasets and a preprocessor was applied through Natural Language Processing (NLP) to improve the data quality. Comparing the suggested model to prior models, we found it performed best. The model proposed in this study had the best accuracy when applying this feature selection method, with 98.23% and 96.54% for the two datasets. This study proposes a novel feature selection technique that improves android malware detection system performance more successfully in real-time environment. This study also used NLP for classification of social media text containing malware. The use of ML models in an ensemble with NLP for Android malware detection on social platforms hasn't been widely investigated. This method also improves detection accuracy by using unique feature extraction techniques.

Downloads

Published

2025-01-17

Issue

Section

Articles