Application of Pre-trained Models (PTMs) in sentiment analysis, news classification, anti-spam detection, and information extraction

Norliza Binti Ahmad

Keywords: Anti-spam Detection, Fine-tuning, Information Extraction, Pre-trained Models, Sentiment Analysis, Transfer Learning


Abstract

The This research aims to investigate the application of Pre-trained Models (PTMs) in Natural Language Processing (NLP), focusing on four key tasks: sentiment analysis, news classification, anti-spam detection, and information extraction. Leveraging PTMs such as BERT, GPT, RoBERTa, and T5, we explore various methodologies tailored for each task. For sentiment analysis, we consider fine-tuning using the IMDb dataset, zero-shot or few-shot learning, and embedding-based approaches that utilize classical classifiers like SVM or Random Forest. In news classification, the study employs fine-tuning on labeled news articles, hierarchical attention to manage longer texts, and transfer learning to adapt models to smaller datasets. For anti-spam detection, the research investigates fine-tuning on spam-specific datasets, anomaly detection techniques, and active learning methods to adapt to the evolving nature of spam. In the domain of information extraction, we engage in Named Entity Recognition (NER), relation extraction, coreference resolution, and template filling to derive structured information from unstructured texts. The advantages of using PTMs include data efficiency, allowing for strong performance with less labeled data; generalization capabilities across different tasks and domains due to their extensive training; and speed, as transfer learning and fine-tuning are usually quicker than building models from the ground up. However, there are challenges to consider: PTMs require significant computational resources, may overfit when applied to small datasets without proper regularization, and offer limited interpretability due to their complex architectures.


Author Biography

Norliza Binti Ahmad

Norliza Binti Ahmad
Universiti Teknologi MARA, Kampus Jasin, 77000 Jasin, Melaka, Malaysia