Intent Classification using Hugging Face and BERT

Intent Classification is a classic task in Natural Language Processing (NLP) where the goal is to classify the intent of a user’s input. This is useful in many applications such as chatbots, virtual assistants, and search engines. In this project, we will be using the Hugging Face Transformers library to fine-tune a pre-trained BERT model on the Amazon massive Intent Classification dataset.

Data

The dataset we are using is the Amazon massive Intent Classification dataset which contains 16.5 thousand training points which are already split into training set, validation set and test set.

The dataset can be found here.

BERT

With the recent boom with transformer due to their parallelization capabilities and their ability to capture long-range dependencies. There have been many nice transformers being develop daily, but for this project we are focused on BERT (Bidirectional Encoder Representations from Transformers) which is a transformer model that was developed by Google. The base model for BERT is trained on the Toronto Book Corpus and Wikipedia, and the specific variant of the model we are using is the bert-base-uncased which is a smaller version of the original BERT model that is trained on lower-cased text and is suitable for intents classification.

More on the BERT model can be found here.

Models

We trained 5 total models to gauge the performance of the BERT model on the Amazon massive Intent Classification dataset, the models are:

  1. Baseline Model - BERT model with no fine-tuning achiving an accuracy of 0.0148
  2. Custom Tuned Model - Included Warmup Scheduler and re-initialized some of the layers of the BERT model, achieving an accuracy of 0.8675
  3. SupContrast Model - Used the Supervised Contrastive Loss function to train the model, achieving an accuracy of 0.8141.
  4. SLIMCLR Model - Used the SLIMCLR loss function to train the model, achieving an accuracy of 0.8773.

Deliverables

For a more in-depth look at the project, you can download the project below, and you can also view the code by clicking the image below.