The CCS Nationals 2024 was held in the IU Natatorium, Indianapolis, Indiana. The event was held from the 5th to the 7th of April, this is my second time attenting the CCS Nationals but this time I was not only a competitor but I was also one of the captains for my team UC San Diego Club Swim.
On the 13th of March 2024, after a 2 quarter long journey through UCSD Senior Capstone sequence, we capped it off with a showcase with all the other projects from the other teams. Our project was on Climate Bench Plus where we improved upon Baseline models set by the original Climate Bench paper by our mentor. For more on the project click here.
An extension to my professors (Duncan Watson-Parris) paper ClimateBench which is a benchmarking framework that leverages data from a set of Coupled Model Intercomparison Projects (CMIPS), AerChemMip and Detection-Attrition Model Intercomparison Projects which are extremely complex simulations performed by the state of the art Earth Model Systems (EMS). In order to create a lighter and more accessible benchmark which can be used for climate research and understanding our climate better.
The task was to classify French and Spanish words using Machine Learning, by building the classifier from scratch. The dataset we were given consisted of 1200 words of 600 French words and 600 Spanish words.
Intent Classification is a classic task in Natural Language Processing (NLP) where the goal is to classify the intent of a user’s input. This is useful in many applications such as chatbots, virtual assistants, and search engines. In this project, we will be using the Hugging Face Transformers library to fine-tune a pre-trained BERT model on the Amazon massive Intent Classification dataset.
Recurrent Neural Networks (RNN) are a special kind of Neural Network that maintains a state or memory of the previous inputs allowing them to remember important information over a sequence of inputs. This makes them ideal for tasks that require sequential information where the order of the data matters. Long Short-Term Memory (LSTM) networks are a variation of RNNs that fix the issue of vanishing gradients in traditional RNNs that happen due to the backpropagation of errors over many time steps causing the gradients to vanish. Making both RNN and LSTM networks ideal for tasks like music generation where the order of the notes is important.
On the 6th of February 2024, in collaboration with ASML I organized an event with them which consisted of an ASML campus tour and a talk on the future of Data Science and a Q&A session with the ASML team. The event was a huge success with everyone learning a lot about the future of Data Science and the role of ASML in it. Below are some pictures from the event.
Semantic Segmentation is the task of classifying each pixel in an image into a category and is an extremely important task in computer vision as it has many real life uses such as in autonomous driving, medical imaging, and satellite imaging. In this project, my group and I implemented a Fully Convolutional Network (FCN) to perform semantic segmentation on the VOC2007 dataset.
For this project, I implemented a Softmax Regression Neural Network from scratch to classify the MNIST dataset. The MNIST dataset is a dataset of handwritten digits from 0-9, which is a popular dataset used for benchmarking and testing new Machine Learning models. The task was to classify the handwritten digits into their respective classes using a Softmax Regression Neural Network.