Automatic Labeling Model

2023
Machine LearningNLP
Automatic Labeling Model

Overview

Kakaostyle is a mid-sized e-commerce company within the Kakao Group in South Korea. As a data quality manager on the Data Science team, I analyzed model prediction data and developed dashboards to ensure the training data was well-distributed for the model. This included verifying that all items, particularly popular ones, were included in the training data and assessing how it related to page views. The team launched a new function in the app that allows users to check reviews based on clothing characteristics, such as fit and length, which they might consider during online shopping. To train the language model for this function, I sometimes had to label the review data, which was a time-consuming and inefficient process. One day, I proposed an automatic labeling model to my manager, giving me the opportunity to design and implement it. Using the BERT model, I trained the internal clothing review data with supervised learning. When you upload an unlabeled file, the model labels the sentiment as Positive, Neutral, or Negative and returns the labeled file. I also implemented a simple user interface using Streamlit. This approach proved to be very efficient, as we only needed to proofread the labeled files. If you’re interested, check out the demo!

Technologies Used

PythonStreamlit

Project Details

Year

2023

Status

Completed

Links