Capturing Training Data using Natural Language Processing
Ubineer is looking to add to our data set by integrating advanced Natural Language Processing (NLP) techniques. Currently we are seeking to build a massive data set that understands complex queries to capture data. The goal of this project is to expand/improve our data sets so that we can train a Large LLM. We are seeking student who want to understand how NLP works and have a passion in data analysis. The project will involve tasks such as data preprocessing, capturing, and storing. By the end of the project, the students would understand key NLP techniques such as chunking.