92% 4 Level Deep Product Categorization with MultiLingual Dataset for Wholesale Marketplace
How we were able to reach 97% on top level categories, and 92% on bottom level categories with a multilingual dataset for a customer of Pumice.ai.
The popular HR company O.C. Tanner, which has been in business since 1927 and has over 1500 employees, was looking to research and design two GPT-3 software products to be used as internal tools with their clients. GPT-3 based products can be difficult to outline and design given the sheer lack of publicly available information around optimizing and improving these systems to a production level. We used our expert level understanding of the different processes in building production GPT-3 systems that we’ve used for companies like Keap.com, Mission Holdings, and EAS.
Let’s take a look at the two GPT-3 based products Width.ai designed.
Our unconscious bias detection pipeline works to recognize unconscious bias in the language used in different messages submitted by the user. The pipeline has a learned understanding of how the language in messages should adjust and change given known personal information about the two people involved in the conversation. An example would be how the words and language that are acceptable to use change if one gender is talking to another, or taking into account different race relations. Not only does our GPT-3 pipeline handle any sentiment or language understanding that is important in the task of bias detection, but handles the learned relationship with multiple variables such as gender, age, and race to steer the model. Given the multi-task learning required we’ve designed various testing and optimization modules to understand where the GPT-3 pipeline struggles in terms of covering variance.
As always we worked to outline various prompt optimization and model fine-tuning methods that we’ve found greatly improve the accuracy of this task. Given the wide variety of tasks that large language models can work for, a prompt optimization algorithm is never the same across the board and we must design a custom algorithm for this specific use case.
This product works as a text editor type product used to detect negative or harmful wording and language throughout a given input message. For any found instances the software also suggests a rewritten alternative. The GPT-3 model uses a learned understanding of subjects such as gendered language and offensive terms to detect these problems.
Given the nature of the examples of what language is deemed negative this was pretty difficult to detect with a few shot learner. Oftentimes similar systems have specific keywords and phrases that contain vulgar words or easily identifiable phrases that even sentiment analysis models would catch. In this instance negative language examples were more focused on the meaning and context of what is being said. This to data with a higher variance from the training examples being false negatives. Our software design also generated new phrases to replace the negative language based on state of the art designs for generative models.
Just as we did in the software product above, Width.ai outlined all the required architecture and supporting modules required for this pipeline.
Our software pipeline takes advantage of state of the art GPT-3 api’s and resources, which we combine with our knowledge of recent breakthroughs in optimizing and tuning these generative models towards specific use cases. Although these are data points in a similar domain, they discuss a wide range of topics and we must account for some level of generalization in our training examples. Our cutting edge understanding of GPT-3 prompt optimization allowed us to account for a high level of variance in the training set and create a system that can allow for much more as the training set grows with new examples.
The end result are two powerful GPT-3 based products that are designed and architected for production use.
Interested in seeing how GPT-3 and other NLP tools can be used in your industry? Let’s talk - Contact Us