Bombora, the leader in B2B intent data, provides sales and marketing solutions that identify when a business's prospective customers are actively searching for relevant solutions. Armed with Bombora’s intent data, sales and marketing teams focus their efforts on the businesses that are most likely to buy their solution now.
How does Bombora know when a business is actively looking for solutions? Bombora works with a consent-based data cooperative of B2B publishers to capture buying signals across 4,000 sites. Those signals are processed by Bombora using natural language processing (NLP) and the BERT-based machine learning technique to produce Company Surge® scores that reflect each business’s level of purchasing intent.
May the Best Model Win
After an innovative year working on the next version of the company’s flagship Content Understanding Model, it was time to put this challenger model to the test. This business-critical NLP model evaluates and categorizes web page content into one or more of the 8,500+ Bombora content topics. Bombora customers select topics relevant to their business and Bombora reports when businesses are actively engaging with content in the selected topics. As you can imagine, releasing a new version of a model with this level of business importance doesn’t happen without exhaustive testing and analysis of changes in model performance. Bombora Data Scientist, Zhuoru (Simon) Lin, explains:
“We set up tests to compare the performance of the current production model with the new challenger model. Each model was given the same sample of content and for each one returned the 5 topic predictions with the highest confidence scores.”
The Bombora annotation team, armed with Label Studio Enterprise, was enlisted to review and label whether the topic predictions for each article were accurate. The model with the most accurate prediction scores would be crowned the winner. Did the challenger model unseat the reigning champion? Or did the production model prove it still has what it takes?
Behind the Scenes
Lin and his colleagues leveraged Label Studio’s XML-like scripting language to quickly create a custom labeling interface designed specifically for the task at hand. Lin explains why Label Studio’s customizable interface is so important:
“One thing I love about Label Studio is the flexibility it provides. Other data labeling platforms offer a generic interface that can’t really be customized. With Label Studio’s custom config, we tailored the entire interface to our specific needs. Having a well-designed interface improves labeling accuracy. And the scripting is very easy. Anyone can do it”
Lin used Label Studio to assign labeling tasks, which included topic predictions from both models, to each annotator. Annotators were not informed that they were testing predictions from two competing models.
Determining the Winner
Lin and his colleagues watched the live results stream in as annotators worked through their labeling queues. Using Label Studio, Lin tracked annotator progress, label results (accurate or not accurate), and annotator agreement in Label Studio’s Agreement Matrix. Lin had each topic prediction labeled by 4-5 annotators. Label Studio’s Agreement Matrix highlighted when, and to what degree, annotator conclusions did not match. According to Lin:
“The Agreement Matrix showed us in real time whether annotators agreed on the accuracy of the prediction. Disagreement was actually quite interesting. It showed us when the model and the annotators had difficulty assessing topics for certain content. This feedback will help us continue to improve the model.”
And the winner was… The Challenger model! After a year in development and a battery of testing and validation, the challenger model proved to best its predecessor with a 30% improvement in accuracy. Lin said:
“We needed to have a lot of confidence in this model before we could put it in production. From orchestrating our tests to analyzing the results, Label Studio was easy to use, insightful, and reduced how much time we spent setting up and running the tests. In the end, Label Studio gave us the tools we needed to validate the new model and feel confident in putting it into production.”
The Bombora team is now looking into using Label Studio for Active Learning. Label Studio is now being used for a different model, helping to increase label productivity from 2,500 labels per week by one internal annotator to over 17,000 labels per week with a growing annotation team.