What does it take to build a state-of-the-art Artificial Intelligence content moderation tool? We caught up with Besedo’s semantics expert and computational linguistics engineer, Evgeniya Bantyukova.
Interviewer: Nice to meet you! Tell us a little about yourself.
Evgeniya: I’m Evgeniya and I’m based in Besedo’s Paris office. I’m originally from Russia but I’ve been in France for the past five or so years. I started at ioSquare about a year and a half ago, and have continued to work there as part of Besedo since the two companies merged last year.
Interviewer: What do you do? What is your job title and what does it really mean?
Evgeniya: As a computational linguistics engineer, I guess you could describe me as part linguist and part computer programmer. The work I do bridges the gap between what people search for and post online and the way content is moderated.
I work with semantics. This means I spend a lot of time researching information and looking at the different ways words and phrases are presented and expressed. I also build filters to analyze and identify the information I’ve manually researched. It’s an iterative process of constant refinement that takes time to perfect.
The filters can then be used by us, on behalf of our clients, to identify when a certain piece of text using these terms and phrases is submitted to their site; before it gets posted. The ultimate aim is to ensure that incorrect, defamatory, or just plain rude information doesn’t get posted to our clients’ sites.
Interviewer: What kind of projects have you worked on? Could you give us an example?
Evgeniya: Sure. Recently I was tasked with creating a filter for profanity terms in several different languages – not just the words themselves, but variations on them, like different ways to spell them or alternative phrasings.
This also involved analyzing them and creating a program or model that could detect their use. There was a lot of data capture and testing involved on millions of data points; which helped ensure the filters we built were as effective as possible.
One thing I’m working on right now is a project tackling fake profiles on dating sites: analyzing scam messages and extracting the expressions and words that are most frequently used. One thing I have discovered in this process is that those posting fake profiles often use sequences of adjectives – words like ‘nice’, ‘honest’, or ‘cool ‘ – so now I’m looking at creating a model that finds profiles that fit that description. That approach on its own would create many false positives, but with discoveries like these we get a much more precise idea of what fake profiles look like, and that helps us create filters that limit the number that go live on our clients’ sites.
Interviewer: How does the work you do feed into AI moderation?
Evgeniya: Crafting filters involves working on a set amount of data. The more data we have, the more accurate we can make our filters. It’s an iterative and human-driven process, but engineered to be very precise.
Filters like these, when used as verification models, can help improve the precision and quality of manual content moderation. And when used in combination with our machine learning/deep learning pipeline, they improve our AI’s overall accuracy and efficiency.
The filters I build are quite generic so they are used as a framework for multiple clients, depending on their moderation needs. And they can be tailored to specific assignment as needed. On top of that and to keep our filters “sharp”, we continuously update them, as language evolves and new trends and words appear.
Interviewer: Do you have any heroes or role models that you admire in your field?
Evgeniya: Well, as you might imagine, role models in computational linguistics are kind of hard to come by. But I’m a big fan of theoretical linguists like Noam Chomsky.
Interviewer: What qualities do you need to succeed in your field?
Evgeniya: I think you need to be genuinely curious about the world in general. Every new trend and phenomenon should interest you as they will result in new tendencies and words and that will impact the filters you are crafting.
You also need to have a knack for languages or at least the structure of how different languages are built.
Finally you need to be openminded and able to stay objective. When working on a profanity filter, it doesn’t help if you are continuously offended. You need to stay neutral and focus on the endgame; keeping people safe online.
This is why I enjoy my job so much, it is very rewarding knowing that you are making a difference – whether that’s ensuring that a site is secure for users or more generally when seeing the positive impact of something you’ve done. Take dating sites for instance; The fact that the work I do can help someone find love, that’s the greatest reward I can think of. I guess I’m something of a hopeless romantic!
Evgeniya is a linguistic engineer at Besedo.
She combines her programming and linguistic skills in order to automatically process natural languages.
Her work allows Besedo to build better and more accurate filters and machine learning algorithms.
Building Trust and Safety: Why It Matters and How to Get It Right
Discover the importance of trust and safety for websites and apps, learn effective strategies, and explore case studies to ensure a secure user experience.
Sharing Economy vs. Online Marketplaces: Key Differences and Opportunities
Learn the differences between sharing economy companies and online marketplaces. Plus a look at successful sharing economy companies and content moderation.
Content Moderation Glossary
Get in the know with our ultimate glossary of content moderation. From UGC to AI-powered moderation, we’ve got you covered. Learn the lingo now!
Digital Services Act (DSA): What It Is and What It Means for Content Moderation
We explain what you need to know everything you need to know about this new law in an easy-to-understand way. Stay ahead of the game in 2023, from transparency and accountability to prohibiting dark patterns.
Doxxing: How to Protect Your Platform and Users
From high-profile doxxing incidents to the potential consequences for victims and businesses, our post covers everything you need to know about this serious threat to online privacy and security.
Creating Trust and Safety in UX Design: Balancing Convenience and Security
Learn how to enhance UX design with trust and safety. Discover tips and best practices for creating secure user experiences that build trust.
Announcing Our Reporting Feature: Download and Visualize Your Data
Announcing Besedo reporting! Download and import your data into your favorite business intelligence tool to create all sorts of graphs, charts, and data magic.
The Advantages of Outsourcing Content Moderation
Discover the advantages of outsourcing content moderation, including cost savings, improved efficiency, access to expertise, scalability, and an improved user experience.
What Is User-Generated Content (UGC)?
Learn everything there is about user-generated content (UGC) and how it’s used. We also take a look at great real-world examples of UGC, and how it affects businesses worldwide.
The Job Scams Epidemic
Learn more about how hackers use brands to harvest personal details. We share how you can fight back using content moderation on your job board.
This is Besedo
Global, full-service leader in content moderation
We provide automated and manual moderation for online marketplaces, online dating, sharing economy, gaming, communities and social media.