Creating a filter for moderation is the easy part. Ensuring that a filter produces the desired result is a science. Great accurate content moderation filters don’t just catch what they are set up to find, but also work in a way that minimizes the number of false positives.
With good filters alone our clients have achieved up to 80% automation of their content moderation, but what is the secret to actually building a filter that is both accurate and efficient?
8 steps of filter moderation
Our expert filter manager Kevin Martinez shares his 8 steps for building content moderation filters that work.
In this example, we are using drugs as the target of the filter, but the process is the same for building anything from profanity filters to rules aimed at catching ads for endangered species.
Define the goal of the filter. In this case, we want it to help us prevent illegal drugs from getting posted to our site. As such, we created a filter in Implio called “drugs”. It is always advisable to name your filters something descriptive. You are likely going to end up with a bunch of filters so it is good to be able to understand their function at a glance.
Check the laws of the country your site is operating in. Laws can vary widely depending on country and sometimes even by region. In Spain, for instance, you are allowed to sell the growing box for cannabis and cannabis seeds, but not the plant itself.
Decide on the action you want the filter to take. Should It refuse, send to manual moderation or is it just a test filter that shouldn’t take any action other than highlighting the ads that would’ve been caught? In Implio the default action is to accept any ads that don’t match the filter automatically, but you have full control over what happens to content that matches your rules.
Create a list of all drug-related keywords (Cocaine, heroin, cannabis etc.). Make sure you also include any slang words your users are likely to use for drugs.
Now it is time to set up your rule. Make sure that the rule pulls from the list and that you add exceptions to avoid false positives. For example, in step 2 we discovered that selling cannabis seeds are okay. As such, we must ensure our filter excludes “cannabis”+”seeds”.
Once your filter is set up with the list and all relevant exclusions it is time for the first quality check. Upload your data to Implio and review the matches. Are you getting any false positives?
For all false positives, add exceptions (also called white-listing specific content). Think your exceptions through so you don’t give a blanket white list allowing unwanted content.
8. Rinse & repeat:
Once you have added new exceptions run your data through again to quality check your updated filter. Repeat step 6-8 as many times as you have to to reach your target quality rate. At Besedo we aim at 95% accuracy as a minimum and for most of our filters we reach higher.
Even though you can refuse content pieces automatically, we do not recommend that unless you can reach 100% accuracy level on your filter. (It is quite rare to be able to reach that accuracy level, but an example of a filter that could reach such scores would be an IP-related rule where you do not want to allow users from certain IP’s to post).
If you are not 100% certain that all ads matched by the refusal filter should be refused, then you should send matches for manual review instead. Otherwise, you run the risk of ruining your user experience.
If you follow Kevin’s 8 steps you should be well-equipped to create your own accurate filters. If you want to improve your abilities in filter management even further, we offer training sessions where our expert filter managers will teach you all about regular expressions and rules crafting through step-by-step guides and exercises.
Learn more about filter training
Building Trust and Safety: Why It Matters and How to Get It Right
Discover the importance of trust and safety for websites and apps, learn effective strategies, and explore case studies to ensure a secure user experience.
Sharing Economy vs. Online Marketplaces: Key Differences and Opportunities
Learn the differences between sharing economy companies and online marketplaces. Plus a look at successful sharing economy companies and content moderation.
Content Moderation Glossary
Get in the know with our ultimate glossary of content moderation. From UGC to AI-powered moderation, we’ve got you covered. Learn the lingo now!
Digital Services Act (DSA): What It Is and What It Means for Content Moderation
We explain what you need to know everything you need to know about this new law in an easy-to-understand way. Stay ahead of the game in 2023, from transparency and accountability to prohibiting dark patterns.
Doxxing: How to Protect Your Platform and Users
From high-profile doxxing incidents to the potential consequences for victims and businesses, our post covers everything you need to know about this serious threat to online privacy and security.
Creating Trust and Safety in UX Design: Balancing Convenience and Security
Learn how to enhance UX design with trust and safety. Discover tips and best practices for creating secure user experiences that build trust.
Announcing Our Reporting Feature: Download and Visualize Your Data
Announcing Besedo reporting! Download and import your data into your favorite business intelligence tool to create all sorts of graphs, charts, and data magic.
The Advantages of Outsourcing Content Moderation
Discover the advantages of outsourcing content moderation, including cost savings, improved efficiency, access to expertise, scalability, and an improved user experience.
What Is User-Generated Content (UGC)?
Learn everything there is about user-generated content (UGC) and how it’s used. We also take a look at great real-world examples of UGC, and how it affects businesses worldwide.
The Job Scams Epidemic
Learn more about how hackers use brands to harvest personal details. We share how you can fight back using content moderation on your job board.
This is Besedo
Global, full-service leader in content moderation
We provide automated and manual moderation for online marketplaces, online dating, sharing economy, gaming, communities and social media.