How To Create Accurate Content Moderation Filters That Work

Creating a filter for moderation is the easy part. Ensuring that a filter produces the desired result is a science. Great accurate content moderation filters don’t just catch what they are set up to find, but also work in a way that minimizes the number of false positives.

With good filters alone, our clients have achieved up to 80% automation of their content moderation, but what is the secret to building an accurate and efficient filter?

8 steps of filter moderation

Our expert filter manager, Kevin Martinez, shares his eight steps for building content moderation filters that work.

In this example, we are using drugs as the target of the filter, but the process is the same for building anything from profanity filters to rules aimed at catching ads for endangered species.

1. Mission:
Define the goal of the filter. In this case, we want it to help us prevent illegal drugs from getting posted to our site. As such, we created a filter in Implio called “drugs”. It is always advisable to name your filters something descriptive. You are likely going to end up with a bunch of filters so it is good to be able to understand their function at a glance.

2. Local:
Check the laws of the country your site is operating in. Laws can vary widely depending on country and sometimes even by region. In Spain, for instance, you are allowed to sell the growing box for cannabis and cannabis seeds, but not the plant itself.

3. Action:
Decide on the action you want the filter to take. Should it refuse to send to manual moderation, or is it just a test filter that shouldn’t take any action other than highlighting the ads that would’ve been caught? In Implio, the default action is to accept any ads that don’t match the filter automatically, but you have full control over what happens to content that matches your rules.

4. List:
Create a list of all drug-related keywords (Cocaine, heroin, cannabis, etc.). Make sure you also include any slang words your users are likely to use for drugs.

5. Rule:
Now, it is time to set up your rule. Make sure that the rule pulls from the list and that you add exceptions to avoid false positives. For example, in step 2, we discovered that selling cannabis seeds is okay. As such, we must ensure our filter excludes “cannabis” + ”seeds.”

6. QA:
Once your filter is set up with the list and all relevant exclusions, it is time for the first quality check. Upload your data to Implio and review the matches. Are you getting any false positives?

7. Exceptions:
For all false positives, add exceptions (also called white-listing specific content). Think your exceptions through so you don’t give a blanket white list allowing unwanted content.

8. Rinse & repeat:
Once you have added new exceptions, run your data through again to quality-check your updated filter. Repeat steps 6-8 as often as possible to reach your target quality rate. At Besedo, we aim at 95% accuracy as a minimum, and we reach higher for most of our filters.

Even though you can refuse content pieces automatically, we do not recommend that unless you can reach a 100% accuracy level on your filter. (It is quite rare to be able to reach that accuracy level, but an example of a filter that could reach such scores would be an IP-related rule where you do not want to allow users from certain IP’s to post).

If you are not 100% certain that all ads matched by the refusal filter should be refused, then you should send matches for manual review instead. Otherwise, you run the risk of ruining your user experience.

If you follow Kevin’s 8 steps you should be well-equipped to create your own accurate filters.

If you want to improve your abilities in filter management even further, we offer training sessions where our expert filter managers will teach you all about regular expressions and rules crafting through step-by-step guides and exercises.

Learn more about filter training

See allSee all articles

Report Survey Gaming

Online Gaming Safety: 9 in 10 Gamers Wouldn’t Let Their Kid Play

We surveyed 2,000 gamers from all over the USA about their experiences in online multiplayer games. The results were eye-opening.

A person holding a smartphone in their hand with a dating app open.

Dating Harassment

Can Dating Apps Keep Women Safe?

Protect women at all costs. Online dating must be safe. Learn how AI and moderation can help create a better experience this International Women’s Day.

Dating Survey

Report: How Dating App Chats Are Driving Users Away

Dating app users are frustrated with in-app messaging. Our survey reveals why people are ghosting. Not just matches, but the apps themselves

Content Moderation Communities and Forums

Data on Reddit’s massive amounts of user-generated content and how it is moderated

How does Reddit moderate billions of posts? Discover its AI-powered AutoModerator, 60K+ community mods, and scalable content moderation strategy.

Press

Besedo Recruits Chief Executive Louise Barnekow

Besedo is delighted to announce the appointment of Louise Barnekow from Mynewsdesk as the company’s new CEO.

How To Create Accurate Content Moderation Filters That Work

Contents

8 steps of filter moderation

Related articles

Online Gaming Safety: 9 in 10 Gamers Wouldn’t Let Their Kid Play

Can Dating Apps Keep Women Safe?

Report: How Dating App Chats Are Driving Users Away

Data on Reddit’s massive amounts of user-generated content and how it is moderated

Besedo Recruits Chief Executive Louise Barnekow

Contents

Contents

8 steps of filter moderation

Related articles

Online Gaming Safety: 9 in 10 Gamers Wouldn’t Let Their Kid Play

Can Dating Apps Keep Women Safe?

Report: How Dating App Chats Are Driving Users Away

Data on Reddit’s massive amounts of user-generated content and how it is moderated

Besedo Recruits Chief Executive Louise Barnekow

Share this article

Contents

Share this article