Give Your Algorithms A Daily Dose Of Data

Feed your AI content moderation algorithms the right data if you want to continuously keep them smart enough to improve — by themselves.

Think of the term ‘Artificial Intelligence’, and you’d be forgiven for conjuring up mental images of servant robots, android sentience, and an eventual machine uprising. While such scenarios are still a way off (!) here in 2017, AI continues to gain significant ground — and is now sufficiently sophisticated to begin its formal education. But, as its (current) masters, AI’s schooling’s on us. It can only ever be as good as the information we give it — and giving it the wrong info can only do more harm than good.

If you still feel apprehensive about a machine uprising, you might first want to learn the basic concepts of A.I.’s potential before moving into your doomsday bunker.

Upskilling Your AI

Although an AI can beat the Turing Test, the most advanced AI that the majority of us have access to is the kind that can problem-solve. In its simplest form, artificial intelligence is an algorithm — a computer program that can identify patterns in large data sets; able to learn what ‘good’ and ‘bad’ look like.

Yup, that’s right, it can differentiate. We’re not talking about ethical and moral judgments yet; empowering an AI to differentiate means allowing it to automate complex and monotonous tasks skillfully. By giving it richer, more diverse data, it can incrementally improve the processes it’s given — optimizing efficiency, reducing error margins, and maintaining accuracy. This is essentially what machine learning brings to the moderation process.

However, the problem is that machines don’t learn independently. Developing and maintaining algorithms that perform consistently takes time, hard work, dedication, and expense. And it takes clean, relevant data to train it properly. When you consider how critical AI accuracy has become to a number of different industries — including online marketplaces, healthcare, and finance — getting this right is vital.

Avoiding Data Entropy

Maintaining a high quality level for your AI is a bit like gardening. What will happen if we don’t keep our algorithms fed and watered correctly? In a word ‘entropy’: the idea that the amount of order in a given system will deteriorate unless external energy is applied. Or in simpler terms: your data quality will drop unless you do what’s needed to maintain its accuracy.

But this isn’t all theoretical — there are numerous practical ways to ensure your algorithms blossom. Firstly, by giving your AI new data and replacing old data. The algorithm will eventually stop learning and improving if the same information is continually fed in. Cleaner, more accurate data leads to better results too; as do new data processing techniques and statistical models. Filters are important too: the kind that detect language, personal information, preferences; as well as industry-specific filters — among many other criteria.

Basically, the more specific you are about the kind of output you want from your AI, and the more effort you put into it, the better it’ll become.

Think of AI like a garden in this respect: to keep it manageable, you need to remove weeds, mow the lawn, prune flowers, pare back branches, gather leaves — all that time-consuming rigmarole — otherwise, it becomes unkempt and difficult to maintain.

As you’ve probably gathered, creating a reliable and effective machine learning algorithm is no simple feat — even for tech companies employing data scientists. And if you have data scientists working for you, you’ll surely want them to work on bigger proprietary stuff — particularly when you can easily outsource moderation work?

AI is Content Moderation’s New Best Friend

At Besedo, we’re all about content moderation, which at its core involves understanding how our clients build trust and credibility across their businesses — crucial for online marketplaces, dating sites and shared services.

Admittedly, moderation is a job a living, breathing person can do. In fact, we still offer manual moderation for some clients. But automation is a cost-effective addition to the manual approach, especially for larger businesses with big incoming volumes. It’s also very helpful for businesses that are sensitive to volume spikes. Rather than take on large cohorts of temporary support staff to match seasonal demand — and then facing the unenviable task on onboarding them all to keep pace with content moderation — online marketplaces can make use of automation, allowing them to easily scale up and down with no real impact on day to day business.

Considering that most platform users expect to see their ads live as soon as they hit ‘submit’, for bigger businesses, there’s no contest: nothing offers the same speed, scale, or consistent accuracy as automated software — the kind that can cross-reference, check, and monitor thousands of previous data entries simultaneously. No one single human has that type of mental recall.

That said, because we offer both manual and automated moderation services our AI offering continues to expand and evolve. How? Because using manual moderation as a final line of defence gives the algorithm new information to learn.

For example: say an uploaded data entry contained a mild swear word. You can teach an AI to screen for swear words, but what if everything around the rest of the entry fulfills the publishable criteria? The AI might not feel confident enough to take a decision. For ambiguous instances like this, the AI can forward the content to a manual moderation team, so the entry is reviewed, and eventually approved or rejected by a human. This data can then be fed back to the AI so it learns and next time has a better idea of the context in which the ‘offending term’ was used. Then it can make a judgment on similar situations in the future.

In this way, having manual and automation work together can lead to a better, more accurate solution. As mentioned before, the cleaner (as in less conflicting) the data, the more accurate the result.

Efficiency = Stability + Accuracy

To be wholly effective, machine learning algorithms must be specifically configured to take on a single task at scale. It’s not a case of putting your AI on a ‘finger in the air’ trajectory and hoping it’ll improve.

To ensure your algorithms stay accurate they must be continuously updated with new data. Better algorithms mean higher accuracy, which in effect increases the amount of work that can be automated. For online marketplaces this is now standard practice; part of making the content as safe and as relevant as possible.

Ultimately AI is aware but not self-aware — and only aware of things we input. Machines may be learning fast, but they’re only as good the parameters we set for them. For now…

See allSee all articles

Report Survey Gaming

Online Gaming Safety: 9 in 10 Gamers Wouldn’t Let Their Kid Play

We surveyed 2,000 gamers from all over the USA about their experiences in online multiplayer games. The results were eye-opening.

A person holding a smartphone in their hand with a dating app open.

Dating Harassment

Can Dating Apps Keep Women Safe?

Protect women at all costs. Online dating must be safe. Learn how AI and moderation can help create a better experience this International Women’s Day.

Dating Survey

Report: How Dating App Chats Are Driving Users Away

Dating app users are frustrated with in-app messaging. Our survey reveals why people are ghosting. Not just matches, but the apps themselves

Content Moderation Communities and Forums

Data on Reddit’s massive amounts of user-generated content and how it is moderated

How does Reddit moderate billions of posts? Discover its AI-powered AutoModerator, 60K+ community mods, and scalable content moderation strategy.

Press

Besedo Recruits Chief Executive Louise Barnekow

Besedo is delighted to announce the appointment of Louise Barnekow from Mynewsdesk as the company’s new CEO.

This is Besedo

Global, full-service leader in content moderation

We provide automated and manual moderation for online marketplaces, online dating, sharing economy, gaming, communities and social media.

Request demo Free trial

Give Your Algorithms A Daily Dose Of Data

Contents

Upskilling Your AI

Avoiding Data Entropy

AI is Content Moderation’s New Best Friend

Efficiency = Stability + Accuracy

Related articles

Online Gaming Safety: 9 in 10 Gamers Wouldn’t Let Their Kid Play

Can Dating Apps Keep Women Safe?

Report: How Dating App Chats Are Driving Users Away

Data on Reddit’s massive amounts of user-generated content and how it is moderated

Besedo Recruits Chief Executive Louise Barnekow

Global, full-service leader in content moderation

Contents

Contents

Upskilling Your AI

Avoiding Data Entropy

AI is Content Moderation’s New Best Friend

Efficiency = Stability + Accuracy

Related articles

Online Gaming Safety: 9 in 10 Gamers Wouldn’t Let Their Kid Play

Can Dating Apps Keep Women Safe?

Report: How Dating App Chats Are Driving Users Away

Data on Reddit’s massive amounts of user-generated content and how it is moderated

Besedo Recruits Chief Executive Louise Barnekow

Global, full-service leader in content moderation

Share this article

Contents

Share this article