Contents
Feed your AI content moderation algorithms the right data if you want to continuously keep them smart enough to improve — by themselves.
Think of the term ‘Artificial Intelligence’, and you’d be forgiven for conjuring up mental images of servant robots, android sentience, and an eventual machine uprising. While such scenarios are still a way off (!) here in 2017, AI continues to gain significant ground — and is now sufficiently sophisticated to begin its formal education. But, as its (current) masters, AI’s schooling’s on us. It can only ever be as good as the information we give it — and giving it the wrong info can only do more harm than good.
If you still feel apprehensive about a machine uprising, you might first want to learn the basic concepts of A.I.’s potential before moving into your doomsday bunker.
Upskilling Your AI
Although an AI can beat the Turing Test, the most advanced AI that the majority of us have access to is the kind that can problem-solve. In its simplest form, artificial intelligence is an algorithm — a computer program that can identify patterns in large data sets; able to learn what ‘good’ and ‘bad’ look like.
Yup, that’s right, it can differentiate. We’re not talking about ethical and moral judgments yet; empowering an AI to differentiate means allowing it to automate complex and monotonous tasks skillfully. By giving it richer, more diverse data, it can incrementally improve the processes it’s given — optimizing efficiency, reducing error margins, and maintaining accuracy. This is essentially what machine learning brings to the moderation process.
However, the problem is that machines don’t learn independently. Developing and maintaining algorithms that perform consistently takes time, hard work, dedication, and expense. And it takes clean, relevant data to train it properly. When you consider how critical AI accuracy has become to a number of different industries — including online marketplaces, healthcare, and finance — getting this right is vital.
Avoiding Data Entropy
Maintaining a high quality level for your AI is a bit like gardening. What will happen if we don’t keep our algorithms fed and watered correctly? In a word ‘entropy’: the idea that the amount of order in a given system will deteriorate unless external energy is applied. Or in simpler terms: your data quality will drop unless you do what’s needed to maintain its accuracy.
But this isn’t all theoretical — there are numerous practical ways to ensure your algorithms blossom. Firstly, by giving your AI new data and replacing old data. The algorithm will eventually stop learning and improving if the same information is continually fed in. Cleaner, more accurate data leads to better results too; as do new data processing techniques and statistical models. Filters are important too: the kind that detect language, personal information, preferences; as well as industry-specific filters — among many other criteria.
Basically, the more specific you are about the kind of output you want from your AI, and the more effort you put into it, the better it’ll become.
Think of AI like a garden in this respect: to keep it manageable, you need to remove weeds, mow the lawn, prune flowers, pare back branches, gather leaves — all that time-consuming rigmarole — otherwise, it becomes unkempt and difficult to maintain.
As you’ve probably gathered, creating a reliable and effective machine learning algorithm is no simple feat — even for tech companies employing data scientists. And if you have data scientists working for you, you’ll surely want them to work on bigger proprietary stuff — particularly when you can easily outsource moderation work?
AI is Content Moderation’s New Best Friend
At Besedo, we’re all about content moderation, which at its core involves understanding how our clients build trust and credibility across their businesses — crucial for online marketplaces, dating sites and shared services.
Admittedly, moderation is a job a living, breathing person can do. In fact, we still offer manual moderation for some clients. But automation is a cost-effective addition to the manual approach, especially for larger businesses with big incoming volumes. It’s also very helpful for businesses that are sensitive to volume spikes. Rather than take on large cohorts of temporary support staff to match seasonal demand — and then facing the unenviable task on onboarding them all to keep pace with content moderation — online marketplaces can make use of automation, allowing them to easily scale up and down with no real impact on day to day business.
Considering that most platform users expect to see their ads live as soon as they hit ‘submit’, for bigger businesses, there’s no contest: nothing offers the same speed, scale, or consistent accuracy as automated software — the kind that can cross-reference, check, and monitor thousands of previous data entries simultaneously. No one single human has that type of mental recall.
That said, because we offer both manual and automated moderation services our AI offering continues to expand and evolve. How? Because using manual moderation as a final line of defence gives the algorithm new information to learn.
For example: say an uploaded data entry contained a mild swear word. You can teach an AI to screen for swear words, but what if everything around the rest of the entry fulfills the publishable criteria? The AI might not feel confident enough to take a decision. For ambiguous instances like this, the AI can forward the content to a manual moderation team, so the entry is reviewed, and eventually approved or rejected by a human. This data can then be fed back to the AI so it learns and next time has a better idea of the context in which the ‘offending term’ was used. Then it can make a judgment on similar situations in the future.
In this way, having manual and automation work together can lead to a better, more accurate solution. As mentioned before, the cleaner (as in less conflicting) the data, the more accurate the result.
Efficiency = Stability + Accuracy
To be wholly effective, machine learning algorithms must be specifically configured to take on a single task at scale. It’s not a case of putting your AI on a ‘finger in the air’ trajectory and hoping it’ll improve.
To ensure your algorithms stay accurate they must be continuously updated with new data. Better algorithms mean higher accuracy, which in effect increases the amount of work that can be automated. For online marketplaces this is now standard practice; part of making the content as safe and as relevant as possible.
Ultimately AI is aware but not self-aware — and only aware of things we input. Machines may be learning fast, but they’re only as good the parameters we set for them. For now…
Related articles
See allSee all articlesThis is Besedo
Global, full-service leader in content moderation
We provide automated and manual moderation for online marketplaces, online dating, sharing economy, gaming, communities and social media.