Even though the rapid development of Internet and social media contributes significantly to human connection, it is undeniable that this is also the very reason why toxic behaviors become more common online. Thus, toxic comments classification has been researched by experts in the Machine Learning field for the past few years. Recently, one of our clients asked us to teach Ebbot to detect toxic messages in conversations. Thanks to this special request, we got a chance to work on one of the most difficult topics in the Natural Language Processing (NLP) field. And yes, we can not be more excited! 🥳

Challenges with collecting dataset

In order to successfully implement this classification task, we have to train Ebbot on a dataset of text with toxicity. Although large labeled training datasets exist, they are not available in Swedish. And using machine translation is not a good approach, since there are many slangs that cannot be translated accurately by machines.

Ebbot's solution to toxic messages detection

After researching, we found an open-source yet highly accurate trained model, built by Laura Hanu at Unitary. In addition to the original version, which only supported English and was trained on Wikipedia comments, Unitary also provided a multilingual model which was trained on 7 different languages (english, french, spanish, italian, portuguese, turkish and russian).

At the same time, we also found a machine translation model by the Language Technology Research Group at the University of Helsinki. This combination enables us to work around the lack of dataset and meet our clients' request. After receiving input text in Swedish, Ebbot will translate it to English first, then run it through the toxicity classifier. The output will be the scores for six categories of toxic messages: toxicity, severe toxicity, obscene, threat, insult and identity hate. Using this method, not only can we decide whether a message is toxic or not, but we are also able to see which type of inappropriate behaviors it brings.

detecting-toxic-messages.webp

We are aware that this is not the best solution when it comes to solving Machine Learning/Artificial Intelligence problems. Nevertheless, when facing the challenges of not having available training dataset, we consider this to be one of the quickest and easiest ways to tackle multilingual NLP challenges. Currently we are testing the model and gathering user feedback to improve the app's performance. But please feel free to contact us if you have any inquiries about our bot-builder product or special NLP integrations 🙌 We are usually very responsive 😉