The Nonprofit Artificial Intelligence Research Team OpenAI Inc. today announced a major update to its free content moderation tool, and it’s now available to all developers.
The update Moderation endpoint allows access via an application programming interface to OpenAI Transformer-based pre-trained generative classifierswhich are AI models that have been trained to detect unwanted content in apps, the company said in a blog post.
When it receives a text input, the Moderation endpoint analyzes it to see if it contains anything that needs to be filtered, such as sexual content, hateful or violent language, or messages that encourage self-harm. It will remove and block any content prohibited by OpenAI content policy.
The enhanced version of the Moderation endpoint was designed to be fast, accurate, and powerful across multiple types of applications, including AI-powered chatbots, messaging systems, and social media sites. Importantly, OpenAI said it greatly reduces the chances of an AI model “saying” the wrong thing. This means that AI can be used in more sensitive contexts, such as educational applications, where people may have previously had reservations about deploying the technology.
The Moderation endpoint is free when used with content generated by the OpenAI API. For example, Theai Inc., the company behind AI around the world, uses the tools of OpenAI to enable developers to create AI-powered virtual characters for the metaverse, virtual worlds, and VR games. Inworld relies on the Moderation endpoint to make sure these characters stay “on script” and don’t start talking about anything untoward. This allows him to focus more on creating memorable characters, rather than worrying about what those characters are saying.
In addition to moderating bots, the Moderation endpoint can also block harmful content that is not generated by OpenAI’s APIs, but rather by humans. The anonymous messaging platform NGL, which provides a platform for young people to share their feelings and opinions, uses OpenAI’s tool to detect hate speech and bullying. NGL said the Moderation endpoint is only able to generalize around the latest slang, allowing it to match the evolution of language used by teenagers, for example.
In the case of non-API traffic, the moderation endpoint is subject to charges.
OpenAI said developers can get started with the moderation endpoint by checking its Documentation. He also has published an article detailing his training process and performance, as well as a evaluation data set which he says will hopefully inspire further research in the area of AI-powered moderation.