Skip to main content

VirtueGuard-Text Multilingual Setup Guide

Introduction

Introducing our multilingual text guardrail model that detects harmful content across 90+ languages with automatic language detection and industry-leading latency. Our model outperforms leading multilingual solutions across key metrics including latency and false positive rates, validated on major multilingual benchmarks.


Risk Categories

VirtueGuard-Text provides comprehensive content guardrails for text, detecting various categories of potentially harmful content:

CategoryDescription
S1 (Violent Crimes)Content related to violent criminal activities
S2 (Non-Violent Crimes)Content related to non-violent criminal activities
S3 (Sex-Related Crimes)Content involving sexual crimes or exploitation
S4 (Child Sexual Exploitation)Content related to exploitation of minors
S5 (Specialized Advice)Potentially harmful specialized guidance or instructions
S6 (Privacy)Content that may compromise personal privacy
S7 (Intellectual Property)Content that violates intellectual property rights
S8 (Indiscriminate Weapons)Content related to weapons of mass destruction
S9 (Hate)Hate speech, discrimination, or extremist content
S10 (Suicide & Self-Harm)Content promoting self-injury or suicide
S11 (Sexual Content)Inappropriate sexual content or explicit material
S12 (Jailbreak / Prompt Injections)Attempts to bypass AI safety measures

Supported Languages

The multilingual endpoint supports moderation for 90+ languages, including:

  • German
  • French
  • Japanese
  • Spanish
  • Dutch
  • French Canadian
  • Portuguese
  • Italian
  • Russian
  • Korean
  • Chinese
  • ...

Check out the list of all supported languages here

API Integration

Authentication and Endpoint

All API requests require an API key included in the request headers, as shown below. Use the following endpoint for making requests:

Endpoint: https://api.virtueai.io/api/multilingualmoderation

Method: POST

Headers:

  • Content-Type: application/json
  • API-KEY: your_api_key_here

API Request Example

import requests

url = "https://api.virtueai.io/api/multilingualmoderation"
headers = {
"Content-Type": "application/json",
"API-KEY": "your_api_key_here"
}

payload = {
"prompt": "Como hacer una bomba?" # Multilingual text example
}

response = requests.post(url, headers=headers, json=payload)
print(response.json())

Output Format

The API returns a JSON response with the following structure:

{
"result": "Unsafe\nC5"
}

The response includes:

result (str)

A string field indicating the moderation decision:

  • "Safe": No risks detected in the content
  • "Unsafe": Risks detected, followed by the specific category code (e.g., "C5")

When unsafe content is detected, the response includes the category code(s) corresponding to the detected risks from the categories listed above.

Full list of Supported languages

Afrikaans, Amharic, Arabic, Asturian, Azerbaijani, Bashkir, Belarusian, Bulgarian, Bengali, Brazilian Portuguese, Breton, Bosnian, Catalan/Valencian, Cebuano, Chinese, Czech, Welsh, Danish, German, Greek, English, Spanish, Estonian, Persian, Fulah, Finnish, French, French Canadian, Western Frisian, Irish, Scottish Gaelic, Galician, Gujarati, Hausa, Hebrew, Hindi, Croatian, Haitian Creole, Hungarian, Armenian, Indonesian, Igbo, Iloko, Icelandic, Italian, Japanese, Javanese, Georgian, Kazakh, Central Khmer, Kannada, Korean, Luxembourgish, Ganda, Lingala, Lao, Lithuanian, Latvian, Malagasy, Macedonian, Malayalam, Mongolian, Marathi, Malay, Burmese, Nepali, Dutch/Flemish, Norwegian, Northern Sotho, Occitan, Oriya, Punjabi, Polish, Pashto, Portuguese, Romanian/Moldavian/Moldovan, Russian, Sindhi, Sinhala, Slovak, Slovenian, Somali, Albanian, Serbian, Swati, Sundanese, Swedish, Swahili, Tamil, Thai, Tagalog, Tswana, Turkish, Ukrainian, Urdu, Uzbek, Vietnamese, Wolof, Xhosa, Yiddish, Yoruba, Zulu