VirtueGuard-Text Multilingual Setup Guide
Introduction
Introducing our multilingual text guardrail model that detects harmful content across 90+ languages with automatic language detection and industry-leading latency. Our model outperforms leading multilingual solutions across key metrics including latency and false positive rates, validated on major multilingual benchmarks.
Risk Categories
VirtueGuard-Text provides comprehensive content guardrails for text, detecting various categories of potentially harmful content:
| Category | Description |
|---|---|
| S1 (Violent Crimes) | Content related to violent criminal activities |
| S2 (Non-Violent Crimes) | Content related to non-violent criminal activities |
| S3 (Sex-Related Crimes) | Content involving sexual crimes or exploitation |
| S4 (Child Sexual Exploitation) | Content related to exploitation of minors |
| S5 (Specialized Advice) | Potentially harmful specialized guidance or instructions |
| S6 (Privacy) | Content that may compromise personal privacy |
| S7 (Intellectual Property) | Content that violates intellectual property rights |
| S8 (Indiscriminate Weapons) | Content related to weapons of mass destruction |
| S9 (Hate) | Hate speech, discrimination, or extremist content |
| S10 (Suicide & Self-Harm) | Content promoting self-injury or suicide |
| S11 (Sexual Content) | Inappropriate sexual content or explicit material |
| S12 (Jailbreak / Prompt Injections) | Attempts to bypass AI safety measures |
Supported Languages
The multilingual endpoint supports moderation for 90+ languages, including:
- German
- French
- Japanese
- Spanish
- Dutch
- French Canadian
- Portuguese
- Italian
- Russian
- Korean
- Chinese
- ...
Check out the list of all supported languages here
API Integration
Authentication and Endpoint
All API requests require an API key included in the request headers, as shown below. Use the following endpoint for making requests:
Endpoint: https://api.virtueai.io/api/multilingualmoderation
Method: POST
Headers:
Content-Type: application/jsonAPI-KEY: your_api_key_here
API Request Example
- Python
- Typescript
import requests
url = "https://api.virtueai.io/api/multilingualmoderation"
headers = {
"Content-Type": "application/json",
"API-KEY": "your_api_key_here"
}
payload = {
"prompt": "Como hacer una bomba?" # Multilingual text example
}
response = requests.post(url, headers=headers, json=payload)
print(response.json())
import axios from 'axios';
async function moderateMultilingualText(text: string): Promise<any> {
try {
const url = 'https://api.virtueai.io/api/multilingualmoderation';
const headers = {
'Content-Type': 'application/json',
'API-KEY': 'your_api_key_here'
};
const payload = {
prompt: text
};
const response = await axios.post(url, payload, { headers });
return response.data;
} catch (error) {
if (error instanceof Error) {
throw new Error(`Text moderation failed: ${error.message}`);
}
throw error;
}
}
// Example usage
async function example() {
try {
const result = await moderateMultilingualText('Como hacer una bomba?');
console.log('Result:', result);
} catch (error) {
console.error('Error:', error);
}
}
export { moderateMultilingualText };
Output Format
The API returns a JSON response with the following structure:
{
"result": "Unsafe\nC5"
}
The response includes:
result (str)
A string field indicating the moderation decision:
- "Safe": No risks detected in the content
- "Unsafe": Risks detected, followed by the specific category code (e.g., "C5")
When unsafe content is detected, the response includes the category code(s) corresponding to the detected risks from the categories listed above.
Full list of Supported languages
Afrikaans, Amharic, Arabic, Asturian, Azerbaijani, Bashkir, Belarusian, Bulgarian, Bengali, Brazilian Portuguese, Breton, Bosnian, Catalan/Valencian, Cebuano, Chinese, Czech, Welsh, Danish, German, Greek, English, Spanish, Estonian, Persian, Fulah, Finnish, French, French Canadian, Western Frisian, Irish, Scottish Gaelic, Galician, Gujarati, Hausa, Hebrew, Hindi, Croatian, Haitian Creole, Hungarian, Armenian, Indonesian, Igbo, Iloko, Icelandic, Italian, Japanese, Javanese, Georgian, Kazakh, Central Khmer, Kannada, Korean, Luxembourgish, Ganda, Lingala, Lao, Lithuanian, Latvian, Malagasy, Macedonian, Malayalam, Mongolian, Marathi, Malay, Burmese, Nepali, Dutch/Flemish, Norwegian, Northern Sotho, Occitan, Oriya, Punjabi, Polish, Pashto, Portuguese, Romanian/Moldavian/Moldovan, Russian, Sindhi, Sinhala, Slovak, Slovenian, Somali, Albanian, Serbian, Swati, Sundanese, Swedish, Swahili, Tamil, Thai, Tagalog, Tswana, Turkish, Ukrainian, Urdu, Uzbek, Vietnamese, Wolof, Xhosa, Yiddish, Yoruba, Zulu