Skip to main content

VirtueGuard-Image Setup Guide

Introduction

Introducing our state-of-the-art image guardrail model, designed to effectively identify and flag potentially harmful and risky content in images while maintaining high performance and efficiency. Our image guardrail model surpasses leading industry solutions, including GPT-4O and the Azure Moderation API, excelling across key metrics on existing harmful and unsafe image benchmarks.


Risk Categories

VirtueGuard-Images provides image guardrails for comprehensive harmful and unsafe categories as listed below, which could be further customized:

CategoryDescription
Hate_Humiliation_HarassmentContent promoting hatred, discrimination, bullying, or harassment
Violence_Harm_CrueltyPhysical violence, gore, injury, or acts of cruelty
SexualInappropriate sexual content, nudity, or suggestive material
Criminal_PlanningContent related to illegal activities or criminal operations
Weapons_Substance_AbuseWeapons, drug paraphernalia, or substance abuse content
Self_HarmContent depicting or promoting self-injury or suicide
Animal_CrueltyContent showing harm or abuse to animals
Disasters_EmergenciesNatural disasters, accidents, or emergency situations
PoliticalSensitive political content, propaganda, or extremist messaging

API Integration

Authentication and Endpoint

All API requests require an API key included in the request headers, as shown below. Use the following endpoint for making requests:

Endpoint: https://api.virtueai.io/api/imagemoderation

Method: POST

Headers:

  • Content-Type: application/json
  • Authorization: Bearer your_api_key_here

API Request Example

import requests
import base64
import json

# Use the image from web (a landscape photo of a tree)
image_content = requests.get("https://www.gstatic.com/webp/gallery/4.sm.jpg").content

# Read the image from local
# with open("test.jpg", "rb") as image_file:
# image_content = image_file.read()

image_data = base64.b64encode(image_content).decode('utf-8')

# Prepare request
payload = {
"input": image_data
}
headers = {
"Content-Type": "application/json",
"Authorization": "Bearer your_api_key_here"
}

# Make request
response = requests.post("https://api.virtueai.io/api/imagemoderation", json=payload, headers=headers)
print(response.json())

Output Format

The API returns a JSON response with the following structure:

{
"results": {
"Hate_Humiliation_Harassment": false,
"Violence_Harm_Cruelty": false,
"Sexual": false,
"Criminal_Planning": false,
"Weapons_Substance_Abuse": false,
"Self_Harm": false,
"Animal_Cruelty": false,
"Disasters_Emergencies": false,
"Political": false
},
"reason": "No harmful content detected.",
"flag": false
}

The results expects a JSON object with the following properties:

results (dict)

An object containing boolean flags for each risk category. Each key in this object represents a specific risk category and returns:

  • true: The specified risk was detected in the image
  • false: The specified risk was not detected in the image

reason (str)

A string field providing a human-readable explanation of the moderation decision:

  • When no risks are detected: Returns a message like "No harmful content detected."
  • When risks are detected: Provides a detailed explanation of which categories triggered the flag and why

flag (bool)

This additional boolean field serves as a concise indicator, providing quick binary result into risk detection:

  • true: At least one risk category in the results object was detected
  • false: No risks were detected in any category