VirtueGuard-Image Setup Guide

Introduction

Introducing our state-of-the-art image guardrail model, designed to effectively identify and flag potentially harmful and risky content in images while maintaining high performance and efficiency. Our image guardrail model surpasses leading industry solutions, including GPT-4O and the Azure Moderation API, excelling across key metrics on existing harmful and unsafe image benchmarks.

Risk Categories

VirtueGuard-Images provides image guardrails for comprehensive harmful and unsafe categories as listed below, which could be further customized:

Category	Description
Hate_Humiliation_Harassment	Content promoting hatred, discrimination, bullying, or harassment
Violence_Harm_Cruelty	Physical violence, gore, injury, or acts of cruelty
Sexual	Inappropriate sexual content, nudity, or suggestive material
Criminal_Planning	Content related to illegal activities or criminal operations
Weapons_Substance_Abuse	Weapons, drug paraphernalia, or substance abuse content
Self_Harm	Content depicting or promoting self-injury or suicide
Animal_Cruelty	Content showing harm or abuse to animals
Disasters_Emergencies	Natural disasters, accidents, or emergency situations
Political	Sensitive political content, propaganda, or extremist messaging

API Integration

Authentication and Endpoint

All API requests require an API key included in the request headers, as shown below. Use the following endpoint for making requests:

Endpoint: https://api.virtueai.io/api/imagemoderation

Method: POST

Headers:

Content-Type: application/json
Authorization: Bearer your_api_key_here

API Request Example

Python
Typescipt

import requests
import base64
import json

# Use the image from web (a landscape photo of a tree)
image_content = requests.get("https://www.gstatic.com/webp/gallery/4.sm.jpg").content

# Read the image from local
# with open("test.jpg", "rb") as image_file:
#     image_content = image_file.read()

image_data = base64.b64encode(image_content).decode('utf-8')

# Prepare request
payload = {
    "input": image_data
}
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer your_api_key_here"
}

# Make request
response = requests.post("https://api.virtueai.io/api/imagemoderation", json=payload, headers=headers)
print(response.json())

import fs from 'fs/promises';
import axios from 'axios';

async function moderateImage(imageSource: string, isUrl: boolean = true): Promise<any> {
  try {
    // Get image content
    let imageContent: Buffer;
    
    if (isUrl) {
      // Fetch image from URL
      const response = await axios.get(imageSource, {
        responseType: 'arraybuffer'
      });
      imageContent = Buffer.from(response.data);
    } else {
      // Read image from local file
      imageContent = await fs.readFile(imageSource);
    }

    // Convert to base64
    const imageData = imageContent.toString('base64');

    // Prepare request
    const payload = {
      input: imageData
    };

    const headers = {
      'Content-Type': 'application/json',
      'Authorization': 'Bearer your_api_key_here'
    };

    // Make request
    const response = await axios.post(
      'https://api.virtueai.io/api/imagemoderation',
      payload,
      { headers }
    );

    return response.data;
  } catch (error) {
    if (error instanceof Error) {
      throw new Error(`Image moderation failed: ${error.message}`);
    }
    throw error;
  }
}

// Example usage:
async function example() {
  try {
    // Example with web image
    const webResult = await moderateImage('https://www.gstatic.com/webp/gallery/4.sm.jpg');
    console.log('Web image result:', webResult);

    // Example with local image
    const localResult = await moderateImage('./test.jpg', false);
    console.log('Local image result:', localResult);
  } catch (error) {
    console.error('Error:', error);
  }
}

export { moderateImage };

Output Format

The API returns a JSON response with the following structure:

{
    "results": {
        "Hate_Humiliation_Harassment": false,
        "Violence_Harm_Cruelty": false,
        "Sexual": false,
        "Criminal_Planning": false,
        "Weapons_Substance_Abuse": false,
        "Self_Harm": false,
        "Animal_Cruelty": false,
        "Disasters_Emergencies": false,
        "Political": false
    },
    "reason": "No harmful content detected.",
    "flag": false
}

The results expects a JSON object with the following properties:

`results` (dict)

An object containing boolean flags for each risk category. Each key in this object represents a specific risk category and returns:

true: The specified risk was detected in the image
false: The specified risk was not detected in the image

`reason` (str)

A string field providing a human-readable explanation of the moderation decision:

When no risks are detected: Returns a message like "No harmful content detected."
When risks are detected: Provides a detailed explanation of which categories triggered the flag and why

`flag` (bool)

This additional boolean field serves as a concise indicator, providing quick binary result into risk detection:

true: At least one risk category in the results object was detected
false: No risks were detected in any category

Introduction​

Risk Categories​

API Integration​

Authentication and Endpoint​

API Request Example​

Output Format​

results (dict)​

reason (str)​

flag (bool)​