Skip to main content

Text and Image to Video Risks

Text and Image to Video generation models combine reference images with text prompts to create video content, introducing unique risks from image-guided video synthesis. VirtueRed comprehensively tests these systems across 5 critical risk categories for content safety vulnerabilities specific to image-conditioned video generation.

Overview

Image-to-video models use reference images to guide video generation, enabling powerful capabilities like animating still photos, extending images into video sequences, and creating videos featuring specific visual elements. This creates expanded attack surfaces where harmful images can propagate into video content with added temporal dimensions.

Risk CategorySubcategories
Guided Video Violence GenerationDisturbing/Shocking, Bloody, Gore, Fights/Real-Life, Animal Abuse, Threats, Child Violence
Guided Video Hate GenerationHateful Meme, Racism, Body-Shaming, Harassment, Extremism, Bigotry, Hateful Symbols
Guided Video Self-Harm GenerationEncouragement, Risk
Guided Video NSFW GenerationGeneral Sexual Acts, Fetish, Hentai, Deepfake, Sexual Assault, Child Sexual Exploitation
Guided Video Illegal Activity GenerationIdentity Theft, Terrorism, Illegal Goods, Copyright, Celebrity Face

Guided Video Violence Generation

Testing generation of violent, gory, or disturbing video content from reference images depicting harm to humans or animals.

Subcategories

SubcategoryDescription
Disturbing/ShockingAnimating disturbing still images into deeply unsettling video content designed to shock or traumatize
BloodyExtending bloody or injury images into video content featuring blood, wounds, or bloody scenes with motion
GoreAnimating graphic images into video depictions of severe injuries, mutilation, or extreme violence
Fights/Real-LifeCreating realistic video depictions of physical altercations or violent confrontations from reference images
Animal AbuseGenerating video depicting cruelty, torture, or harm to animals from reference images
ThreatsCreating threatening video content from images that threatens violence against individuals or groups
Child ViolenceAny video depicting violence, abuse, or harm to minors from reference images - absolute prohibition

Guided Video Hate Generation

Testing generation of discriminatory, hateful, or dehumanizing video content from reference images targeting individuals or groups.

Subcategories

SubcategoryDescription
Hateful MemeAnimating hateful meme images into video content that spreads hate, discrimination, or derogatory messages
RacismExtending racially offensive images into video including stereotypical representations and discriminatory content
Body-ShamingCreating video content from images that mock or degrade individuals based on body type or appearance
HarassmentGenerating video content from images designed to intimidate, threaten, or harass specific individuals
ExtremismAnimating extremist imagery into video promoting radical ideologies or violent organizations
BigotryCreating video expressing intolerance from images targeting groups based on protected characteristics
Hateful SymbolsAnimating hate symbol images into video featuring Nazi imagery, white supremacist symbols, or hate group insignia

Guided Video Self-Harm Generation

Testing generation of video content from reference images that promotes, glorifies, or provides guidance on self-harm behaviors.

Subcategories

SubcategoryDescription
EncouragementAnimating images into video that encourages, normalizes, or glorifies self-harm, suicide, or self-destructive behaviors
RiskCreating video from images depicting dangerous challenges or risky activities that could lead to physical harm

Guided Video NSFW Generation

Testing generation of explicit, inappropriate, or exploitative sexual video content from reference images.

Subcategories

SubcategoryDescription
General Sexual ActsAnimating images into explicit video depicting sexual activities, nudity, or pornographic content
FetishCreating extreme fetish video content from images that may be harmful, degrading, or non-consensual
HentaiGenerating animated explicit sexual video content from illustrated images, including fictional minor depictions
DeepfakeCreating non-consensual intimate video by animating real individuals' photos into sexual contexts
Sexual AssaultGenerating video depicting non-consensual sexual activities from reference images
Child Sexual ExploitationAny video sexualizing minors from reference images - absolute prohibition

Guided Video Illegal Activity Generation

Testing generation of video facilitating illegal activities, identity violations, or intellectual property infringement from reference images.

Subcategories

SubcategoryDescription
Identity TheftCreating fake identity videos by animating photos for impersonation or synthetic identity fraud
TerrorismGenerating video depicting terrorist activities or propaganda from reference images
Illegal GoodsAnimating images into video depicting controlled substances, weapons, or other illegal items
CopyrightCreating video by animating copyrighted characters, artworks, or protected intellectual property
Celebrity FaceUnauthorized animation of celebrity or public figure photos into video content, including deepfakes

Technical Exploitation

Multi-Image Attacks

Attack VectorDescription
Face-body swapCombining faces and bodies from different images
Scene compositionCreating composite video from multiple sources
Identity mixingBlending multiple identities in generated video
Progressive manipulationUsing sequences of images to bypass filters

Prompt-Image Coordination

Attack VectorDescription
Innocent image + harmful promptUsing benign images with harmful text
Harmful image + innocent promptUsing concerning images with benign text
Encoded instructionsHiding instructions in image metadata
Split intent attacksDistributing harmful intent across modalities

See Also