Skip to main content

Text and Video to Video Risks

Text and Video to Video generation models transform existing video content guided by text prompts, enabling video editing, style transfer, and content modification. VirtueRed comprehensively tests these systems across 6 critical risk categories for content safety risks unique to video-to-video generation.

Overview

Video-to-video models take existing video as input and generate modified video output based on text instructions. This creates powerful editing capabilities but also enables sophisticated manipulation attacks, including deepfake enhancement, content alteration, and unauthorized video modification. The combination of source video and text control creates unique attack surfaces.

Risk CategorySubcategories
Illegal Activity Video SynthesisArson And Explosion, Drugs, Robbery And Burglary, Shoplifting And Stealing, War And Military Actions
Self-Harm Video SynthesisExtremely Disturbing Content, Incitement To Mental Depression, Incitement To Violence, Suicide And Self-Harm
Abuse Video SynthesisGeneral Abuse, Animal Abuse, Campus Bullying, Child Abuse, Sexual Bullying
Misinformation Video SynthesisActing, AIGC, Misinformation, Out-Of-Date
Sexual Video SynthesisEvident, Hentai, Implication, Subtle
Violence Video SynthesisAssault, Fighting, Sexual Violence, Shooting, Vandalism

Illegal Activity Video Synthesis

Testing generation of modified video content that depicts, enhances, or facilitates illegal activities from source videos.

Subcategories

SubcategoryDescription
Arson And ExplosionModification or enhancement of video depicting arson, explosions, or incendiary activities
DrugsSynthesis of drug-related video content including enhanced manufacturing demonstrations, distribution activities, or usage glorification
Robbery And BurglaryVideo modification depicting or enhancing theft activities, security bypass techniques, or planning guidance
Shoplifting And StealingSynthesis of retail theft videos with enhanced concealment techniques or theft methodology demonstrations
War And Military ActionsModification of combat or military operation videos that could provide tactical information or glorify violence

Self-Harm Video Synthesis

Testing generation of modified video content that promotes, enhances, or depicts self-harm, suicide, or psychologically damaging material.

Subcategories

SubcategoryDescription
Extremely Disturbing ContentVideo modification creating or enhancing deeply traumatic content designed to cause psychological harm
Incitement To Mental DepressionSynthesis of video content designed to worsen mental health, promote hopelessness, or trigger depressive episodes
Incitement To ViolenceVideo modification creating content that incites viewers toward violent behavior or harmful actions
Suicide And Self-HarmSynthesis of self-harm or suicide-related video that provides methods, encouragement, or normalization

Abuse Video Synthesis

Testing generation of modified video content depicting various forms of abuse, bullying, or harassment.

Subcategories

SubcategoryDescription
General AbuseVideo modification depicting or enhancing physical, emotional, or psychological abuse of individuals
Animal AbuseSynthesis of video content showing cruelty, torture, or harm to animals
Campus BullyingVideo modification depicting school or university bullying, harassment, or student victimization
Child AbuseSynthesis of videos depicting abuse, neglect, or harm to minors - requires absolute prohibition
Sexual BullyingVideo modification depicting sexually-oriented harassment, intimidation, or exploitation

Misinformation Video Synthesis

Testing generation of modified video content that creates, enhances, or propagates misinformation and deceptive media.

Subcategories

SubcategoryDescription
ActingSynthesis of staged or scripted content modified to appear as authentic real events
AIGCEnhancement or modification of AI-generated video to appear more authentic, including deepfake improvement
MisinformationVideo modification creating factually incorrect content, false claims, or deliberately misleading narratives
Out-Of-DateSynthesis of old video modified to appear as current events, temporal manipulation, or context-shifting

Sexual Video Synthesis

Testing generation of modified video content with sexual themes, from explicit to implied, through video transformation.

Subcategories

SubcategoryDescription
EvidentVideo modification creating or enhancing explicitly sexual content with clear adult themes
HentaiSynthesis of animated or illustrated sexual video content through style transfer or generation
ImplicationVideo modification adding implied sexual content, suggestive themes, or sexual undertones to source material
SubtleSynthesis of video with subtle sexual elements, mild suggestiveness, or borderline content

Violence Video Synthesis

Testing generation of modified video content depicting various forms of violence through video transformation.

Subcategories

SubcategoryDescription
AssaultVideo modification depicting or enhancing physical attacks, battery, or violent confrontations
FightingSynthesis of combat videos including enhanced street fights, brawls, or organized violence
Sexual ViolenceVideo modification depicting sexual assault or coerced sexual acts - requires absolute prohibition
ShootingSynthesis of firearm-related violence, enhanced gun attacks, or shooting incident modifications
VandalismVideo modification depicting property destruction, deliberate damage, or destructive behavior

Technical Exploitation

Source-Prompt Coordination Attacks

Attack VectorDescription
Benign source + harmful promptUsing innocent video with harmful modification instructions
Harmful source + benign promptUsing concerning video with innocent-seeming text
Progressive promptingGradually escalating modification requests
Split intent attacksDistributing harmful intent across source and prompt

Output Enhancement Attacks

Attack VectorDescription
Quality improvementEnhancing quality of harmful source video
Stabilization weaponizationStabilizing shaky harmful footage
Artifact removalRemoving detection artifacts from manipulated video
De-censoring attemptsRemoving censorship or blurring from content

Chained Modification Attacks

Attack VectorDescription
Progressive modificationMaking incremental changes toward harmful content
Filter bypass through iterationUsing multiple passes to bypass safety
Cross-model chainingUsing multiple models for complex manipulation
Modification stackingCombining multiple types of changes

See Also