Privacy
Privacy testing evaluates AI systems' handling of personal and sensitive information, including resistance to revealing training data, inferring private information, and leaking Personally Identifiable Information (PII). VirtueRed tests 3 core privacy risk subcategories with comprehensive attack scenarios.
Overview
Privacy in AI systems encompasses multiple dimensions, from protecting training data to safeguarding user interactions. Privacy failures can expose individuals to identity theft, discrimination, and other harms, while creating legal liability for AI operators under regulations like GDPR and CCPA.
| Privacy Dimension | Description | Risk |
|---|---|---|
| Training Data Privacy | Protecting data used to train models | Data subject exposure |
| Inference Privacy | Preventing unauthorized data inference | Derived information exposure |
| Interaction Privacy | Protecting user conversations | Conversation leakage |
Risk Subcategories
Privacy Extraction
Evaluates the model's tendency to memorize and expose private information from training data. This tests extraction accuracy of sensitive information and the model's ability to maintain data confidentiality.
| Extraction Type | Description |
|---|---|
| Training Data Extraction | Recovering verbatim training examples (e.g., Enron email dataset) |
| Sensitive Information Recall | Model's consistency in extracting/not extracting sensitive information |
| Prompt Format Variations | Testing across zero-shot, few-shot, and context-based prompts |
| Email Pair Extraction | Accuracy assessment for extracting name-email pairs |
Privacy Understanding
Tests the AI system's ability to recognize and appropriately handle privacy-sensitive contexts and conversations.
| Understanding Aspect | Description |
|---|---|
| Privacy Context Recognition | Understanding and respecting privacy contexts in conversations |
| Privacy Event Identification | Recognizing sensitive discussions (divorce, confidential information) |
| Information Leakage Prevention | Preventing disclosure rates in privacy-related contexts |
| User Trust Maintenance | Maintaining confidentiality in sensitive discussions |
Privacy PII Leakage
Evaluates whether the AI system can be manipulated to generate, reveal, or process PII inappropriately during inference.
| PII Type | Description |
|---|---|
| Direct Identifiers | Names, SSN, passport numbers |
| Contact Information | Addresses, phone numbers, emails |
| Financial Data | Credit cards, bank accounts |
| Health Information | Medical records, conditions |
| Biometric Data | Facial features, fingerprints |
| Behavioral Data | Usage patterns, preferences |
Risk Level Assessment
Low Risk Privacy Profile
- Strong robustness against privacy risks
- Excellent ability to avoid generating sensitive information from pretraining datasets
- Effective detection and prevention of PII leakage
- Strong privacy context understanding
- High trustworthiness in safeguarding sensitive information
Moderate Risk Privacy Profile
- Moderate robustness against privacy risks
- Some capability in avoiding sensitive information generation
- Partial effectiveness in PII leakage detection
- Adequate privacy context understanding
- Room for improvement in privacy protection mechanisms
High Risk Privacy Profile
- High vulnerability concerning privacy
- Struggles with privacy concern scenarios
- Potential to memorize and leak sensitive training data
- PII leakage during inference
- Failure to understand privacy contexts
- Urgent need for privacy protection enhancements
Testing Methodology
Pretraining Data Privacy Testing
Tests extraction accuracy of sensitive information from training data sources like the Enron email dataset:
- Zero-shot extraction - Direct queries for sensitive information
- Few-shot extraction - Providing examples to guide extraction
- Context-based extraction - Using contextual cues to elicit information
- Email pair extraction - Testing name-email association recall
Inference-Time PII Testing
Evaluates recognition and protection of PII during real-time interactions:
- PII recognition - Identifying sensitive data in user inputs
- PII protection - Preventing generation of realistic PII
- Data regulation compliance - Adhering to GDPR, CCPA requirements
- Sensitive data safeguarding - Protecting user trust
Privacy-Sensitive Conversation Testing
Assesses information leakage in privacy-related contexts:
- Context recognition - Understanding privacy implications
- Confidentiality maintenance - Respecting disclosed private information
- Appropriate responses - Providing helpful answers without privacy violations
- Trust preservation - Maintaining user confidence
Regulatory Alignment
Privacy testing supports compliance with:
| Regulation | Requirements |
|---|---|
| GDPR | Data minimization, consent, right to erasure |
| CCPA/CPRA | Consumer rights, sale restrictions |
| HIPAA | Health information protection |
| FERPA | Educational record privacy |
| GLBA | Financial data protection |
See Also
- GDPR Compliance - EU data protection
- Societal Harmfulness - Content harm prevention
- Brand Risk - Reputation protection