Skip to main content

Privacy

Privacy testing evaluates AI systems' handling of personal and sensitive information, including resistance to revealing training data, inferring private information, and leaking Personally Identifiable Information (PII). VirtueRed tests 3 core privacy risk subcategories with comprehensive attack scenarios.

Overview

Privacy in AI systems encompasses multiple dimensions, from protecting training data to safeguarding user interactions. Privacy failures can expose individuals to identity theft, discrimination, and other harms, while creating legal liability for AI operators under regulations like GDPR and CCPA.

Privacy DimensionDescriptionRisk
Training Data PrivacyProtecting data used to train modelsData subject exposure
Inference PrivacyPreventing unauthorized data inferenceDerived information exposure
Interaction PrivacyProtecting user conversationsConversation leakage

Risk Subcategories

Privacy Extraction

Evaluates the model's tendency to memorize and expose private information from training data. This tests extraction accuracy of sensitive information and the model's ability to maintain data confidentiality.

Extraction TypeDescription
Training Data ExtractionRecovering verbatim training examples (e.g., Enron email dataset)
Sensitive Information RecallModel's consistency in extracting/not extracting sensitive information
Prompt Format VariationsTesting across zero-shot, few-shot, and context-based prompts
Email Pair ExtractionAccuracy assessment for extracting name-email pairs

Privacy Understanding

Tests the AI system's ability to recognize and appropriately handle privacy-sensitive contexts and conversations.

Understanding AspectDescription
Privacy Context RecognitionUnderstanding and respecting privacy contexts in conversations
Privacy Event IdentificationRecognizing sensitive discussions (divorce, confidential information)
Information Leakage PreventionPreventing disclosure rates in privacy-related contexts
User Trust MaintenanceMaintaining confidentiality in sensitive discussions

Privacy PII Leakage

Evaluates whether the AI system can be manipulated to generate, reveal, or process PII inappropriately during inference.

PII TypeDescription
Direct IdentifiersNames, SSN, passport numbers
Contact InformationAddresses, phone numbers, emails
Financial DataCredit cards, bank accounts
Health InformationMedical records, conditions
Biometric DataFacial features, fingerprints
Behavioral DataUsage patterns, preferences

Risk Level Assessment

Low Risk Privacy Profile

  • Strong robustness against privacy risks
  • Excellent ability to avoid generating sensitive information from pretraining datasets
  • Effective detection and prevention of PII leakage
  • Strong privacy context understanding
  • High trustworthiness in safeguarding sensitive information

Moderate Risk Privacy Profile

  • Moderate robustness against privacy risks
  • Some capability in avoiding sensitive information generation
  • Partial effectiveness in PII leakage detection
  • Adequate privacy context understanding
  • Room for improvement in privacy protection mechanisms

High Risk Privacy Profile

  • High vulnerability concerning privacy
  • Struggles with privacy concern scenarios
  • Potential to memorize and leak sensitive training data
  • PII leakage during inference
  • Failure to understand privacy contexts
  • Urgent need for privacy protection enhancements

Testing Methodology

Pretraining Data Privacy Testing

Tests extraction accuracy of sensitive information from training data sources like the Enron email dataset:

  1. Zero-shot extraction - Direct queries for sensitive information
  2. Few-shot extraction - Providing examples to guide extraction
  3. Context-based extraction - Using contextual cues to elicit information
  4. Email pair extraction - Testing name-email association recall

Inference-Time PII Testing

Evaluates recognition and protection of PII during real-time interactions:

  1. PII recognition - Identifying sensitive data in user inputs
  2. PII protection - Preventing generation of realistic PII
  3. Data regulation compliance - Adhering to GDPR, CCPA requirements
  4. Sensitive data safeguarding - Protecting user trust

Privacy-Sensitive Conversation Testing

Assesses information leakage in privacy-related contexts:

  1. Context recognition - Understanding privacy implications
  2. Confidentiality maintenance - Respecting disclosed private information
  3. Appropriate responses - Providing helpful answers without privacy violations
  4. Trust preservation - Maintaining user confidence

Regulatory Alignment

Privacy testing supports compliance with:

RegulationRequirements
GDPRData minimization, consent, right to erasure
CCPA/CPRAConsumer rights, sale restrictions
HIPAAHealth information protection
FERPAEducational record privacy
GLBAFinancial data protection

See Also