Data Cleaning Process
The data cleaning process allows you to systematically remove or anonymize personal data in compliance with GDPR requirements, particularly the "right to be forgotten" (Article 17).
Overview
Data cleaning involves: - Selecting which data to clean based on criteria - Filtering records to target specific data subjects or time periods - Executing the cleaning operation with comprehensive logging - Verifying that the cleaning was successful and complete
Cleaning Workflow
Step 1: Prepare for Cleaning
Pre-Cleaning Checklist
- Legal Basis Confirmed: Verify you have legal authority to delete the data
- Data Subject Request: Have written request from data subject (if applicable)
- Business Impact Assessment: Understand business implications of data removal
- Backup Strategy: Ensure data can be recovered if needed for legal purposes
- Dependencies Check: Identify related data that may be affected
Access Requirements
- Permission Set: GDPR-USER or GDPR-ADMIN permission set assigned
- Table Access: Read/Modify permissions for target tables
- Functional Access: Access to QTEAMDataCleaner codeunit
Step 2: Access Data Cleaning Interface
Navigate to Cleaning Functions
- From GDPR Register List:
- Open GDPR Register List
- Select data elements to clean
- Use Actions > Clean Data (if available)
- From Table Pages:
- Navigate to specific Business Central table (e.g., Customer list)
- Select records to clean
- Use custom GDPR cleaning actions
Understanding the Interface
- Filter Options: Define which records to process
- Field Selection: Choose which fields to clean
- Cleaning Method: Select how to clean the data (clear, anonymize, etc.)
- Session Settings: Configure logging and error handling
Step 3: Configure Cleaning Parameters
Filter Configuration
Use QteamFilterSelector functionality to define filters:
Customer Data Example:
Table: Customer (18)
Filters:
- Last Date Modified < 01/01/2020
- Customer Posting Group = 'INACTIVE'
- Balance (LCY) = 0Employee Data Example:
Table: Employee (5200)
Filters:
- Status = Terminated
- Termination Date < 01/01/2022
- Employment Date > 01/01/2010Field Selection
Choose which fields to clean for each table:
Personal Identification Fields: - Name, Name 2, Search Name - Social Security No., Birth Date - Phone No., E-Mail, Address fields
Contact Information Fields: - Address, Address 2, City, Post Code - Country/Region Code, County - Contact person names and details
Financial/Sensitive Fields (with caution): - Bank account information - Credit card details - Salary information
Step 4: Execute Cleaning Operation
Start Cleaning Session
- Initialize Session:
- System creates new session log entry
- Assigns unique session ID
- Records start time and user
- Progress Monitoring:
- Real-time progress indicator
- Current table and field being processed
- Success/failure counters
- Error Handling:
- Continues processing on non-critical errors
- Logs all failures for review
- Option to stop on first error if needed
Cleaning Methods
Clear Fields (Default): - Sets field values to blank/zero - Maintains record structure - Preserves foreign keys where possible
Anonymization: - Replaces with generic values - Maintains data format and relationships - Example: "John Smith" becomes "Person 001"
Conditional Cleaning: - Cleans only if certain conditions are met - Preserves data needed for legal/business purposes - Example: Keep data if open transactions exist
Step 5: Monitor and Verify
Session Monitoring
During cleaning operation: - Progress Indicators: Show completion percentage - Real-Time Counts: Display processed, successful, and failed operations - Error Messages: Show specific errors as they occur - Performance Metrics: Track processing speed and resource usage
Completion Verification
After cleaning completes:
- Review Session Log: Check final statistics and any errors
- Verify Data: Spot-check that targetted data was actually cleaned
- Check Dependencies: Ensure related data maintains integrity
- Document Results: Record completion for audit purposes
Cleaning Results and Logging
Session Log Information
Each cleaning session records: - Session ID: Unique identifier for the cleaning operation - Start/End Time: Duration of cleaning operation<br> - User: Who performed the cleaning - Tables/Fields: Which data was targeted - Statistics: Total processed, successful, and failed operations - Errors: Detailed error messages for failed operations
Individual Operation Logs
For each field cleaning operation: - Table/Field: Specific table and field cleaned - Record Key: Identifies which record was processed - Status: Success or failure - Error Details: Specific error message if operation failed - Timestamp: When the operation was performed
Accessing Logs
- Session Logs:
- Open Cleaner Session Log page
- Filter by date, user, or session status
- Drill down to see individual operations
- Operation Logs:
- Open Data Cleaner Log Entries page
- Filter by session ID, table, or status
- Review error details for failed operations
Best Practices
Planning and Preparation
Legal Compliance
- Document Requests: Maintain records of data subject requests
- Legal Basis: Ensure you have legal authority to delete data
- Retention Policies: Follow organizational data retention policies
- Audit Trail: Maintain complete audit trail of cleaning operations
Technical Preparation
- Test Environment: Always test cleaning operations first
- Backup Data: Create backups before large cleaning operations
- Dependencies: Map data dependencies before cleaning
- Performance: Plan cleaning during off-peak hours for large operations
Execution Best Practices
Phased Approach
- Start Small: Begin with small, low-risk data sets
- Validate Results: Check results thoroughly before proceeding
- Scale Gradually: Increase scope as confidence builds
- Monitor Impact: Watch for unexpected business impacts
Error Management
- Review Errors: Investigate all cleaning failures
- Fix Issues: Address underlying problems before retrying
- Document Solutions: Maintain knowledge base of common issues
- Escalate if Needed: Involve technical support for complex problems
Quality Assurance
Pre-Cleaning Validation
- Filter Testing: Verify filters select correct records
- Field Verification: Confirm correct fields are targeted
- Impact Assessment: Understand business process implications
- Rollback Plan: Have plan to recover if needed
Post-Cleaning Validation
- Data Verification: Confirm target data was cleaned
- System Integrity: Verify system functions normally
- Business Process: Check that business processes still work
- Compliance: Ensure cleaning meets regulatory requirements
Advanced Cleaning Scenarios
Partial Data Cleaning
For situations where only some personal data should be removed: - Field-Specific Cleaning: Clean only specific fields - Conditional Cleaning: Clean based on business rules - Retention Rules: Keep some data for legal/business requirements
Bulk Data Operations
For large-scale cleaning operations: - Batch Processing: Process records in smaller batches - Progress Checkpoints: Create recovery points during long operations - Resource Management: Monitor system resources during processing - Parallel Processing: Use multiple sessions for different data types
Integration with Business Processes
- Workflow Integration: Incorporate cleaning into business workflows
- Automated Triggers: Set up automatic cleaning based on business events
- Approval Processes: Require approval for certain types of cleaning
- Notification Systems: Alert stakeholders when cleaning is completed
Troubleshooting
Common Issues
Cleaning Operation Fails
- Check Permissions: Verify user has modify rights on target tables
- Review Filters: Ensure filters don't exclude all records
- Check Dependencies: Look for foreign key constraints preventing deletion
- System Resources: Verify sufficient system resources available
Partial Cleaning Success
- Review Error Log: Check which operations failed and why
- Data Constraints: Look for business logic preventing field updates
- Concurrent Usage: Check if other users are accessing the same data
- Field Validation: Verify cleaning doesn't violate field validation rules
Performance Issues
- Batch Size: Reduce number of records processed per batch
- Index Usage: Ensure proper indexing on filtered fields
- System Load: Avoid cleaning during peak usage times
- Resource Allocation: Ensure adequate server resources
Error Resolution
For specific error types, see: - Data Cleaning Issues - Permission Problems - Common Errors
Next Steps
After mastering data cleaning:
- Session Management: Learn advanced session management
- API Overview: Explore automation possibilities
- FAQ: Review frequently asked questions about cleaning