Skip to main content

Data Governance

Policies, Ownership, and Compliance Frameworks for ComplyAI Data Assets


πŸ“‹ Table of Contents​

  1. Data Governance Framework
  2. Data Ownership Model
  3. Data Classification
  4. Data Quality Standards
  5. Privacy & Compliance
  6. Data Retention Policies
  7. Access Control
  8. Change Management
  9. Incident Response
  10. Audit & Monitoring

Data Governance Framework​

Governance Principles​

PrincipleDescription
AccountabilityEvery data asset has a designated owner responsible for its quality and security
TransparencyData definitions, lineage, and policies are documented and accessible
IntegrityData is accurate, complete, and consistent across systems
SecurityData is protected according to its classification level
ComplianceData handling meets regulatory and contractual requirements
StewardshipData is treated as a valuable organizational asset

Governance Structure​

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ DATA GOVERNANCE COUNCIL β”‚
β”‚ (Executive Sponsor + Domain Leads + Compliance) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β–Ό β–Ό β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ DATA OWNERS β”‚ β”‚ DATA STEWARDS β”‚ β”‚ DATA CUSTODIANSβ”‚
β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
β”‚ Business β”‚ β”‚ Quality & β”‚ β”‚ Technical β”‚
β”‚ Accountabilityβ”‚ β”‚ Standards β”‚ β”‚ Implementationβ”‚
β”‚ for domains β”‚ β”‚ Enforcement β”‚ β”‚ & Operations β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Roles & Responsibilities​

Data Governance Council​

  • Meets: Monthly (or as needed for urgent matters)
  • Members: CEO/CTO (Sponsor), Engineering Lead, Product Lead, Compliance Lead
  • Responsibilities:
    • Set data governance strategy and priorities
    • Approve major policy changes
    • Resolve cross-domain data conflicts
    • Allocate resources for governance initiatives

Data Owners (by Domain)​

DomainOwner RoleResponsibilities
Customer DomainHead of ProductUsers, Organizations, Subscriptions
Ad Content DomainHead of EngineeringAd Accounts, Ads, Media Assets
Compliance DomainHead of ComplianceReviews, Policies, Violations
Operational DomainHead of EngineeringActivity Events, Notifications

Data Stewards​

  • Role: Cross-functional team members who ensure data quality
  • Responsibilities:
    • Monitor data quality metrics
    • Enforce naming conventions
    • Review data change requests
    • Maintain documentation accuracy

Data Custodians​

  • Role: Engineering/DevOps team members
  • Responsibilities:
    • Implement technical controls
    • Manage database access
    • Execute backup/recovery
    • Monitor system health

Data Ownership Model​

Domain Ownership Matrix​

Data EntityDomainOwnerStewardPrimary System
usersCustomerProductEngineeringcomplyai-core
organizationsCustomerProductEngineeringcomplyai-core
subscriptionsCustomerFinanceEngineeringcomplyai-core
rolesCustomerProductEngineeringcomplyai-core
org_business_accountsAd ContentEngineeringEngineeringcomplyai-core
org_ad_accountsAd ContentEngineeringEngineeringcomplyai-core
org_adsAd ContentEngineeringEngineeringcomplyai-api
org_ads_scoreComplianceComplianceEngineeringcomplyai-violin
facebook_ad_statusComplianceComplianceEngineeringcomplyai-maestro
activity_eventsOperationalEngineeringEngineeringcomplyai-core
notificationsOperationalProductEngineeringcomplyai-triangle

Ownership Responsibilities​

Data Owners Must:

  1. Define business rules for their data
  2. Approve access requests
  3. Review data quality reports monthly
  4. Sign off on schema changes
  5. Ensure compliance with regulations

System of Record (SOR)

EntitySystem of RecordRationale
User Profilecomplyai-coreCentral authentication/identity
Organization Detailscomplyai-coreCentral customer management
Ad Account MetadataMeta Graph APIExternal authoritative source
Ad Content/CreativeMeta Graph APIExternal authoritative source
Compliance Scorescomplyai-violinComplyAI's proprietary analysis
Billing/SubscriptionStripeExternal payment processor

Data Classification​

Classification Levels​

LevelLabelDescriptionExamples
1πŸ”΄ RestrictedHighly sensitive, regulatory impactAccess tokens, passwords, SSN, payment data
2🟠 ConfidentialBusiness-sensitive, limited accessUser PII, organization financials, contracts
3🟑 InternalInternal use onlyInternal metrics, employee data, system configs
4🟒 PublicCan be shared externallyMarketing content, public documentation

Data Classification by Table​

TableClassificationRationale
users.passwordπŸ”΄ RestrictedAuthentication credential
users.access_tokenπŸ”΄ RestrictedMeta API authentication
org_business_accounts.system_user_access_tokenπŸ”΄ RestrictedService authentication
subscriptions.stripe_customer_idπŸ”΄ RestrictedPayment reference
users.email🟠 ConfidentialPersonal identifiable information
users.first_name, users.last_name🟠 ConfidentialPersonal identifiable information
organizations.name🟠 ConfidentialBusiness information
org_ads.*🟠 ConfidentialClient advertising data
activity_events.*🟑 InternalAudit/operational data
notifications.*🟑 InternalSystem messages
policies.* (CMS)🟒 PublicPublished compliance guidance

Handling Requirements by Classification​

ClassificationStorageTransmissionAccessLogging
πŸ”΄ RestrictedEncrypted at rest (AES-256)TLS 1.2+ requiredRole-based, MFA requiredFull audit trail
🟠 ConfidentialEncrypted at restTLS 1.2+ requiredRole-basedAccess logged
🟑 InternalStandard databaseTLS recommendedTeam-basedStandard logging
🟒 PublicStandardAnyOpenMinimal

Data Quality Standards​

Data Quality Dimensions​

DimensionDefinitionTargetMeasurement
AccuracyData correctly represents reality99%+Comparison with source systems
CompletenessRequired fields are populated98%+NULL/empty field counts
ConsistencyData agrees across systems99%+Cross-system reconciliation
TimelinessData is up-to-datePer SLALag time monitoring
UniquenessNo unintended duplicates100%Duplicate detection queries
ValidityData conforms to business rules99%+Constraint violation counts

Quality Rules by Entity​

Users Table​

FieldRuleValidation
emailMust be valid email formatRegex validation
emailMust be uniqueUnique constraint
auth0_user_idMust match Auth0 recordPeriodic sync check
created_atMust not be future dateCheck constraint

Organizations Table​

FieldRuleValidation
nameRequired, non-emptyNOT NULL constraint
stripe_customer_idMust exist in Stripe (if set)API verification
subscription_statusMust be valid enumCheck constraint

Ad Accounts Table​

FieldRuleValidation
meta_idMust be valid Meta ID formatFormat validation
meta_idMust be unique per organizationUnique constraint
statusMust match Meta current status15-min sync check

Data Quality Monitoring​

Automated Checks (Run Daily)

  1. NULL field analysis across critical tables
  2. Orphan record detection (FKs pointing to deleted records)
  3. Duplicate detection on unique business keys
  4. Cross-system reconciliation (Meta ↔ Local counts)
  5. Stale data detection (records not updated per expected cadence)

Quality Dashboards

  • Daily quality score by domain
  • Trend analysis (week over week)
  • Alert thresholds for quality degradation

Quality Issue Resolution Process

  1. Automated alert triggers on threshold breach
  2. Data Steward triages and assigns to owner
  3. Root cause analysis performed
  4. Fix implemented and validated
  5. Post-mortem for systemic issues

Privacy & Compliance​

Regulatory Framework​

RegulationApplicabilityKey Requirements
GDPREU users/customersConsent, right to erasure, data portability
CCPA/CPRACalifornia residentsDisclosure, opt-out, deletion rights
SOC 2All operationsSecurity, availability, confidentiality
Meta Platform TermsAd dataData use restrictions, retention limits

Personal Data Inventory​

Data ElementLegal BasisRetentionSubject Rights
User emailContract performanceAccount lifetime + 30 daysAccess, Rectification, Erasure
User nameContract performanceAccount lifetime + 30 daysAccess, Rectification, Erasure
Activity logsLegitimate interest2 yearsAccess
Ad account dataContract performanceAccount lifetime + 90 daysAccess, Portability
Payment infoContract/Legal7 years (financial records)Access

Data Subject Rights Procedures​

Right to Access​

  1. User submits request via Settings or email
  2. Identity verification performed
  3. Data export generated within 30 days
  4. Delivered securely to verified email

Right to Erasure ("Right to be Forgotten")​

  1. User submits deletion request
  2. Verify no legal hold requirements
  3. Soft delete user record (anonymize PII)
  4. Cascade to related records per retention policy
  5. Confirm deletion to user within 30 days

Erasure Exceptions:

  • Active subscription with outstanding balance
  • Legal hold or regulatory requirement
  • Data required for legal defense

Right to Rectification​

  1. User updates profile in-app, or
  2. Submits correction request
  3. Data steward reviews and applies
  4. Confirmation sent to user
PurposeConsent TypeCollection PointWithdrawal Method
Account creationExplicitRegistrationAccount deletion
Email notificationsExplicitOnboardingSettings toggle
Marketing emailsOpt-inRegistration/DashboardUnsubscribe link
Analytics cookiesOpt-inCookie bannerCookie settings
Data sharing (Meta)ExplicitOAuth flowDisconnect account

Data Retention Policies​

Retention Schedule​

Data CategoryRetention PeriodArchive PolicyDeletion Method
User AccountsAccount lifetime + 30 daysN/AHard delete after anonymization
Organization DataAccount lifetime + 90 daysArchive to cold storageHard delete
Ad Data (org_ads)2 years from creationArchive after 1 yearHard delete
Compliance Scores2 years from creationArchive after 1 yearHard delete
Activity Logs2 yearsArchive after 6 monthsHard delete
Webhook Events90 daysN/AHard delete
Session Data30 daysN/AAuto-expire
Backup Data30 days (daily), 1 year (monthly)Encrypted S3Lifecycle policy
Financial Records7 yearsRequired for complianceLegal hold

Retention Triggers​

Trigger EventActionTimeline
User requests account deletionBegin deletion workflowWithin 30 days
Organization churnsMark for retention countdown90-day countdown
Subscription expires (no renewal)Mark inactiveAfter 30-day grace
Retention period expiresAutomated deletion jobNightly job

Archive Process​

  1. Identify records meeting archive criteria
  2. Export to compressed, encrypted format
  3. Upload to S3 Glacier
  4. Verify archive integrity
  5. Remove from production database
  6. Update audit log
  1. Legal/Compliance issues hold request
  2. Tag affected records with hold flag
  3. Exclude from automated deletion
  4. Maintain until hold released
  5. Document hold duration and reason

Access Control​

Access Control Model​

ComplyAI uses Role-Based Access Control (RBAC) with the following hierarchy:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ PLATFORM ADMIN β”‚
β”‚ (ComplyAI Staff - Full System Access) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ ORGANIZATION OWNER β”‚
β”‚ (Customer Admin - Full Org Access) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β–Ό β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ ADMIN β”‚ β”‚ MEMBER β”‚
β”‚ β”‚ β”‚ β”‚
β”‚ Manage Users β”‚ β”‚ View/Edit Ads β”‚
β”‚ Manage Accts β”‚ β”‚ View Reports β”‚
β”‚ View Reports β”‚ β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Permission Matrix​

ActionPlatform AdminOrg OwnerAdminMember
View organization dataβœ…βœ…βœ…βœ…
Edit organization settingsβœ…βœ…βœ…βŒ
Add/remove usersβœ…βœ…βœ…βŒ
Connect ad accountsβœ…βœ…βœ…βŒ
View adsβœ…βœ…βœ…βœ…
Edit ads/feedbackβœ…βœ…βœ…βœ…
View billingβœ…βœ…βŒβŒ
Change subscriptionβœ…βœ…βŒβŒ
Access other organizationsβœ…βŒβŒβŒ
Access admin dashboardβœ…βŒβŒβŒ

Database Access Control​

Access LevelWhoAccess MethodAudit
Production Read/WriteDesignated DBAs onlyBastion host + MFAFull query logging
Production Read-OnlySenior Engineers (approved)Bastion host + MFAFull query logging
Staging Full AccessEngineering teamVPN + credentialsStandard logging
Local DevelopmentAll developersLocal containersN/A

Access Request Process​

  1. Submit access request via internal tool
  2. Manager approval
  3. Data Owner approval (for sensitive data)
  4. IT provisions access
  5. Access reviewed quarterly

Service-to-Service Authentication​

  • Internal services use API keys stored in AWS Secrets Manager
  • Service mesh validates service identity
  • All inter-service calls logged

Change Management​

Schema Change Process​

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Request │───▢│ Review │───▢│ Test │───▢│ Deploy β”‚
β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
β”‚ RFC Document β”‚ β”‚ Data Owner β”‚ β”‚ Staging β”‚ β”‚ Production β”‚
β”‚ Impact β”‚ β”‚ DBA Review β”‚ β”‚ Integration β”‚ β”‚ Rollback β”‚
β”‚ Analysis β”‚ β”‚ Security β”‚ β”‚ Tests β”‚ β”‚ Plan Ready β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Change Categories​

CategoryExamplesApproval RequiredLead Time
EmergencySecurity patches, critical bugsPost-hoc approvalImmediate
StandardNew columns (nullable), new indexesTeam lead1 sprint
SignificantNew tables, column type changesData Owner + DBA2 sprints
MajorSchema redesign, migrationsGovernance Council1 quarter

Schema Change Checklist​

Before Change:

  • Impact analysis completed
  • Backward compatibility assessed
  • Rollback plan documented
  • Data migration script tested
  • Documentation updated
  • Stakeholders notified

During Change:

  • Change window communicated
  • Monitoring in place
  • Rollback ready to execute

After Change:

  • Verification tests passed
  • Performance validated
  • Documentation finalized
  • Change logged in changelog

API Change Management​

Change TypeVersioningDeprecation Notice
Breaking changesNew version (v2, v3)6 months minimum
Additive changesSame versionN/A
Bug fixesSame versionN/A
DeprecationsMark deprecated3 months minimum

Incident Response​

Data Incident Classification​

SeverityDescriptionResponse TimeEscalation
P1 - CriticalData breach, mass data lossImmediateCEO, Legal, All Hands
P2 - HighData corruption, unauthorized access< 1 hourCTO, Security Lead
P3 - MediumData quality issues, minor exposure< 4 hoursData Owner, Engineering
P4 - LowDocumentation gaps, minor inconsistencies< 24 hoursData Steward

Incident Response Procedure​

Phase 1: Detection & Triage

  1. Incident detected (monitoring, user report, audit)
  2. On-call engineer triages
  3. Severity assigned
  4. Incident channel created (#incident-YYYYMMDD)

Phase 2: Containment

  1. Isolate affected systems if needed
  2. Preserve evidence (logs, snapshots)
  3. Stop ongoing data exposure
  4. Notify affected parties (if P1/P2)

Phase 3: Investigation

  1. Root cause analysis
  2. Scope determination (what data, how many records)
  3. Timeline reconstruction
  4. Document findings

Phase 4: Remediation

  1. Fix vulnerability/issue
  2. Restore data if needed
  3. Verify fix effectiveness
  4. Update monitoring/alerts

Phase 5: Post-Incident

  1. Blameless post-mortem
  2. Update runbooks/procedures
  3. Regulatory notifications (if required)
  4. Customer communications (if required)

Breach Notification Requirements​

RegulationNotification TimelineWho to Notify
GDPR72 hoursSupervisory authority + affected users
CCPA"Without unreasonable delay"Affected California residents
ContractualPer agreementAffected customers

Audit & Monitoring​

Audit Logging Requirements​

Event TypeLogged FieldsRetention
User loginuser_id, timestamp, IP, success/fail2 years
Data access (sensitive)user_id, table, record_id, timestamp2 years
Data modificationuser_id, table, record_id, old/new values, timestamp2 years
Admin actionsuser_id, action, target, timestamp2 years
API callsendpoint, user/service, params, response_code90 days
System eventsservice, event_type, details, timestamp90 days

Audit Log Structure​

{
"timestamp": "2024-12-08T14:30:00Z",
"event_type": "data_access",
"actor": {
"type": "user",
"id": "user_123",
"ip": "192.168.1.1"
},
"resource": {
"type": "table",
"name": "users",
"record_id": "456"
},
"action": "read",
"outcome": "success",
"metadata": {
"fields_accessed": ["email", "first_name"],
"query_context": "user_profile_view"
}
}

Monitoring Dashboards​

Data Quality Dashboard

  • Completeness scores by table
  • Freshness (time since last update)
  • Anomaly detection alerts
  • Cross-system reconciliation status

Security Dashboard

  • Failed login attempts
  • Unusual access patterns
  • Privilege escalations
  • API rate limit breaches

Compliance Dashboard

  • Data subject requests (pending/completed)
  • Consent status distribution
  • Retention policy compliance
  • Encryption status

Periodic Reviews​

Review TypeFrequencyParticipantsOutput
Access reviewQuarterlyData Owners, ITAccess adjustments
Quality reviewMonthlyData StewardsQuality improvement plan
Policy reviewAnnuallyGovernance CouncilPolicy updates
Compliance auditAnnuallyExternal auditorAudit report
Penetration testAnnuallySecurity firmRemediation plan

Governance Metrics & KPIs​

MetricTargetCurrentTrend
Data quality score (overall)>95%TBDβ€”
Critical field completeness>99%TBDβ€”
Data subject request SLA100% within 30 daysTBDβ€”
Incident response SLA100% within targetTBDβ€”
Documentation coverage100% of tablesTBDβ€”
Access review completion100% quarterlyTBDβ€”


Changelog​

VersionDateAuthorChanges
1.02024-12Data TeamInitial governance framework

Last Updated: December 2024