by Monika Andraos
Founder & Principal Consultant, Dunamis Compliance
Emerging life science companies frequently treat data integrity (DI) as a downstream compliance obligation rather than an upstream design principle. This approach generates significant operational, regulatory, and financial risk that compounds rapidly with organizational growth and diverges from the path to Quality Management Maturity.
The concept of Data Integrity by Design, as defined by the International Society for Pharmaceutical Engineering (ISPE) GAMP® Records and Data Integrity Good Practice Guide (2021), requires that DI considerations be incorporated from the initial planning of a business process through the implementation, operation, and retirement of supporting computerized systems.
Despite clear regulatory expectations (FDA 2018 Data Integrity Guidance, MHRA 2018 GxP Data Integrity Guidance, PIC/S PI 041-1, WHO Annex 4), many early-stage life science organizations continue to address data integrity reactively. The consequences: clinical holds, delayed filings, 483s, recalls, and multi-year remediation programs, are well documented yet largely preventable.
There are five recurrent failure modes observed in pre-commercial biotechnology and pharmaceutical organizations and corresponding structured framework for embedding Data Integrity by Design from inception. Implementation of these principles has been demonstrated to reduce technical debt, enhance data return-on-investment, and eliminate many traditional regulatory pain points without explicit reliance on reactive compliance-driven controls.
1. Delegation of Data Integrity to a Third Party
A prevalent misconception is that DI can be procured through vendor-supplied validation packages or certificates of compliance (e.g., 21 CFR Part 11). In practice, such artifacts represent point-in-time evidence of a vendor’s configuration potential rather than assurance that the system as implemented, operated and maintained within the sponsor organization, maintains integrity throughout the data lifecycle. The absence of organization-specific System Requirements Specifications (SRS) and rigorous supplier assessment frequently results in systems that–while compliant in documentation—are vulnerable in execution.
2. Compromised Risk Management in Resource-Constrained Environments
In smaller, greener organizations, individuals routinely assume multiple critical roles (e.g., Quality Manager, CSV SME, QA Investigator, and System Administrator). This convergence of responsibilities undermines the fundamental risk management principle of independent evaluation. Risk assessments become blurred conflicts of interest rather than objective analyses, with colorful risk assessment documentation influenced by operational urgency rather than evidence.
3. Absence of Formal Data Governance Frameworks
Almost all green organizations lack foundational data governance elements: a data governance policy, defined data stewardship roles, and a master data classification scheme. Instead, critical data elements are managed ad hoc by system administrators whose primary responsibilities lie elsewhere. The resulting proliferation of local conventions for naming, storage, and retention creates latent inconsistencies that become exponentially costly to resolve at scale.
4. Dependence on Tribal Knowledge and Informal Practices
Undocumented procedural knowledge remains the de facto standard operating procedure in many early-stage firms. This manifests in two predominant forms: “we’ve always done it this way” practices and the “this-is-how-we-did-it-at-mylast company” approach. Both patterns are equally problematic.
Procedures carried over from previous organizations—often from larger, later-stage companies or entirely different modalities—are frequently misaligned with the current organization’s scale, risk profile, or regulatory strategy. What was acceptable in a commercial-phase multinational with dedicated data stewardship teams becomes a critical vulnerability in a small preclinical company.
Single points of failure emerge when key individuals depart, taking with them the only comprehensive understanding of data provenance, transformation rules, or reconciliation processes—whether those processes were developed in-house or inherited from a prior employer.
5. Accumulation of Technical Debt
Expediency in early system selection and implementation frequently yields fragmented architectures: disconnected LIMS, QMS, eDMS, CTMS, and others. There is a reliance on manual exports, vendor repositories, and duplicated master data. Each workaround results in technical debt that accrues compound interest in the form of reconciliation effort, error rates, foregone analytical insight, and difficulty in scaling the system and process in the future. The inability to perform cross-batch trending or longitudinal clinical data analysis costs can be a regulatory risk as well as a strategic forfeiture of data-derived competitive advantage.
A Framework for Data Integrity by Design
Successful implementation requires deliberate integration of DI considerations across five domains:
- System Design: Develop concise, risk-based SRS that reflect actual business processes rather than generic templates. Do not leave this up to only the “users,” instead involve all the stakeholders to participate, including IT, Engineering, Procurement, Quality, Clinical, and Regulatory, among others. Limit documentation to those requirements capable of impacting patient safety, product quality, or DI with clear acceptance criteria.
- Strict Supplier Selection and Oversight: Treat providers of GxP computerized systems as critical suppliers under ICH Q9(R1) and EU GMP Chapter 7. Conduct documented supplier assessments, maintain ongoing performance monitoring, and retain organizational ownership and understanding of critical configuration, data and system security decisions that may be outsourced. A symbiotic working relationship will reflect continual improvement, integrity, and honesty from both sides.
- Embed Data Governance Early: Establish a fit-for-purpose data governance framework that is deliberately lightweight at inception yet explicitly designed to scale without re-architecture. The objective is not bureaucratic overhead but the systematic assignment of accountability for the completeness, consistency, and accuracy of critical data throughout its lifecycle. Its ultimate purpose is to deliver accurate, consistent, secure, and accessible data—the prerequisite for data integrity, reliable information, informed decision-making, and, ultimately, product quality and patient safety. The framework ought to include People and Structure, Processes and Policies, Technology, Leading Indicators, and Metrics and Accountability.
- Employee Empowerment and Knowledge Management: Institute lightweight but enforceable mechanisms for knowledge capture: version-controlled procedures, short-form video walkthroughs, and explicit succession planning for data-critical roles. Data literacy across the organization must be positioned as a core scientific competency rather than thought of as quality overhead. A culture of continual learning combined with good digital hygiene practices safeguards DI and enables risk management.
- Technical Architecture and Integration: Establish an integration backbone from the outset, even at modest scale, by first conducting a rapid but disciplined assessment of the current available technology and data landscape. This assessment must explicitly identify and prioritize two dimensions: (1) the processes and data flows that represent the greatest risk to patient safety, product quality, or regulatory submission integrity; and (2) datasets and use cases that offer the highest potential insights on return-on-investment. With these priorities defined, intentionally design and enforce standardization across three foundational pillars: data security, integration, and reliability.
Emerging life science organizations that treat data integrity as an upstream design principle rather than a downstream “last-minute” fix consistently avoid the clinical holds, delayed submissions, and multi-million-dollar remediations that plague the market. Data cease to be a liability and become the most powerful driver of scientific insight, operational excellence, and investor confidence. For eager companies, deliberate design is the difference between surviving the next regulatory milestone and leading in their category.
About the Author:
Monika Andraos, ASQ CQE, leads Dunamis Compliance, which helps regulated organizations embed data integrity principles in their day-to-day operations by refining their data governance framework using current science-backed tools. Find her on LinkedIn.
References:
1. International Society for Pharmaceutical Engineering (ISPE). (2021). GAMP® Records and Data Integrity Good Practice Guide: Data Integrity by Design.
2. U.S. Food and Drug Administration (FDA). (2018). Data Integrity and Compliance with Drug CGMP: Questions and Answers. Guidance for Industry. U.S. Department of Health and Human Services, FDA, Center for Drug Evaluation and Research (CDER), Center for Biologics Evaluation and Research (CBER), and Center for Veterinary Medicine (CVM).
3. Medicines and Healthcare products Regulatory Agency (MHRA). (2018). MHRA GxP Data Integrity Guidance and Definitions.
4. Pharmaceutical Inspection Co-operation Scheme (PIC/S). (2021). PI 041-1 Good Practices for Data Management and Integrity Related to Drug Product.
5. World Health Organization (WHO). (2021). Annex 4: WHO good distribution practices for pharmaceutical products. WHO Technical Report Series, No. 957.
6. International Council for Harmonisation (ICH). (2018). ICH Q9(R1) Quality Risk Management.
7. European Commission. (2017). EudraLex – Volume 4: Good Manufacturing Practice (GMP) Guidelines, Chapter 7: Outsourced Activities.
8. U.S. Food and Drug Administration (FDA). (2021). 21 CFR Part 11: Electronic Records; Electronic Signatures. Code of Federal Regulations, Title 21, Chapter I, Subchapter A, Part 11.
9. European Commission. (2011). EudraLex – Volume 4: Good Manufacturing Practice (GMP) Guidelines, Annex 11: Computerised Systems.