Data Transparency Autumn Event 2024 – Presentation Information

Save the Date! 

The PHUSE Data Transparency Autumn Event took place from 17–19 September 2024. During this virtual event, presentations were delivered across the three days in bitesize chunks. Each day also hosted a panel discussion and Q&A session focused on the day's themes.

Presentations and event recordings will be available soon. 

Day 1 – 17 September
Day 1 Recording
Presentation TitleSpeaker(s)Abstract

Event-Based Access Control – Game Changer for Autonomous Security and Transparent Data Sharing

Amit Gautam, Abluva

In today’s interconnected world, the ability to share data securely has emerged as a key to organisational success. However, as data volumes soar and cyber threats proliferate, the task of safeguarding sensitive information becomes daunting. Traditional access control methods, characterised by rigid permissions and manual oversight, are ill equipped to meet the demands. Enter Purpose and Event-Based Access Control (PEBAC) – an approach poised to redefine the way organisations manage data access. At its core, PEBAC represents a departure from conventional access control paradigms, offering a dynamic, context-aware solution to the challenges of data sharing and breach prevention. By tethering access privileges to specific purposes and real-world events, PEBAC empowers organisations to enforce granular controls over their data, enhancing transparency and accountability in the process.

In this presentation, we will explore how PEBAC’s unique blend of purpose-driven access and event-triggered policies can revolutionise data security, paving the way for a future where organisations can share data with confidence, knowing their information remains protected from unauthorised access and breaches.

Rare Disease Data – Overcoming Barriers to Controlled Access Sharing

Helen Spotswood, Roche & Karolina Stępniak, AstraZeneca

Rare disease data sharing to qualified researchers on controlled data access platforms brings hope for accelerated identification of molecular mechanisms/traits underpinning rare diseases and the development of new therapies. Rare disease clinical trial data may be more difficult to anonymise while maintaining utility compared to other clinical trial data and thus is not routinely shared. Considering the term ‘rare diseases’ covers a spectrum regarding prevalence, sensitivity of the data associated with phenotypic manifestations, stigma, and complexity of clinical data sharing, it is difficult to develop a one-size-fits-all procedure for sharing rare disease data.

The PHUSE Rare Disease/Small Population Data Sharing? Working Group project has been working on a white paper to review potential barriers to rare disease data sharing, e.g. risk of re-identification and invasion of privacy, and to provide recommendations to encourage the sharing of rare disease data with the research community.

This presentation will provide an update on this initiative and the progress made until August 2024.

Optimising the Utility of Anonymised Data: A Clinical Case Study

Lisa Pilgram, University of Ottawa

Data sharing has been mandated by several regulatory agencies and is desirable for many reasons (e.g. transparency, reproducibility, collaboration and innovation). At the same time, it comes with relevant privacy considerations. Anonymisation is one approach to address these concerns. To anonymise data, the data is manipulated in a way that it can no longer be related to a person. Manipulation can, however, result in relevant utility constraints. This is typically referred to as the privacy-utility trade-off. Consequently, one main concern when using anonymised data is the reproducibility of scientific results.

Our research investigated this concern in a clinical case study, addressing three main questions:

Can we reproduce scientific results in health research with anonymised data?

How relevant is use case-specific (and potentially more costly) anonymisation for reproducibility?

Do broad utility metrics reflect reproducibility?

Using data and scientific results from the German Chronic Kidney Disease study, we compared two anonymisation configurations: a generic scenario that aims to support multiple likely analytics use cases and a use case-specific scenario that was tailored to the scientific research question.

Our findings show that anonymisation can preserve data utility for downstream analyses, but the choice of anonymisation strategy impacts on reproducibility outcomes. While use case-specific anonymisation may incur additional costs in terms of domain expert knowledge and time, it can improve utility. The utility in downstream analysis is, however, not necessarily reflected in broad metrics.

Based on these findings, utility concerns should be as much a part of discussion as privacy concerns. It is recommended that data controllers account for the downstream analysis whenever feasible to maximise utility in anonymised data. In addition, generic utility metrics should be interpreted with caution as they may under- or overestimate actual utility.

Diversity in Clinical Trial Data Transparency: A New Horizon for Data Sharing with Truthful Statistics

Luk Arbuckle, Privacy Analytics & Stephen Bamford, Johnson & Johnson

As clinical trials increasingly prioritise diverse participant representation, the complexities of data anonymisation and transparency increase. This presentation will delve into the transformative shift from raw clinical data sharing to the innovative practice of statistical sharing, a method designed to uphold data quality and transparency and mitigate biases.

The transition to producing accurate and reliable statistics, while ensuring privacy and data utility, emerges as a robust solution to the challenges posed by managing diverse anonymised data. This approach prioritises the generation of truthful statistics which adhere to the principles of data anonymisation without compromising the integrity of the research. Key strategies include using synthetic datasets, aggregated counts, and advanced analytics, including artificial intelligence and machine learning (AIML). Each technique offers distinct advantages and challenges, balancing the need for insightful research and ethical data handling. These methods collectively create a pathway for transforming protected clinical trial data into secure and practical statistics, enabling researchers to derive meaningful insights without exposing raw data.

Join us as we explore the opportunities and challenges in clinical trial data transparency, emphasising the importance of diversity, accuracy, and ethical data practices. This session will provide a comprehensive overview of how statistical sharing can transform data management in clinical trials, to ensure robust research while safeguarding participant privacy.
Day 2 – 18 September
Day 2 Recording
Presentation TitleSpeaker(s)Abstract
Disclosure Without Exposure – Upstream Strategies for Enabling AI to Effectively and Efficiently Protect Clinical DataHonz Slipka, CertaraMost new drugs coming to market are required to publicly disclose the clinical trials leading up to their marketing authorisation. As a result of this public disclosure, clinical trials must be anonymised to protect both personal and commercially sensitive data. While this is a small part of the end-to end process of getting drugs to patients, clinical data protection is a critically important, labour-intensive and often bottle-necking process. Pharmaceutical companies are turning towards AI-leveraging software that enables automated identification of protected personal data (PPD) or commercially confidential information (CCI). The evolution in the way data is identified and anonymised has forced other positive shifts towards lean authoring, terminology harmonisation, and establishing sensitive data libraries.

This presentation will cover some of the major changes that can be implemented in clinical research to enable more efficient clinical data protection using emerging AI tools.
Harnessing the Power of Artificial Intelligence (AI) in Accelerating Production of Clinical Documentation: A Case Study with Plain Language Summaries (PLS)Kathi Künnemann, StaburoPlain language summaries (PLS) are required to accompany the summary of clinical trial results submissions according to the European Union Clinical Trials Regulation 536/2014 Annex V. AI tools are fast evolving and play an increasingly important role in many fields, including healthcare and medicine.

Our aim is to find out if AI can generate text with the same quality as that written by a medical writer, especially regarding correct interpretation of study results and requirements of lay language.

We created PLS with an AI tool to:
• Find the best balance in terms of PLS quality and MW working time between AI-created, MW-created, and AI-created & MW-reviewed PLS.
• Compare the comprehensibility of AI- vs MW-created PLS in a group of lay persons (using a questionnaire). We also want to compare AI tools and discuss aspects of data protection regarding creating PLS.
De-Identification of Medical Data for AI Training, Speeding Up Clinical Trials, and Enabling Healthcare Research?

Patricia Thaine, Private AI

The rapid advancements in artificial intelligence (AI) have the potential to revolutionise healthcare, from accelerating clinical trials to enhancing the accuracy of medical research. However, the integration of AI in healthcare is impeded by concerns over patient privacy and data security. In this talk, we will explore the critical role of de-identification in medical data to mitigate these challenges while ensuring compliance with privacy regulations.

We will delve into advanced de-identification techniques that preserve the utility of data for AI training, enabling the development of robust models without compromising patient confidentiality. By facilitating secure and privacy-conscious data sharing, these techniques not only expedite clinical trials but also unlock new avenues for healthcare research, leading to more personalised and effective treatments. Attendees will gain insights into how de-identification can bridge the gap between privacy and innovation, driving the future of AI in healthcare.

Leveraging Agentic AI Networks to Automate Transparency ActivitiesWoo Song, XogeneThis presentation will delve into the application of AI agents in solving mundane tasks currently performed by humans in clinical trial transparency. We will explore how AI can be leveraged to generate plain language protocols, summaries, and ICFs, thereby reducing the workload of medical writers. Additionally, we will discuss the potential of AI in automating disclosure activities, such as registration, results posting, project management, and reporting.

Furthermore, we will examine how AI agents not only enhance human productivity but also create virtual medical writers and disclosure analysts to augment human resources. By training AI models on vast amounts of clinical trial data and disclosure requirements, we can develop intelligent systems capable of assisting and even replacing human experts in certain tasks.

The presentation will include interactive elements, such as live demonstrations of AI-generated content and audience participation in identifying potential use cases for AI in their respective organisations.

Attendees will gain valuable insights into the transformative potential of AI in clinical trial transparency and learn how to harness its power to streamline processes, reduce costs and improve overall efficiency.
Day 3  – 19 September
Day 3 Recording
Presentation TitleSpeaker(s)Abstract
Privacy Methodology Implementation: Sharing Experience from the Field

Véronique Poinsot, Sanofi

Adoption of the TransCelerate Privacy Methodology in a scalable mode requires decisions, development, practices, and communication with stakeholders.

This presentation proposes to share this experience from the field, from an operational angle, considering the different aspects of the journey of implementing the Privacy Methodology.
True Re-Identification Risk Threshold – Aggregated-Level Clinical Information Impact on Anonymisation Methods Applied to Individual Patient-Level Data (IPLD)Agnieszka Głowińska & Łukasz Szyszka, AstraZenecaAggregated patient levels are generally considered anonymous. Aggregated data is no longer subject to anonymisation, which is demonstrated in the latest Anonymization Report template and in the Manufacturer PRCI Deck. However, fully retained aggregated-level data may disrupt anonymisation methods applied on individual patient-level data and impact on risk simulation measurements, leading to altered or even exceeded thresholds of 0.09. This is particularly evident in the case of summary demographic tables and tables that contain S/AEs subgrouped by demographic identifiers. These tables, if not properly analysed and secured, beyond disrupting anonymisation methods applied on demographic identifiers, can lead to the building of a full patient demography profile accompanied by visible, identifying, and/or stigmatising medical information. Aggregated-level data also has enormous influence on unmasking re-identifying S/AEs, rendering the redaction or generalisation methods ineffective.

This presentation will focus on the linkage between summary tables and narratives and its significant impact on anonymisation methods applied on IPLD. The main objective of this presentation is to raise awareness of the need to analyse full CSR data, to mitigate interrupting the calculated risk of participants’ re-identification value. Awareness that personal information can be derived from tabular data is still rather low, regardless of the number of assessments on the re-identification risk and the anonymisation of tabular data. There are also many publications demonstrating the ability to reconcile data reported in the patient listings with those in the narrative and using linkage data (i.e. treatment assignment, coded term, and timing of event) from the patient listings and narratives, in a way to map data from these two formats with the data in summary tables. Appropriate anonymisation protects participant privacy by considering evaluation of additional information available that can unmask or reveal PPD by linking parameters and residual context factors.

Writing Better CSRs to Facilitate Anonymisation for Clinical Data Publication

Cathal Gallagher & Laura Dodd, Instem

The main theme of transparency writing aligns with some of the same goals as for lean writing, only including text important to the study and avoiding adding details in the CSR body that should remain in the safety narratives. Less text allows the key messages to be emphasised in the document, which is ideal for both regulatory reviewers as well as those reading these documents online due to the clinical data publication requirements. Better ways to write subject identifiers in text and tables will be discussed, with an emphasis on the source of proper terms to use in text.

A Data Transparency Odyssey - A Decade of Data SharingAlex Hughes, Roche & Brent Caldwell, Novartis 

Over the past decade, the commitment to data transparency by various sponsors has not only transformed individual organisations but has also fostered a broader culture of data sharing and collaboration across the industry. The routine sharing of data and documents from clinical trials for secondary research, once considered improbable, has now become standard practice across the industry.

From ad hoc data requests to large multi-sponsor enquiries in a shifting regulatory landscape, this presentation will reflect on the industry’s journey in the data sharing space. We will discuss the challenges encountered along the way, addressing the complexities and obstacles that accompanied the data sharing journey leading up to the first data sharing packages.

Looking ahead, we will examine the current data transparency landscape and consider its implications for the future of data sharing in the industry.

Data Transparency Autumn Event Sponsors 

Virtual Event Sponsors

Sponsor Flyers 

                                                                  

Sponsorship 

Hosting the Data Transparency Event digitally means that no matter where you are in the world you can participate. It provides the industry with a broader opportunity to share knowledge on a global scale, connecting through the virtual event platform. The sponsor options offer a range of benefits with ample company exposure. See the prospectus for more detail. 

Data Transparency Working Group Leads