Research
Identifying and Mitigating Potential Biases in Predicting Drug Approvals
Xu, Qingyang, Elaheh Ahmadi, Alexander Amini, Daniela Rus, and Andrew W. Lo, Identifying and Mitigating Potential Biases in Predicting Drug Approvals, Drug Safety 45, 521–533.
View abstract
Hide abstract
Introduction
Machine learning models are increasingly applied to predict the drug development outcomes based on intermediary clinical trial results. A key challenge to this task is to address various forms of bias in the historical drug approval data.
Objective
We aimed to identify and mitigate the bias in drug approval predictions and quantify the impacts of debiasing in terms of financial value and drug safety.
Methods
We instantiated the Debiasing Variational Autoencoder, the state-of-the-art model for automated debiasing. We trained and evaluated the model on the Citeline dataset provided by Informa Pharma Intelligence to predict the final drug development outcome from phase II trial results.
Results
The debiased Debiasing Variational Autoencoder model achieved better performance (measured by the F1�1 score 0.48) in predicting the drug development outcomes than its un-debiased baseline (F1�1 score 0.25). It had a much higher true-positive rate than baseline (60% vs 15%), while its true-negative rate was slightly lower (88% vs 99%). The Debiasing Variational Autoencoder distinguished between drugs developed by large pharmaceutical firms and those by small biotech companies. The model prediction is strongly influenced by multiple factors such as prior approval of the drug for another indication, whether the trial meets the positive/negative endpoints, and the year when the trial is completed. We estimate that the debiased model generates financial value for the drug developer in six major therapeutic areas, with a range of US$763–1,365 million.
Conclusions
Our analysis shows that debiasing improves the financial efficiency of late-stage drug development. From the pharmacovigilance perspective, the debiased model is more likely to identify drugs that are both safe and effective. Meanwhile, it may predict a higher probability of success for drugs with potential adverse effects (because of its lower true-negative rate), thus it must be used with caution to predict the development outcomes of drug candidates currently in the pipeline.
Lo, Andrew W., and Richard T. Thakor (2022), Financing Biomedical Innovation, Annual Review of Financial Economics 14, 231-270.
View abstract
Hide abstract
We review the recent literature on financing biomedical innovation, with a specific focus on the drug development process and how it may be enhanced to improve outcomes. We begin by laying out stylized facts about the structure of the drug development process and its associated costs and risks, and we present evidence that the rate of discovery for life-saving treatments has declined over time while costs have increased. We make the argument that these structural features require drug development (i.e., biopharmaceutical) firms to rely on external financing and at the same time amplify market frictions that may hinder the ability of these firms to obtain financing, especially for treatments that may have large societal value relative to the benefits going to the firms and their investors. We then provide an overview of the evidence for various types of market frictions to which these drug development firms are exposed and discuss how these frictions affect their incentive to invest in the development of new drugs, leading to underinvestment in valuable treatments. In light of this evidence, numerous studies have proposed ways to overcome this funding gap, including the use of financial innovation. We discuss the potential of these approaches to improve outcomes.
Estimates of Probabilities of Successful Development of Pain Medications: An Analysis of Pharmaceutical Clinical Development Programs from 2000 to 2020
Maher, Dermot P., Chi Heem Wong, Kien Wei Siah, and Andrew W. Lo (2022), Estimates of Probabilities of Successful Development of Pain Medications: An Analysis of Pharmaceutical Clinical Development Programs from 2000 to 2020, Anesthesiology 137, 243-251.
View abstract
Hide abstract
Background: The authors estimate the probability of successful development and duration of clinical trials for medications to treat neuropathic and nociceptive pain. The authors also consider the effect of the perceived abuse potential of the medication on these variables.
Methods: This study uses the Citeline database to compute the probabilities of success, duration, and survivorship of pain medication development programs between January 1, 2000, and June 30, 2020, conditioned on the phase, type of pain (nociceptive vs. neuropathic), and the abuse potential of the medication.
Results: The overall probability of successful development of all pain medications from phase 1 to approval is 10.4% (standard error, 1.5%).Medications to treat nociceptive and neuropathic pain have a probability of successful development of 13.3% (standard error, 2.3%) and 7.1% (standard error, 1.9%), respectively. The probability of successful development of medications with high abuse potential and low abuse potential are 27.8% (standard error, 4.6%) and 4.7% (standard error, 1.2%), respectively. The most common period for attrition is between phase 3 and approval.
Conclusions: The authors’ data suggest that the unique attributes of pain medications, such as their abuse potential and intended pathology, can influence the probability of successful development and duration of development.
Predicting drug approvals: The Novartis data science and artificial intelligence challenge
Wei Siah, Kien, Nicholas W. Kelley, Steffen Ballerstedt, Bjorn Holzhauer, Tianmeng Lyu, David Mettler, Sophie Sun, Simon Wandel, Yang Zhong, Bin Zhou, Shifeng Pan, Yingyao Zhou, and Andrew W. Lo (2021), Predicting drug approvals: The Novartis data science and artificial intelligence challenge, Patterns 2 (8), 1-9.
View abstract
Hide abstract
We describe a novel collaboration between academia and industry, an in-house data science and artificial intelligence challenge held by Novartis to develop machine-learning models for predicting drug-development outcomes, building upon research at MIT using data from Informa as the starting point. With over 50 crossfunctional teams from 25 Novartis offices around the world participating in the challenge, the domain expertise of these Novartis researchers was leveraged to create predictive models with greater sophistication. Ultimately, two winning teams developed models that outperformed the baseline MIT model—areas under the curve of 0.88 and 0.84 versus 0.78, respectively—through state-of-the-art machine-learning algorithms and the use of newly incorporated features and data. In addition to validating the variables shown to be associated with drug approval in the earlier MIT study, the challenge also provided new insights into the drivers of drug-development success and failure.
Estimating clinical trial success rates and related parameters in oncology
Wong, Chi Heem, Kien Wei Siah, and Andrew W. Lo (2019), Estimating Clinical Trial Success Rates and Related Parameters in Oncology, Biostatistics 20 (1), 273-286.
View abstract
Hide abstract
We extend earlier large-scale studies of clinical trial statistics by focusing on the performance of oncology trials. Using 108,248 data points between January 1, 2005, and September 31, 2018, compiled from the Citeline database, we investigate the duration of clinical trials and compute the probabilities of success of 24,448 oncology drug development programs by disease group. While the overall phase 1 to approval rate for all oncology-related drug development programs is 3.3%, individual disease groups have approval rates ranging from 0% to 10.1%. Similar patterns can be seen for oncology orphan drug development programs, where the overall probability of success ranges from 0% to 8.3%, with an overall average of 1.9%. We find overwhelming evidence that using biomarkers for patient selection is effective in almost all disease groups within oncology, raising the overall probability of success by an average of 13.3%.
Lo, Andrew W., Ruixun Zhang, and Chaoyi Zhao (2022), Measuring and Optimizing the Risk and Reward of Green Portfolios, The Journal of Impact and ESG Investing 3 (2), 55-99.
View abstract
Hide abstract
We study the performance of green portfolios in both the US and Chinese markets, constructed using a broad range of climate-related environmental metrics, including carbon emissions, water consumption, waste disposal, land and water pollutants, air pollutants, and natural resource use. We compare several popular long-only and long–short green portfolio construction methodologies and find that a method based on Treynor–Black weights offers the most robust performance, thanks to its ability to quantify alphas for individual assets using only a small number of parameters. In the United States, green portfolios (e.g., low-carbon portfolios) have realized positive alphas in excess of Fama–French factors, a significant portion of which can be explained by an unexpected increase in climate concerns over the past decade, rather than positive expected returns. In contrast, Chinese investors have borne a cost for holding green assets instead of brown assets over the past seven years, implying a positive carbon premium, the opposite of US markets.
Lo, Andrew W., Richard Thakor (2022), Financing Biomedical Innovation, Annual Reviews 14, 231-270.
View abstract
Hide abstract
We review the recent literature on financing biomedical innovation, with a specific focus on the drug development process and how it may be enhanced to improve outcomes. We begin by laying out stylized facts about the structure of the drug development process and its associated costs and risks, and we present evidence that the rate of discovery for life-saving treatments has declined over time while costs have increased. We make the argument that these structural features require drug development (i.e., biopharmaceutical) firms to rely on external financing and at the same time amplify market frictions that may hinder the ability of these firms to obtain financing, especially for treatments that may have large societal value relative to the benefits going to the firms and their investors. We then provide an overview of the evidence for various types of market frictions to which these drug development firms are exposed and discuss how these frictions affect their incentive to invest in the development of new drugs, leading to underinvestment in valuable treatments. In light of this evidence, numerous studies have proposed ways to overcome this funding gap, including the use of financial innovation. We discuss the potential of these approaches to improve outcomes.
The Wisdom of Crowds Versus the Madness of Mobs: An Evolutionary Model of Bias, Polarization, and Other Challenges to Collective Intelligence
Lo, Andrew W., Ruixun Zhang (2022), The Wisdom of Crowds Versus the Madness of Mobs: An Evolutionary Model of Bias, Polarization, and Other Challenges to Collective Intelligence, Collective Intelligence 1(1). https://doi.org/10.1177/26339137221104785.
View abstract
Hide abstract
Despite its success in financial markets and other domains, collective intelligence seems to fall short in many critical contexts, including infrequent but repeated financial crises, political polarization and deadlock, and various forms of bias and discrimination. We propose an evolutionary framework that provides fundamental insights into the role of heterogeneity and feedback loops in contributing to failures of collective intelligence. The framework is based on a binary choice model of behavior that affects fitness; hence, behavior is shaped by evolutionary dynamics and stochastic changes in environmental conditions. We derive collective intelligence as an emergent property of evolution in this framework, and also specify conditions under which it fails. We find that political polarization emerges in stochastic environments with reproductive risks that are correlated across individuals. Bias and discrimination emerge when individuals incorrectly attribute random adverse events to observable features that may have nothing to do with those events. In addition, path dependence and negative feedback in evolution may lead to even stronger biases and levels of discrimination, which are locally evolutionarily stable strategies. These results suggest potential policy interventions to prevent such failures by nudging the “madness of mobs” towards the “wisdom of crowds” through targeted shifts in the environment
The reaction of sponsor stock prices to clinical trial outcomes: An event study analysis
Singh, Manish, Roland Rocafort, Cathy Cai, Kien Wei Siah, Andrew W. Lo (2022), The reaction of sponsor stock prices to clinical trial outcomes: An event study analysis, PLoS ONE 17(9).
View abstract
Hide abstract
We perform an event study analysis that quantifies the market reaction to clinical trial result announcements for 13,807 trials from 2000 to 2020, one of the largest event studies of clinical trials to date. We first determine the specific dates in the clinical trial process on which the greatest impact on the stock prices of their sponsor companies occur. We then analyze the relationship between the abnormal returns observed on these dates due to the clinical trial outcome and the properties of the trial, such as its phase, target accrual, design category, and disease and sponsor company type (biotechnology or pharmaceutical). We find that the classification of a company as “early biotechnology” or “big pharmaceutical” had the most impact on abnormal returns, followed by properties such as disease, outcome, the phase of the clinical trial, and target accrual. We also find that these properties and classifications by themselves were insufficient to explain the variation in excess returns observed due to clinical trial outcomes.
An Artificial Intelligence-Based Industry Peer Grouping System
Bonne, George, Andrew W. Lo, Abilash Prabhakaran, Kien Wei Siah, Manish Singh, Xinxin Wang, Peter Zangari, and Howard Zhang (2022), An Artificial Intelligence-Based Industry Peer Grouping System, The Journal of Financial Data Science 4 (2), 1-3.
View abstract
Hide abstract
In this article, the authors develop a data-driven peer grouping system using artificial intelligence (AI) tools to capture market perception and, in turn, group companies into clusters at various levels of granularity. In addition, they develop a continuous measure of similarity between companies; they use this measure to group companies into clusters and construct hedged portfolios. In the peer groupings, companies grouped in the same clusters had strong homogeneous risk and return profiles, whereas different clusters of companies had diverse, varying risk exposures. The authors extensively evaluated the clusters and found
that companies grouped by their method had higher out-of-sample return correlation but lower stability and interpretability than companies grouped by a standard industry classification system. The authors also develop an interactive visualization system for identifying AI-based clusters and similar companies.