Publications
Deep-Learning Models for Forecasting Financial Risk Premia and Their Interpretations
2023The measurement of financial risk premia, the amount that a risky asset will outperform a risk-free one, is an important problem in asset pricing. The noisiness and non-stationarity of asset returns makes the estimation of risk premia using machine learning (ML) techniques challenging. In this work, we develop ML models that solve the problems associated with risk premia forecasting by separating risk premia prediction into two independent tasks, a time series model and a cross-sectional model, and using neural networks with skip connections to enable their deep neural network training.These models are tested robustly with different metrics, and we observe that our models outperform several existing standard ML models. A known issue with ML models is their ‘black box’ nature, i.e. their opaqueness to interpretability. We interpret these deep neural networks using local approximation-based techniques that provide explanations for our model’s predictions.
From ELIZA to ChatGPT: The Evolution of NLP and Financial Applications
2023Natural language processing (NLP) has revolutionized the financial industry, providing advanced techniques for the processing, analyzing, and understanding of unstructured financial text. The authors provide a comprehensive overview of the historical development of NLP, starting from early rules-based approaches to recent advances in deep learning–based NLP models. They also discuss applications of NLP in finance along with its challenges, including data scarcity and adversarial examples, and speculate about the future of NLP in the financial industry. To illustrate the capability of current NLP models, a state-of-the-art chatbot is employed as a co-author of this article.
Leveraging Patient Preference Information in Medical Device Clinical Trial Design
2023Use of robust, quantitative tools to measure patient perspectives within product development and regulatory review processes offers the opportunity for medical device researchers, regulators, and other stakeholders to evaluate what matters most to patients and support the development of products that can best meet patient needs. The medical device innovation consortium (MDIC) undertook a series of projects, including multiple case studies and expert consultations, to identify approaches for utilizing patient preference information (PPI) to inform clinical trial design in the US regulatory context. Based on these activities, this paper offers a cogent review of considerations and opportunities for researchers seeking to leverage PPI within their clinical trial development programs and highlights future directions to enhance this field. This paper also discusses various approaches for maximizing stakeholder engagement in the process of incorporating PPI into the study design, including identifying novel endpoints and statistical considerations, crosswalking between attributes and endpoints, and applying findings to the population under study. These strategies can help researchers ensure that clinical trials are designed to generate evidence that is useful to decision makers and captures what matters most to patients.
Incorporating patient preferences and burden-of-disease in evaluating ALS drug candidate AMX0035: a Bayesian decision analysis perspective
2023OBJECTIVE: Provide US FDA and amyotrophic lateral sclerosis (ALS) society with a systematic, transparent, and quantitative framework to evaluate the efficacy of the ALS therapeutic candidate AMX0035 in its phase 2 trial, which showed statistically significant effects (p-value 3%) in slowing the rate of ALS progression on a relatively small sample size of 137 patients.
METHODS: We apply Bayesian decision analysis (BDA) to determine the optimal type I error rate (p-value) under which the clinical evidence of AMX0035 supports FDA approval. Using rigorous estimates of ALS disease burden, our BDA framework strikes the optimal balance between FDA’s need to limit adverse effects (type I error) and patients’ need for expedited access to a potentially effective therapy (type II error). We apply BDA to evaluate long-term patient survival based on clinical evidence from AMX0035 and Riluzole.
RESULTS: The BDA-optimal type I error for approving AMX0035 is higher than the 3% p-value reported in the phase 2 trial if the probability of the therapy being effective is at least 30%. Assuming a 50% probability of efficacy and a signal-to-noise ratio of treatment effect between 25% and 50% (benchmark: 33%), the optimal type I error rate ranges from 2.6% to 26.3% (benchmark: 15.4%). The BDA-optimal type I error rate is robust to perturbations in most assumptions except for a probability of efficacy below 5%.
CONCLUSION: BDA provides a useful framework to incorporate subjective perspectives of ALS patients and objective burden-of-disease metrics to evaluate the therapeutic effects of AMX0035 in its phase 2 trial.
Financial Intermediation and the Funding of Biomedical Innovation: A Review
2023We review the literature on financial intermediation in the process by which new medical therapeutics are financed, developed, and delivered. We discuss the contributing factors that lead to a key finding in the literature—underinvestment in biomedical R&D—and focus on the role that banks and other intermediaries can play in financing biomedical R&D and potentially closing this funding gap. We conclude with a discussion of the role of financial intermediation in the delivery of healthcare to patients.
Social Contagion and the Survival of Diverse Investment Styles
2023We examine the contagion of investment ideas in a multiperiod setting in which investors are more likely to transmit their ideas to other investors after experiencing higher payoffs in one of two investment styles with different return distributions. We show that heterogeneous investment styles are able to coexist in the long run, implying a greater diversity than predicted by traditional theory. We characterize the survival and popularity of styles in relation to the distribution of security returns. In addition, we demonstrate that psychological effects such as conformist preference can lead to oscillations and bubbles in the choice of style. These results remain robust under a wide class of replication rules and endogenous returns. They offer empirically testable predictions, and provide new insights into the persistence of the wide range of investment strategies used by individual investors, hedge funds, and other professional portfolio managers.
Explainable Machine Learning Models of Consumer Credit Risk
2023In this work, the authors create machine learning (ML) models to forecast home equity credit risk for individuals using a real-world dataset and demonstrate methods to explain the output of these ML models to make them more accessible to the end user. They analyze the explainability for various stakeholders: loan companies, regulators, loan applicants, and data scientists, incorporating their different requirements with respect to explanations. For loan companies, they generate explanations for every model prediction of creditworthiness. For regulators, they perform a stress test for extreme scenarios. For loan applicants, they generate diverse counterfactuals to guide them with steps toward a favorable classification from the model. Finally, for data scientists, they generate simple rules that accurately explain 70%–72% of the dataset. Their study provides a synthesized ML explanation framework for all stakeholders and is intended to accelerate the adoption of ML techniques in domains that would benefit from explanations of their predictions.
Optimal Financing for R&D-Intensive Firms (Working Paper)
2023We develop a theory of optimal financing for R&D-intensive firms. With only market financing, the firm relies exclusively on equity financing and carries excess cash, but underinvests in R&D. We use mechanism design to examine how intermediated financing can attentuate this underinvestment. The mechanism combines equity with put options such that investors insure firms against R&D failure and firms insure investors against high R&D payoffs not being realized.
Macro-Finance Models with Nonlinear Dynamics
2023We provide a review of macro-finance models featuring nonlinear dynamics. We survey the models developed recently in the literature, including models with amplification effects of financial constraints, models with households' leverage constraints, and models with financial networks. We also construct an illustrative model for those readers who are unfamiliar with the literature. Within this framework, we highlight several important limitations of local solution methods compared with global solution methods, including the fact that local-linearization approximations omit important nonlinear dynamics, yielding biased impulse-response analysis.
Jack Bogle: Champion of the People
2022A tribute to John (Jack) C. Bogle, founder of The Vanguard Group, who passed away on January 16, 2019.
Estimation and Prediction for Algorithmic Models of Investor Behavior
2022We propose a Markov chain Monte Carlo (MCMC) algorithm for estimating the parameters of algorithmic models of investor behavior. We show that this method can successfully infer the relative importance of each heuristic among a large cross-section of investors, even when the number of observations per investor is quite small. We also compare the accuracy of the MCMC approach to regression analysis in predicting the relative importance of heuristics at the individual and aggregate levels and conclude that MCMC predicts aggregate weights more accurately while regression outperforms in predicting individual weights.
Pandemic Readiness Requires Bold Federal Financing for Vaccines
2022Most people will experience a severe pandemic within their lifetime, and the world remains dangerously unprepared. In fact, scientists predict a nearly 50% chance—the same probability as flipping heads or tails on a coin—that we will endure another COVID-19-level pandemic within the next 25 years. Shifting America’s pandemic response capability from reactive to proactive is, therefore, urgent. Failure to do so risks the country’s welfare.
Getting ahead of the next pandemic is impossible without government financing. Vaccine production is costly, and these expenses will hinder industries from preemptively developing the tools needed to halt disease transmission. For example, the total expected revenues over a 20-year vaccine patent lifecycle would cover just half of the upfront research and development (R&D) costs.
However, research suggests that a portfolio-based approach to vaccine development—especially now with new, broadly applicable mRNA technology—dramatically increases the returns on investment while also guarding against an estimated 31 of the next 45 epidemic outbreaks. With lessons learned from Operation Warp Speed, Congress can deploy this approach by (i) authorizing and appropriating $10 billion to the Biomedical Advanced Research and Development Authority (BARDA) (ii) developing a vaccine portfolio for 10 emerging infectious diseases (EIDs), and (iii) a White House Office of Science and Technology Policy (OSTP)-led interagency effort focused on scaling up production of priority vaccines.
Identifying and Mitigating Potential Biases in Predicting Drug Approvals
2022INTRODUCTION: Machine learning models are increasingly applied to predict the drug development outcomes based on intermediary clinical trial results. A key challenge to this task is to address various forms of bias in the historical drug approval data.
OBJECTIVE:We aimed to identify and mitigate the bias in drug approval predictions and quantify the impacts of debiasing in terms of financial value and drug safety.
METHODS: We instantiated the Debiasing Variational Autoencoder, the state-of-the-art model for automated debiasing. We trained and evaluated the model on the Citeline dataset provided by Informa Pharma Intelligence to predict the final drug development outcome from phase II trial results.
RESULTS: The debiased Debiasing Variational Autoencoder model achieved better performance (measured by the F1 score 0.48) in predicting the drug development outcomes than its un-debiased baseline (measured by the F1 score 0.25). It had a much higher true-positive rate than baseline (60% vs 15%), while its true-negative rate was slightly lower (88% vs 99%). The Debiasing Variational Autoencoder distinguished between drugs developed by large pharmaceutical firms and those by small biotech companies. The model prediction is strongly influenced by multiple factors such as prior approval of the drug for another indication, whether the trial meets the positive/negative endpoints, and the year when the trial is completed. We estimate that the debiased model generates financial value for the drug developer in six major therapeutic areas, with a range of US$763–1,365 million.
CONCLUSIONS: Our analysis shows that debiasing improves the financial efficiency of late-stage drug development. From the pharmacovigilance perspective, the debiased model is more likely to identify drugs that are both safe and effective. Meanwhile, it may predict a higher probability of success for drugs with potential adverse effects (because of its lower true-negative rate), thus it must be used with caution to predict the development outcomes of drug candidates currently in the pipeline.
Estimates of Probabilities of Successful Development of Pain Medications: An Analysis of Pharmaceutical Clinical Development Programs from 2000 to 2020
2022BACKGROUND: The authors estimate the probability of successful development and duration of clinical trials for medications to treat neuropathic and nociceptive pain. The authors also consider the effect of the perceived abuse potential of the medication on these variables.
METHODS: This study uses the Citeline database to compute the probabilities of success, duration, and survivorship of pain medication development programs between January 1, 2000, and June 30, 2020, conditioned on the phase, type of pain (nociceptive vs. neuropathic), and the abuse potential of the medication.
RESULTS: The overall probability of successful development of all pain medications from phase 1 to approval is 10.4% (standard error, 1.5%).Medications to treat nociceptive and neuropathic pain have a probability of successful development of 13.3% (standard error, 2.3%) and 7.1% (standard error, 1.9%), respectively. The probability of successful development of medications with high abuse potential and low abuse potential are 27.8% (standard error, 4.6%) and 4.7% (standard error, 1.2%), respectively. The most common period for attrition is between phase 3 and approval.
CONCLUSIONS: The authors’ data suggest that the unique attributes of pain medications, such as their abuse potential and intended pathology, can influence the probability of successful development and duration of development.
World of EdCraft: Teaching Healthcare Finance at MIT
2022In this article, I describe my approach to dealing with the challenges and opportunities of synchronous online teaching during the Fall semester of 2020 in the specific context of a 90-student graduate course in Healthcare Finance at the MIT Sloan School of Management.
