We are currently undergoing a transition where software is becoming increasingly important for drug discovery. While parts of the drug discovery pipeline  have used sophisticated computational methods for decades, other domains such as clinical trials use scarcely any software beyond Word or Excel. This situation is unfortunate, because there already exists a strong, albeit fragmented, ecosystem of open source datasets. Unifying this ecosystem and making it accessible from open source software packages like DeepChem will accelerate the process of discovering new medicines. In this essay, we explore what good open source tooling for disease selection and clinical trial management might look like, and comment on good directions for open algorithmic development.
1/ What can’t deep learning do? Worth putting together a list of known failures to guide algorithmic development.
Machine learning and big data are broadly believed to be synonymous. The story goes that large amounts of training data are needed for algorithms to discern signal from noise. As a result, machine learning techniques have been most used by web companies with troves of user data. For Google, Facebook, Microsoft, Amazon, Apple (or the “Fearsome Five” as Farhad Manjoo of the New York Times has dubbed them), obtaining large amounts of user data is no issue. Data usage policies have become increasingly broad, allowing these companies to make use of everything from our keystrokes to our personal locations as we use company products. As a result, web companies have been able to offer very useful, but intrusive, products and services that rely on large datasets. Datasets with billions to trillions of datapoints are not unreasonable for these companies.
Businesses need safeguards and moats to grow. In order for investors to justify allocating capital to a business, there have to be mechanisms to ensure that value created doesn’t leak out. Traditionally, this value has been safeguarded by a variety of legal mechanisms. For example, in markets where products are scarcely differentiated from one another, branding becomes critical. Tide detergent and Costco generic detergent are likely near identical, but Tide has the advantage of strong branding and packaging. In order to protect this value, Tide can take out trademarks protecting its distinctive packaging and visuals. In the pharmaceutical industry, a company’s critical intellectual property has resided in novel chemical entities. The effort expended in these inventions is protected by patents that provide the right to use the novel molecule for a given purpose. The rule of law ensures that knock-offs can’t ignore these protections and piggy-back on Tide’s brand or Merck’s molecules.
கடந்த நூற்றாண்டில் அறிவியல் ஆங்கிலத்தில் பெரும் அளவில் நடந்திருக்கிறது. முதல் உலகப் போருக்கு பிறகு, இங்கிலாந்திலும் அமெரிக்காவிலும் புகழ்பெற்ற விஞ்ஞானிகள் குவிந்திருக்கிறார்கள். ஆகையால், அவர்களுடைய கண்டுபிடிப்புகள் ஆங்கிலத்தில் எழுதப்பட்டன.
ஆனால், இருநூறு ஆண்டுகளுக்கு முன்பு சூழ்நிலை வேறு விதமாக இருந்தது. அப்போது ஜெர்மனியில் விஞ்ஞானம் செழித்து வளர்ந்து கொண்டிருந்தது. கவுஸ் ஆய்ல்லர் போன்ற அறிஞர்கள் பல்வேறு கண்டுபிடிப்புகள் கண்டறிந்தனர். ஆயிரம் ஆண்டுகளுக்கு முன்பு, அரேபியாவில் இபின் சீன மருத்துவத்தில் பல சாதனைகளை புரிந்தார். இரண்டாயிரம் ஆண்டுகளுக்கு முன்பு கிரீஸில் பித்தகோரஸ் கணிதத்தை வளர்த்தார்.
எதற்காக இந்த பட்டியல் கொடுத்திருக்கிறேன்? உலகத்தில் எல்லா மூலைகளிலும் விஞ்ஞானிகள் வாழ்ந்த்திருக்கிறார்கள் என்று காட்டுவதற்குத்தான்!
ஆதலால் விஞ்ஞானம் ஆங்கிலத்தில் தான் செய்ய வேண்டும் என்று கட்டாயம் இல்லை. வேறு மொழிகளில் விஞ்ஞானம் வளர்ப்பதற்க்கு நல்ல காரணங்கள் உண்டு: விஞ்ஞானம் கணிதத்தில் மட்டும் பிறக்காது. புதிய கண்டுபிடிப்புக்கு பல மூல காரணங்கள் இருக்கலாம். ஐன்ஸ்டீன் ரெலடிவிட்டி கண்டுபிடித்ததற்கு மாக் என்ற ஜெர்மானிய எழுத்தாளரின் கருத்துகள் தூண்டுதல் கொடுத்தது. அதே போல், நான் செய்யும் ஆராய்ச்சிக்கு வள்ளுவர், கம்பன், பாரதியார் போன்ற மேதைகளின் கருத்துகள் உதவியுள்ளன.
எதிர்காலத்தில் அறிவியல் வளர்வதற்கு பல கருத்துகள் உபயோகமாக இருக்கலாம். அதற்கு பல்வேறு கலாச்சாரங்களில் உள்ள கோட்பாடுகள் உதவலாம். விஞ்ஞானம் வளர்ச்சி அடைவதற்கு பல மொழிகளில் ஆராய்ச்சி செய்வோம்!
The drug discovery process is a complex one, consisting of many stages and typically requiring many years. In this essay, I’ll provide a brief overview of the traditional drug discovery pipeline, and comment a bit on the various technologies, both biological and computational, used in the process. Listed in sequential order, the stages of the pipeline are target selection and validation, hit finding, lead optimization and toxicity handling, animal model testing, Phase I clinical testing, Phase II clinical testing, and Phase III clinical testing. Let’s cover each of these in order.
Target discovery is the process of identifying the biological system the drug will target. The idea is to identify a critical driver of a disease. This process is called target driven drug discovery, as opposed to phenotypic drug discovery, which looks directly for compounds that appear to treat diseased individuals or animals. Target driven drug discovery has a number of advantages over the phenotypic approach. The rapid growth of nanoscience techniques to study biological molecules has meant that scientists have a fine grained understanding of the structure of a vast array of biomolecules. This tight-grained control has allowed for the design of a number of new targeted cancer therapeutics and has resulted in target driven discovery becoming the dominant form of modern discovery effects.
There are a number of critics who argue that this emphasis has gone too far. Drugs never affect only one target, and designing believing so will inevitably lead to prominent failures. A number of companies have started platforms based on hybridizing the strengths of phenotypic and target driven discovery.
This early stage of drug discovery is often the most nebulous. For complicated diseases such as Alzheimer’s, pancreatic cancers, or obesity, scientists have very little idea about critical biological drivers. Multiple potential Alzheimer’s drugs have suffered high profile failures, likely due to mistaken hypotheses about the root causes of the disease. This fact emphasizes the reality that drug discovery is intimately linked to deep questions about biology, and until human knowledge of biological systems advances dramatically, many diseases will remain resistant to our best efforts at discovering treatments.
Returning to the mechanics of the pipeline, target validation is the flip side of the target discovery process. Some biomolecules are not good targets for drug design projects. Such biomolecules may be harder to reach than others due to metabolic or biophysical reasons. Choosing a challenging target is usually a bad idea unless there’s some new enabling biochemical technology that enables this targeting. Target validation mechanism seeks to guess that a putative target, to first approximation, is reachable by standard therapeutics and is amenable to chemical modulation. This process is largely one of guesswork, and tends to rely upon experienced biologists’ intuitions.
After a drug discovery project has obtained and validated a target, it proceeds to hit finding, namely the process of finding a first molecule which will modulate the biomolecular target in question. An important note is that the process of finding a hit might be significantly different depending on the type of drug you’d like to design. The two major types of drugs are small molecules and biologics. Small molecules are molecules with between ten and a few hundred atoms. They tend to be designed and created by human chemists. Biologics are much larger molecules, ranging from thousands to hundreds of thousands of atoms. Such molecules are often antibodies, a special type of protein created by immune systems. Biologics are often manufactured by co-opting the machinery of living organisms (living mice, or perhaps human cells).
For small molecules, there are some tried-and-true ways of hit finding. The first is to simply search public databases (or the patent literature) to find compounds known to modulate the biomolecular target in question. More often than not, some compounds will be available. Another strategy is to use a high throughput screening assay (HTS). HTS assays are roboticized experiments that will test up to hundreds of thousands of standard drug-like molecules against a designated biomolecule. These tests have a number of caveats; the detection mechanism for hits is known to have many issues, and the experiments tend to be quite expensive to run since procuring chemicals to test takes significant effort. For antibodies, the set of hit-finding techniques are entirely different. Common technologies include phage display, which expresses the desired target on the surface of a bacteriophage (a type of virus) and uses it to select amongst various antibody choices.
Hit finding is the first part of drug discovery which has materially benefited from computational methods. Such computational drug discovery methods got a big boost in the 70s 80s with the advent of “virtual screening” methods which attempted to speed up the hit finding process. Since HTS assays remained (and still remain) expensive, virtual screening methods used machine learning from initial runs of the HTS system to guide future iterations of the search. Virtual screening has found a niche in the pharmaceutical world, with many companies maintaining an internal virtual screening platform. A complementary approach to computational hit finding is based on the structure of the target biomolecule. The structure, along with computational physics computations, is used to guide design of initial compound hits. It’s worth noting that computational methods are almost exclusively used for small molecule therapeutics. The larger size and complexity of biologic molecules has rendered them resistant to common computational methods. It’s widely expected that the increasing power of modern computers will change this situation over the next few years.
Computational methods raised much enthusiasm when they gained broad adoption in the late 1970s and early 1980s (landing of the cover of Fortune magazine in 1981). Unfortunately though, these early computational methods failed to live up to the hype, and the field of computational drug discovery went through a sort of “AI Winter.” Funding and enthusiasm for computational methods have only recently returned with the advent of deep learning methods, but many pharmaceutical companies retain institutional memories of the failure of the last wave of computational drug discovery.
In most pharmaceutical organizations, the role of computational modelers is limited. There exists a computational biology or computational chemistry division that handles most of the modelling work. The influence of these divisions is largely governed by how capable its workers are at explaining their insights to biologists and chemists.
Returning once again to the drug discovery pipeline, once a suitable first hit is found, the next stages in the drug discovery process involve lead optimization and toxicity testing. The idea of lead optimization is to modify the hit molecule to tightly interact with the desired target biomolecule. The closer the interaction with the target, the hope goes that the weaker the interaction will be with any off-target biomolecules. For small molecules, interactions are usually optimized by teams of medicinal chemists. These chemists iteratively modify the structure of the hit molecule until it binds tightly with the target biomolecule. For antibodies, iterative screening platforms try generations of antibodies, with each generation a little more tightly interacting with the target. As a quick note, making synthetic modifications to small molecules is a feat of considerable technical sophistication, but one that can be reliably contracted out to external labs. For antibody design, it is often easier to do the work in house since materials are easier to obtain, but external contractors can help as well.
Toxicity testing involves testing whether the compound is safe for human ingestion. A large part of this process involves ADME (absorption, distribution, metabolism, and excretion) property characterization. There are a number of standard experiments (hERG, CYP450 inhibition test) for small molecules that measure whether there exist negative interactions with unsuitable biological systems. None of these methods are perfect however, and it is routine to find lead molecules which behave perfectly in the standard toxicity profile and cause failures. Testing toxicity in rats or mice is usually a reasonable proxy for gauging toxicity, but the real test comes for humans.
For antibodies, metabolism poses major challenges. Our stomachs are designed to easily digest large biomolecules like antibodies. As a result, antibodies cannot be ingested in pill format the way small molecules can and have to be injected into patient veins, a major inconvenience which limits antibody based treatments to more serious medical disorders than small molecule treatments. Less is known about toxicity and metabolism properties of antibodies than about small molecules since the former have been in use for a shorter period. However, this is rapidly changing as many more biotech companies invest heavily in biologic development.
Computational lead optimization and toxicity have made some inroads in the last few years. Schrodinger’s FEP suite has gained some traction amongst pharmaceutical companies for lead optimization, and the recent Tox21 challenge for computational toxicology (won by deep learning) raised hope that deep learning methods could advance toxicity testing. Both of these hopes are still works in progress, with partial progress, but no definitive advances yet. As we noted previously, these computational advances are entirely limited to small molecules.
Once the lead therapeutic, small molecule or biologic, has been lead optimized and designed to be (hopefully) nontoxic, scientists move into phase I trials. In these trials, healthy human volunteers are dosed with the proposed treatment. These trials are often done in dose escalation format, where volunteers are first dosed with a very low dose likely to be safe. The dose is progressively escalated and any side effects found are noted, with the trial stopped if the side effects become too severe. Traditionally, phase I trials have not been used to gauge whether the therapeutic actually treats the disease in question. These trials typically only have a small number of volunteers, on the order of a few dozen or so. The success rate of phase I trials is around 60% for a new therapeutic, demonstrating that the current process of estimating toxicity still leaves much to be desired.
Phase II trials are significantly larger, with typically a few hundred patients treated (note that unlike phase I, these patients actually suffer from the disease attempting to be treated). These trials can their be conducted as open label trials, or as blinded trials. In open label trials, both patients and doctors know what treatment is being provided. In single blinded trials, typically the patient does not know whether the novel treatment or a standard treatment is being administered. In double blinded trials, neither the patient nor the doctor knows which treatment is being administered. Phase II trials have the highest failure rates of all the stages in the clinical pipeline, with roughly 35% of compounds passing this phase. This extreme failure rate is due to modern science’s very weak understanding of human biology. It is common to find that hypotheses about targets are often fatally flawed upon reaching this stage of the pipeline. Computational methods for systems biology might one day make it feasible to decrease this failure rate, but such methods appear to be significantly out of the reach of today’s scientists.
Phase III trials are basically a scaled out version of phase II trials. Up to a few thousand patients are enrolled in the trials. In general, phase III trials are the gold standard of proof required by the FDA in order to approve a medicine for consumers. The success rate for phase III trials is higher than that for phase II, at around 60%. This relatively low rate demonstrates the many statistical challenges present in designing clinical trials. Many therapeutics display moderate to low effectiveness in phase II trials, which turn out to be simple statistical flukes when tested in the larger phase III setting.
Computational methods have not penetrated the clinical testing phase of drug discovery. The development of effective computational methods for clinical drug discovery may rather be spurred by the recent wave of progress in AI medicine; it’s possible that technologies like convolutional neural networks will allow for careful monitoring of patient response to treatments, permitting for effective trials with fewer patients.
The entire process of drug discovery is extremely expensive. Estimates of the exact cost vary depending on the analysis method, but a few billion dollars of expenditure per approved medicine is not uncommon. According to one estimate by Tufts University, preclinical expenditures run up to 430 million dollars, and clinical expenditures run up to 960 million dollars on average. The entire process takes roughly 10 years from start to finish. The future of drug discovery belongs to those who can invent techniques (algorithmic, biological, logistical) which succeed in lowering costs, increasing success rates, and decreasing time for discovering new medicines.
A few weeks ago, I had an interesting conversation at a conference. The person I was talking to asked me about my work. I mentioned that I worked on DeepChem, an open source library facilitating deep learning for drug discovery. I explained that the tools we build help make the drug discovery process a little easier. Nothing revolutionary yet, but given the powers of deep learning and our promising early results, we had cause to be hopeful. My questioner was a little surprised to hear my answer. He asked, “Since pharma companies have a lot of money, why does it make sense to make your tools open source? I can see the point of open source for projects where there aren’t resources available, but why bother making free tools given that your biggest users have plenty of cash?” This question stumped me a bit.
I’ve left this blog on a bit of a hiatus due to a number of other projects taking up the majority of my time. In an effort to revive the blog, I’m experimenting with tweet-storm blog posts. For those unfamiliar, tweet storms are a rapid fire sequence of tweets which tell a story or make an argument. I tweet more often than I write blog posts, so occasionally I’ll transcribe some of my tweet storms in blog form.
Can an AI believe in God? This is a seldom-asked question in the philosophical community that thinks about artificial intelligence (perhaps since that community is primarily rationalist). The question is worth asking. There have been numerous pieces of art that think about the capacity of AI to feel emotion; see the movie Her for a love story between man and AI. If an AI can feel intense emotion for a human being, why couldn’t an AI couldn’t feel a similarly fervent devotion for an abstract diety?
Modern drug discovery remains an artisanal pursuit, driven in large part by luck and expert knowledge. While this approach has worked spectacularly in the past, the last few years have seen a systematic decrease in the number of new drugs discovered for dollar spent. Eroom’s law empirically demonstrates that the number of new drugs per dollar has been falling exponentially year over year. Eroom is of course Moore spelled backward, where Moore’s law observes that transistor densities on computer chips have been increasing exponentially year over year for the past fifty years. The opposite trends in increasing computational power per dollar versus decreasing number of drugs discovered serve as a reminder that naive computation is insufficient to solve hard biological problems (a topic I’ve written about previously). To reverse Eroom’s law, scientists must combine deep biological insights with computational modeling, and I hypothesize that the best path forward is systematically learning causal models of human disease and drug actions from available experimental data.
Fifty years ago, the first molecular dynamics papers allowed scientists to exhaustively simulate systems with a few dozen atoms for picoseconds. Today, due to tremendous gains in computational capability from Moore’s law, and due to significant gains in algorithmic sophisticiation from fifty years of research, modern scientists can simulate systems with hundreds of thousands of atoms for milliseconds at a time. Put another way, scientists today can study systems tens of thousands of times larger, for billion of times longer than they could fifty years go. The effective reach of physical simulation techniques has expanded handleable computational complexity ten-trillion fold. The scope of this achievement should not be underestimated; the advent of these techniques along with the maturation of deep-learning has permitted a host of start-ups (1, 2, 3, etc) to investigate diseases using tools that were hitherto unimaginable.
There’s been a lot of recent attention to the threat of antiobiotic resistence (see this recent NYTimes piece for example). As a quick summary, overuse of antibiotics by doctors and farmers has triggered the evolution of bacteria that are resistant to all available antibiotics. The conclusion follows that more needs to be done to curtail unnecessary antiobiotic use, and to develop novel antibiotics that can cope with the coming onslaught of antibiotic-resistant bacteria.
I just read the fascinating paper Could a neuroscientist understand a microprocessor?. The paper simulates a simple microprocessor (the MOS 6502, used in the Apple I and in the Atari video game system) and uses neuroinformatics techniques (mostly statistics/machine-learning) to analyze the simulated microprocessor. More specifically, the authors analyze the connections between transistor on the microprocessor (see Connectomics), ablation of single transistors, covariances of transistor activities, and whole-chip recordings (analogous to whole-brain recordings).