“Artificial Intelligence in Drug Design” Book Highlights. What can you learn about the latest trends of AI in pharma?
During the last winter holiday period, I've decided to re-discover an "old" habit of reading a real-paper book instead of an e-reader. The book that I've packed in the luggage and carried with me during the stay at my family in Italy is the excellent "Artificial Intelligence in Drug Design" edited by Alexander Heifetz from Evotec.
To cut short the suspense, I'll go straight to the point - the book is simply great! Basically, it reflects what has been recently cooking in the pharma industry around the usage of AI and machine learning for drug design. It contains a collection of the finest and most up-to-date methods applied today; it elucidates their potential and guides the reader towards understanding their different strengths and weaknesses. Also, it warns about the potential pitfalls, which is something not found as often as I would expect in the literature, and that provides a huge added value.
The hot topics in pharma
The topics addressed in the book are many, of which the hot ones (intended as the ones I've found more often mentioned throughout the entire elaborate) include Generative Chemistry, Target Profiling, ADMET Prediction and Scoring and Synthesis Planning (Chapter 4). If I were asked to extract the "hottest" (and here it is intended as the ones described more in detail across the entire tome), I would say that Generative Chemistry and ADMET Prediction are on the top of the list. These two topics are, in fact, significant. For example, to have an idea about why it is so crucial to find new molecular structures, at the very beginning of the book (Chapter 1) it is reported that according to a 2014 study, 83% of the rings in drugs were developed prior to 1983. The boost that AI could provide is already quite clear, especially if one keeps reading the text and just two lines afterward find out that all known ring systems in chemical databases (study from 2017) accounted for only 1.4% of the chemically feasible rings!
Of course, as any other method, tool, or technique, AI has to be used with some caution. In this regard, I can report that different authors clearly stress the importance of two particular subjects. I'm referring to:
The need for using only high-quality data coming from standardized comparable assays to create good predictive models.
The inclusion of a reliability measure for model results - Applicability Domain (Chapter 2 and Chapter 15).
For the first, the take-home message is "simply" that all of the three checkboxes (high-quality, standardized, and comparable) should be ticked when searching/compiling/creating a dataset for such modeling purposes. While for the second, the importance to have an idea on the applicability domain is due to the fact that a model will always return a result, but only the confidence in that result allows the user to benefit from it and to make meaningful decisions. Let's not forget that the goal of these models is to provide insights to perform the next business decision and guide the scientist towards the most favorable compounds.
But where to start?
Chapter 12 provides a comprehensive list of deep generative models available in the literature from 2017 till 2020. I find this table – that extends over multiple pages – probably the best entry point for anyone who wants to start getting their hands dirty with AI and ML. Suppose instead one is looking for specific tools which are underneath powered by AI. In that case, excellent examples are reported as well: Chapter 17 has a summary of available retrosynthesis planning tools, Chapter 16 reports examples of drug discovery tools, and Chapter 11 collects scoring functions for docking experiments based on Deep Neural Networks.
One more thing I liked about this volume is that it is not only fully packed food for thought but also a very nice source of references for people wanting to apply what is available in the literature.
Personal favorites
The book really manages to stimulate curiosity about the variety of AI applications. For someone like me, who played a lot with High-Throughput Virtual Screening (HTVS) during the PhD, it was simply fascinating to read about the emerging field of uHTVS (Ultrahigh-Throughput Virtual Screening) in Chapter 13. The power of this method and the amount of data that can be processed and elaborated with AI boost is honestly really cool!
Speaking about personal favorites (and to give in to a bit of shameless pride), it has also been delightful finding out that my paper "A structure-kinetic relationship study using matched molecular pair analysis", co-written with my ex-colleague Doris when I was part of Prof. Gerhard Ecker's group at the University of Vienna, is referenced in this book! In Chapter 8, where the intriguing topic of the drug-target residence time of GPCR ligands prediction with Machine Learning is addressed, our KIND (KINetic Dataset) is mentioned as the largest collection by far of kinetic data.
What a nice cherry on the top of such interesting reading!
Check out the original article on LinkedIn and don’t hesitate to like and comment!
References:
Chapters ordered according to their mention in the text:
Chapter 4: Rishi Gupta, Novartis
Chapter 1: Chris de Graaf, Sosei Heptares and Andreas Bender and colleagues, University of Cambridge
Chapter 2: Alexander Hillisch and colleagues, Bayer
Chapter 15: Gerhard Hessler and colleagues, Sanofi
Chapter 12: Ferruccio Palazzesi and alfonso pozzan, Evotec
Chapter 17: Govinda Bhisetti and Cheng Fang, #Biogen
Chapter 16: Constantino Diaz Gonzalez and colleagues, Evotec
Chapter 11: Andrew Anighoro, Evotec
Chapter 13: Austin Clyde, University of Chicago and Argonne National Laboratory
Chapter 8: Andrea Townsend-Nicholson and colleagues, UCL and Evotec