top of page

Machine Learning for COVID-19: What Can We Do?

Last week, MIT moved to a new virtual existence to contain the COVID-19 spread. This whole situation is surreal for most of us — empty classrooms and corridors make MIT feel like a ghost town. At the same time, these sudden changes prompted many of us to ask, what can we do to help? Less than a month ago, Cell published our paper on using AI tools to help identify a potent new antibacterial drug Halicin. The method required a relatively small E-coli inhibition screen of 2500 compounds, and was then used to identify Halicin and other candidates from much larger compound libraries. The work was completed in just a few months. The natural question now is what can we do for COVID-19? A group of faculty, and our collective students decided to dig deeper into this question. Our initial goal is to identify promising COVID-19 antiviral molecules and/or their cocktails that could be then tested in the lab. Since bringing novel molecules to clinic is both long and expensive process, we are instead looking into repurposing existing drugs or compounds that are already in phase 1 or later clinical trials. The majority of our team comes with CS backgrounds with little prior experience in viruses (other than those we have personally hosted). Luckily, we were able to connect to the local biotech community and outside collaborators with extensive experience in developing vaccines and antivirals. Our team is expanding, and rapidly ramping up our efforts. Since the machine learning community is not yet actively engaged, we hope to accelerate this process by openly sharing relevant resources and baselines. We hope other researchers will join and contribute their expertise in this vital effort as well. Lastly, it is worth noting that a lot of our time has been spent on obtaining data pertaining to the virus or its close relatives. We have connected to COVID-19 Therapeutics Accelerator (sponsored by Gates Foundation, Wellcome and Mastercard) and Pfizer which started dedicated initiatives on COVID. At the moment, they do not have data to share, and local labs lack the capacity to generate this data, a situation we hope will soon change.  Ideally, we would have access to molecular screens measuring inhibitory activity against SARS-Cov-2 virus or related protein targets. Currently, our data is scraped from public sources (PubChem) pertaining to the relative SARS-Cov-1, and was identified with the generous help of Dr. Malone. More details of these and other datasets will be posted on our project page. More updates to come. The team PIs: Pulkit Agrawal, Regina Barzilay, James Collins, Connor Coley, Rafael Gomez-Bombarelli, Tommi Jaakkola, Stefanie Jegelka, Klavs Jensen, Caroline  Uhler Postdocs and students:  Benson Chen, Adam Fisch, Xiang Fu, Octavian Ganea, Lior Hirschfeld, Wengong Jin,  Guang-He Lee, Peter Michaels, Victor Quach, Amit Schechter, Tal Schuster, Jonathan Stokes, Kyle Swanson, Allison Tam, Shangyuan Tong, Rachel Wu, Kevin Yang

Collaborators: Robert Malone

bottom of page