Our team reviewed the papers accepted to NeurIPS 2020 and shortlisted the most interesting ones across different research areas. Here are the topics we cover:
- Natural Language Processing & Conversational AI
- Computer Vision
- Reinforcement Learning & More
- Tackling COVID-19 with AI & Machine Learning
If you’re interested in the remarkable keynote presentations, interesting workshops, and exciting tutorials presented at the conference, check our guide to NeurIPS 2020.
Subscribe to our AI Research mailing list at the bottom of this article to be alerted when we release new summaries.
COVID-19 Research at NeurIPS 2020
NeurIPS 2020 had a special call for COVID-19 related research. Out of a total number of 40 research papers submitted on this topic, one was accepted for oral presentation, 4 for spotlight presentation, and 4 for a poster presentation.
Here are the accepted COVID-19 research papers.
When and How to Lift the Lockdown? Global COVID-19 Scenario Analysis and Policy Assessment using Compartmental Gaussian Processes
Zhaozhi Qian (University of Cambridge), Ahmed Alaa (UCLA), Mihaela van der Schaar (University of Cambridge)
The coronavirus disease 2019 (COVID-19) global pandemic has led many countries to impose unprecedented lockdown measures in order to slow down the outbreak. Questions on whether governments have acted promptly enough, and whether lockdown measures can be lifted soon have since been central in public discourse. Data-driven models that predict COVID-19 fatalities under different lockdown policy scenarios are essential for addressing these questions, and for informing governments on future policy directions. To this end, this paper develops a Bayesian model for predicting the effects of COVID-19 containment policies in a global context – we treat each country as a distinct data point, and exploit variations of policies across countries to learn country-specific policy effects. Our model utilizes a two-layer Gaussian process (GP) prior – the lower layer uses a compartmental SEIR (Susceptible, Exposed, Infected, Recovered) model as a prior mean function with “country-and-policy-specific” parameters that capture fatality curves under different “counterfactual” policies within each country, whereas the upper layer is shared across all countries, and learns lower-layer SEIR parameters as a function of country features and policy indicators. Our model combines the solid mechanistic foundations of SEIR models (Bayesian priors) with the flexible data-driven modeling and gradient-based optimization routines of machine learning (Bayesian posteriors) – i.e., the entire model is trained end-to-end via stochastic variational inference. We compare the projections of our model with other models listed by the Center for Disease Control (CDC), and provide scenario analyses for various lockdown and reopening strategies highlighting their impact on COVID-19 fatalities.
Sercan Arik (Google), Chun-Liang Li (Google), Jinsung Yoon (Google), Rajarishi Sinha (Google), Arkady Epshteyn (Google), Long Le (Google), Vikas Menon (Google), Shashank Singh (Google), Leyou Zhang (Google), Martin Nikoltchev (Google), Yash Sonthalia (Google), Hootan Nakhost (Google), Elli Kanal (Google), Tomas Pfister (Google)
We propose a novel approach that integrates machine learning into compartmental disease modeling (e.g., SEIR) to predict the progression of COVID-19. Our model is explainable by design as it explicitly shows how different compartments evolve and it uses interpretable encoders to incorporate covariates and improve performance. Explainability is valuable to ensure that the model’s forecasts are credible to epidemiologists and to instill confidence in end-users such as policy makers and healthcare institutions. Our model can be applied at different geographic resolutions, and we demonstrate it for states and counties in the United States. We show that our model provides more accurate forecasts compared to the alternatives, and that it provides qualitatively meaningful explanatory insights.
Yingxiang Yang (University of Illinois at Urbana-Champaign), Negar Kiyavash (Ecole polytechnique federale de Lausanne), Le Song (Georgia Institute of Technology), Niao He (UIUC & ETH Zurich)
Macroscopic data aggregated from microscopic events are pervasive in machine learning, such as country-level COVID-19 infection statistics based on city-level data. Yet, many existing approaches for predicting macroscopic behavior only use aggregated data, leaving a large amount of fine-grained microscopic information unused. In this paper, we propose a principled optimization framework for macroscopic prediction by fitting microscopic models based on conditional stochastic optimization. The framework leverages both macroscopic and microscopic information, and adapts to individual microscopic models involved in the aggregation. In addition, we propose efficient learning algorithms with convergence guarantees. In our experiments, we show that the proposed learning framework clearly outperforms other plug-in supervised learning approaches in real-world applications, including the prediction of daily infections of COVID-19 and medicare claims.
Mrinank Sharma (University of Oxford), Sören Mindermann (University of Oxford), Jan Brauner (University of Oxford), Gavin Leech (University of Bristol), Anna Stephenson (Harvard University), Tomáš Gavenčiak (Independent researcher), Jan Kulveit (University of Oxford), Yee Whye Teh (University of Oxford, DeepMind), Leonid Chindelevitch (Simon Fraser University), Yarin Gal (University of Oxford)
To what extent are effectiveness estimates of nonpharmaceutical interventions (NPIs) against COVID-19 influenced by the assumptions our models make? To answer this question, we investigate 2 state-of-the-art NPI effectiveness models and propose 6 variants that make different structural assumptions. In particular, we investigate how well NPI effectiveness estimates generalise to unseen countries, and their sensitivity to unobserved factors. Models which account for noise in disease transmission compare favourably. We further evaluate how robust estimates are to different choices of epidemiological parameters and data. Focusing on models that assume transmission noise, we find that previously published results are robust across these choices and across different models. Finally, we mathematically ground the interpretation of NPI effectiveness estimates when certain common assumptions do not hold.
Michael Widrich (LIT AI Lab / University Linz), Bernhard Schäfl (LIT AI Lab / University Linz), Hubert Ramsauer (LIT AI Lab / University Linz), Milena Pavlović (University of Oslo), Lukas Gruber (LIT AI Lab / University Linz), Markus Holzleitner (LIT AI Lab / University Linz), Johannes Brandstetter (LIT AI Lab / University Linz), Geir Kjetil Sandve (University of Oslo), Victor Greiff (University of Oslo), Sepp Hochreiter (LIT AI Lab / University Linz/ IARAI), Günter Klambauer (LIT AI Lab / University Linz)
A central mechanism in machine learning is to identify, store, and recognize patterns. How to learn, access, and retrieve such patterns is crucial in Hopfield networks and the more recent transformer architectures. We show that the attention mechanism of transformer architectures is actually the update rule of modern Hopfield networks that can store exponentially many patterns. We exploit this high storage capacity of modern Hopfield networks to solve a challenging multiple instance learning (MIL) problem in computational biology: immune repertoire classification. In immune repertoire classification, a vast number of immune receptors are used to predict the immune status of an individual. This constitutes a MIL problem with an unprecedentedly massive number of instances, two orders of magnitude larger than currently considered problems, and with an extremely low witness rate. Accurate and interpretable machine learning methods solving this problem could pave the way towards new vaccines and therapies, which is currently a very relevant research topic intensified by the COVID-19 crisis. In this work, we present our novel method DeepRC that integrates transformer-like attention, or equivalently modern Hopfield networks, into deep learning architectures for massive MIL such as immune repertoire classification. We demonstrate that DeepRC outperforms all other methods with respect to predictive performance on large-scale experiments including simulated and real-world virus infection data and enables the extraction of sequence motifs that are connected to a given disease class.
Code: official code implementation and datasets are available here.
Atalanti Mastakouri (Max Planck Institute for Intelligent Systems), Bernhard Schölkopf (MPI for Intelligent Systems, Tübingen)
In this work, we study the causal relations among German regions in terms of the spread of Covid-19 since the beginning of the pandemic, taking into account the restriction policies that were applied by the different federal states. We loose a strictly formulated assumption for a causal feature selection method for time series data, robust to latent confounders, which we subsequently apply on Covid-19 case numbers. We present findings about the spread of the virus in Germany and the causal impact of restriction measures, discussing the role of various policies in containing the spread. Since our results are based on rather limited target time series (only the numbers of reported cases), care should be exercised in interpreting them. However, it is encouraging that already such limited data seems to contain causal signals. This suggests that as more data becomes available, our causal approach may contribute towards meaningful causal analysis of political interventions on the development of Covid-19, and thus also towards the development of rational and data-driven methodologies for choosing interventions.
Vijil Chenthamarakshan (IBM Research), Payel Das (IBM Research), Samuel Hoffman (IBM Research), Hendrik Strobelt (IBM Research), Inkit Padhi (IBM Research), Kar Wai Lim (IBM Singapore), Ben Hoover (IBM), Matteo Manica (IBM Research Zürich), Jannis Born (IBM Research), Teodoro Laino (IBM Research Zurich), Aleksandra Mojsilovic (IBM Research)
The novel nature of SARS-CoV-2 calls for the development of efficient de novo drug design approaches. In this study, we propose an end-to-end framework, named CogMol (Controlled Generation of Molecules), for designing new drug-like small molecules targeting novel viral proteins with high affinity and off-target selectivity. CogMol combines adaptive pre-training of a molecular SMILES Variational Autoencoder (VAE) and an efficient multi-attribute controlled sampling scheme that uses guidance from attribute predictors trained on latent features. To generate novel and optimal drug-like molecules for unseen viral targets, CogMol leverages a protein-molecule binding affinity predictor that is trained using SMILES VAE embeddings and protein sequence embeddings learned unsupervised from a large corpus. We applied the CogMol framework to three SARS-CoV-2 target proteins: main protease, receptor-binding domain of the spike protein, and non-structural protein 9 replicase. The generated candidates are novel at both the molecular and chemical scaffold levels when compared to the training data. CogMol also includes insilico screening for assessing toxicity of parent molecules and their metabolites with a multi-task toxicity classifier, synthetic feasibility with a chemical retrosynthesis predictor, and target structure binding with docking simulations. Docking reveals favorable binding of generated molecules to the target protein structure, where 87–95% of high affinity molecules showed docking free energy < -6 kcal/mol. When compared to approved drugs, the majority of designed compounds show low predicted parent molecule and metabolite toxicity and high predicted synthetic feasibility. In summary, CogMol can handle multi-constraint design of synthesizable, low-toxic, drug-like molecules with high target specificity and selectivity, even to novel protein target sequences, and does not need target-dependent fine-tuning of the framework or target structure information.
Uchenna Akujuobi (King Abdullah University of Science and Technology and Sony AI), Jun Chen (King Abdullah University of Science and Technology), Mohamed Elhoseiny (King Abdullah University of Science and Technology), Michael Spranger (Sony AI), Xiangliang Zhang (King Abdullah University of Science and Technology)
Understanding the relationships between biomedical terms like viruses, drugs, and symptoms is essential in the fight against diseases. Many attempts have been made to introduce the use of machine learning to the scientific process of hypothesis generation(HG), which refers to the discovery of meaningful implicit connections between biomedical terms. However, most existing methods fail to truly capture the temporal dynamics of scientific term relations and also assume unobserved connections to be irrelevant (i.e., in a positive-negative (PN) learning setting). To break these limits, we formulate this HG problem as future connectivity prediction task on a dynamic attributed graph via positive-unlabeled (PU) learning. Then, the key is to capture the temporal evolution of node pair (term pair) relations from just the positive and unlabeled data. We propose a variational inference model to estimate the positive prior, and incorporate it in the learning of node pair embeddings, which are then used for link prediction. Experiment results on real-world biomedical term relationship datasets and case study analyses on a COVID-19 dataset validate the effectiveness of the proposed model.
Mukund Sudarshan (New York University), Wesley Tansey (Memorial Sloan Kettering Cancer Center), Rajesh Ranganath (New York University)
Predictive modeling often uses black box machine learning methods, such as deep neural networks, to achieve state-of-the-art performance. In scientific domains, the scientist often wishes to discover which features are actually important for making the predictions. These discoveries may lead to costly follow-up experiments and as such it is important that the error rate on discoveries is not too high. Model-X knockoffs enable important features to be discovered with control of the FDR. However, knockoffs require rich generative models capable of accurately modeling the knockoff features while ensuring they obey the so-called “swap” property. We develop Deep Direct Likelihood Knockoffs (DDLK), which directly minimizes the KL divergence implied by the knockoff swap property. DDLK consists of two stages: it first maximizes the explicit likelihood of the features, then minimizes the KL divergence between the joint distribution of features and knockoffs and any swap between them. To ensure that the generated knockoffs are valid under any possible swap, DDLK uses the Gumbel-Softmax trick to optimize the knockoff generator under the worst-case swap. We find DDLK has higher power than baselines while controlling the false discovery rate on a variety of synthetic and real benchmarks including a task involving a large dataset from one of the epicenters of COVID-19.
Code: official code implementation is available here.
Top Research Papers From 2020
To be prepared for NeurIPS, you should be aware of the major research papers published in the last year in popular topics such as computer vision, NLP, and general machine learning approaches, even if they are not being presented at this specific event.
We’ve shortlisted top research papers in these areas so you can review them quickly:
- Novel Computer Vision Research Papers From 2020
- GPT-3 & Beyond: 10 NLP Research Papers You Should Read
- 2020’s Top AI & Machine Learning Research Papers
Enjoy this article? Sign up for more AI research updates.
We’ll let you know when we release more summary articles like this one.