Search Results for author: Yoshua Bengio

Found 577 papers, 293 papers with code

Learning to Navigate in Synthetically Accessible Chemical Space Using Reinforcement Learning

1 code implementation • ICML 2020 • Sai Krishna Gottipati, Boris Sattarov, Sufeng. Niu, Hao-Ran Wei, Yashaswi Pathak, Shengchao Liu, Simon Blackburn, Karam Thomas, Connor Coley, Jian Tang, Sarath Chandar, Yoshua Bengio

In this work, we propose a novel reinforcement learning (RL) setup for drug discovery that addresses this challenge by embedding the concept of synthetic accessibility directly into the de novo compound design system.

Drug Discovery Navigate +3

Paper
Code

hBERT + BiasCorp - Fighting Racism on the Web

no code implementations • EACL (LTEDI) 2021 • Olawale Onabola, Zhuang Ma, Xie Yang, Benjamin Akera, Ibraheem Abdulrahman, Jia Xue, Dianbo Liu, Yoshua Bengio

In this work, we present hBERT, where we modify certain layers of the pretrained BERT model with the new Hopfield Layer.

Paper
Add Code

Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems

no code implementations • 10 May 2024 • David "davidad" Dalrymple, Joar Skalse, Yoshua Bengio, Stuart Russell, Max Tegmark, Sanjit Seshia, Steve Omohundro, Christian Szegedy, Ben Goldhaber, Nora Ammann, Alessandro Abate, Joe Halpern, Clark Barrett, Ding Zhao, Tan Zhi-Xuan, Jeannette Wing, Joshua Tenenbaum

Ensuring that AI systems reliably and robustly avoid harmful or dangerous behaviours is a crucial challenge, especially for AI systems with a high degree of autonomy and general intelligence, or systems used in safety-critical contexts.

Paper
Add Code

Generative Active Learning for the Search of Small-molecule Protein Binders

no code implementations • 2 May 2024 • Maksym Korablyov, Cheng-Hao Liu, Moksh Jain, Almer M. van der Sloot, Eric Jolicoeur, Edward Ruediger, Andrei Cristian Nica, Emmanuel Bengio, Kostiantyn Lapchevskyi, Daniel St-Cyr, Doris Alexandra Schuetz, Victor Ion Butoi, Jarrid Rector-Brooks, Simon Blackburn, Leo Feng, Hadi Nekoei, SaiKrishna Gottipati, Priyesh Vijayan, Prateek Gupta, Ladislav Rampášek, Sasikanth Avancha, Pierre-Luc Bacon, William L. Hamilton, Brooks Paige, Sanchit Misra, Stanislaw Kamil Jastrzebski, Bharat Kaul, Doina Precup, José Miguel Hernández-Lobato, Marwin Segler, Michael Bronstein, Anne Marinier, Mike Tyers, Yoshua Bengio

Despite substantial progress in machine learning for scientific discovery in recent years, truly de novo design of small molecules which exhibit a property of interest remains a significant challenge.

Active Learning Molecular Docking

Paper
Add Code

Towards DNA-Encoded Library Generation with GFlowNets

no code implementations • 15 Apr 2024 • Michał Koziarski, Mohammed Abukalam, Vedant Shah, Louis Vaillancourt, Doris Alexandra Schuetz, Moksh Jain, Almer van der Sloot, Mathieu Bourgey, Anne Marinier, Yoshua Bengio

DNA-encoded libraries (DELs) are a powerful approach for rapidly screening large numbers of diverse compounds.

Paper
Add Code

Foundational Challenges in Assuring Alignment and Safety of Large Language Models

1 code implementation • 15 Apr 2024 • Usman Anwar, Abulhair Saparov, Javier Rando, Daniel Paleka, Miles Turpin, Peter Hase, Ekdeep Singh Lubana, Erik Jenner, Stephen Casper, Oliver Sourbut, Benjamin L. Edelman, Zhaowei Zhang, Mario Günther, Anton Korinek, Jose Hernandez-Orallo, Lewis Hammond, Eric Bigelow, Alexander Pan, Lauro Langosco, Tomasz Korbak, Heidi Zhang, Ruiqi Zhong, Seán Ó hÉigeartaigh, Gabriel Recchia, Giulio Corsi, Alan Chan, Markus Anderljung, Lilian Edwards, Yoshua Bengio, Danqi Chen, Samuel Albanie, Tegan Maharaj, Jakob Foerster, Florian Tramer, He He, Atoosa Kasirzadeh, Yejin Choi, David Krueger

This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs).

Paper
Code

Language Models Can Reduce Asymmetry in Information Markets

no code implementations • 21 Mar 2024 • Nasim Rahaman, Martin Weiss, Manuel Wüthrich, Yoshua Bengio, Li Erran Li, Chris Pal, Bernhard Schölkopf

This work addresses the buyer's inspection paradox for information markets.

Paper
Add Code

Ant Colony Sampling with GFlowNets for Combinatorial Optimization

2 code implementations • 11 Mar 2024 • Minsu Kim, Sanghyeok Choi, Jiwoo Son, Hyeonah Kim, Jinkyoo Park, Yoshua Bengio

This paper introduces the Generative Flow Ant Colony Sampler (GFACS), a novel neural-guided meta-heuristic algorithm for combinatorial optimization.

Combinatorial Optimization

Paper
Code

Machine learning and information theory concepts towards an AI Mathematician

no code implementations • 7 Mar 2024 • Yoshua Bengio, Nikolay Malkin

The current state-of-the-art in artificial intelligence is impressive, especially in terms of mastery of language, but not so much in terms of mathematical reasoning.

Mathematical Reasoning

Paper
Add Code

Discrete Probabilistic Inference as Control in Multi-path Environments

1 code implementation • 15 Feb 2024 • Tristan Deleu, Padideh Nouri, Nikolay Malkin, Doina Precup, Yoshua Bengio

We consider the problem of sampling from a discrete and structured distribution as a sequential decision problem, where the objective is to find a stochastic policy such that objects are sampled at the end of this sequential process proportionally to some predefined reward.

Paper
Code

Iterated Denoising Energy Matching for Sampling from Boltzmann Densities

1 code implementation • 9 Feb 2024 • Tara Akhound-Sadegh, Jarrid Rector-Brooks, Avishek Joey Bose, Sarthak Mittal, Pablo Lemos, Cheng-Hao Liu, Marcin Sendera, Siamak Ravanbakhsh, Gauthier Gidel, Yoshua Bengio, Nikolay Malkin, Alexander Tong

Efficiently generating statistically independent samples from an unnormalized probability distribution, such as equilibrium samples of many-body systems, is a foundational problem in science.

Denoising Efficient Exploration

Paper
Code

On diffusion models for amortized inference: Benchmarking and improving stochastic control and sampling

1 code implementation • 7 Feb 2024 • Marcin Sendera, Minsu Kim, Sarthak Mittal, Pablo Lemos, Luca Scimeca, Jarrid Rector-Brooks, Alexandre Adam, Yoshua Bengio, Nikolay Malkin

We study the problem of training diffusion models to sample from a distribution with a given unnormalized density or energy function.

Benchmarking

Paper
Code

Efficient Causal Graph Discovery Using Large Language Models

1 code implementation • 2 Feb 2024 • Thomas Jiralerspong, Xiaoyin Chen, Yash More, Vedant Shah, Yoshua Bengio

We propose a novel framework that leverages LLMs for full causal graph discovery.

Paper
Code

A Hitchhiker's Guide to Geometric GNNs for 3D Atomic Systems

1 code implementation • 12 Dec 2023 • Alexandre Duval, Simon V. Mathis, Chaitanya K. Joshi, Victor Schmidt, Santiago Miret, Fragkiskos D. Malliaros, Taco Cohen, Pietro Liò, Yoshua Bengio, Michael Bronstein

In these graphs, the geometric attributes transform according to the inherent physical symmetries of 3D atomic systems, including rotations and translations in Euclidean space, as well as node permutations.

Protein Structure Prediction Specificity

Paper
Code

Improving Gradient-guided Nested Sampling for Posterior Inference

1 code implementation • 6 Dec 2023 • Pablo Lemos, Nikolay Malkin, Will Handley, Yoshua Bengio, Yashar Hezaveh, Laurence Perreault-Levasseur

We present a performant, general-purpose gradient-guided nested sampling algorithm, ${\tt GGNS}$, combining the state of the art in differentiable programming, Hamiltonian slice sampling, clustering, mode separation, dynamic nested sampling, and parallelization.

Clustering

Paper
Code

Unlearning via Sparse Representations

no code implementations • 26 Nov 2023 • Vedant Shah, Frederik Träuble, Ashish Malik, Hugo Larochelle, Michael Mozer, Sanjeev Arora, Yoshua Bengio, Anirudh Goyal

Machine \emph{unlearning}, which involves erasing knowledge about a \emph{forget set} from a trained model, can prove to be costly and infeasible by existing techniques.

Knowledge Distillation

Paper
Add Code

Mitigating Biases with Diverse Ensembles and Diffusion Models

no code implementations • 23 Nov 2023 • Luca Scimeca, Alexander Rubinstein, Damien Teney, Seong Joon Oh, Armand Mihai Nicolicioiu, Yoshua Bengio

Spurious correlations in the data, where multiple cues are predictive of the target labels, often lead to a phenomenon known as shortcut learning, where a model relies on erroneous, easy-to-learn cues while ignoring reliable ones.

Paper
Add Code

SatBird: Bird Species Distribution Modeling with Remote Sensing and Citizen Science Data

1 code implementation • 2 Nov 2023 • Mélisande Teng, Amna Elmustafa, Benjamin Akera, Yoshua Bengio, Hager Radi Abdelwahed, Hugo Larochelle, David Rolnick

The wide availability of remote sensing data and the growing adoption of citizen science tools to collect species observations data at low cost offer an opportunity for improving biodiversity monitoring and enabling the modelling of complex ecosystems.

Paper
Code

Object-centric architectures enable efficient causal representation learning

1 code implementation • 29 Oct 2023 • Amin Mansouri, Jason Hartford, Yan Zhang, Yoshua Bengio

Causal representation learning has showed a variety of settings in which we can disentangle latent variables with identifiability guarantees (up to some reasonable equivalence class).

Disentanglement Object

Paper
Code

OC-NMN: Object-centric Compositional Neural Module Network for Generative Visual Analogical Reasoning

no code implementations • 28 Oct 2023 • Rim Assouel, Pau Rodriguez, Perouz Taslakian, David Vazquez, Yoshua Bengio

A key aspect of human intelligence is the ability to imagine -- composing learned concepts in novel ways -- to make sense of new scenarios.

Data Augmentation Out-of-Distribution Generalization +1

Paper
Add Code

Managing AI Risks in an Era of Rapid Progress

no code implementations • 26 Oct 2023 • Yoshua Bengio, Geoffrey Hinton, Andrew Yao, Dawn Song, Pieter Abbeel, Yuval Noah Harari, Ya-Qin Zhang, Lan Xue, Shai Shalev-Shwartz, Gillian Hadfield, Jeff Clune, Tegan Maharaj, Frank Hutter, Atılım Güneş Baydin, Sheila Mcilraith, Qiqi Gao, Ashwin Acharya, David Krueger, Anca Dragan, Philip Torr, Stuart Russell, Daniel Kahneman, Jan Brauner, Sören Mindermann

In this short consensus paper, we outline risks from upcoming, advanced AI systems.

Paper
Add Code

Causal machine learning for single-cell genomics

no code implementations • 23 Oct 2023 • Alejandro Tejada-Lapuerta, Paul Bertin, Stefan Bauer, Hananeh Aliee, Yoshua Bengio, Fabian J. Theis

Advances in single-cell omics allow for unprecedented insights into the transcription profiles of individual cells.

Experimental Design

Paper
Add Code

Towards equilibrium molecular conformation generation with GFlowNets

no code implementations • 20 Oct 2023 • Alexandra Volokhova, Michał Koziarski, Alex Hernández-García, Cheng-Hao Liu, Santiago Miret, Pablo Lemos, Luca Thiede, Zichao Yan, Alán Aspuru-Guzik, Yoshua Bengio

Sampling diverse, thermodynamically feasible molecular conformations plays a crucial role in predicting properties of a molecule.

Paper
Add Code

A cry for help: Early detection of brain injury in newborns

no code implementations • 12 Oct 2023 • Charles C. Onu, Samantha Latremouille, Arsenii Gorin, Junhao Wang, Innocent Udeogu, Uchenna Ekwochi, Peter O. Ubuane, Omolara A. Kehinde, Muhammad A. Salisu, Datonye Briggs, Yoshua Bengio, Doina Precup

Since the 1960s, neonatal clinicians have known that newborns suffering from certain neurological conditions exhibit altered crying patterns such as the high-pitched cry in birth asphyxia.

Specificity

Paper
Add Code

PhyloGFN: Phylogenetic inference with generative flow networks

1 code implementation • 12 Oct 2023 • Mingyang Zhou, Zichao Yan, Elliot Layne, Nikolay Malkin, Dinghuai Zhang, Moksh Jain, Mathieu Blanchette, Yoshua Bengio

Phylogenetics is a branch of computational biology that studies the evolutionary relationships among biological entities.

Variational Inference

Paper
Code

On the importance of catalyst-adsorbate 3D interactions for relaxed energy predictions

no code implementations • 10 Oct 2023 • Alvaro Carbonero, Alexandre Duval, Victor Schmidt, Santiago Miret, Alex Hernandez-Garcia, Yoshua Bengio, David Rolnick

The use of machine learning for material property prediction and discovery has traditionally centered on graph neural networks that incorporate the geometric configuration of all atoms.

Property Prediction

Paper
Add Code

Crystal-GFN: sampling crystals with desirable properties and constraints

1 code implementation • 7 Oct 2023 • Mila AI4Science, Alex Hernandez-Garcia, Alexandre Duval, Alexandra Volokhova, Yoshua Bengio, Divya Sharma, Pierre Luc Carrier, Yasmine Benabed, Michał Koziarski, Victor Schmidt

Accelerating material discovery holds the potential to greatly help mitigate the climate crisis.

Formation Energy

123

Paper
Code

Amortizing intractable inference in large language models

1 code implementation • 6 Oct 2023 • Edward J. Hu, Moksh Jain, Eric Elmoznino, Younesse Kaddar, Guillaume Lajoie, Yoshua Bengio, Nikolay Malkin

Autoregressive large language models (LLMs) compress knowledge from their training data through next-token conditional distributions.

Bayesian Inference

Paper
Code

Pre-Training and Fine-Tuning Generative Flow Networks

no code implementations • 5 Oct 2023 • Ling Pan, Moksh Jain, Kanika Madan, Yoshua Bengio

However, as they are typically trained from a given extrinsic reward function, it remains an important open challenge about how to leverage the power of pre-training and train GFlowNets in an unsupervised fashion for efficient adaptation to downstream tasks.

Unsupervised Pre-training

Paper
Add Code

Causal Inference in Gene Regulatory Networks with GFlowNet: Towards Scalability in Large Systems

no code implementations • 5 Oct 2023 • Trang Nguyen, Alexander Tong, Kanika Madan, Yoshua Bengio, Dianbo Liu

Understanding causal relationships within Gene Regulatory Networks (GRNs) is essential for unraveling the gene interactions in cellular processes.

Causal Discovery Causal Inference

Paper
Add Code

Learning to Scale Logits for Temperature-Conditional GFlowNets

1 code implementation • 4 Oct 2023 • Minsu Kim, Joohwan Ko, Taeyoung Yun, Dinghuai Zhang, Ling Pan, Woochang Kim, Jinkyoo Park, Emmanuel Bengio, Yoshua Bengio

We find that the challenge is greatly reduced if a learned function of the temperature is used to scale the policy's logits directly.

Paper
Code

Local Search GFlowNets

2 code implementations • 4 Oct 2023 • Minsu Kim, Taeyoung Yun, Emmanuel Bengio, Dinghuai Zhang, Yoshua Bengio, Sungsoo Ahn, Jinkyoo Park

Generative Flow Networks (GFlowNets) are amortized sampling methods that learn a distribution over discrete objects proportional to their rewards.

Paper
Code

Expected flow networks in stochastic environments and two-player zero-sum games

1 code implementation • 4 Oct 2023 • Marco Jiralerspong, Bilun Sun, Danilo Vucetic, Tianyu Zhang, Yoshua Bengio, Gauthier Gidel, Nikolay Malkin

Generative flow networks (GFlowNets) are sequential sampling models trained to match a given distribution.

Protein Design

Paper
Code

Diffusion Generative Flow Samplers: Improving learning signals through partial trajectory optimization

2 code implementations • 4 Oct 2023 • Dinghuai Zhang, Ricky T. Q. Chen, Cheng-Hao Liu, Aaron Courville, Yoshua Bengio

We tackle the problem of sampling from intractable high-dimensional density functions, a fundamental task that often appears in machine learning and statistics.

Paper
Code

Discrete, compositional, and symbolic representations through attractor dynamics

no code implementations • 3 Oct 2023 • Andrew Nam, Eric Elmoznino, Nikolay Malkin, Chen Sun, Yoshua Bengio, Guillaume Lajoie

Compositionality is an important feature of discrete symbolic systems, such as language and programs, as it enables them to have infinite capacity despite a finite symbol set.

Quantization

Paper
Add Code

Delta-AI: Local objectives for amortized inference in sparse graphical models

1 code implementation • 3 Oct 2023 • Jean-Pierre Falet, Hae Beom Lee, Nikolay Malkin, Chen Sun, Dragos Secrieru, Thomas Jiralerspong, Dinghuai Zhang, Guillaume Lajoie, Yoshua Bengio

We present a new algorithm for amortized inference in sparse probabilistic graphical models (PGMs), which we call $\Delta$-amortized inference ($\Delta$-AI).

Paper
Code

Leveraging Diffusion Disentangled Representations to Mitigate Shortcuts in Underspecified Visual Tasks

no code implementations • 3 Oct 2023 • Luca Scimeca, Alexander Rubinstein, Armand Mihai Nicolicioiu, Damien Teney, Yoshua Bengio

Spurious correlations in the data, where multiple cues are predictive of the target labels, often lead to shortcut learning phenomena, where a model may rely on erroneous, easy-to-learn, cues while ignoring reliable ones.

Paper
Add Code

Consciousness-Inspired Spatio-Temporal Abstractions for Better Generalization in Reinforcement Learning

1 code implementation • 30 Sep 2023 • Mingde Zhao, Safa Alver, Harm van Seijen, Romain Laroche, Doina Precup, Yoshua Bengio

Inspired by human conscious planning, we propose Skipper, a model-based reinforcement learning framework utilizing spatio-temporal abstractions to generalize better in novel situations.

Decision Making Model-based Reinforcement Learning +2

Paper
Code

Tree Cross Attention

1 code implementation • 29 Sep 2023 • Leo Feng, Frederick Tung, Hossein Hajimirsadeghi, Yoshua Bengio, Mohamed Osama Ahmed

In this work, we propose Tree Cross Attention (TCA) - a module based on Cross Attention that only retrieves information from a logarithmic $\mathcal{O}(\log(N))$ number of tokens for performing inference.

Paper
Code

Consciousness in Artificial Intelligence: Insights from the Science of Consciousness

no code implementations • 17 Aug 2023 • Patrick Butlin, Robert Long, Eric Elmoznino, Yoshua Bengio, Jonathan Birch, Axel Constant, George Deane, Stephen M. Fleming, Chris Frith, Xu Ji, Ryota Kanai, Colin Klein, Grace Lindsay, Matthias Michel, Liad Mudrik, Megan A. K. Peters, Eric Schwitzgebel, Jonathan Simon, Rufin VanRullen

From these theories we derive "indicator properties" of consciousness, elucidated in computational terms that allow us to assess AI systems for these properties.

Paper
Add Code

Benchmarking Bayesian Causal Discovery Methods for Downstream Treatment Effect Estimation

no code implementations • 11 Jul 2023 • Chris Chinenye Emezue, Alexandre Drouin, Tristan Deleu, Stefan Bauer, Yoshua Bengio

Nevertheless, a notable gap exists in the evaluation of causal discovery methods, where insufficient emphasis is placed on downstream inference.

Benchmarking Causal Discovery +2

Paper
Add Code

AI For Global Climate Cooperation 2023 Competition Proceedings

no code implementations • 10 Jul 2023 • Yoshua Bengio, Prateek Gupta, Lu Li, Soham Phade, Sunil Srinivasa, Andrew Williams, Tianyu Zhang, Yang Zhang, Stephan Zheng

On the other hand, an interdisciplinary panel of human experts in law, policy, sociology, economics and environmental science, evaluated the solutions qualitatively.

Decision Making Ethics +1

Paper
Add Code

Simulation-free Schrödinger bridges via score and flow matching

1 code implementation • 7 Jul 2023 • Alexander Tong, Nikolay Malkin, Kilian Fatras, Lazar Atanackovic, Yanlei Zhang, Guillaume Huguet, Guy Wolf, Yoshua Bengio

We present simulation-free score and flow matching ([SF]$^2$M), a simulation-free objective for inferring stochastic dynamics given unpaired samples drawn from arbitrary source and target distributions.

780

Paper
Code

Generative Flow Networks: a Markov Chain Perspective

no code implementations • 4 Jul 2023 • Tristan Deleu, Yoshua Bengio

While Markov chain Monte Carlo methods (MCMC) provide a general framework to sample from a probability distribution defined up to normalization, they often suffer from slow convergence to the target distribution when the latter is highly multi-modal.

Decision Making

Paper
Add Code

Thompson sampling for improved exploration in GFlowNets

no code implementations • 30 Jun 2023 • Jarrid Rector-Brooks, Kanika Madan, Moksh Jain, Maksym Korablyov, Cheng-Hao Liu, Sarath Chandar, Nikolay Malkin, Yoshua Bengio

Generative flow networks (GFlowNets) are amortized variational inference algorithms that treat sampling from a distribution over compositional objects as a sequential decision-making problem with a learnable action policy.

Active Learning Decision Making +3

Paper
Add Code

HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution

2 code implementations • NeurIPS 2023 • Eric Nguyen, Michael Poli, Marjan Faizi, Armin Thomas, Callum Birch-Sykes, Michael Wornow, Aman Patel, Clayton Rabideau, Stefano Massaroli, Yoshua Bengio, Stefano Ermon, Stephen A. Baccus, Chris Ré

Leveraging Hyena's new long-range capabilities, we present HyenaDNA, a genomic foundation model pretrained on the human reference genome with context lengths of up to 1 million tokens at the single nucleotide-level - an up to 500x increase over previous dense attention-based models.

4k In-Context Learning +2

509

Paper
Code

BatchGFN: Generative Flow Networks for Batch Active Learning

1 code implementation • 26 Jun 2023 • Shreshth A. Malik, Salem Lahlou, Andrew Jesson, Moksh Jain, Nikolay Malkin, Tristan Deleu, Yoshua Bengio, Yarin Gal

We introduce BatchGFN -- a novel approach for pool-based active learning that uses generative flow networks to sample sets of data points proportional to a batch reward.

Active Learning

Paper
Code

Constant Memory Attention Block

no code implementations • 21 Jun 2023 • Leo Feng, Frederick Tung, Hossein Hajimirsadeghi, Yoshua Bengio, Mohamed Osama Ahmed

Modern foundation model architectures rely on attention mechanisms to effectively capture context.

Point Processes

Paper
Add Code

Multi-Fidelity Active Learning with GFlowNets

2 code implementations • 20 Jun 2023 • Alex Hernandez-Garcia, Nikita Saxena, Moksh Jain, Cheng-Hao Liu, Yoshua Bengio

For example, in scientific discovery, we are often faced with the problem of exploring very large, high-dimensional spaces, where querying a high fidelity, black-box objective function is very expensive.

Active Learning

123

Paper
Code

GEO-Bench: Toward Foundation Models for Earth Monitoring

1 code implementation • NeurIPS 2023 • Alexandre Lacoste, Nils Lehmann, Pau Rodriguez, Evan David Sherwin, Hannah Kerner, Björn Lütjens, Jeremy Andrew Irvin, David Dao, Hamed Alemohammad, Alexandre Drouin, Mehmet Gunturkun, Gabriel Huang, David Vazquez, Dava Newman, Yoshua Bengio, Stefano Ermon, Xiao Xiang Zhu

Recent progress in self-supervision has shown that pre-training large neural networks on vast amounts of unsupervised data can lead to substantial increases in generalization to downstream tasks.

Paper
Code

Cycle Consistency Driven Object Discovery

no code implementations • 3 Jun 2023 • Aniket Didolkar, Anirudh Goyal, Yoshua Bengio

To tackle the second limitation, we apply the learned object-centric representations from the proposed method to two downstream reinforcement learning tasks, demonstrating considerable performance enhancements compared to conventional slot-based and monolithic representation learning methods.

Object Object Discovery +2

Paper
Add Code

Improving day-ahead Solar Irradiance Time Series Forecasting by Leveraging Spatio-Temporal Context

1 code implementation • 1 Jun 2023 • Oussama Boussif, Ghait Boukachab, Dan Assouline, Stefano Massaroli, Tianle Yuan, Loubna Benabbou, Yoshua Bengio

Solar power harbors immense potential in mitigating climate change by substantially reducing CO$_{2}$ emissions.

Solar Irradiance Forecasting Time Series +2

Paper
Code

Spotlight Attention: Robust Object-Centric Learning With a Spatial Locality Prior

no code implementations • 31 May 2023 • Ayush Chakravarthy, Trang Nguyen, Anirudh Goyal, Yoshua Bengio, Michael C. Mozer

The aim of object-centric vision is to construct an explicit representation of the objects in a scene.

Inductive Bias Object

Paper
Add Code

Attention Schema in Neural Agents

no code implementations • 27 May 2023 • Dianbo Liu, Samuele Bolotta, He Zhu, Yoshua Bengio, Guillaume Dumas

A strong prediction of this theory is that an agent can use its own AS to also infer the states of other agents' attention and consequently enhance coordination with other agents.

Descriptive Multi-agent Reinforcement Learning

Paper
Add Code

Let the Flows Tell: Solving Graph Combinatorial Optimization Problems with GFlowNets

1 code implementation • 26 May 2023 • Dinghuai Zhang, Hanjun Dai, Nikolay Malkin, Aaron Courville, Yoshua Bengio, Ling Pan

In this paper, we design Markov decision processes (MDPs) for different combinatorial problems and propose to train conditional GFlowNets to sample from the solution space.

Combinatorial Optimization

Paper
Code

Model evaluation for extreme risks

no code implementations • 24 May 2023 • Toby Shevlane, Sebastian Farquhar, Ben Garfinkel, Mary Phuong, Jess Whittlestone, Jade Leung, Daniel Kokotajlo, Nahema Marchal, Markus Anderljung, Noam Kolt, Lewis Ho, Divya Siddarth, Shahar Avin, Will Hawkins, Been Kim, Iason Gabriel, Vijay Bolina, Jack Clark, Yoshua Bengio, Paul Christiano, Allan Dafoe

Current approaches to building general-purpose AI systems tend to produce systems with both beneficial and harmful capabilities.

Paper
Add Code

torchgfn: A PyTorch GFlowNet library

2 code implementations • 24 May 2023 • Salem Lahlou, Joseph D. Viviano, Victor Schmidt, Yoshua Bengio

The growing popularity of generative flow networks (GFlowNets or GFNs) from a range of researchers with diverse backgrounds and areas of expertise necessitates a library which facilitates the testing of new features such as training losses that can be easily compared to standard benchmark implementations, or on a set of common environments.

188

Paper
Code

Memory Efficient Neural Processes via Constant Memory Attention Block

no code implementations • 23 May 2023 • Leo Feng, Frederick Tung, Hossein Hajimirsadeghi, Yoshua Bengio, Mohamed Osama Ahmed

Neural Processes (NPs) are popular meta-learning methods for efficiently modelling predictive uncertainty.

Meta-Learning

Paper
Add Code

FAENet: Frame Averaging Equivariant GNN for Materials Modeling

1 code implementation • 28 Apr 2023 • Alexandre Duval, Victor Schmidt, Alex Hernandez Garcia, Santiago Miret, Fragkiskos D. Malliaros, Yoshua Bengio, David Rolnick

Applications of machine learning techniques for materials modeling typically involve functions known to be equivariant or invariant to specific symmetries.

Paper
Code

Hyena Hierarchy: Towards Larger Convolutional Language Models

6 code implementations • 21 Feb 2023 • Michael Poli, Stefano Massaroli, Eric Nguyen, Daniel Y. Fu, Tri Dao, Stephen Baccus, Yoshua Bengio, Stefano Ermon, Christopher Ré

Recent advances in deep learning have relied heavily on the use of large Transformers due to their ability to learn at scale.

Ranked #37 on Language Modelling on WikiText-103

2k 8k +2

843

Paper
Code

Stochastic Generative Flow Networks

1 code implementation • 19 Feb 2023 • Ling Pan, Dinghuai Zhang, Moksh Jain, Longbo Huang, Yoshua Bengio

Generative Flow Networks (or GFlowNets for short) are a family of probabilistic agents that learn to sample complex combinatorial structures through the lens of "inference as control".

Paper
Code

GFlowNet-EM for learning compositional latent variable models

1 code implementation • 13 Feb 2023 • Edward J. Hu, Nikolay Malkin, Moksh Jain, Katie Everett, Alexandros Graikos, Yoshua Bengio

Latent variable models (LVMs) with discrete compositional latents are an important but challenging setting due to a combinatorially large number of possible configurations of the latents.

Variational Inference

Paper
Code

Sources of Richness and Ineffability for Phenomenally Conscious States

no code implementations • 13 Feb 2023 • Xu Ji, Eric Elmoznino, George Deane, Axel Constant, Guillaume Dumas, Guillaume Lajoie, Jonathan Simon, Yoshua Bengio

Conscious states (states that there is something it is like to be in) seem both rich or full of detail, and ineffable or hard to fully describe or recall.

Philosophy

Paper
Add Code

Distributional GFlowNets with Quantile Flows

1 code implementation • 11 Feb 2023 • Dinghuai Zhang, Ling Pan, Ricky T. Q. Chen, Aaron Courville, Yoshua Bengio

Generative Flow Networks (GFlowNets) are a new family of probabilistic samplers where an agent learns a stochastic policy for generating complex combinatorial structure through a series of decision-making steps.

Decision Making

Paper
Code

DynGFN: Towards Bayesian Inference of Gene Regulatory Networks with GFlowNets

1 code implementation • NeurIPS 2023 • Lazar Atanackovic, Alexander Tong, Bo wang, Leo J. Lee, Yoshua Bengio, Jason Hartford

In this paper we leverage the fact that it is possible to estimate the "velocity" of gene expression with RNA velocity techniques to develop an approach that addresses both challenges.

Bayesian Inference Causal Discovery

Paper
Code

Better Training of GFlowNets with Local Credit and Incomplete Trajectories

2 code implementations • 3 Feb 2023 • Ling Pan, Nikolay Malkin, Dinghuai Zhang, Yoshua Bengio

Generative Flow Networks or GFlowNets are related to Monte-Carlo Markov chain methods (as they sample from a distribution specified by an energy function), reinforcement learning (as they learn a policy to sample composed objects through a sequence of steps), generative models (as they learn to represent and sample from a distribution) and amortized variational methods (as they can be used to learn to approximate and sample from an otherwise intractable posterior, given a prior and a likelihood).

Paper
Code

Improving and generalizing flow-based generative models with minibatch optimal transport

2 code implementations • 1 Feb 2023 • Alexander Tong, Kilian Fatras, Nikolay Malkin, Guillaume Huguet, Yanlei Zhang, Jarrid Rector-Brooks, Guy Wolf, Yoshua Bengio

CFM features a stable regression objective like that used to train the stochastic flow in diffusion models but enjoys the efficient inference of deterministic flow models.

780

Paper
Code

GFlowNets for AI-Driven Scientific Discovery

no code implementations • 1 Feb 2023 • Moksh Jain, Tristan Deleu, Jason Hartford, Cheng-Hao Liu, Alex Hernandez-Garcia, Yoshua Bengio

However, in order to truly leverage large-scale data sets and high-throughput experimental setups, machine learning methods will need to be further improved and better integrated in the scientific discovery pipeline.

Efficient Exploration Experimental Design

Paper
Add Code

A theory of continuous generative flow networks

1 code implementation • 30 Jan 2023 • Salem Lahlou, Tristan Deleu, Pablo Lemos, Dinghuai Zhang, Alexandra Volokhova, Alex Hernández-García, Léna Néhale Ezzine, Yoshua Bengio, Nikolay Malkin

Generative flow networks (GFlowNets) are amortized variational inference algorithms that are trained to sample from unnormalized target distributions over compositional objects.

Variational Inference

Paper
Code

Leveraging the Third Dimension in Contrastive Learning

no code implementations • 27 Jan 2023 • Sumukh Aithal, Anirudh Goyal, Alex Lamb, Yoshua Bengio, Michael Mozer

We evaluate these two approaches on three different SSL methods -- BYOL, SimSiam, and SwAV -- using ImageNette (10 class subset of ImageNet), ImageNet-100 and ImageNet-1k datasets.

Contrastive Learning Depth Estimation +2

Paper
Add Code

Regeneration Learning: A Learning Paradigm for Data Generation

no code implementations • 21 Jan 2023 • Xu Tan, Tao Qin, Jiang Bian, Tie-Yan Liu, Yoshua Bengio

Regeneration learning extends the concept of representation learning to data generation tasks, and can be regarded as a counterpart of traditional representation learning, since 1) regeneration learning handles the abstraction (Y') of the target data Y for data generation while traditional representation learning handles the abstraction (X') of source data X for data understanding; 2) both the processes of Y'-->Y in regeneration learning and X-->X' in representation learning can be learned in a self-supervised way (e. g., pre-training); 3) both the mappings from X to Y' in regeneration learning and from X' to Y in representation learning are simpler than the direct mapping from X to Y.

Image Generation Representation Learning +6

Paper
Add Code

MixupE: Understanding and Improving Mixup from Directional Derivative Perspective

1 code implementation • 27 Dec 2022 • Yingtian Zou, Vikas Verma, Sarthak Mittal, Wai Hoh Tang, Hieu Pham, Juho Kannala, Yoshua Bengio, Arno Solin, Kenji Kawaguchi

Mixup is a popular data augmentation technique for training deep neural networks where additional samples are generated by linearly interpolating pairs of inputs and their labels.

Data Augmentation

Paper
Code

Synergies between Disentanglement and Sparsity: Generalization and Identifiability in Multi-Task Learning

1 code implementation • 26 Nov 2022 • Sébastien Lachapelle, Tristan Deleu, Divyat Mahajan, Ioannis Mitliagkas, Yoshua Bengio, Simon Lacoste-Julien, Quentin Bertrand

Although disentangled representations are often said to be beneficial for downstream tasks, current empirical and theoretical understanding is limited.

Disentanglement Meta-Learning +1

Paper
Code

PhAST: Physics-Aware, Scalable, and Task-specific GNNs for Accelerated Catalyst Design

2 code implementations • 22 Nov 2022 • Alexandre Duval, Victor Schmidt, Santiago Miret, Yoshua Bengio, Alex Hernández-García, David Rolnick

Catalyst materials play a crucial role in the electrochemical reactions involved in numerous industrial processes key to this transition, such as renewable energy storage and electrofuel synthesis.

Computational Efficiency

Paper
Code

Latent Bottlenecked Attentive Neural Processes

1 code implementation • 15 Nov 2022 • Leo Feng, Hossein Hajimirsadeghi, Yoshua Bengio, Mohamed Osama Ahmed

We demonstrate that LBANPs can trade-off the computational cost and performance according to the number of latent vectors.

Meta-Learning Multi-Armed Bandits

Paper
Code

Equivariance with Learned Canonicalization Functions

no code implementations • 11 Nov 2022 • Sékou-Oumar Kaba, Arnab Kumar Mondal, Yan Zhang, Yoshua Bengio, Siamak Ravanbakhsh

Symmetry-based neural networks often constrain the architecture in order to achieve invariance or equivariance to a group of transformations.

Image Classification Point Cloud Classification

Paper
Add Code

Posterior samples of source galaxies in strong gravitational lenses with score-based priors

no code implementations • 7 Nov 2022 • Alexandre Adam, Adam Coogan, Nikolay Malkin, Ronan Legin, Laurence Perreault-Levasseur, Yashar Hezaveh, Yoshua Bengio

Inferring accurate posteriors for high-dimensional representations of the brightness of gravitationally-lensed sources is a major challenge, in part due to the difficulties of accurately quantifying the priors.

Paper
Add Code

A General Purpose Neural Architecture for Geospatial Systems

no code implementations • 4 Nov 2022 • Nasim Rahaman, Martin Weiss, Frederik Träuble, Francesco Locatello, Alexandre Lacoste, Yoshua Bengio, Chris Pal, Li Erran Li, Bernhard Schölkopf

Geospatial Information Systems are used by researchers and Humanitarian Assistance and Disaster Response (HADR) practitioners to support a wide variety of important applications.

Disaster Response Earth Observation +2

Paper
Add Code

Bayesian learning of Causal Structure and Mechanisms with GFlowNets and Variational Bayes

no code implementations • 4 Nov 2022 • Mizu Nishikawa-Toomey, Tristan Deleu, Jithendaraa Subramanian, Yoshua Bengio, Laurent Charlin

We extend the method of Bayesian causal structure learning using GFlowNets to learn not only the posterior distribution over the structure, but also the parameters of a linear-Gaussian model.

Paper
Add Code

Discrete Factorial Representations as an Abstraction for Goal Conditioned Reinforcement Learning

no code implementations • 1 Nov 2022 • Riashat Islam, Hongyu Zang, Anirudh Goyal, Alex Lamb, Kenji Kawaguchi, Xin Li, Romain Laroche, Yoshua Bengio, Remi Tachet des Combes

Goal-conditioned reinforcement learning (RL) is a promising direction for training agents that are capable of solving multiple tasks and reach a diverse set of objectives.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Consistent Training via Energy-Based GFlowNets for Modeling Discrete Joint Distributions

no code implementations • 1 Nov 2022 • Chanakya Ekbote, Moksh Jain, Payel Das, Yoshua Bengio

We hypothesize that this can lead to incompatibility between the inductive optimization biases in training $R$ and in training the GFlowNet, potentially leading to worse samples and slow adaptation to changes in the distribution.

Active Learning

Paper
Add Code

FL Games: A Federated Learning Framework for Distribution Shifts

no code implementations • 31 Oct 2022 • Sharut Gupta, Kartik Ahuja, Mohammad Havaei, Niladri Chatterjee, Yoshua Bengio

Federated learning aims to train predictive models for data that is distributed across clients, under the orchestration of a server.

Federated Learning

Paper
Add Code

GFlowOut: Dropout with Generative Flow Networks

no code implementations • 24 Oct 2022 • Dianbo Liu, Moksh Jain, Bonaventure Dossou, Qianli Shen, Salem Lahlou, Anirudh Goyal, Nikolay Malkin, Chris Emezue, Dinghuai Zhang, Nadhir Hassen, Xu Ji, Kenji Kawaguchi, Yoshua Bengio

These methods face two important challenges: (a) the posterior distribution over masks can be highly multi-modal which can be difficult to approximate with standard variational inference and (b) it is not trivial to fully utilize sample-dependent information and correlation among dropout masks to improve posterior estimation.

Bayesian Inference Variational Inference

Paper
Add Code

Multi-Objective GFlowNets

1 code implementation • 23 Oct 2022 • Moksh Jain, Sharath Chandra Raparthy, Alex Hernandez-Garcia, Jarrid Rector-Brooks, Yoshua Bengio, Santiago Miret, Emmanuel Bengio

We study the problem of generating diverse candidates in the context of Multi-Objective Optimization.

Active Learning Drug Discovery

Paper
Code

Toward Next-Generation Artificial Intelligence: Catalyzing the NeuroAI Revolution

no code implementations • 15 Oct 2022 • Anthony Zador, Sean Escola, Blake Richards, Bence Ölveczky, Yoshua Bengio, Kwabena Boahen, Matthew Botvinick, Dmitri Chklovskii, Anne Churchland, Claudia Clopath, James DiCarlo, Surya Ganguli, Jeff Hawkins, Konrad Koerding, Alexei Koulakov, Yann Lecun, Timothy Lillicrap, Adam Marblestone, Bruno Olshausen, Alexandre Pouget, Cristina Savin, Terrence Sejnowski, Eero Simoncelli, Sara Solla, David Sussillo, Andreas S. Tolias, Doris Tsao

Neuroscience has long been an essential driver of progress in artificial intelligence (AI).

Paper
Add Code

Neural Attentive Circuits

no code implementations • 14 Oct 2022 • Nasim Rahaman, Martin Weiss, Francesco Locatello, Chris Pal, Yoshua Bengio, Bernhard Schölkopf, Li Erran Li, Nicolas Ballas

Recent work has seen the development of general purpose neural architectures that can be trained to perform tasks across diverse data modalities.

Point Cloud Classification text-classification +1

Paper
Add Code

Contrastive Retrospection: honing in on critical steps for rapid learning and generalization in RL

1 code implementation • NeurIPS 2023 • Chen Sun, Wannan Yang, Thomas Jiralerspong, Dane Malenfant, Benjamin Alsbury-Nealy, Yoshua Bengio, Blake Richards

Distinct from other contemporary RL approaches to credit assignment, ConSpec takes advantage of the fact that it is easier to retrospectively identify the small set of steps that success is contingent upon (and ignoring other states) than it is to prospectively predict reward at every taken step.

Contrastive Learning Out-of-Distribution Generalization +1

Paper
Code

MAgNet: Mesh Agnostic Neural PDE Solver

1 code implementation • 11 Oct 2022 • Oussama Boussif, Dan Assouline, Loubna Benabbou, Yoshua Bengio

The computational complexity of classical numerical methods for solving Partial Differential Equations (PDE) scales significantly as the resolution increases.

Zero-shot Generalization

Paper
Code

Robust and Controllable Object-Centric Learning through Energy-based Models

no code implementations • 11 Oct 2022 • Ruixiang Zhang, Tong Che, Boris Ivanovic, Renhao Wang, Marco Pavone, Yoshua Bengio, Liam Paull

Humans are remarkably good at understanding and reasoning about complex visual scenes.

Object Representation Learning +1

Paper
Add Code

Generative Augmented Flow Networks

no code implementations • 7 Oct 2022 • Ling Pan, Dinghuai Zhang, Aaron Courville, Longbo Huang, Yoshua Bengio

We specify intermediate rewards by intrinsic motivation to tackle the exploration problem in sparse reward environments.

Paper
Add Code

Stateful active facilitator: Coordination and Environmental Heterogeneity in Cooperative Multi-Agent Reinforcement Learning

2 code implementations • 4 Oct 2022 • Dianbo Liu, Vedant Shah, Oussama Boussif, Cristian Meo, Anirudh Goyal, Tianmin Shu, Michael Mozer, Nicolas Heess, Yoshua Bengio

We formalize the notions of coordination level and heterogeneity level of an environment and present HECOGrid, a suite of multi-agent RL environments that facilitates empirical evaluation of different MARL approaches across different levels of coordination and environmental heterogeneity by providing a quantitative control over coordination and heterogeneity levels of the environment.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Code

Latent State Marginalization as a Low-cost Approach for Improving Exploration

1 code implementation • 3 Oct 2022 • Dinghuai Zhang, Aaron Courville, Yoshua Bengio, Qinqing Zheng, Amy Zhang, Ricky T. Q. Chen

While the maximum entropy (MaxEnt) reinforcement learning (RL) framework -- often touted for its exploration and robustness capabilities -- is usually motivated from a probabilistic perspective, the use of deep probabilistic models has not gained much traction in practice due to their inherent complexity.

Continuous Control Reinforcement Learning (RL) +1

Paper
Code

GFlowNets and variational inference

1 code implementation • 2 Oct 2022 • Nikolay Malkin, Salem Lahlou, Tristan Deleu, Xu Ji, Edward Hu, Katie Everett, Dinghuai Zhang, Yoshua Bengio

This paper builds bridges between two families of probabilistic algorithms: (hierarchical) variational inference (VI), which is typically used to model distributions over continuous spaces, and generative flow networks (GFlowNets), which have been used for distributions over discrete structures such as graphs.

Reinforcement Learning (RL) Variational Inference

Paper
Code

Predictive Inference with Feature Conformal Prediction

1 code implementation • 1 Oct 2022 • Jiaye Teng, Chuan Wen, Dinghuai Zhang, Yoshua Bengio, Yang Gao, Yang Yuan

Conformal prediction is a distribution-free technique for establishing valid prediction intervals.

Conformal Prediction Image Segmentation +5

Paper
Code

Learning GFlowNets from partial episodes for improved convergence and stability

3 code implementations • 26 Sep 2022 • Kanika Madan, Jarrid Rector-Brooks, Maksym Korablyov, Emmanuel Bengio, Moksh Jain, Andrei Nica, Tom Bosc, Yoshua Bengio, Nikolay Malkin

Generative flow networks (GFlowNets) are a family of algorithms for training a sequential sampler of discrete objects under an unnormalized target density and have been successfully used for various probabilistic modeling tasks.

568

Paper
Code

Interventional Causal Representation Learning

1 code implementation • 24 Sep 2022 • Kartik Ahuja, Divyat Mahajan, Yixin Wang, Yoshua Bengio

Can interventional data facilitate causal representation learning?

Representation Learning

Paper
Code

Graph-Based Active Machine Learning Method for Diverse and Novel Antimicrobial Peptides Generation and Selection

no code implementations • 18 Sep 2022 • Bonaventure F. P. Dossou, Dianbo Liu, Xu Ji, Moksh Jain, Almer M. van der Sloot, Roger Palou, Michael Tyers, Yoshua Bengio

As antibiotic-resistant bacterial strains are rapidly spreading worldwide, infections caused by these strains are emerging as a global crisis causing the death of millions of people every year.

Paper
Add Code

Designing Biological Sequences via Meta-Reinforcement Learning and Bayesian Optimization

no code implementations • 13 Sep 2022 • Leo Feng, Padideh Nouri, Aneri Muni, Yoshua Bengio, Pierre-Luc Bacon

The problem can be framed as a global optimization problem where the objective is an expensive black-box function such that we can query large batches restricted with a limitation of a low number of rounds.

Bayesian Optimization Meta-Learning +3

Paper
Add Code

Unifying Generative Models with GFlowNets and Beyond

no code implementations • 6 Sep 2022 • Dinghuai Zhang, Ricky T. Q. Chen, Nikolay Malkin, Yoshua Bengio

Our framework provides a means for unifying training and inference algorithms, and provides a route to shine a unifying light over many generative models.

Decision Making

Paper
Add Code

AI for Global Climate Cooperation: Modeling Global Climate Negotiations, Agreements, and Long-Term Cooperation in RICE-N

2 code implementations • 15 Aug 2022 • Tianyu Zhang, Andrew Williams, Soham Phade, Sunil Srinivasa, Yang Zhang, Prateek Gupta, Yoshua Bengio, Stephan Zheng

To facilitate this research, here we introduce RICE-N, a multi-region integrated assessment model that simulates the global climate and economy, and which can be used to design and evaluate the strategic outcomes for different negotiation and agreement frameworks.

Ethics Multi-agent Reinforcement Learning

Paper
Code

Diversifying Design of Nucleic Acid Aptamers Using Unsupervised Machine Learning

no code implementations • 10 Aug 2022 • Siba Moussa, Michael Kilgour, Clara Jans, Alex Hernandez-Garcia, Miroslava Cuperlovic-Culf, Yoshua Bengio, Lena Simine

Inverse design of short single-stranded RNA and DNA sequences (aptamers) is the task of finding sequences that satisfy a set of desired criteria.

Paper
Add Code

Controlled Sparsity via Constrained Optimization or: How I Learned to Stop Tuning Penalties and Love Constraints

1 code implementation • 8 Aug 2022 • Jose Gallego-Posada, Juan Ramirez, Akram Erraqabi, Yoshua Bengio, Simon Lacoste-Julien

The performance of trained neural networks is robust to harsh levels of pruning.

Sparse Learning

Paper
Code

Discrete Key-Value Bottleneck

1 code implementation • 22 Jul 2022 • Frederik Träuble, Anirudh Goyal, Nasim Rahaman, Michael Mozer, Kenji Kawaguchi, Yoshua Bengio, Bernhard Schölkopf

Deep neural networks perform well on classification tasks where data streams are i. i. d.

Class Incremental Learning Incremental Learning

Paper
Code

Lookback for Learning to Branch

no code implementations • 30 Jun 2022 • Prateek Gupta, Elias B. Khalil, Didier Chetélat, Maxime Gasse, Yoshua Bengio, Andrea Lodi, M. Pawan Kumar

Given that B&B results in a tree of sub-MILPs, we ask (a) whether there are strong dependencies exhibited by the target heuristic among the neighboring nodes of the B&B tree, and (b) if so, whether we can incorporate them in our training procedure.

Model Selection Variable Selection

Paper
Add Code

Your Autoregressive Generative Model Can be Better If You Treat It as an Energy-Based One

no code implementations • 26 Jun 2022 • Yezhen Wang, Tong Che, Bo Li, Kaitao Song, Hengzhi Pei, Yoshua Bengio, Dongsheng Li

Autoregressive generative models are commonly used, especially for those tasks involving sequential data.

Image Generation Language Modelling +1

Paper
Add Code

On Neural Architecture Inductive Biases for Relational Tasks

1 code implementation • 9 Jun 2022 • Giancarlo Kerg, Sarthak Mittal, David Rolnick, Yoshua Bengio, Blake Richards, Guillaume Lajoie

Recent work has explored how forcing relational representations to remain distinct from sensory representations, as it seems to be the case in the brain, can help artificial systems.

Inductive Bias Out-of-Distribution Generalization

Paper
Code

On the Generalization and Adaption Performance of Causal Models

no code implementations • 9 Jun 2022 • Nino Scherrer, Anirudh Goyal, Stefan Bauer, Yoshua Bengio, Nan Rosemary Ke

Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes and offer robust generalization.

Causal Discovery Out-of-Distribution Generalization

Paper
Add Code

Building Robust Ensembles via Margin Boosting

1 code implementation • 7 Jun 2022 • Dinghuai Zhang, Hongyang Zhang, Aaron Courville, Yoshua Bengio, Pradeep Ravikumar, Arun Sai Suggala

Consequently, an emerging line of work has focused on learning an ensemble of neural networks to defend against adversarial attacks.

Adversarial Robustness

Paper
Code

Is a Modular Architecture Enough?

1 code implementation • 6 Jun 2022 • Sarthak Mittal, Yoshua Bengio, Guillaume Lajoie

Inspired from human cognition, machine learning systems are gradually revealing advantages of sparser and more modular architectures.

Out-of-Distribution Generalization

Paper
Code

Weakly Supervised Representation Learning with Sparse Perturbations

1 code implementation • 2 Jun 2022 • Kartik Ahuja, Jason Hartford, Yoshua Bengio

We show that if the perturbations are applied only on mutually exclusive blocks of latents, we identify the latents up to those blocks.

Representation Learning

Paper
Code

Agnostic Physics-Driven Deep Learning

no code implementations • 30 May 2022 • Benjamin Scellier, Siddhartha Mishra, Yoshua Bengio, Yann Ollivier

This work establishes that a physical system can perform statistical learning without gradient computations, via an Agnostic Equilibrium Propagation (Aeqprop) procedure that combines energy minimization, homeostatic control, and nudging towards the correct response.

Paper
Add Code

Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning

2 code implementations • 30 May 2022 • Aniket Didolkar, Kshitij Gupta, Anirudh Goyal, Nitesh B. Gundavarapu, Alex Lamb, Nan Rosemary Ke, Yoshua Bengio

A slow stream that is recurrent in nature aims to learn a specialized and compressed representation, by forcing chunks of $K$ time steps into a single representation which is divided into multiple vectors.

Decision Making Inductive Bias

2,653

Paper
Code

FL Games: A federated learning framework for distribution shifts

no code implementations • 23 May 2022 • Sharut Gupta, Kartik Ahuja, Mohammad Havaei, Niladri Chatterjee, Yoshua Bengio

Federated learning aims to train predictive models for data that is distributed across clients, under the orchestration of a server.

Federated Learning

Paper
Add Code

Coordinating Policies Among Multiple Agents via an Intelligent Communication Channel

no code implementations • 21 May 2022 • Dianbo Liu, Vedant Shah, Oussama Boussif, Cristian Meo, Anirudh Goyal, Tianmin Shu, Michael Mozer, Nicolas Heess, Yoshua Bengio

In Multi-Agent Reinforcement Learning (MARL), specialized channels are often introduced that allow agents to communicate directly with one another.

Intelligent Communication Multi-agent Reinforcement Learning +2

Paper
Add Code

FedILC: Weighted Geometric Mean and Invariant Gradient Covariance for Federated Learning on Non-IID Data

1 code implementation • 19 May 2022 • Mike He Zhu, Léna Néhale Ezzine, Dianbo Liu, Yoshua Bengio

Federated learning is a distributed machine learning approach which enables a shared server model to learn by aggregating the locally-computed parameter updates with the training data from spatially-distributed client silos.

Federated Learning

Paper
Code

A Highly Adaptive Acoustic Model for Accurate Multi-Dialect Speech Recognition

no code implementations • 6 May 2022 • Sanghyun Yoo, Inchul Song, Yoshua Bengio

In this paper, we propose a novel acoustic modeling technique for accurate multi-dialect speech recognition with a single AM.

speech-recognition Speech Recognition

Paper
Add Code

Temporal Abstractions-Augmented Temporally Contrastive Learning: An Alternative to the Laplacian in RL

no code implementations • 21 Mar 2022 • Akram Erraqabi, Marlos C. Machado, Mingde Zhao, Sainbayar Sukhbaatar, Alessandro Lazaric, Ludovic Denoyer, Yoshua Bengio

In reinforcement learning, the graph Laplacian has proved to be a valuable tool in the task-agnostic setting, with applications ranging from skill discovery to reward shaping.

Continuous Control Contrastive Learning +1

Paper
Add Code

A New Era: Intelligent Tutoring Systems Will Transform Online Learning for Millions

no code implementations • 3 Mar 2022 • Francois St-Hilaire, Dung Do Vu, Antoine Frau, Nathan Burns, Farid Faraji, Joseph Potochny, Stephane Robert, Arnaud Roussel, Selene Zheng, Taylor Glazier, Junfel Vincent Romano, Robert Belfer, Muhammad Shayan, Ariella Smofsky, Tommy Delarosbil, Seulmin Ahn, Simon Eden-Walker, Kritika Sony, Ansona Onyi Ching, Sabina Elkins, Anush Stepanyan, Adela Matajova, Victor Chen, Hossein Sahraei, Robert Larson, Nadia Markova, Andrew Barkett, Laurent Charlin, Yoshua Bengio, Iulian Vlad Serban, Ekaterina Kochmar

AI-powered learning can provide millions of learners with a highly personalized, active and practical learning experience, which is key to successful learning.

Active Learning Multiple-choice

Paper
Add Code

Continuous-Time Meta-Learning with Forward Mode Differentiation

no code implementations • ICLR 2022 • Tristan Deleu, David Kanaa, Leo Feng, Giancarlo Kerg, Yoshua Bengio, Guillaume Lajoie, Pierre-Luc Bacon

Drawing inspiration from gradient-based meta-learning methods with infinitely small gradient steps, we introduce Continuous-Time Meta-Learning (COMLN), a meta-learning algorithm where adaptation follows the dynamics of a gradient vector field.

Few-Shot Image Classification Meta-Learning

Paper
Add Code

Biological Sequence Design with GFlowNets

1 code implementation • 2 Mar 2022 • Moksh Jain, Emmanuel Bengio, Alex-Hernandez Garcia, Jarrid Rector-Brooks, Bonaventure F. P. Dossou, Chanakya Ekbote, Jie Fu, Tianyu Zhang, Micheal Kilgour, Dinghuai Zhang, Lena Simine, Payel Das, Yoshua Bengio

In this work, we propose an active learning algorithm leveraging epistemic uncertainty estimation and the recently proposed GFlowNets as a generator of diverse candidate solutions, with the objective to obtain a diverse batch of useful (as defined by some utility function, for example, the predicted anti-microbial activity of a peptide) and informative candidates after each round.

Active Learning

Paper
Code

Bayesian Structure Learning with Generative Flow Networks

1 code implementation • 28 Feb 2022 • Tristan Deleu, António Góis, Chris Emezue, Mansi Rankawat, Simon Lacoste-Julien, Stefan Bauer, Yoshua Bengio

In Bayesian structure learning, we are interested in inferring a distribution over the directed acyclic graph (DAG) structure of Bayesian networks, from data.

Variational Inference

Paper
Code

Combining Modular Skills in Multitask Learning

1 code implementation • 28 Feb 2022 • Edoardo M. Ponti, Alessandro Sordoni, Yoshua Bengio, Siva Reddy

By jointly learning these and a task-skill allocation matrix, the network for each task is instantiated as the average of the parameters of active skills.

Instruction Following reinforcement-learning +1

Paper
Code

RECOVER: sequential model optimization platform for combination drug repurposing identifies novel synergistic compounds in vitro

1 code implementation • 7 Feb 2022 • Paul Bertin, Jarrid Rector-Brooks, Deepak Sharma, Thomas Gaudelet, Andrew Anighoro, Torsten Gross, Francisco Martinez-Pena, Eileen L. Tang, Suraj M S, Cristian Regep, Jeremy Hayter, Maksym Korablyov, Nicholas Valiante, Almer van der Sloot, Mike Tyers, Charles Roberts, Michael M. Bronstein, Luke L. Lairson, Jake P. Taylor-King, Yoshua Bengio

For large libraries of small molecules, exhaustive combinatorial chemical screens become infeasible to perform when considering a range of disease models, assay conditions, and dose ranges.

Benchmarking Model Optimization

Paper
Code

Generative Flow Networks for Discrete Probabilistic Modeling

2 code implementations • 3 Feb 2022 • Dinghuai Zhang, Nikolay Malkin, Zhen Liu, Alexandra Volokhova, Aaron Courville, Yoshua Bengio

We present energy-based generative flow networks (EB-GFN), a novel probabilistic modeling algorithm for high-dimensional discrete data.

Paper
Code

Adaptive Discrete Communication Bottlenecks with Dynamic Vector Quantization

no code implementations • 2 Feb 2022 • Dianbo Liu, Alex Lamb, Xu Ji, Pascal Notsawo, Mike Mozer, Yoshua Bengio, Kenji Kawaguchi

Vector Quantization (VQ) is a method for discretizing latent representations and has become a major part of the deep learning toolkit.

Quantization reinforcement-learning +2

Paper
Add Code

Trajectory balance: Improved credit assignment in GFlowNets

3 code implementations • 31 Jan 2022 • Nikolay Malkin, Moksh Jain, Emmanuel Bengio, Chen Sun, Yoshua Bengio

Generative flow networks (GFlowNets) are a method for learning a stochastic policy for generating compositional objects, such as graphs or strings, from a given unnormalized density by sequences of actions, where many possible action sequences may lead to the same object.

568

Paper
Code

Towards Scaling Difference Target Propagation by Learning Backprop Targets

1 code implementation • 31 Jan 2022 • Maxence Ernoult, Fabrice Normandin, Abhinav Moudgil, Sean Spinney, Eugene Belilovsky, Irina Rish, Blake Richards, Yoshua Bengio

As such, it is important to explore learning algorithms that come with strong theoretical guarantees and can match the performance of backpropagation (BP) on complex tasks.

Paper
Code

The Effect of Diversity in Meta-Learning

1 code implementation • 27 Jan 2022 • Ramnath Kumar, Tristan Deleu, Yoshua Bengio

Recent studies show that task distribution plays a vital role in the meta-learner's performance.

Few-Shot Learning

Paper
Code

Boosting Exploration in Multi-Task Reinforcement Learning using Adversarial Networks

1 code implementation • 27 Jan 2022 • Ramnath Kumar, Tristan Deleu, Yoshua Bengio

Our proposed adversarial training regime for Multi-Task Reinforcement Learning (MT-RL) addresses the limitations of conventional training methods in RL, especially in meta-RL environments where the agent faces new tasks.

Decision Making reinforcement-learning +1

Paper
Code

Multi-Domain Balanced Sampling Improves Out-of-Distribution Generalization of Chest X-ray Pathology Prediction Models

1 code implementation • 27 Dec 2021 • Enoch Tetteh, Joseph Viviano, Yoshua Bengio, David Krueger, Joseph Paul Cohen

Learning models that generalize under different distribution shifts in medical imaging has been a long-standing research challenge.

Out-of-Distribution Generalization Representation Learning

Paper
Code

Multi-scale Feature Learning Dynamics: Insights for Double Descent

1 code implementation • 6 Dec 2021 • Mohammad Pezeshki, Amartya Mitra, Yoshua Bengio, Guillaume Lajoie

A key challenge in building theoretical foundations for deep learning is the complex optimization dynamics of neural networks, resulting from the high-dimensional interactions between the large number of network parameters.

Paper
Code

GFlowNet Foundations

2 code implementations • 17 Nov 2021 • Yoshua Bengio, Salem Lahlou, Tristan Deleu, Edward J. Hu, Mo Tiwari, Emmanuel Bengio

Generative Flow Networks (GFlowNets) have been introduced as a method to sample a diverse set of candidates in an active learning context, with a training objective that makes them approximately sample in proportion to a given reward function.

Active Learning

188

Paper
Code

Properties from Mechanisms: An Equivariance Perspective on Identifiable Representation Learning

no code implementations • ICLR 2022 • Kartik Ahuja, Jason Hartford, Yoshua Bengio

These results suggest that by exploiting inductive biases on mechanisms, it is possible to design a range of new identifiable representation learning approaches.

Representation Learning

Paper
Add Code

From Machine Learning to Robotics: Challenges and Opportunities for Embodied Intelligence

no code implementations • 28 Oct 2021 • Nicholas Roy, Ingmar Posner, Tim Barfoot, Philippe Beaudoin, Yoshua Bengio, Jeannette Bohg, Oliver Brock, Isabelle Depatie, Dieter Fox, Dan Koditschek, Tomas Lozano-Perez, Vikash Mansinghka, Christopher Pal, Blake Richards, Dorsa Sadigh, Stefan Schaal, Gaurav Sukhatme, Denis Therien, Marc Toussaint, Michiel Van de Panne

Machine learning has long since become a keystone technology, accelerating science and applications in a broad range of domains.

BIG-bench Machine Learning

Paper
Add Code

Chunked Autoregressive GAN for Conditional Waveform Synthesis

1 code implementation • ICLR 2022 • Max Morrison, Rithesh Kumar, Kundan Kumar, Prem Seetharaman, Aaron Courville, Yoshua Bengio

We show that simple pitch and periodicity conditioning is insufficient for reducing this error relative to using autoregression.

Inductive Bias

180

Paper
Code

Compositional Attention: Disentangling Search and Retrieval

3 code implementations • ICLR 2022 • Sarthak Mittal, Sharath Chandra Raparthy, Irina Rish, Yoshua Bengio, Guillaume Lajoie

Through our qualitative analysis, we demonstrate that Compositional Attention leads to dynamic specialization based on the type of retrieval needed.

Retrieval

7,709

Paper
Code

Graph Neural Networks with Learnable Structural and Positional Representations

1 code implementation • ICLR 2022 • Vijay Prakash Dwivedi, Anh Tuan Luu, Thomas Laurent, Yoshua Bengio, Xavier Bresson

An approach to tackle this issue is to introduce Positional Encoding (PE) of nodes, and inject it into the input layer, like in Transformers.

Ranked #13 on Graph Regression on ZINC-500k

Graph Regression Knowledge Graphs +1

224

Paper
Code

Dynamic Inference with Neural Interpreters

no code implementations • NeurIPS 2021 • Nasim Rahaman, Muhammad Waleed Gondal, Shruti Joshi, Peter Gehler, Yoshua Bengio, Francesco Locatello, Bernhard Schölkopf

Modern neural network architectures can leverage large amounts of data to generalize well within the training distribution.

Image Classification Systematic Generalization

Paper
Add Code

Unifying Likelihood-free Inference with Black-box Optimization and Beyond

no code implementations • ICLR 2022 • Dinghuai Zhang, Jie Fu, Yoshua Bengio, Aaron Courville

Black-box optimization formulations for biological sequence design have drawn recent attention due to their promising potential impact on the pharmaceutical industry.

Drug Discovery

Paper
Add Code

ClimateGAN: Raising Climate Change Awareness by Generating Images of Floods

2 code implementations • ICLR 2022 • Victor Schmidt, Alexandra Sasha Luccioni, Mélisande Teng, Tianyu Zhang, Alexia Reynaud, Sunand Raghupathi, Gautier Cosne, Adrien Juraver, Vahe Vardanyan, Alex Hernandez-Garcia, Yoshua Bengio

Climate change is a major threat to humanity, and the actions required to prevent its catastrophic consequences include changes in both policy-making and individual behaviour.

Conditional Image Generation Unsupervised Domain Adaptation

Paper
Code

Divide and Explore: Multi-Agent Separate Exploration with Shared Intrinsic Motivations

no code implementations • 29 Sep 2021 • Xiao Jing, Zhenwei Zhu, Hongliang Li, Xin Pei, Yoshua Bengio, Tong Che, Hongyong Song

One of the greatest challenges of reinforcement learning is efficient exploration, especially when training signals are sparse or deceptive.

Distributed Computing Efficient Exploration

Paper
Add Code

Learning Neural Causal Models with Active Interventions

1 code implementation • 6 Sep 2021 • Nino Scherrer, Olexa Bilaniuk, Yashas Annadani, Anirudh Goyal, Patrick Schwab, Bernhard Schölkopf, Michael C. Mozer, Yoshua Bengio, Stefan Bauer, Nan Rosemary Ke

Discovering causal structures from data is a challenging inference problem of fundamental importance in all areas of science.

Causal Discovery

Paper
Code

Discrete-Valued Neural Communication

no code implementations • NeurIPS 2021 • Dianbo Liu, Alex Lamb, Kenji Kawaguchi, Anirudh Goyal, Chen Sun, Michael Curtis Mozer, Yoshua Bengio

Deep learning has advanced from fully connected architectures to structured models organized into components, e. g., the transformer composed of positional elements, modular architectures divided into slots, and graph neural nets made up of nodes.

Quantization Systematic Generalization

Paper
Add Code

The Causal-Neural Connection: Expressiveness, Learnability, and Inference

2 code implementations • NeurIPS 2021 • Kevin Xia, Kai-Zhan Lee, Yoshua Bengio, Elias Bareinboim

Given this property, one may be tempted to surmise that a collection of neural nets is capable of learning any SCM by training on data generated by that SCM.

Causal Identification Causal Inference +1

Paper
Code

Systematic Evaluation of Causal Discovery in Visual Model Based Reinforcement Learning

1 code implementation • 2 Jul 2021 • Nan Rosemary Ke, Aniket Didolkar, Sarthak Mittal, Anirudh Goyal, Guillaume Lajoie, Stefan Bauer, Danilo Rezende, Yoshua Bengio, Michael Mozer, Christopher Pal

A central goal for AI and causality is thus the joint discovery of abstract representations and causal structure.

Benchmarking Causal Discovery +4

Paper
Code

Variational Causal Networks: Approximate Bayesian Inference over Causal Structures

1 code implementation • 14 Jun 2021 • Yashas Annadani, Jonas Rothfuss, Alexandre Lacoste, Nino Scherrer, Anirudh Goyal, Yoshua Bengio, Stefan Bauer

However, a crucial aspect to acting intelligently upon the knowledge about causal structure which has been inferred from finite data demands reasoning about its uncertainty.

Bayesian Inference Causal Inference +2

Paper
Code

Exploration-Driven Representation Learning in Reinforcement Learning

no code implementations • ICML Workshop URL 2021 • Akram Erraqabi, Mingde Zhao, Marlos C. Machado, Yoshua Bengio, Sainbayar Sukhbaatar, Ludovic Denoyer, Alessandro Lazaric

In this work, we introduce a method that explicitly couples representation learning with exploration when the agent is not provided with a uniform prior over the state space.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Invariance Principle Meets Information Bottleneck for Out-of-Distribution Generalization

2 code implementations • NeurIPS 2021 • Kartik Ahuja, Ethan Caballero, Dinghuai Zhang, Jean-Christophe Gagnon-Audet, Yoshua Bengio, Ioannis Mitliagkas, Irina Rish

To answer these questions, we revisit the fundamental assumptions in linear regression tasks, where invariance-based approaches were shown to provably generalize OOD.

Out-of-Distribution Generalization regression

1,337

Paper
Code

Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation

4 code implementations • NeurIPS 2021 • Emmanuel Bengio, Moksh Jain, Maksym Korablyov, Doina Precup, Yoshua Bengio

Using insights from Temporal Difference learning, we propose GFlowNet, based on a view of the generative process as a flow network, making it possible to handle the tricky case where different trajectories can yield the same final state, e. g., there are many ways to sequentially add atoms to generate some molecular graph.

568

Paper
Code

SpeechBrain: A General-Purpose Speech Toolkit

4 code implementations • 8 Jun 2021 • Mirco Ravanelli, Titouan Parcollet, Peter Plantinga, Aku Rouhe, Samuele Cornell, Loren Lugosch, Cem Subakan, Nauman Dawalatabad, Abdelwahab Heba, Jianyuan Zhong, Ju-chieh Chou, Sung-Lin Yeh, Szu-Wei Fu, Chien-Feng Liao, Elena Rastorgueva, François Grondin, William Aris, Hwidong Na, Yan Gao, Renato de Mori, Yoshua Bengio

SpeechBrain is an open-source and all-in-one speech toolkit.

Language Identification Spoken Language Understanding

7,975

Paper
Code

A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

1 code implementation • NeurIPS 2021 • Mingde Zhao, Zhen Liu, Sitao Luan, Shuyuan Zhang, Doina Precup, Yoshua Bengio

We present an end-to-end, model-based deep reinforcement learning agent which dynamically attends to relevant parts of its state during planning.

Model-based Reinforcement Learning Out-of-Distribution Generalization +2

Paper
Code

Fast and Slow Learning of Recurrent Independent Mechanisms

no code implementations • 18 May 2021 • Kanika Madan, Nan Rosemary Ke, Anirudh Goyal, Bernhard Schölkopf, Yoshua Bengio

To study these ideas, we propose a particular training framework in which we assume that the pieces of knowledge an agent needs and its reward function are stationary and can be re-used across tasks.

Meta-Learning

Paper
Add Code

An End-to-End Framework for Molecular Conformation Generation via Bilevel Programming

1 code implementation • 15 May 2021 • Minkai Xu, Wujie Wang, Shitong Luo, Chence Shi, Yoshua Bengio, Rafael Gomez-Bombarelli, Jian Tang

Specifically, the molecular graph is first encoded in a latent space, and then the 3D structures are generated by solving a principled bilevel optimization program.

Bilevel Optimization

Paper
Code

Comparative Study of Learning Outcomes for Online Learning Platforms

no code implementations • 15 Apr 2021 • Francois St-Hilaire, Nathan Burns, Robert Belfer, Muhammad Shayan, Ariella Smofsky, Dung Do Vu, Antoine Frau, Joseph Potochny, Farid Faraji, Vincent Pavero, Neroli Ko, Ansona Onyi Ching, Sabina Elkins, Anush Stepanyan, Adela Matajova, Laurent Charlin, Yoshua Bengio, Iulian Vlad Serban, Ekaterina Kochmar

Personalization and active learning are key aspects to successful learning.

Active Learning Multiple-choice

Paper
Add Code

HBert + BiasCorp -- Fighting Racism on the Web

no code implementations • 6 Apr 2021 • Olawale Onabola, Zhuang Ma, Yang Xie, Benjamin Akera, Abdulrahman Ibraheem, Jia Xue, Dianbo Liu, Yoshua Bengio

In this work, we present hBERT, where we modify certain layers of the pretrained BERT model with the new Hopfield Layer.

Paper
Add Code

Neural Production Systems: Learning Rule-Governed Visual Dynamics

no code implementations • NeurIPS 2021 • Anirudh Goyal, Aniket Didolkar, Nan Rosemary Ke, Charles Blundell, Philippe Beaudoin, Nicolas Heess, Michael Mozer, Yoshua Bengio

First, GNNs do not predispose interactions to be sparse, as relationships among independent entities are likely to be.

Paper
Add Code

Exploring the Wasserstein metric for survival analysis

1 code implementation • Proceedings of Machine Learning Research 1:1–13 2021 • Margaux Luck*, Tristan Sylvain*, Joseph Paul Cohen, Heloise Cardinal, Andrea Lodi, Yoshua Bengio

Survival analysis is a type of semi-supervised task where the target output (the survival time) is often right-censored.

Survival Analysis

Paper
Code

Coordination Among Neural Modules Through a Shared Global Workspace

1 code implementation • ICLR 2022 • Anirudh Goyal, Aniket Didolkar, Alex Lamb, Kartikeya Badola, Nan Rosemary Ke, Nasim Rahaman, Jonathan Binas, Charles Blundell, Michael Mozer, Yoshua Bengio

We explore the use of such a communication channel in the context of deep learning for modeling the structure of complex environments.

Paper
Code

Transformers with Competitive Ensembles of Independent Mechanisms

no code implementations • 27 Feb 2021 • Alex Lamb, Di He, Anirudh Goyal, Guolin Ke, Chien-Feng Liao, Mirco Ravanelli, Yoshua Bengio

In this work we explore a way in which the Transformer architecture is deficient: it represents each position with a large monolithic hidden representation and a single set of parameters which are applied over the entire hidden representation.

Speech Enhancement

Paper
Add Code

Towards Causal Representation Learning

no code implementations • 22 Feb 2021 • Bernhard Schölkopf, Francesco Locatello, Stefan Bauer, Nan Rosemary Ke, Nal Kalchbrenner, Anirudh Goyal, Yoshua Bengio

The two fields of machine learning and graphical causality arose and developed separately.

BIG-bench Machine Learning Causal Inference +1

Paper
Add Code

Learning Neural Generative Dynamics for Molecular Conformation Generation

3 code implementations • ICLR 2021 • Minkai Xu, Shitong Luo, Yoshua Bengio, Jian Peng, Jian Tang

Inspired by the recent progress in deep generative models, in this paper, we propose a novel probabilistic framework to generate valid and diverse conformations given a molecular graph.

valid

Paper
Code

DEUP: Direct Epistemic Uncertainty Prediction

1 code implementation • 16 Feb 2021 • Salem Lahlou, Moksh Jain, Hadi Nekoei, Victor Ion Butoi, Paul Bertin, Jarrid Rector-Brooks, Maksym Korablyov, Yoshua Bengio

Epistemic Uncertainty is a measure of the lack of knowledge of a learner which diminishes with more evidence.

Active Learning Image Classification +2

Paper
Code

Structured Sparsity Inducing Adaptive Optimizers for Deep Learning

2 code implementations • 7 Feb 2021 • Tristan Deleu, Yoshua Bengio

The parameters of a neural network are naturally organized in groups, some of which might not contribute to its overall performance.

Paper
Code

Scaling Equilibrium Propagation to Deep ConvNets by Drastically Reducing its Gradient Estimator Bias

no code implementations • 14 Jan 2021 • Axel Laborieux, Maxence Ernoult, Benjamin Scellier, Yoshua Bengio, Julie Grollier, Damien Querlioz

Equilibrium Propagation (EP) is a biologically-inspired counterpart of Backpropagation Through Time (BPTT) which, owing to its strong theoretical guarantees and the locality in space of its learning rule, fosters the design of energy-efficient hardware dedicated to learning.

Paper
Add Code

Dependency Structure Discovery from Interventions

no code implementations • 1 Jan 2021 • Nan Rosemary Ke, Olexa Bilaniuk, Anirudh Goyal, Stefan Bauer, Bernhard Schölkopf, Michael Curtis Mozer, Hugo Larochelle, Christopher Pal, Yoshua Bengio

Promising results have driven a recent surge of interest in continuous optimization methods for Bayesian network structure learning from observational data.

Paper
Add Code

Factorizing Declarative and Procedural Knowledge in Structured, Dynamical Environments

no code implementations • ICLR 2021 • Anirudh Goyal, Alex Lamb, Phanideep Gampa, Philippe Beaudoin, Charles Blundell, Sergey Levine, Yoshua Bengio, Michael Curtis Mozer

To use a video game as an illustration, two enemies of the same type will share schemata but will have separate object files to encode their distinct state (e. g., health, position).

Object

Paper
Add Code

Systematic generalisation with group invariant predictions

no code implementations • ICLR 2021 • Faruk Ahmed, Yoshua Bengio, Harm van Seijen, Aaron Courville

We consider situations where the presence of dominant simpler correlations with the target variable in a training set can cause an SGD-trained neural network to be less reliant on more persistently-correlating complex features.

Paper
Add Code

Conditional Networks

no code implementations • 1 Jan 2021 • Anthony Ortiz, Kris Sankaran, Olac Fuentes, Christopher Kiekintveld, Pascal Vincent, Yoshua Bengio, Doina Precup

In this work we tackle the problem of out-of-distribution generalization through conditional computation.

Image Classification Out-of-Distribution Generalization +1

Paper
Add Code

Meta Attention Networks: Meta-Learning Attention to Modulate Information Between Recurrent Independent Mechanisms

no code implementations • ICLR 2021 • Kanika Madan, Nan Rosemary Ke, Anirudh Goyal, Bernhard Schölkopf, Yoshua Bengio

Decomposing knowledge into interchangeable pieces promises a generalization advantage when there are changes in distribution.

Meta-Learning

Paper
Add Code

Neural Bayes: A Generic Parameterization Method for Unsupervised Learning

no code implementations • 1 Jan 2021 • Devansh Arpit, Huan Wang, Caiming Xiong, Richard Socher, Yoshua Bengio

Disjoint Manifold Separation: Neural Bayes allows us to formulate an objective which can optimally label samples from disjoint manifolds present in the support of a continuous distribution.

Clustering Representation Learning

Paper
Add Code

Spatially Structured Recurrent Modules

no code implementations • ICLR 2021 • Nasim Rahaman, Anirudh Goyal, Muhammad Waleed Gondal, Manuel Wuthrich, Stefan Bauer, Yash Sharma, Yoshua Bengio, Bernhard Schölkopf

Capturing the structure of a data-generating process by means of appropriate inductive biases can help in learning models that generalise well and are robust to changes in the input distribution.

Starcraft II Video Prediction

Paper
Add Code

FloW: A Dataset and Benchmark for Floating Waste Detection in Inland Waters

1 code implementation • ICCV 2021 • Yuwei Cheng, Jiannan Zhu, Mengxin Jiang, Jie Fu, Changsong Pang, Peidong Wang, Kris Sankaran, Olawale Onabola, Yimin Liu, Dianbo Liu, Yoshua Bengio

To promote the practical application for autonomous floating wastes cleaning, we present FloW, the first dataset for floating waste detection in inland water areas.

object-detection Robust Object Detection

Paper
Code

Machine Learning for Glacier Monitoring in the Hindu Kush Himalaya

1 code implementation • 9 Dec 2020 • Shimaa Baraka, Benjamin Akera, Bibek Aryal, Tenzing Sherpa, Finu Shresta, Anthony Ortiz, Kris Sankaran, Juan Lavista Ferres, Mir Matin, Yoshua Bengio

Glacier mapping is key to ecological monitoring in the hkh region.

BIG-bench Machine Learning

Paper
Code

Untangling tradeoffs between recurrence and self-attention in artificial neural networks

no code implementations • NeurIPS 2020 • Giancarlo Kerg, Bhargav Kanuparthi, Anirudh Goyal Alias Parth Goyal, Kyle Goyette, Yoshua Bengio, Guillaume Lajoie

Attention and self-attention mechanisms, are now central to state-of-the-art deep learning on sequential tasks.

Model Optimization

Paper
Add Code

Inductive Biases for Deep Learning of Higher-Level Cognition

no code implementations • 30 Nov 2020 • Anirudh Goyal, Yoshua Bengio

A fascinating hypothesis is that human and animal intelligence could be explained by a few principles (rather than an encyclopedic list of heuristics).

Systematic Generalization

Paper
Add Code

RetroGNN: Approximating Retrosynthesis by Graph Neural Networks for De Novo Drug Design

no code implementations • 25 Nov 2020 • Cheng-Hao Liu, Maksym Korablyov, Stanisław Jastrzębski, Paweł Włodarczyk-Pruszyński, Yoshua Bengio, Marwin H. S. Segler

A natural idea to mitigate this problem is to bias the search process towards more easily synthesizable molecules using a proxy for synthetic accessibility.

Retrosynthesis

Paper
Add Code

Gradient Starvation: A Learning Proclivity in Neural Networks

2 code implementations • NeurIPS 2021 • Mohammad Pezeshki, Sékou-Oumar Kaba, Yoshua Bengio, Aaron Courville, Doina Precup, Guillaume Lajoie

We identify and formalize a fundamental gradient descent phenomenon resulting in a learning proclivity in over-parameterized neural networks.

Ranked #1 on Out-of-Distribution Generalization on ImageNet-W

Out-of-Distribution Generalization

1,337

Paper
Code

COVI-AgentSim: an Agent-based Model for Evaluating Methods of Digital Contact Tracing

no code implementations • 30 Oct 2020 • Prateek Gupta, Tegan Maharaj, Martin Weiss, Nasim Rahaman, Hannah Alsdurf, Abhinav Sharma, Nanor Minoyan, Soren Harnois-Leblanc, Victor Schmidt, Pierre-Luc St. Charles, Tristan Deleu, Andrew Williams, Akshay Patel, Meng Qu, Olexa Bilaniuk, Gaétan Marceau Caron, Pierre Luc Carrier, Satya Ortiz-Gagné, Marc-Andre Rousseau, David Buckeridge, Joumana Ghosn, Yang Zhang, Bernhard Schölkopf, Jian Tang, Irina Rish, Christopher Pal, Joanna Merckx, Eilif B. Muller, Yoshua Bengio

The rapid global spread of COVID-19 has led to an unprecedented demand for effective methods to mitigate the spread of the disease, and various digital contact tracing (DCT) methods have emerged as a component of the solution.

Virology

Paper
Add Code

Predicting Infectiousness for Proactive Contact Tracing

1 code implementation • ICLR 2021 • Yoshua Bengio, Prateek Gupta, Tegan Maharaj, Nasim Rahaman, Martin Weiss, Tristan Deleu, Eilif Muller, Meng Qu, Victor Schmidt, Pierre-Luc St-Charles, Hannah Alsdurf, Olexa Bilanuik, David Buckeridge, Gáetan Marceau Caron, Pierre-Luc Carrier, Joumana Ghosn, Satya Ortiz-Gagne, Chris Pal, Irina Rish, Bernhard Schölkopf, Abhinav Sharma, Jian Tang, Andrew Williams

Predictions are used to provide personalized recommendations to the individual via an app, as well as to send anonymized messages to the individual's contacts, who use this information to better predict their own infectiousness, an approach we call proactive contact tracing (PCT).

Paper
Code

NU-GAN: High resolution neural upsampling with GAN

no code implementations • 22 Oct 2020 • Rithesh Kumar, Kundan Kumar, Vicki Anand, Yoshua Bengio, Aaron Courville

In this paper, we propose NU-GAN, a new method for resampling audio from lower to higher sampling rates (upsampling).

Audio Generation Speech Synthesis +1

Paper
Add Code

Cross-Modal Information Maximization for Medical Imaging: CMIM

no code implementations • 20 Oct 2020 • Tristan Sylvain, Francis Dutil, Tess Berthier, Lisa Di Jorio, Margaux Luck, Devon Hjelm, Yoshua Bengio

In hospitals, data are siloed to specific information systems that make the same information available under different modalities such as the different medical imaging exams the patient undergoes (CT scans, MRI, PET, Ultrasound, etc.)

Image Classification Medical Image Classification

Paper
Add Code

Neural Function Modules with Sparse Arguments: A Dynamic Approach to Integrating Information across Layers

no code implementations • 15 Oct 2020 • Alex Lamb, Anirudh Goyal, Agnieszka Słowik, Michael Mozer, Philippe Beaudoin, Yoshua Bengio

Feed-forward neural networks consist of a sequence of layers, in which each layer performs some processing on the information from the previous layer.

Domain Generalization

Paper
Add Code

CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and Transfer Learning

1 code implementation • ICLR 2021 • Ossama Ahmed, Frederik Träuble, Anirudh Goyal, Alexander Neitz, Yoshua Bengio, Bernhard Schölkopf, Manuel Wüthrich, Stefan Bauer

To facilitate research addressing this problem, we propose CausalWorld, a benchmark for causal structure and transfer learning in a robotic manipulation environment.

Reinforcement Learning (RL) Transfer Learning

201

Paper
Code

RNNLogic: Learning Logic Rules for Reasoning on Knowledge Graphs

2 code implementations • ICLR 2021 • Meng Qu, Junkun Chen, Louis-Pascal Xhonneux, Yoshua Bengio, Jian Tang

Then in the E-step, we select a set of high-quality rules from all generated rules with both the rule generator and reasoning predictor via posterior inference; and in the M-step, the rule generator is updated with the rules selected in the E-step.

Knowledge Graphs

118

Paper
Code

Visual Concept Reasoning Networks

no code implementations • 26 Aug 2020 • Taesup Kim, Sungwoong Kim, Yoshua Bengio

It approximates sparsely connected networks by explicitly defining multiple branches to simultaneously learn representations with different visual concepts or properties.

Action Recognition Image Classification +4

Paper
Add Code

Mastering Rate based Curriculum Learning

1 code implementation • 14 Aug 2020 • Lucas Willems, Salem Lahlou, Yoshua Bengio

Recent automatic curriculum learning algorithms, and in particular Teacher-Student algorithms, rely on the notion of learning progress, making the assumption that the good next tasks are the ones on which the learner is making the fastest progress or digress.

Paper
Code

Deriving Differential Target Propagation from Iterating Approximate Inverses

no code implementations • 29 Jul 2020 • Yoshua Bengio

We show that a particular form of target propagation, i. e., relying on learned inverses of each layer, which is differential, i. e., where the target is a small perturbation of the forward propagation, gives rise to an update rule which corresponds to an approximate Gauss-Newton gradient-based optimization, without requiring the manipulation or inversion of large matrices.

Paper
Add Code

Multi-Image Super-Resolution for Remote Sensing using Deep Recurrent Networks

1 code implementation • 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2020 • Md Rifat Arefin, Vincent Michalski, Pierre-Luc St-Charles, Alfredo Kalaitzis, Sookyung Kim, Samira E. Kahou, Yoshua Bengio

High-resolution satellite imagery is critical for various earth observation applications related to environment monitoring, geoscience, forecasting, and land use analysis.

Decoder Earth Observation +1

Paper
Code

BabyAI 1.1

3 code implementations • 24 Jul 2020 • David Yu-Tung Hui, Maxime Chevalier-Boisvert, Dzmitry Bahdanau, Yoshua Bengio

This increases reinforcement learning sample efficiency by up to 3 times and improves imitation learning performance on the hardest level from 77 % to 90. 4 %.

Computational Efficiency Imitation Learning +2

671

Paper
Code

S2RMs: Spatially Structured Recurrent Modules

no code implementations • 13 Jul 2020 • Nasim Rahaman, Anirudh Goyal, Muhammad Waleed Gondal, Manuel Wuthrich, Stefan Bauer, Yash Sharma, Yoshua Bengio, Bernhard Schölkopf

Capturing the structure of a data-generating process by means of appropriate inductive biases can help in learning models that generalize well and are robust to changes in the input distribution.

Starcraft II Video Prediction

Paper
Add Code

Revisiting Fundamentals of Experience Replay

2 code implementations • ICML 2020 • William Fedus, Prajit Ramachandran, Rishabh Agarwal, Yoshua Bengio, Hugo Larochelle, Mark Rowland, Will Dabney

Experience replay is central to off-policy algorithms in deep reinforcement learning (RL), but there remain significant gaps in our understanding.

DQN Replay Dataset Q-Learning +1

33,014

Paper
Code

Compositional Generalization by Factorizing Alignment and Translation

no code implementations • ACL 2020 • Jacob Russin, Jason Jo, R O{'}Reilly, all, Yoshua Bengio

Standard methods in deep learning for natural language processing fail to capture the compositional structure of human language that allows for systematic generalization outside of the training distribution.

Machine Translation Systematic Generalization +1

Paper
Add Code

Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules

1 code implementation • ICML 2020 • Sarthak Mittal, Alex Lamb, Anirudh Goyal, Vikram Voleti, Murray Shanahan, Guillaume Lajoie, Michael Mozer, Yoshua Bengio

To effectively utilize the wealth of potential top-down information available, and to prevent the cacophony of intermixed signals in a bidirectional architecture, mechanisms are needed to restrict information flow.

Language Modelling Open-Ended Question Answering +2

Paper
Code

Object Files and Schemata: Factorizing Declarative and Procedural Knowledge in Dynamical Systems

no code implementations • 29 Jun 2020 • Anirudh Goyal, Alex Lamb, Phanideep Gampa, Philippe Beaudoin, Sergey Levine, Charles Blundell, Yoshua Bengio, Michael Mozer

To use a video game as an illustration, two enemies of the same type will share schemata but will have separate object files to encode their distinct state (e. g., health, position).

Object

Paper
Add Code

Hybrid Models for Learning to Branch

1 code implementation • NeurIPS 2020 • Prateek Gupta, Maxime Gasse, Elias B. Khalil, M. Pawan Kumar, Andrea Lodi, Yoshua Bengio

First, in a more realistic setting where only a CPU is available, is the GNN model still competitive?

Paper
Code

Image-to-image Mapping with Many Domains by Sparse Attribute Transfer

no code implementations • 23 Jun 2020 • Matthew Amodio, Rim Assouel, Victor Schmidt, Tristan Sylvain, Smita Krishnaswamy, Yoshua Bengio

Unsupervised image-to-image translation consists of learning a pair of mappings between two domains without known pairwise correspondences between points.

Attribute Translation +1

Paper
Add Code

Rethinking Distributional Matching Based Domain Adaptation

no code implementations • 23 Jun 2020 • Bo Li, Yezhen Wang, Tong Che, Shanghang Zhang, Sicheng Zhao, Pengfei Xu, Wei Zhou, Yoshua Bengio, Kurt Keutzer

In this paper, in order to devise robust DA algorithms, we first systematically analyze the limitations of DM based methods, and then build new benchmarks with more realistic domain shifts to evaluate the well-accepted DM methods.

Domain Adaptation

Paper
Add Code

HNHN: Hypergraph Networks with Hyperedge Neurons

1 code implementation • 22 Jun 2020 • Yihe Dong, Will Sawin, Yoshua Bengio

Hypergraphs provide a natural representation for many real world datasets.

Hypergraph representations Representation Learning

Paper
Code

Untangling tradeoffs between recurrence and self-attention in neural networks

no code implementations • 16 Jun 2020 • Giancarlo Kerg, Bhargav Kanuparthi, Anirudh Goyal, Kyle Goyette, Yoshua Bengio, Guillaume Lajoie

Attention and self-attention mechanisms, are now central to state-of-the-art deep learning on sequential tasks.

Model Optimization

Paper
Add Code

Learning Causal Models Online

1 code implementation • 12 Jun 2020 • Khurram Javed, Martha White, Yoshua Bengio

One solution for achieving strong generalization is to incorporate causal structures in the models; such structures constrain learning by ignoring correlations that contradict them.

Continual Learning

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.