Machine learning research with positive real-world impact.
Ambitious applied research, positive outcomes
Recent highlights
Our research is supported by access to massive datasets, close collaboration with world renowned academic faculty, and a uniquely scalable machine learning platform.
NeurIPS 2023 | Exposing flaws of generative model evaluation metrics and their unfair treatment of diffusion models
NeurIPS 2023 | Exposing flaws of generative model evaluation metrics and their unfair treatment of diffusion models

Abstract
We systematically study a wide variety of image-based generative models spanning semantically-diverse datasets to understand and improve the feature extractors and metrics used to evaluate them. Using best practices in psychophysics, we measure human perception of image realism for generated samples by conducting the largest experiment evaluating generative models to date, and find that no existing metric strongly correlates with human evaluations. Comparing to 16 modern metrics for evaluating the overall performance, fidelity, diversity, and memorization of generative models, we find that the state-of-the-art perceptual realism of diffusion models as judged by humans is not reflected in commonly reported metrics such as FID. This discrepancy is not explained by diversity in generated samples, though one cause is over-reliance on Inception-V3. We address these flaws through a study of alternative self-supervised feature extractors, find that the semantic information encoded by individual networks strongly depends on their training procedure, and show that DINOv2-ViT-L/14 allows for much richer evaluation of generative models. Next, we investigate data memorization, and find that generative models do memorize training examples on simple, smaller datasets like CIFAR10, but not necessarily on more complex datasets like ImageNet. However, our experiments show that current metrics do not properly detect memorization; none in the literature is able to separate memorization from other phenomena such as underfitting or mode shrinkage. To facilitate further development of generative models and their evaluation we release all generated image datasets, human evaluation data, and a modular library to compute 16 common metrics for 8 different encoders.
NeurIPS 2023 | Adversarially robust learning with uncertain perturbation sets
NeurIPS 2023 | Adversarially robust learning with uncertain perturbation sets

Abstract
In many real-world settings exact perturbation sets to be used by an adversary are not plausibly available to a learner. While prior literature has studied both scenarios with completely known and completely unknown perturbation sets, we propose an in-between setting of learning with respect to a class of perturbation sets. We show that in this setting we can improve on previous results with completely unknown perturbation sets, while still addressing the concerns of not having perfect knowledge of these sets in real life. In particular, we give the first positive results for the learnability of infinite Littlestone classes when having access to a perfect-attack oracle. We also consider a setting of learning with abstention, where predictions are considered robustness violations, only when the wrong prediction is made within the perturbation set. We show there are classes for which perturbation-set unaware learning without query access is possible, but abstention is required.
RecSys Challenge 2023 1st Place | Robust User Engagement Modeling with Transformers and Self-Supervision
RecSys Challenge 2023 1st Place | Robust User Engagement Modeling with Transformers and Self-Supervision

Abstract
Online advertising has seen exponential growth transforming into a vast and dynamic market that encompasses many diverse platforms such web search, e-commerce, social media and mobile apps. The rapid growth of products and services presents a formidable challenge for advertising platforms, and accurately modeling user intent is increasingly critical for targeted ad placement. The 2023 ACM RecSys Challenge, organized by ShareChat, provides a standardized benchmark for developing and evaluating user intent models using a large dataset of impression from the ShareChat and Moj apps. In this paper we present our approach to this challenge. We use Transformers to automatically capture interactions between different types of input features, and propose a self-supervised optimization framework based on the contrastive objective. Empirically, we demonstrate that self-supervised learning effectively reduces overfitting improving model generalization and leading to significant gains in performance. Our team, Layer 6 AI, achieved 1st place on the final leaderboard out of over 100 teams.
MLHC 2023 | DuETT: Dual Event Time Transformer for Electronic Health Records
MLHC 2023 | DuETT: Dual Event Time Transformer for Electronic Health Records

Abstract
Electronic health records (EHRs) recorded in hospital settings typically contain a wide range of numeric time series data that is characterized by high sparsity and irregular observations. Effective modelling for such data must exploit its time series nature, the semantic relationship between different types of observations, and information in the sparsity structure of the data. Self-supervised Transformers have shown outstanding performance in a variety of structured tasks in NLP and computer vision. But multivariate time series data contains structured relationships over two dimensions: time and recorded event type, and straightforward applications of Transformers to time series data do not leverage this distinct structure. The quadratic scaling of self-attention layers can also significantly limit the input sequence length without appropriate input engineering. We introduce the DuETT architecture, an extension of Transformers designed to attend over both time and event type dimensions, yielding robust representations from EHR data. DuETT uses an aggregated input where sparse time series are transformed into a regular sequence with fixed length; this lowers the computational complexity relative to previous EHR Transformer models and, more importantly, enables the use of larger and deeper neural networks. When trained with self-supervised prediction tasks, that provide rich and informative signals for model pre-training, our model outperforms state-of-the-art deep learning models on multiple downstream tasks from the MIMIC-IV and PhysioNet-2012 EHR datasets.
Nature Communications | Decentralized federated learning through proxy model sharing
Nature Communications | Decentralized federated learning through proxy model sharing

Abstract
Institutions in highly regulated domains such as finance and healthcare often have restrictive rules around data sharing. Federated learning is a distributed learning framework that enables multi-institutional collaborations on decentralized data with improved protection for each collaborator’s data privacy. In this paper, we propose a communication-efficient scheme for decentralized federated learning called ProxyFL, or proxy-based federated learning. Each participant in ProxyFL maintains two models, a private model, and a publicly shared proxy model designed to protect the participant’s privacy. Proxy models allow efficient information exchange among participants without the need of a centralized server. The proposed method eliminates a significant limitation of canonical federated learning by allowing model heterogeneity; each participant can have a private model with any architecture. Furthermore, our protocol for communication by proxy leads to stronger privacy guarantees using differential privacy analysis. Experiments on popular image datasets, and a cancer diagnostic problem using high-quality gigapixel histology whole slide images, show that ProxyFL can outperform existing alternatives with much less communication overhead and stronger privacy.
Our research areas include recommendation systems, computer vision, time series forecasting, and natural language processing.

Big vision, deep roots
The co-founders of Layer 6, Jordan Jacobs and Tomi Poutanen, are also founders of the Vector Institute for Artificial Intelligence, and we maintain multiple research initiatives with Vector faculty. Current and former scientific advisors include professors Raquel Urtasun, Sanja Fidler, Rich Zemel, David Duvenaud, Laura Rosella and Scott Sanner.
Meaningful partnerships
Originally founded in 2011, Layer 6 now forms the AI research lab of TD Bank Group. Layer 6 impacts the lives of 25 million customers, helping more people achieve their financial goals. Partnerships with TD Securities provides Layer 6 with market data for training algo trading systems.
Layer 6 embraces opportunities to collaborate with Toronto’s world-leading medical research community, offering deep learning solutions to transform healthcare delivery and improve health outcomes. We are the first to deploy deep learning models on health data covering large population.
Passion to learn, driven to succeed
Our team represents 18 different countries of birth and we care deeply about fostering an inclusive culture where we learn from each other and win together. We are united by our passion for deep learning and a desire to apply our skills to have an outsized and positive impact on the future.
Meet some of our team
Develop your career at Layer 6
We’re growing our team exclusively with people driven to be at the top of the game in machine learning.

In the news
-
Inside TD’s AI play: How Layer 6’s technology hopes to improve old-fashioned banking advice
Globe And Mail
Read articleon Globe And Mail -
Tomi Poutanen: Geoffrey Hinton's Turing Award celebrates a life devoted to ground-breaking AI research
TD Newsroom
Read articleon TD Newsroom -
Tomi Poutanen: Chief Artificial Intelligence Officers Enter the C-Suite
Wall Street Journal
Read articleon Wall Street Journal -
TD Bank’s ‘Layer 6’ to bring machine learning personalization to diabetes care
IT Business
Read articleon IT Business -
TD Advances Innovation in Canadian Healthcare
TD Bank Group
Read articleon TD Bank Group
-
Inside TD’s AI play: How Layer 6’s technology hopes to improve old-fashioned banking advice
Globe And Mail
Read articleon Globe And Mail -
Tomi Poutanen: Geoffrey Hinton's Turing Award celebrates a life devoted to ground-breaking AI research
TD Newsroom
Read articleon TD Newsroom -
Tomi Poutanen: Chief Artificial Intelligence Officers Enter the C-Suite
Wall Street Journal
Read articleon Wall Street Journal -
TD Bank’s ‘Layer 6’ to bring machine learning personalization to diabetes care
IT Business
Read articleon IT Business -
TD Advances Innovation in Canadian Healthcare
TD Bank Group
Read articleon TD Bank Group -
Jordan Jacobs, co-founder of Vector Institute on Canada as a global AI leader
IT Business
Read articleon IT Business -
Layer 6’s Jordan Jacobs: Canada needs to promote itself as an AI leader
BetaKit
Read articleon BetaKit -
U of T alumni and graduate students part of Layer 6 AI's win in global competition
U of T News
Read articleon U of T News -
Tomi Poutanen and Michael Rhodes discuss the future of artificial intelligence with Amanda Lang
TD Bank Group
Read articleon TD Bank Group -
Get Smart – Artificial intelligence is transforming business and life
Ivey Business School
Read articleon Ivey Business School -
How TD Bank plans to use artificial intelligence
BNN Bloomberg
Read articleon BNN Bloomberg