Alex Tamkin

Email: atamkin_stanford_edu | Twitter: @alextamkin | Mastodon: sigmoid.social/@alextamkin

I'm a fifth-year PhD student in Computer Science at Stanford, advised by Noah Goodman and part of the Stanford AI Lab and the Stanford NLP Group.

My research focuses on how to make foundation models (e.g. GPT-3) safe and useful for real-world problems in the natural sciences, engineering, and healthcare.

I study both language models as well as general machine learning methods that can be applied broadly to images, organic molecules, astronomical data, satellite imagery, wearable sensors, and more.

I'm grateful to be supported by an Open Philanthropy AI Fellowship.

In Fall 2021 I was the instructor of Stanford's CS 197: Computer Science Research. (Slides and materials)

RESEARCH OVERVIEW

Foundation models are machine learning models (e.g. SimCLR or GPT-3), which are trained on a large amount of unlabeled data and can be easily adapted to many downstream tasks.

My research focuses on making foundation models safe and useful for real-world problems in the sciences, engineering, and healthcare.

My work focuses on both sides of this problem—training these models well and adapting them fruitfully:

  1. Enabling foundation models to learn from any kind of data, beyond just text or images
    – State-of-the-art methods for training foundation models are specialized for particular modalities, such as text or images. My work has shown that we can find a single method that works across diverse data from 12 different fields, including real-world scientific applications in genomics, wearable sensors, and multispectral satellite imagery.
    – I've also proposed Viewmaker Networks, a foundation model that learns to pose its own training task, and shown how it implements a behavior called feature dropout that helps foundation models perform well on a broad range of tasks.

  2. Enabling foundation models to behave well when tasks are ambiguous
    – Machine learning research is typically conducted on benchmarks targeting a well-specified problem. However, in the real world, often defining the task well for a foundation model will be much of the challenge.
    – My work includes the first study of how language models respond to ambiguous tasks, and has shown that pretrained models can be finetuned using active learning to resolve this ambiguity and generalize more robustly.

Talks

  • UC Berkeley, October 2022
  • MIT, October 2022
  • Cornell University, September 2022
  • Columbia University, September 2022
  • University of Washington, June 2022, Self-Supervised Learning for the Real World
  • Harvard Medical School, February 2022, DABS: A Domain-Agnostic Benchmark for Self-Supervised Learning
  • Invited Talk, NeurIPS Workshop on Controllable Generative Modeling in Language and Vision, December 2021, Off the Beaten Path: Domain-Agnostic ML for Controllable Generation and Beyond
  • Stanford Center for Research on Foundation Models, October 2021, Active Learning Helps Pretrained Models Learn the Intended Task
  • Stanford Vision and Learning Lab, August 2021, Towards Universal Self-Supervision
  • Stanford OVAL Seminar, May 2021, Understanding and Controlling Transfer Learning in Large Language Models
  • FAIR, December 2020, Language Through a Prism: A Spectral Approach for Multiscale Language Representations.
  • Google Brain, September 2018, Searching for Planets with WaveNet
  • NASA Ames Research Center, September 2018, Overcoming Dataset Challenges for Vetting Exoplanets with Machine Learning.

Publications

BenchMD: A Benchmark for Modality-Agnostic Learning on Medical Images and Sensors

Kathryn Wantlin, Chenwei Wu, Shih-Cheng Huang, Oishi Banerjee, FARAH DADABHOY, Veeral Vipin Mehta, Ryan Wonhee Han, Fang Cao, Raja R. Narayan, Errol Colak, Adewole S. Adamson, Laura Heacock, Geoffrey H. Tison, Alex Tamkin*, Pranav Rajpurkar*ArXiv Preprint

Feature Dropout: Revisiting the Role of Augmentations in Contrastive Learning

Alex Tamkin, Margalit Glasgow, Xiluo He, Noah GoodmanArXiv Preprint

Task Ambiguity in Humans and Language Models

Alex Tamkin*, Kunal Handa*, Avash Shrestha, Noah GoodmanICLR 2023

DABS 2.0: Improved Datasets and Algorithms for Universal Self-Supervision [🐦thread]

Alex Tamkin, Gaurab Banerjee, Mohamed Owda, Vincent Liu, Shashank Rammoorthy, Noah GoodmanNeurIPS 2022

Active Learning Helps Pretrained Models Learn the Intended Task [🐦thread]

Alex Tamkin*, Dat Nguyen*, Salil Deshpande*, Jesse Mu, Noah GoodmanNeurIPS 2022

Oolong: Investigating What Makes Crosslingual Transfer Hard with Controlled Studies [🐦thread]

Zhengxuan Wu*, Isabel Papadimitriou*, Alex Tamkin*ArXiv Preprint

DABS: A Domain-Agnostic Benchmark for Self-Supervised Learning [🌐site] [🐦thread]

Alex Tamkin, Vincent Liu, Rongfei Lu, Daniel Fein, Colin Schultz, Noah GoodmanNeurIPS 2021Press: [Redshift Magazine] [AIM Magazine] [Stanford HAI]

Tradeoffs Between Contrastive and Supervised Learning: An Empirical Study

Ananya Karthik, Mike Wu, Noah Goodman, Alex TamkinNeurIPS 2021 Workshop on Self-Supervised Learning - Theory and Practice

C5T5: Controllable Generation of Organic Molecules with Transformers

Daniel Rothchild, Alex Tamkin, Julie Yu, Ujval Misra, Joseph GonzalezArXiv Preprint

On the Opportunities and Risks of Foundation Models

Center for Research on Foundation Models (full list of authors)– Section 4.2: Training and Self-Supervision, Alex Tamkin– Section 4.9: AI Safety and Alignment, Alex Tamkin, Geoff Keeling, Jack Ryan, Sydney von ArxCoauthor: Sections §2.2: Vision, §3.3: Education, §4.1 Modeling, §5.6: Ethics of ScalePress: [Forbes] [The Economist] [VentureBeat]

Viewmaker Networks: Learning Views for Unsupervised Representation Learning [📝blogpost] [🐦thread]

Alex Tamkin, Mike Wu, Noah GoodmanICLR 2021

Language Through a Prism: A Spectral Approach for Multiscale Language Representations [🐦thread] [📝blogpost]

Alex Tamkin, Dan Jurafsky, Noah GoodmanNeurIPS 2020

Investigating Transferability in Pretrained Language Models [🐦thread]

Alex Tamkin, Trisha Singh, Davide Giovanardi, Noah GoodmanFindings of EMNLP 2020; Presented at CoNLL 2020

Distributionally-Aware Exploration for CVaR Bandits.

Alex Tamkin, Ramtin Keramati, Christoph Dann, Emma Brunskill. NeurIPS 2019 Workshop on Safety and Robustness in Decision Making; RLDM 2019

Being Optimistic to Be Conservative: Quickly Learning a CVaR Policy

Ramtin Keramati, Christoph Dann, Alex Tamkin, Emma Brunskill. AAAI 2020

Recursive Routing Networks: Learning to Compose Modules for Language Understanding.

Ignacio Cases, Clemens Rosenbaum, Matthew Riemer, Atticus Geiger, Tim Klinger, Alex Tamkin, Olivia Li, Sandhini Agarwal, Joshua D Greene, Dan Jurafsky, Christopher Potts, Lauri KarttunenNAACL 2019

Drone.io: A Gestural and Visual Interface for Human-Drone Interaction.

Jessica R Cauchard, Alex Tamkin, Cheng Yao Wang, Luke Vink, Michelle Park, Tommy Fang, James A Landay. HRI 2019

Identifying Exoplanets with Deep Learning: Towards Improved Planet Occurrence Rates with Kepler, K2, and TESS.

Andrew Vanderburg, Christopher Shallue, Liang Yu, Anne Dattilo, Alex Tamkin. American Astronomical Society Meeting Abstracts, 2019

Other Writing

Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models [📝blogpost]

Alex Tamkin*, Miles Brundage*, Jack Clark, Deep GanguliArXiv Preprint Press: [VentureBeat] [Datanami] [Slator]

Input on the European Commission White Paper on Artificial Intelligence

Marietje Schaake, Elisabeth Appel, Dathan M. Duplichen, Lisa Einstein, Wren Elhai, Muhammad Dhafer, Muhammad Faishal, Agata Foryciarz, Sydney L. Frankenberg, Toni Friedman, Zoe Huczok, Kyra Jasper, Danielle Jablanski, Jennifer King, Cindy Kuang, Heajune Lee, Shreya Mantha, Vidyangi Patil, Gailyn Portelance, Adriana Stephan, Alex Tamkin, Alessandro Vecchiato, Eva Zhang, Jason Zhao

(See Essays for more)

Media

AI Artwork in PC Magazine (twitter thread: DALL-E Meets WALL-E: an Art History)

Interview on The Gradient Podcast - Alex Tamkin on Self-Supervised Learning and Large Language Models

Press: [Communications of the ACM]

Interview on The Engineered Mind Podcast - NLP, AI Ethics & PhD Life

Personal

Other topics I think a lot about:

Outside of research, I organize the Stanford Queer in AI Dinner with Stanford Inclusion in AI

I also like making art, especially ceramics and photography!