Headshot

Gustaf Ahdritz

I'm a third-year PhD student in Computer Science at Harvard University. I'm a member of the Machine Learning Foundations Group and am advised by Boaz Barak and Jonathan Frankle. I'm supported by a fellowship from Harvard's Kempner Institute.

I'm broadly interested in empirical investigations of the properties of realistic deep neural networks. At the moment, I'm thinking about uncertainty in large language models.

In the summer of 2024, I interned at Apple, where I worked with Parikshit Gopalan and Udi Wieder (et al.). I graduated from Columbia with a B.A. in Computer Science & History (2020) and an M.S. in Computer Science (2021). There, I worked with Mohammed AlQuraishi on the applied task of protein structure prediction and lead the development of OpenFold. I also spent time in Kathleen McKeown's lab and the History Lab.

Here's my CV.

Papers

* denotes equal contribution

Preprints

  • Modeling Real-Time Interactive Conversations as Timed Diarized Transcripts
    Garrett Tanzer, Gustaf Ahdritz, Luke Melas-Kyriazi,
    arXiv, 2024.
    [paper] [code] [tweetorial] [bibtex]

    TL;DR: Training language models directly on timed, diarized transcripts (e.g. instant messenger logs) permits true real-time interactivity.

Workshop papers

  • Soft prompting might be a bug, not a feature
    Luke Bailey, Gustaf Ahdritz, Anat Kleiman, Siddharth Swaroop, Finale Doshi-Velez, Weiwei Pan
    Workshop on Challenges in Deployable Generative AI, ICML 2023.
    [paper] [bibtex]

    TL;DR: Contrary to prior speculation, we find that soft prompts (created with "prompt-" or "prefix-tuning") differ from natural token embeddings in key ways, complicating attempts to decode them back into natural language.

Publications

  • Distinguishing the Knowable from the Unknowable with Language Models
    Gustaf Ahdritz, Tian Qin, Nikhil Vyas, Boaz Barak, Benjamin L. Edelman
    ICML, 2024.
    [paper] [code] [blog] [tweetorial] [bibtex]

    TL;DR: Linear probes of language model activations can predict when the predictive entropy of much larger and more knowledgeable models is close to zero, and they even work out-of-distribution!
  • OpenFold: Retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization
    Gustaf Ahdritz, Nazim Bouatta, Christina Floristean, Sachin Kadyan, Qinghui Xia, William Gerecke, Timothy J. O'Donnell, Daniel Berenberg, Ian Fisk, Niccolò Zanichelli, Bo Zhang, Arkadiusz Nowaczynski, Bei Wang, Marta M. Stepniewska-Dziubinska, Shang Zhang, Adegoke Ojewole, Murat Efe Guney, Stella Biderman, Andrew M. Watkins, Stephen Ra, Pablo Ribalta Lorenzo, Lucas Nivon, Brian Weitzner, Yih-En Andrew Ban, Shiyang Chen, Minjia Zhang, Conglong Li, Shuaiwen Leon Song, Yuxiong He, Peter K. Sorger, Emad Mostaque, Zhao Zhang, Richard Bonneau, Mohammed AlQuraishi
    Nature Methods 21, 1514-1524 (2024).
    [paper] [code] [talk] [coverage] [tweetorial (preprint)] [tweetorial (publication)] [bibtex]

    TL;DR: We created the first trainable, open-source reproduction of AlphaFold2 and used it to study how the model learns to fold. We found surprisingly fast convergence, robustness to lack of diversity in the training set, and regular progressions in the dimensionality of predicted structures over the course of training. We also optimized the model, making inference possible on much longer sequences, and added new features to improve training stability.
  • OpenProteinSet: Training data for structural biology at scale
    Gustaf Ahdritz, Nazim Bouatta, Sachin Kadyan, Lukas Jarosch, Daniel Berenberg, Ian Fisk, Andrew M. Watkins, Stephen Ra, Richard Bonneau, Mohammed AlQuraishi
    NeurIPS Track on Datasets and Benchmarks, 2023.
    [paper] [data] [bibtex]

    TL;DR: We present the largest open repository of precomputed multiple sequence alignments (MSAs) of proteins, representing millions of compute hours. MSAs are important primitives across bioinformatics, but their steep computational cost has previously limited their accessibility outside large research labs in industry (notably DeepMind and Meta, of AlphaFold2 and the MSA Transformer, respectively).
  • Single-sequence protein structure prediction using a language model and deep learning
    Ratul Chowdhury, Nazim Bouatta, Surojit Biswas, Christina Floristean, Anant Kharkar, Koushik Roy, Charlotte Rochereau, Gustaf Ahdritz, Joanna Zhang, George M. Church, Peter K. Sorger, Mohammed AlQuraishi
    Nature Biotechnology, 2022.
    [paper] [code] [coverage] [bibtex]

    TL;DR: We present RGN2, an end-to-end "single-sequence" protein structure prediction model that relies on a small protein language model rather than multiple sequence alignments. RGN2 outperforms AlphaFold2 and RoseTTAFold on orphan proteins and is faster by orders of magnitude.

Teaching

I've served as a teaching fellow/assistant for the following courses at Harvard and Columbia:
  • Spring 2023: Foundations of Deep Learning (Harvard COMPSCI 229br) with Boaz Barak
  • Spring 2019 - Spring 2021: Advanced Programming (Columbia COMS 3157) with Jae Woo Lee

Awards & Fellowships

Links

[github] [google scholar] [twitter]