Gonzalo Benegas

Research Scientist at Open Athena

profile_round.jpg

I work on adapting large language models to new domains — understanding what it takes in terms of data curation, architecture design, and evaluation to make foundation models useful beyond text. My primary focus has been genomics, where I developed DNA language models that predict the effects of genetic variants across the human genome.

I’m currently a Research Scientist at Open Athena, working on open-source LLMs for science. I received a PhD in Computational Biology from UC Berkeley, advised by Yun S. Song, and a Licentiate in Computer Science from the University of Buenos Aires.

selected publications

2025

  1. gpnstar.png
    Predicting functional constraints across evolutionary timescales with phylogeny-informed genomic language models
    Chengzhong Ye*, Gonzalo Benegas*, Carlos Albors, Jianan Canal Li, Sebastian Prillo, and 3 more authors
    bioRxiv, 2025
  2. gpnmsa.png
    A DNA language model based on multispecies alignment predicts the effects of genome-wide variants
    Gonzalo Benegas, Carlos Albors, Alan J Aw, Chengzhong Ye, and Yun S Song
    Nature Biotechnology, 2025
  3. tig_review.jpg
    Genomic language models: opportunities and challenges
    Gonzalo Benegas*, Chengzhong Ye*, Carlos Albors*, Jianan Canal Li*, and Yun S Song
    Trends in Genetics, 2025

2023

  1. gpn_logo.png
    DNA language models are powerful predictors of genome-wide variant effects
    Gonzalo Benegas, Sanjit Singh Batra, and Yun S. Song
    Proceedings of the National Academy of Sciences, 2023