Sitemap
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Pages
Posts
publications
BioinformaticsBench: A collaboratively built large language model benchmark for Bioinformatics reasoning
Published in ICML 2024, 2024
Most of the existing Large Language Model (LLM) benchmarks on bioinformatics problem reasoning focus on problems grounded to niche research domains where datasets contain a small number of samples and, therefore are not truly representative of the broad domain of bioinformatics. To systematically examine the reasoning capabilities required for solving complex bioinformatics problems, we introduce an expansive benchmark suite BioinformaticsBench for LLMs.
Citation: Varuni Sarwal, Seungmo Lee, Rosemary He, Aingela Kattapuram, xiaoxuan wang, Eleazar Eskin, Wei Wang, Serghei Mangul. "BioinformaticsBench: A collaboratively built large language model benchmark for Bioinformatics reasoning." AccMLBio ICML 2024.
Download Paper
A blended genome and exome sequencing method captures genetic variation in an unbiased, high-quality, and cost-effective manner
Published in bioRxiv, 2024
We deployed the Blended Genome Exome (BGE), a DNA library blending approach that generates low pass whole genome (1-4x mean depth) and deep whole exome (30-40x mean depth) data in a single sequencing run. This technology is cost-effective, empowers most genomic discoveries possible with deep whole genome sequencing, and provides an unbiased method to capture the diversity of common SNP variation across the globe. To evaluate this new technology at scale, we applied BGE to sequence >53,000 samples from the Populations Underrepresented in Mental Illness Associations Studies (PUMAS) Project, which included participants across African, African American, and Latin American populations.
Citation: Boltz, T., Chu, B., Liao,C. Sealock, J , … Lee, Seungmo. ... Martin,R. (2024). "A blended genome and exome sequencing method captures genetic variation in an unbiased, high-quality, and cost-effective manner." bioRxiv.
Download Paper
VISTA: an integrated framework for structural variant discovery
Published in Briefings in Bioinformatics, 2024
Here, we report an integrated structural variant calling framework, Variant Identification and Structural Variant Analysis (VISTA), that leverages the results of individual callers using a novel and robust filtering and merging algorithm.
Citation: Varuni Sarwal*, Seungmo Lee*, Jianzhi Yang, Sriram Sankararaman, Mark Chaisson, Eleazar Eskin, Serghei Mangul "VISTA: an integrated framework for structural variant discovery." Briefings in Bioinformatics, Volume 25, Issue 5.
Download Paper
talks
RECOMB 2024 Short Talk Presentation
Published:
VISTA: an integrated framework for structural variant discovery
ISMB 2024, Satellite Talk
Published:
VISTA: an integrated framework for structural variant discovery
RECOMB 2025 Short Talk Presentation
Published:
Metagenomics agnostic Test
teaching
CS 122: Algorithms in Bioinformatics
Winter 2025, Computer Science Department, 2025



