Publications
Current publications and preprints.
Preprints
-
Population-scale Ancestral Recombination Graphs with tskit 1.0Ben Jeffery, Yan Wong, Kevin Thornton, Georgia Tsambos, Gertjan Bisschop, and 47 more authors2026PreprintAncestral recombination graphs (ARGs) are an increasingly important component of population and statistical genetics. The tskit library has become key infrastructure for the field, providing an expressive and general representation of ARGs together with a suite of efficient fundamental operations. In this note, we announce tskit version 1.0, describe its underlying rationale, and document its stability guarantees. These guarantees provide a foundation for durable computational artefacts and support long-term reproducibility of code and analyses.
@unpublished{tskit, title = {Population-scale Ancestral Recombination Graphs with tskit 1.0}, author = {Jeffery, Ben and Wong, Yan and Thornton, Kevin and Tsambos, Georgia and Bisschop, Gertjan and Deng, Yun and Ellerman, E. Castedo and Forest, Thomas B. and Fritze, Halley and Goldstein, Daniel and Gorjanc, Gregor and Gower, Graham and Gravel, Simon and Guez, Jeremy and Haller, Benjamin C. and Kern, Andrew D. and Kirk, Lloyd and Krukov, Ivan and Lee, Hanbin and Lehmann, Brieuc and Loay, Hossameldin and Osmond, Matthew M. and Palmer, Duncan S. and Pope, Nathaniel S. and Ragsdale, Aaron P. and Robertson, Duncan and Rodrigues, Murillo F. and van Kemenade, Hugo and Weiß, Clemens L. and Wohns, Anthony Wilder and Zhan, Shing H. and Zhang, Brian C. and Aspbury, Marianne and Baya, Nikolas A. and Belsare, Saurabh and Biddanda, Arjun and Jiménez, Francisco Campuzano and Gladstein, Ariella and Guo, Bing and Karthikeyan, Savita and Kretzschmar, Warren W. and Rebollo, Inés and Saunack, Kumar and Shemirani, Ruhollah and Simon, Alexis and Smith, Chris and Sukumaran, Jeet and Terhorst, Jonathan and Unneberg, Per and Zhang, Ao and Ralph, Peter and Kelleher, Jerome}, year = {2026}, archiveprefix = {arXiv}, primaryclass = {q-bio.PE}, note = {Preprint}, url = {https://arxiv.org/abs/2602.09649} } -
Faithful Reeb Graph Reconstruction of a Tectonic Subduction Zone from Earthquake HypocentersHalley Fritze, Sushovan Majhi, Marissa Masden, Atish Mitra, and Michael Stickney2025Preprint@unpublished{faithfulreebgraphreconstruction, title = {Faithful Reeb Graph Reconstruction of a Tectonic Subduction Zone from Earthquake Hypocenters}, author = {Fritze, Halley and Majhi, Sushovan and Masden, Marissa and Mitra, Atish and Stickney, Michael}, year = {2025}, archiveprefix = {arXiv}, primaryclass = {cs.CG}, note = {Preprint}, url = {https://arxiv.org/abs/2410.19410} } -
Multiscale 2-Mapper – Exploratory Data Analysis Guided by the First Betti NumberHalley Fritze2025PreprintThe Mapper algorithm is a fundamental tool in exploratory topological data analysis for identifying connectivity and topological clustering in data. Derived from the nerve construction, Mapper graphs can contain additional information about clustering density when considering the higher-dimensional skeleta. To observe two-dimensional features, and capture one-dimensional topology, we construct 2-Mapper. A common issue in using Mapper algorithms is parameter choice. We develop tools to choose 2-Mapper parameters that reflect persistent Betti-1 information. Computationally, we study how cover choice affects 2-Mapper and analyze this through a computational Multiscale Mapper algorithm. We test our constructions on three-dimensional shape data, including the Klein bottle.
@unpublished{2-mapper, title = {Multiscale 2-Mapper -- Exploratory Data Analysis Guided by the First Betti Number}, author = {Fritze, Halley}, year = {2025}, archiveprefix = {arXiv}, primaryclass = {cs.CG}, note = {Preprint}, url = {https://arxiv.org/abs/2509.22816} }
Publications
-
TopoBench: A Framework for Benchmarking Topological Deep LearningLev Telyatnikov, Guillermo Bernardez, Marco Montagna, Mustafa Hajij, Martin Carrasco, and 32 more authorsJournal of Data-centric Machine Learning Research, 2025This work introduces TopoBench, an open-source library designed to standardize benchmarking and accelerate research in topological deep learning (TDL). TopoBench decomposes TDL into a sequence of independent modules for data generation, loading, transforming and processing, as well as model training, optimization and evaluation. This modular organization provides flexibility for modifications and facilitates the adaptation and optimization of various TDL pipelines. A key feature of TopoBench is its support for transformations and lifting across topological domains. Mapping the topology and features of a graph to higher-order topological domains, such as simplicial and cell complexes, enables richer data representations and more fine-grained analyses. The applicability of TopoBench is demonstrated by benchmarking several TDL architectures across diverse tasks and datasets.
-
A forest is more than its trees: haplotypes and ancestral recombination graphsHalley Fritze, Nathaniel Pope, Jerome Kelleher, and Peter RalphGenetics, Jan 2026Foreshadowing haplotype-based methods of the genomics era, it is an old observation that the “junction” between two distinct haplotypes produced by recombination is inherited as a Mendelian marker. In a genealogical context, this recombination-mediated information reflects the persistence of ancestral haplotypes across local genealogical trees in which they do not represent coalescences. We show how these non-coalescing haplotypes (“locally-unary nodes”) may be inserted into ancestral recombination graphs, a compact but information-rich data structure describing the genealogical relationships among recombinant sequences. The resulting ancestral recombination graphs are smaller, faster to compute with, and the additional ancestral information that is inserted is nearly always correct where the initial ancestral recombination graph is correct. We provide efficient algorithms to infer locally-unary nodes within existing ancestral recombination graphs, and explore some consequences for ancestral recombination graphs inferred from real data. To do this, we introduce new metrics of agreement and disagreement between ancestral recombination graphs that, unlike previous methods, consider ancestral recombination graphs as describing relationships between haplotypes rather than just a collection of trees.