SOFTWARE

Below is software we have developed in our lab from most recent to oldest.

Acorn

acorn: An R package that works with de novo variants (DNVs) already called using a DNV caller (e.g., HAT). The toolkit is useful for extracting different types of DNVs and summarizing characteristics of the DNVs. Available at https://github.com/TNTurnerLab/acorn.

Relevant Publication: https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-023-05457-z

PYRUS

PYRUS: A tool for plotting copy number estimate data, from an individual, for user-specified regions of the genome. It has several options including plotting other individuals in the same region, plotting an annotation track, and writing out specific regions where the individuals have a copy number below or above given values. The input to the tool is bgzipped and tabix indexed bed files, which enables rapid plotting of the data. Available at https://github.com/TNTurnerLab/PYRUS.

Short Writeup: https://github.com/TNTurnerLab/PYRUS/blob/main/paper/paper.md

HAT

HAT: Hare And Tortoise, HAT, are two de novo variant callers we developed for parent-child trio sequencing data. Hare, as seen in Ng et al. 2022, uses the software Parabricks, v4.0.0-1, by NVIDIA, that leverages GPUs to accelerate variant calling, specifically for Haplotyecaller GATK 4.2.0 and DeepVariant v1.4.0. Tortoise uses freely available, open-source versions of these variant callers. We then use GLnexus to form family level joint-genotyped files to be run through our custom de novo variant filter. Available at https://github.com/TNTurnerLab/hat

Relevant Preprint: https://doi.org/10.1101/2023.01.27.525940

Relevant Publication: https://pubmed.ncbi.nlm.nih.gov/36054329/

ACES

ACES: A workflow to query small sequences in a large set of genomes. It provides several outputs including BLAST results, a multiple sequence alignment file, a graphical fragment assembly file, and a phylogenetic tree file. Available at https://github.com/TNTurnerLab/ACES.

Relevant publication: https://pubmed.ncbi.nlm.nih.gov/34601580/

Updated fitDNM

fitDNM for noncoding: fitDNM was originally developed by the Allen lab (http://people.duke.edu/~asallen/Software.html) in Jiang et al 2015, Am. J. Hum. Genet. (https://www.cell.com/ajhg/fulltext/S0002-9297(15)00277-3) to incorporate functional information in test of excess de novo mutational load. We adapted the pipeline, in collaboration with the Allen lab, to utilize CADD scores instead of PolyPhen-2 scores in order to run in noncoding regions of the genome and implemented a scalable version of the pipeline to test many elements at once. Given a bed file that contains the regions of interest one wants to test for a significant excess of de novo mutations and the corresponding variants to use, this pipeline will output two summary files that contain the p values and scores calculated by fitDNM for each element in the bed file in the “.fitDNM.report” file and a summary of all mutations found in these genomic regions in the “.mutation.report” file. Available at https://github.com/TNTurnerLab/fitDNM.

Relevant Publication: https://pubmed.ncbi.nlm.nih.gov/34256850/