DATASETS

Below are datasets from our lab from most recent to oldest.

Neuro-2a Cell Line Deep

Coverage Revio WGS

We deeply sequenced the Neuro-2a cell line on a PacBio Revio sequencer (98x coverage)

Relevant Preprint: https://www.biorxiv.org/content/10.1101/2023.06.06.543940v1

Globus Link: https://app.globus.org/file-manager?origin_id=4865823e-01af-11ee-a924-63e0d97254cd&path=%2F

Code Link: https://github.com/TNTurnerLab/Revio_N2A

HT-22 Project

HT22 Genome, Epigenome, and Functional Genomics Characterization: In this project, we assess genomic, epigenomic, and functional genomic characteristics of the HT-22 Mouse Hippocampal Neuronal Cell Line including: Karyotype, Illumina short-read whole-genome sequencing, PacBio HiFi long-read whole-genome sequencing, Illumina PolyA RNA sequencing, PacBio Long-Read RNA Iso-Seq, Illumina ATAC-sequencing of HT-22 cell line, and Illumina Hi-C sequencing of HT-22 cell line.

We deposited the raw data at https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA938057.

The GitHub link for this project is at https://github.com/TNTurnerLab/HT22_genome_epigenome_functional_genomics_project.

Ng et al. 2022, Human Mutation

1000G Variant Calls:

This is data from our publication on de novo variation in the 1000 Genomes Project.

The Globus endpoint is: “Turner Lab at WashU – DNV in 1000 Genomes Paper”, direct link: https://app.globus.org/file-manager?origin_id=3eff453a-88f4-11eb-954f-752ba7b88ebe&origin_path=%2F) It includes:

  • DeepVariant_by_family_VCF_files: VCF files called from Parabricks DeepVariant and genotyped with GLnexus of the 602 family trios.

  • GATK_by_family_VCF_files: VCF files called from Parabricks Haplotypecaller GATK and genotyped with GLnexus of the 602 family trios

  • Pedigree_file_from_KG_website: Pedigree file of the relationships between samples from the 1000 Genomes Project.

  • SAMtools_tview_images_at_select_sites: tview images, as .txt files, of manually validated de novo mutation sites. The order of the tviews presented are always father, mother, child. Also included is tview_all_actual_sorted_with_filenames.txt, a tab delimited file that contains all the tviews chromosome, position, ref and alt allele, validation score, and sample.

  • full_denovo_callset: A bgzipped .txt file that contains all called de novo mutations from our pipeline of the 602 trios.

  • phasing_results: A bgzipped .txt file that contains the phasing results from Unfazed for all 602 trios.

  • DeepVariant_Joint_Genotyped_Callset: A bgzipped and tabix indexed vcf file of joint-genotypes we generated with DeepVariant and GLnexus across the individuals in the 1000 Genomes

  • Browser_Track_of_DNVs_for_UCSC_Genome_Browser: A bed file for uploading and visualizing in the UCSC Genome Browser.

Relevant GitHub Link: https://github.com/TNTurnerLab/GPU_accelerated_de_novo_workflow

Link to UCSC Genome Broswer TrackHub: https://data.cyverse.org/dav-anon/iplant/home/turnerlabwashu/Turner_Lab_Track_Hubs/Ng_et_al_1000G_DNV_Paper/hub.txt

 

Padhi et al. 2021, Bioinformatics

Specific versions of the reference genomes we used in our ACES paper are available on the Turner Lab Public Globus Endpoint: https://app.globus.org/file-manager?origin_id=97668938-bcc8-11eb-9d92-5f1f6f07872f&origin_path=%2F.

Relevant GitHub Link: https://github.com/tnTurnerLab/aces