20258
Scalable Sequencing Pipeline on Cloud
Objectives: Here we introduce our cloud-based NGS analysis pipeline and benchmark results of all public whole exome and genome data from autism studies.
Methods: We built a generic workflow management system running on clouds, then implemented Genome analysis toolkit (GATK) workflow as our NGS pipeline. We tested this system on Amazon Web Service (AWS) platform with all autism whole exome and whole genome data sets available to us, in order to examine scalaiblity and cost-effectiveness of our pipeline.
Results: Test results showed that the pipeline works in a scalable manner up to hundred exomes or genomes which is a typical batch size in sequencing. We will also discuss our findings on autism specific data in joint variant calling and characteristics of rare / de novo / knockout variants.
Conclusions: N/A