Date & Time (GMT+7) : November 5, 2022 | 11:00 - 16:30
Registration click
19th International Symposium on Rice Functional Genomics (SRFG) 2022 November 4-7, 2022 Workshop: Accelerated Workflow and Software Tools for Plant Genomes: Data processing from Raw data to Variants
Abstract :
The objective of this proposed workshop is to discuss an automation and acceleration of variant calling workflow at high-performance computing (HPC) platform for rice and other crop genomes. The workflow is designed for HPC platform, where the variants of 3,000 rice samples could be processed in less than one week, and the workflow is also suitable for other different system architectures that includes cluster (or cloud platform), and high-end workstations. During this workshop, we will demonstrate the example datasets from Rice, Maize, and Soybean. We classified the workflow into 4 Phases: (1) Genome mapping, (2) Variants discovery, (3) Call set refinement and combining variants, and (4) Variants matrixes. We are using Genome Analysis Toolkit (GATK) best practices for large-scale variants discovery and customized algorithm for data parallelization. We built every stage of this workflow is independent & flexible entity, where it can be seamlessly executed across different system architectures. The automation on every stage of the workflow is scalable across multiple nodes via data parallelization algorithm which will take care of data distribution. We developed a novel data parallelization algorithm, which takes care of chromosome splitting into multiple chunks, and it can be executed independently across the nodes to reduce the execution time. The flexibility of the workflow offers collaboration in data sharing, data processing (e.g., various stages based on their computational limitations), improve the resource utilization (e.g., simultaneously use different system architectures), minimize overall execution time and many more.
|
Agenda:
Sl. No. |
Details |
Speaker |
Duration |
Overview of KAUST Rice genome project and data processing |
|||
1. |
Opening Remarks & Keynote |
Prof. Rod A Wing |
15 Min |
2. |
KAUST Computational Resources |
Dr. Saber Feki |
15 Min |
3. |
Overview of 3k Rice Genome project |
Dr. Yong Zhou |
10 Min |
4. |
Overview of Accelerated workflow for variant discovery |
Dr. Nagarajan Kathiresan |
10 Min |
Coffee break | 10 Min | ||
Accelerated workflow for data processing and demos |
|||
1. |
Tools for data preprocessing and demo (Phase #1) |
Dr. Nagarajan Kathiresan Dr. Yong Zhou |
30 Min |
Coffee break | 10 Min | ||
2. |
Tools for Variant discovery and demo (Phase #2) |
30 Min |
|
Coffee break | 10 Min | ||
3. |
Data parallelization methods, call set refinement and demo (Phase #3) |
50 Min |
|
Coffee break | 10 Min | ||
4. |
Variant tables and demo (Phase #4) |
30 Min |
|
5. |
Open discussion and feedback |
10 Min |
|
![]() |
ISRFG 2022 |
![]() |
ISRFG 2022 |