Whole Genome Resequencing Pipeline v2.0
NEW!: SAMtools and Picard dependencies replaced by built-in MATLAB capabilities. Now only requires BWA and GATK.
Automates single-end whole genome resequence (WGRS) data processing whereby pre-installed dependencies are used to map reads from FASTQ to a reference and realign indels. BWA must be installed and available on the system path and GenomeAnalysisTK.jar must be available on the MATLAB path. If no arguments are provided, the user will be asked to provide one or more FASTQ files of reads and a reference FASTA. Developers are encouraged to adapt this template to their needs. Pipeline steps are:
(0a) FM-index reference (BWA index)
(0b) Create FASTA index (Internal fai)
(0c) Create sequence dictionary (Internal dict)
(1) Map reads (BWA mem)
(2) Convert SAM to BAM (MATLAB sam2bam)
(3) Sort BAM (MATLAB bamsort)
(4) Index BAM (MATLAB BioMap)
(5) Discover indels (GATK RealignerTargetCreator)
(6) Realign indels (GATK IndelRealigner)
(7) Cleanup
Zitieren als
Turner Conrad (2024). Whole Genome Resequencing Pipeline v2.0 (https://www.mathworks.com/matlabcentral/fileexchange/46078-whole-genome-resequencing-pipeline-v2-0), MATLAB Central File Exchange. Abgerufen .
Kompatibilität der MATLAB-Version
Plattform-Kompatibilität
Windows macOS LinuxKategorien
- Industries > Biotech and Pharmaceutical > Genomics and Next Generation Sequencing >
- Sciences > Physics > Biological Physics >
Tags
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!Live Editor erkunden
Erstellen Sie Skripte mit Code, Ausgabe und formatiertem Text in einem einzigen ausführbaren Dokument.
Version | Veröffentlicht | Versionshinweise | |
---|---|---|---|
1.1.0.0 | See CHANGELOG section in code |
||
1.0.0.0 |