|
META TOPICPARENT |
name="NGSAlignerProject" |
|
|
< < | NGS Aligner To Do List |
> > | Specifications |
| |
|
< < | Features
- Prevent losing SNPs at the end of reads from clipping
- Rescue transcriptome reads with huge chunks of mismatched introns
|
> > | File Formats
- Reads input as FASTQ
- Alignments output as SAM
General Needs
- Works in letter space with RNA-Seq data
- Works in colour space with SOLiD data
- Fast
- Small memory footprint
- Align against reference with dynamic splice graph
- Minimize the size of the driver (i.e. abstract functions into classes)
Read Specific Needs
- Works with 75bp reads, up to 128bp
- Rescue transcriptome reads with huge chunks of mismatched introns (e.g.
NNNIIIII ; I = intron)
|
|
- Split transcriptome reads to align across an intron junction
|
|
> > |
- Handle residual introns within exome reads (e.g.
NNNIINNN )
- Handle resitual introns flanking exome reads (i.e.
NNNNNNII )
- Prevent losing SNPs at the end of reads from clipping (e.g.
NNNNNXN ; N = base; X = SNP)
Alignment Specific Needs
- User-specified method of handling multimapped reads (i.e. read mapping equally well to multiple positions)
- Supports extended CIGAR encoding (including P for Padding)
- Confirm if traceback ever needs to return multiple best alignments
|
| |
|
< < | Implementation
- Decide if traceback ever needs to return more than 1 alignment
|
> > | Quality Specific Needs
- Alignment Score (from alignment)
- Mapping Quality (as defined by MAQ, using Mosaik's implementation)
- Uniqueness (out of 100%, divided by number best mapped positions)
- Fragment Quality (PE specific value, probability that mapped fragment (flanked by PE) location is correct/incorrect)
|
| |
|
< < | Compilation |
> > | Build and Complication |
|
- Have CMake detect CPU architecture and link appropriate libraries
|
|
> > |
- Build as "Release" in ccmake for full optimization
|
| |