> > | Some notes about output...
The below three are considered "identical" for our purposes
baligner
SLXA-EAS1_89:1:1:672:654/1 0 gi|170079663|ref|NC_010473.1| 4017473 65 35M * 0 0 GCTACGGAATAAAACCAGGAACAACAGACCCAGCA cccccccccccccccccccc]c``cVcZccbSYbY NM:i:0 MD:Z:35 AS:i:35
bwa
SLXA-EAS1_89:1:1:672:654 0 gi|170079663|ref|NC_010473.1| 4017473 37 35M * 0 0 GCTACGGAATAAAACCAGGAACAACAGACCCAGCA cccccccccccccccccccc]c``cVcZccbSYbY XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:35
bowtie
SLXA-EAS1_89:1:1:672:654/1 0 gi|170079663|ref|NC_010473.1| 4017473 255 35M * 0 0 GCTACGGAATAAAACCAGGAACAACAGACCCAGCA cccccccccccccccccccc]c``cVcZccbSYbY XA:i:0 MD:Z:35 NM:i:0
Some notes about the columns:
- name: may vary slightly only
- flag: should be identical
- reference: should be identical
- position: identical for unique reads only (not for multimap, to see if this is true, check the
XT tag at the end bwa output
-
XT:A:U means the read is a single Unique read, which XT:A:R means it's a Randomly selected multimap read
- mapping quality: different (our numbers should be nearly identical to MOSAIK, another aligner)
- cigar: identical for unique reads only (an encoding of the aligned read)
- mate reference: * for single end
- mate position: 0 for single end
- insert size: 0 for single end
- sequence: should be identical
- quality: should be identical
- everything beyond is a tag, which vary greatly between programs but when present...
-
NM is the number of mismatches, and should be identical
-
MD is an encoding of the aligned reference, and should be identical
A more interesting example:
baligner
SLXA-EAS1_89:1:1:713:81/1 0 gi|170079663|ref|NC_010473.1| 4058875 5 17M1D18M * 0 0 AAAAGCTGGGTGAGTGGCGATGACGCGCAATAAAA cccccccccccccccccccccccccccJcc^^bb^ NM:i:1 MD:Z:17^G18 AS:i:32
bwa
SLXA-EAS1_89:1:1:713:81 0 gi|170079663|ref|NC_010473.1| 4058875 37 15M1D20M * 0 0 AAAAGCTGGGTGAGTGGCGATGACGCGCAATAAAA cccccccccccccccccccccccccccJcc^^bb^ XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:0 XO:i:1 XG:i:1 MD:Z:15^G20
bowtie
SLXA-EAS1_89:1:1:713:81/1 4 * 0 0 * * 0 0 AAAAGCTGGGTGAGTGGCGATGACGCGCAATAAAA cccccccccccccccccccccccccccJcc^^bb^ XM:i:0
In this case, bowtie returns an unmapped hit (it doesn't allow indels by default), while bwa returns an identical unique hit. Note that the CIGAR and MD are slightly different, but upon further investigation, the different is due to where the gap is introduced: we place the gap on the last G in the triplet of G's, while bwa places the gap on the first.
And finally, a multimap example, when allow_multimap is true :
baligner
SLXA-EAS1_89:1:1:721:668/1.1 0 gi|170079663|ref|NC_010473.1| 521562 62 35M * 0 0 GCTGTAGATCTGGAAATCGCAACGGAGGAAGAAAG ccccccccccccccccccccccccccccccbbbbb NM:i:0 MD:Z:35 AS:i:35
SLXA-EAS1_89:1:1:721:668/1.2 0 gi|170079663|ref|NC_010473.1| 634822 62 35M * 0 0 GCTGTAGATCTGGAAATCGCAACGGAGGAAGAAAG ccccccccccccccccccccccccccccccbbbbb NM:i:0 MD:Z:35 AS:i:35
bwa
SLXA-EAS1_89:1:1:721:668 0 gi|170079663|ref|NC_010473.1| 521562 0 35M * 0 0 GCTGTAGATCTGGAAATCGCAACGGAGGAAGAAAG ccccccccccccccccccccccccccccccbbbbb XT:A:R NM:i:0 X0:i:2 X1:i:2 XM:i:0 XO:i:0 XG:i:0 MD:Z:35 XA:Z:gi|170079663|ref|NC_010473.1|,+634822,35M,0;gi|170079663|ref|NC_010473.1|,+633775,35M,1;gi|170079663|ref|NC_010473.1|,+520515,35M,1;
bowtie
SLXA-EAS1_89:1:1:721:668/1 0 gi|170079663|ref|NC_010473.1| 521562 255 35M * 0 0 GCTGTAGATCTGGAAATCGCAACGGAGGAAGAAAG ccccccccccccccccccccccccccccccbbbbb XA:i:0 MD:Z:35 NM:i:0
Note that our programs finds two exact match hits, and lists them both with unique names.
bwa is slightly interesting:
- it randomly selects one of the two positions we found
-
XT:A:R means that it's a Random multimap
-
X0:i:2 means that it found 2 exact matches (0 mismatches)
-
X1:i:2 means that it found 2 matches with 1 mismatch
-
XA:Z:gi|170079663|ref|NC_010473.1|,+634822,35M,0;gi|170079663|ref|NC_010473.1|,+633775,35M,1;gi|170079663|ref|NC_010473.1|,+520515,35M,1; lists the three other matches (the 1 other perfect match + the two 1 mismatch hits)
bowtie just picks one, and doesn't tell you that it's a multimap at all... (consequently, bowtie runs twice as fast as bwa) |