> > | Concerning the Class Hierarchy
As the library starts to take shape, we have to decide upon a class hierarchy which project will be built upon. I imagine that changing the hierarchy down the road will be difficult, so in hopes or avoiding that, let's commit ourselves to a single hierarchy.
Some history about the existing hierarchy directories:
- Originally, there was only
IO , alignment , and index
- IO would read in the reference and reads
- The index (Kmer) would return positions in the reference that matched the first k bases of a read
- The aligner would align the entire to the reference at the specified position
- Then then index was swapped... aligner was completely replaced when searching for exact reads
- The index would "locate" the position in the reference where the entire read was found
- Inexact reads were supported, leading to the need for
Mapper classes
- Would "map" reads to the reference, but allowed some form of variation (e.g. mismatches, gaps, etc...)
- Some required
aligner classes, bringing back the need for them
- To reduce the code seen in /tools/,
Drivers were created
- Essentially, took in a
mapper , input and output classes, and ran through every read in the given file
-
Pairend classes were introduced to handle the post-processing to make reads paired...
- These were fed into some specific
Drivers , and works independently from index and mappers
As you can see, the entire hierarchy wasn't carefully planned, and rather extended when the need arose... so I wouldn't be surprised if there was room for improvement... or a completely restructuring.
Some personal concerns:
- Some classes in
IO are actually Types... this could be pulled out
- The creation of every
Mapper class requires the addition of a new Locate functions in the Index class
- Should the index simply be a container? And the mapper classes take care of the actually "locating", using the index?
-- Main.jujubix - 26 May 2010 |