Library Construction & Finishing


Finishing Group Photo

Library Construction

To sequence a genome or genomic region we must first construct a shotgun library. DNA is randomly sheared using a hydrodynamic shearing device into small pieces. These pieces are then run on an agarose gel to size select the appropriate length of DNA and then they are inserted into bacterial vector systems. This allows us to have the bacteria amplify many copies of the small piece of a genome that the vector contains. Shotgun libraries can be of several sizes, small (3-4 kilobases), medium (6-10kilobases), large (28-40 kilobases) and very large (60-250 kilobases). These libraries allow us to break down the larger genome sequencing problem into small pieces of DNA that we can sequence in the laboratory.

Bacterial Colony Picking

To sequence a shotgun library, individual bacterial colonies need to be transferred from an agar plate into a microtiter plate. This can either be done manually using toothpicks or automated using a Genetix Colony Picking robot. A single colony is transferred from the plate into a single well that has been previously filled with growth media. The plate is then grown overnight, glycerol is added to protect the cells during the freezing process and the plate is then sealed and stored in a -80 degree centigrade freezer.

Genome Improvement and Finishing

Genome finishing and sequence improvement begins with the draft assembly of Sanger shotgun reads (500-800 base pairs in length) from the ends of 3 and 6 kb plasmid, fosmid and bacterial artificial chromosome (BAC) libraries.  This draft assembly utilizes computational algorithms that align large linear segments of DNA sequence from a mosaic of the individual reads.  The initial assembly contains gaps, low quality regions and incorrect joins. The process of genome improvement and finishing takes this draft assembly and corrects these errors through additional computational analysis and laboratory experiments.  Methods for making this improvement include primer walks with a variety of chemistries and templates, transposons, and shatters of clones (the latter two techniques being essentially sub projects of the sub projects).  During manual inspection of the sequence, additional corrections and validations are made.  Efforts are currently underway to explore practical uses of next-generation sequencing technologies for the extremely large and multi-copied plant genomes. 

Genome finishing and improvement moves beyond the draft assembly by targeting prioritized areas of the genome.  The target area is isolated as a subset of the whole to allow for more workable subproject size; the sub projects range from either fosmid- or BAC-size to 2MB.  Targets are selected for a variety of reasons that include gene-rich areas, quantitative trait loci (QTL) regions, or a region with significant number of gaps in the sequence and unresolved repetitive sequence.  The objective is then to completely resolve each base call in the target subproject and then incorporate that "complete" sequence back into the entire genome assembly, thus making improvements beyond that of the draft assembly.  The finished or improved genome sequence more accurately reflects the actual genome and can be used by researchers as a reference sequence for further study.