178x Filetype PPTX File size 1.14 MB Source: bioresearch.byu.edu
Next-Generation Sequencing y r o t a r o b ay Lt i ss er ce nv ei in cU S lg an nu oo iY t a tm ua ph mg oi r CB Problem Statement y r o t a r o •Map next-generation sequence reads b a y L t i with variable nucleotide confidence to s s e r c e n v e i a model reference genome that may i n c U S l g be different from the subject genome. a n n u o o i Y t ▫ a Speed t m u a p h m g Tens of millions of reads to a 3Gbp o i r C B genome ▫Accuracy Mismatches included? Repetitive regions ▫Visualization Workflow y r o t a r o b ay Lt i ss er ce nv ei in cU S lg an nu oo iY t a tm ua ph mg oi r CB Indexing the genome y r o t a r •Fast lookup of possible hit locations for o b ay Lt i the reads ss er ce nv ▫Hashing groups locations in the genome ei in cU S that have similar sequence content lg an nu oo k-mer hash of exact matches in genome can iY t a tm be used to narrow down possible match ua ph mg oi locations for reads r CB ▫Sorting genome locations provides for content addressing of genome •GNUMap uses indexing of all 10-mers in the genome as seed points for read mapping Building the Hash Table y r o t a r o b a y L t Sliding window i s s e r indexes all locations in Hash Table c e n v e i the genome i n c U S l g a n n u o o i Y t a t m u a p h m g o i r C B AACCA AACCAT AACCA AACCAT T T ACTGAACCATACGGGTACTGAACCATGAATGGCACCTATACGAGATACGC ACTGAACCATACGGGTACTGAACCATGAATGGCACCTATACGAGATACGC CATAC CATAC
no reviews yet
Please Login to review.