Algorithms for DNA Sequencing

We will learn computational methods -- algorithms and data structures -- for analyzing DNA sequencing data. We will learn a little about DNA, genomics, and how DNA sequencing is used. We will use Python to implement key algorithms and data structures and to analyze real genomes and DNA sequencing datasets.

About The Course

DNA sequencing is now a ubiquitous tool in life science. You can observe this trend just by reading the news. This course examines the computational problems that come with this onslaught of DNA sequencing data. How do we take a huge collection of DNA "puzzle pieces" and assemble them into a genome? How do we make it quick and easy to find a DNA "needle" in an enormous genomic "haystack"? We will spend the bulk of the course understanding the algorithms and data structures that underlie software tools for analyzing sequencing data. The course is also an opporunity to practice programming skills and gain exposure to basic algorithms and data structures.

Frequently Asked Questions

  • What resources will I need for this class?

The ability to write and run Python programs. For information on how to get started with Python, see the Python.org getting started guide.

IPython is a more interactive way to build and run Python programs. Many of our practical videos use IPython notebooks. While you might find it useful to develop Python code within IPython, you do not need it for this course.

Recommended Background

You should be comfortable programming in Python, or be willing and able to learn as you go. Some undergraduate-level computer science background is helpful but not necessary. No background in biology or genomics is necessary.