Computer Science for Bioinformatics (140.636)
1st term, 2018-2019
MWF 1:30-2:20 in W5008
F 10:30-11:20 in W4013
Instructor: Fernando Pineda
Course materials
Description
Please read ALL of this document carefully.
This course uses multiple programming language to introduce skills and concepts needed to process and interpret data from high-throughput technologies in the biological sciences. The course focuses on generally applicable computer-science concepts rather than statistical or biological concepts. Lectures with live computer demonstrations and hands-on-laboratories will be used to introduce key concepts. These will be reinforced and extended with weekly readings and programming exercises. Exercises and examples will draw heavily from biological sequence analysis, proteomics, genetics and computational biology. Occasional guest lecturers will present case studies. Students will be introduced to the wealth of bioinformatics and computational software-development resources available on the World Wide Web. Students will be introduced to necessary fundamentals in computer science including: (1) Salient machine and network basics (2) data representation, data structures, algorithms and complexity, (3) parsing and pattern matching, (4) programming languages (HTML, Perl, Python, SQL, regular expressions), (5) style and best practices (6) Object-oriented programming. Applied topics to be covered include: (1) Biological sequence analysis, (2) Middleware (3) how to use scripts in bash, perl and python to manage and process datasets, (4) Relational databases including automated interaction with local (e.g. SQLite and MySQL) and biological (e.g. Genbank) databases, (5) High performance computing, (6) parallel processing, and (7) simulation.