10  Sequence alignments

10.1 Why do we align sequences?

In search of homology and identity

10.2 What is homology

10.3 Pairwise alignments algorithms

10.3.1 Hamming distance

10.3.2 Edit distance

10.3.2.1 Dynamic programming

10.3.3 Needleman-Wunsch (global alignment)

10.3.4 Smith-Waterman (local alignment)

10.4 The genetic code and Scoring matrices

10.5 BLAST and its families

psi-blast? true homologous, recurrent blast to polish scoring matrix during several generations to generate true homologous

10.6 Multiple sequence alignments

Challenge

Your professor is working with species from genus Bacillus and want to align an orthologous gene from 10 genomes of different isolates. He gives you the GenBank accession number of these isolates and ask you to select one orthologous gene (Nucleotide seq) that you consider might be useful to differentiate the bacterial isolates and ask you to align those genes as you better consider. He finally ask you to document each step and send him the sequence alignment file in FASTA format along with the sequence alignment general stats in a TXT file (length, number of each nucleotides and other stats you consider important).

Accessions: GCA_012225885.1, GCA_000196735.1, GCA_000742895.1, GCA_001584335.1, GCA_000007825.1, GCA_000832905.1, GCA_000008425.1, GCA_000507105.1, GCA_000832605.1, GCA_900186955.1