Pairwise Alignment tool
This tool allows one to drop one or a selected set of sequences onto a reference sequence; the sequences dropped will then be aligned to the reference sequence, preserving whatever gaps are present in the reference sequence. In the process, if gaps need to be inserted into the reference sequence, then they will be also inserted into other, non-selected sequences.
The dropdown menu for this tool allows you to change the gap opening cost (default 8) the gap extension cost (default 3) within bases of a sequences, as well as the equivalent costs at the ends of the sequence (default 2 and 2, respectively).
The default substitution costs are:
- for DNA and RNA data, 5 for a transition and 10 for a transversion
- for protein data, 5 for each substitution.
These costs can be changed using the Substitution Costs dialog box available from the tool's dropdown menu.
Aligning nucleotide sequences to match an amino acid alignment
This feature allows one to take a matrix of nucleotides, and an existing alignment of their translated amino acids, and have the nucleotides realigned to match the amino acid alignment. To do this, you will need to have in the same Mesquite file both the nucleotide matrix and the protein matrix. For example, you could do the following:
- Assign codon positions and genetic code to a nucleotide matrix (see the main Mesquite manual for details).
- Adjust each sequence so that its reading frame is correct, by using the Shift To Minimize Stops feature.
- Trim any incomplete codons from the ends of the sequences by selecting the entire matrix and choosing Matrix>Alter/Transform>Other Choices... and selecting Trim Terminal Incomplete Codons. "Terminal Incomplete Codons" are nucleotides that are only part of a codon. For example, if one sequence starts at a third position, then that third position nucleotide represents only one-third of a codon, and it will be trimmed. Once this is done, only complete codons will be left in the sequence
- Translate the DNA matrix to amino acids by choosing Characters>Make New Matrix From>Translate DNA To Protein. You will now have the protein matrix in your file.
- Align the protein matrix. You could use, for example, the Clustal Align feature described below. If instead you export the matrix (e.g., using the File>Export options), align the proteins in a separate program, you will then need to choose File>Include file to include the output of the alignment program into your file.
- Finally, go to your DNA matrix, and choose Matrix>Alter/Transform>Align DNA to Protein.
This feature allows one to select a single block of sequences, and then have ClustalW align them. To do this, select the block, then choose Matrix>Align Multiple Sequences>Clustal Align...
You will be presented first with a query as to whether you want to do the ClustalW alignment on a separate thread, or on the same thread. Mesquite can do multiple things at once, because it can have one thing running on one computational "thread", and another thing happening on a separate thread. There is a main thread of the program that is the thread the user deals with directly, and that allows you to give commands to Mesquite (via menus, etc.). If this main thread is busy with a calculation, then you will not be able to ask for new things to happen in Mesquite until the calculation is done. By choosing "Separate" in the query that appears, you are asking Mesquite to create a thread separate from the main thread, thus enabling you to do things in Mesquite while the Clustal alignment is proceeding. However, if you do this, you must remember not to edit the matrix or close the window showing the matrix; if you do, Mesquite will be very unhappy. The safest thing to do is choose "No" to that query.
Once you make that choice, you will see a dialog box in which you must enter the directory location of ClustalW; if you use the Browse button in the dialog box you can choose ClustalW and have the location filled in automatically. If you wish, you can also alter the options of clustalw. If you then press OK, Mesquite will send that section of the matrix to ClustalW and ask for it to be aligned; it will then harvest the results and reincorporate that piece into the matrix.
The version of ClustalW that is to be used by Mesquite must be a version of ClustalW that is executable from the command line of your operating system.
This tool is just like the ClustalW Align feature described above, except that it works for Robert C. Edgar's MUSCLE program.