T. Nishiyama, Kanazawa University (email: tomoakin@kenroku.kanazawa-u.ac.jp
This process will run a blast search agains combined dataset of nr and Selmo Filtered Model 2. Chlamydomonas, Physcomitrella, Arabidopsis, and Rice sequences should be represented in the nr dataset.
The process collect up to 1000 of the hit sequences, which is returned as a fasta file with the supplied name plus ".nrSmoFM2" suffix.
Up to 100 sequences are selected based on the order on the blast results and further aligned and converted to nexus format that can be processed with MacClade
In some case we need more than 100 sequence to get the whole gene family.
If you see a sudden drop in the similarity it is the end of the gene family.
But in many case the similarity goes down smoothly that there is no distinct point: then try upto several hundred genes.
Add sequences not in Filtered Model 2 to the fasta file that you obtained at "Get related sequences"
Give an appropriate number N for the number of genes you are willing to analyze.
Supply the sequence of interest as query.
Supply the fasta file as the sequence collection.
The process will give you back an alignment of sequences up to N.
It is essential to use only homologous sites for the tree reconstruction Remove genes with excess gaps unless the gene is very interesting ones.
Unmark regions that is not homologous for all of the sequence or having gap in some sequences.
If you find negative branches or exessively long branches, it is quite likely that you incorporated non-homologous sequences aligned.
Neighbor-Joining tree can be constructed at http://moss.nibb.ac.jp/cgi-bin/makenjtreeSelmo
The process automatically performes neighbor joining tree reconstruction and bootstraping.
The tree are in Newick format (text) and in SVG format (vector graphics that can be processed with Adobe Illustrator (CS, CS2, or CS3))
Usually see the file endig with treeagi.svg