FAQ ASAP web: Assemble Species by Automatic Partitioning
Concerning the method:
ASAP is a tool designed to propose partitions of species hypotheses using genetic distances calculated between DNA sequences.
ASAP can handle more than 10 000 sequences, but the computation time can be quite important in this case (several hours). If you need to work with a very large dataset, we recommend you to download the command line version of ASAP to run it on your computer instead of overloading the server.
The Fasta format is the most convenient format. If you provide a distance matrix (phylip dnadist or MEGA CSV) you must provide the length of the alignment used to compute the matrix.
Please rename the ".csv" extension into ".txt" as CSV can be interpreted as special objects by some browsers and will produce unexpected results
Nbgroups is the number of species as identified by ASAP in the corresponding partition
ASAP identifies different partitions, and the score is an indicator of which partition you have to look at. It is a combination beetween the two following parameters (probability and slope)
Proba is the probability that the partition at the step n is different from the partition at the step n-1. Please, refer to the publication for more details.
W is the slope of the curve shown on the right ("Ranked distances") at a given genetic distance value (see below). A high value means that the next distances (bigger and smaller) values are far.
Dc is the value of the "jump" distance used to calculate the slope.
Text is a text file containing the different partitions. Two formats are available:
a csv file: each line is a sequence label followed by the group number, both separated with a semicolon.
a list file: each line is a group and all the sequences belonging to that group are listed
On the main ASAP page you can tune some parameters (see the help page).
All the partitions within the range of genetic distances you provided will be preceded by a star and a blue area corresponding to this range will be drawn on the curve. Default values are 0.005 and 0.05.
For each partition and for each node for which a probability has been calculated, the darker the color of the dots and squares, the higher the probability. When the probability was not computed, the square is grey. We choose to use different symbols (dots and squares) for the table and the curve in one hand and for the dendrogram in the other hand, because the probability is not calculated the same way. For the table and the curve, it corresponds to the probability of the partition. For the dendrogram, it corresponds to the probability that merging the groups within the node is compatible with the known distances inside each of these groups. A very low probability (dark color) indicates that this group is unlikely, i.e. that the groups within this node probably correspond to different species. . Please refer to the publication for more details.
ASAP uses a seed to generate random partitions in order to estimate the probability of a partition. A new seed can slightly change the probabilities.
Some regions are responsive:
A click on a line of the results table will draw two lines. A green one coresponding to the grouping distance and a red one corresponding to the jumping distance (see paper for more details)
The lines will be drawn on every graphics showing where it cuts the histogram, where it cuts the cumulative ranked distance curve,
to which value on the ASAP score curve it corresponds and where it cuts the dendrogram
A click on a node of the dendrogram will color in red all the sequences grouped in the clicked node
A roll over a sequence name on the dendrogram will show the full sequence label.
A roll over a node will provide some information as: proba (the probability associated to the considered node), dist (the genetic distance to which this node corresponds), nbgroup (the number of groups in a partition that would correspond to the genetic distance of the node),
node number (for internal use only) and nbunder (how many sequences are grouped by this node)
Roll over the name and the full label will be displayed
Remember that ASAP is an exploratory tool designed to identify the best partitions of species, given the criteria used by ASAP (in particular the genetic distances). Your own species hypotheses might be based on other data, methods or criteria of species delimitation, and might thus be different from the best ASAP partitions. Combining all these results in an integrative taxonomy approach is generally a good idea.