MicNeSs genotypes microsatellites from a collection of NGS sequences sorted by individual. As the sequencing and the PCR amplification introduce artefactual insertions or dele1tions, the set of sequence reads from a single microsatellite allele shows several length variants. The algorithm infers, without alignment, the true unknown allele(s) of each individual from the observed distributions of microsatellites length of all individuals. MicNeSs, a python implementation of the algorithm, can be used to genotype any sequence from any organism and has been tested on 454 pyrosequencing data.
MicNeSs is an open source program written in Python 2.7 that can only be run as a command line. Importantly, MicNeSs relies on two standard python libraries: scipy and numpy, freely available from www.scipy.org/. To run MicNeSs download the program and simply type:
python2.7 MicNeSs.py inputfile1 inputfile2 inputfile3 ...
python2.7 specifies the version of python. The current implementation does not work with python3.0 but can work with version prior to 2.7. Alternatively, you can edit the source and set the python in the first line (e.g. « #! /usr/bin python2.7 »). Each inputfile is a fasta file that includes all sequences for a single individual. The sequences must not be aligned (no gap) but should include the locus of interest. There are as many input files as many individuals to be genotyped. The filename itself is used for formatting the result. MicNeSs assumes that each file is named as IndividualName LocusName.ext. « IndividualName » is different in each filename since it specifies the individual, « LocusName » is identical for all inputfiles and « ext » only refers to the file format (typically fst, fas or fasta). Please note that the filename has no influence on the genotyping and is only used for formatting the output.
python2.7 MicNeSs.py -h
will list all available options. Several option can be tuned; the description for each of them are available in the manual. We have provided two data sets (TestData directory) that can be used to learn the usage of MicNeSs. Details to used these data set are available in the manual in the section 5 (Example).