fsc2

Web: https://cmpg.unibe.ch/software/fastsimcoal2/

Installation

wget https://cmpg.unibe.ch/software/fastsimcoal2/downloads/fsc28_linux64.zip
unzip fsc28_linux64.zip
cd fsc28_linux64/
chmod +x fsc28
./fsc28

which gives command line options.

fsc was launched without any argument: attempting to read file "fsc_run.txt"
Unable to find file "fsc_run.txt" in current directory(/usr/local/Cluster-Apps/ceuadmin/fsc2/2.8.0/example files)


fastSimcoal2 (ver 2.8.0.0 - 22.09.23)

Usage:

 -h  --help              : prints this help
 -i  --ifile test.par    : name of parameter file
 -n  --numsims 1000      : number of simulations to perform
                           Also applies for parameter estimation
 -t  --tplfile test.tpl  : name of template parameter file (optional)
 -f  --dfile test.def    : name of parameter definition file (optional)
 -F  --dFile test.def    : same as -f, but only uses simple parameters defined
                           in template file. Complex params are recomputed
 -e  --estfile test.est  : parameter prior definition file (optional)
                           Parameters drawn from specified distributions are
                           substituted into template file.
 -E  --numest 10         : number of draws from parameter priors (optional)
                           Listed parameter values are substituted in template file
 -g  --genotypic         : generates arlequin projects with genotypic data
 -p  --phased            : specifies that phase is known in arlequin output
                           default: phase is unknown
 -s  --dnatosnp 2000     : output DNA as SNP data, and specify maximum no. of SNPs
                           to output (use 0 to output all SNPs).
 -S  --allsites          : output the whole DNA sequence, incl. monomorphic sites
 -I  --inf               : generates DNA mutations according to an
                           infinite site (IS) mutation model
 -d  --dsfs              : computes derived site frequency spectrum
                           (for SNP or DNA as SNP (-s) data only).
 -m  --msfs              : computes minor site frequency spectrum
                           (for SNP or DNA as SNP (-s) data only)
 -j  --jobs              : output one simulated or bootstrapped SFS per file
                           in a separate directory for easier analysis
                           (requires -d or -m and -s0 options)
 -b  --numboot 10        : number of bootstraps to perform on polymorphic sites to extract SFS
                           (should be used in addition to -s0 and -j options)
 -H  --header            : generates header in site frequency spectrum files
 -q  --quiet             : minimal message output to console
 -T  --tree              : outputs coalescent tree in nexus format
 -k  --keep  10000       : number of simulated polymorphic sites kept in memory
                           If the simulated no. is larger, then temporary files
                           are created. Default value is 10000
 -K  --numRandGen  20000 : number of random numbers generated in advance
                           Default value is 20000
 -r  --seed              : seed for random number generator (positive integer <= 1E6)
 -x  --noarloutput       : does not generate Arlequin output
 -G  --indgenot          : generates an individual genotype table
 -M  --maxlhood          : perform parameter estimation by max lhood from SFS
                           values between iterations
 -L  --numloops 20       : number of loops (ECM cycles) to perform during
                           lhood maximization. Default is 20
 -l  --minnumloops 2     : number of  loops (ECM cycles) for which the lhood is
                           computed on both monomorphic and polymorphic sites
                           if REFERENCE parameter is defined
 -C  --minSFSCount 1     : minimum observed SFS entry count taken into account in
                           likelihood computation (default = 1, but value can be < 1. e.g  0.5)
 -0  --removeZeroSFS     : do not take into account monomorphic sites for SFS
                           likelihood computation
 -a  --ascDeme 0         : This is the deme id where ascertainment is performed
                           when simulating SNPs. Default: no ascertainment.
 -A  --ascSize 2         : number of ascertained chromosomes used to define SNPs in
                           a given deme. Optional parameter. Default value is 2
 -u   --multiSFS         : generate or use multidimensional SFS
 -w   --brentol 0.01     : tolerance for Brent optimization
                           Default = 0.01. Smaller value imply more precise estimations
                           but require more computation time (min;max) = (1e-1;1e-5)
 -c   --cores 1          : number of openMP threads for parameter estimation
                           (default=1, max=numBatches, use 0 to let openMP choose optimal value)
 -B   --numBatches 12    : max. no. of batches for multi-threaded runs
                           (default=12)
 -P   --pooledsfs        : computes pooled SFS over all samples.
                           Assumes -d or -m, but not -u flag activated
      --recordMRCA       : records tMRCAs for each non recombining segment and outputs
                           results in <generic name>_mrca.txt. Beware: huge slow down of computing time
      --foldedSFS        : computes the 1D and 2D MAF SFS by simply folding the DAF SFS
      --logprecision 23  : precision for computation of logs of random numbers. Max value is 23
                           Default value is 23 (full precision). Recommended lower value is 18
      --initValues my.pv : specifies a file (*.pv) containing initial parameter values
                           for parameter optimization
      --nosingleton      : ignores singletons in likelihood computation
 -y   --resetParam 3     : Number of unsuccessful cycles before resetting parameters to current max lhood values
                           default is zero, implying no resetting
 -z   --finalRange 0.01  : Proportion of the initial search range remaining in the last cycle (default is 1)

Testing

When the module is built, one can use this script.

module load ceuadmin/fsc2/2.8.0
cd examples
fsc28 -i 1PopDNA.par -n 1 -d -e
module load ceadmin/R
wget https://cmpg.unibe.ch/software/fastsimcoal2/R/ParFileViewer.r
Rscript ParFileViewer.r 1PopDNA.par
Rscript ParFileViewer.r 3PopDNASFS.par
convert 3PopDNASFS.par.pdf 3PopDNASFS.png

where an R utility is used to visually inspect the validity of modeled scenarios, 3PopDNASFS.png.

References

Excoffier L, et al. fastsimcoal2: demographic inference under complex evolutionary scenarios. Bioinformatics 37 (24):4882–4885, 2021.