Low Coverage & Pool Sequencing

Download

The latest version is Version 0.2

Version 0.2: Released on July 1, 2013. LRT now supports covariates. Please note that this is not fully stress-tested and please send me any comments/bug reports. For more information, please read covariates_README.txt file under covariates directory. There are also sample files.

Version 0.1: Released on May 29, 2013. The first version to be released

Installation

The two methods are implemented in the form of two executable JAR files, VST.jar and LRT.jar. The JAR files have no external dependencies, and were tested on Java 1.6 and 1.7, but may run on previous versions. The JARs show usage instructions and command line parameters themselves. To see these, simply run one of the following from your command prompt:

java -jar VST.jar java -jar LRT.jar

For your convenience, example input files are provided as well. They are:

exampleinput.txt - Contains simulated major and minor allele read counts read counts. Also describes the file format.

exampleci.txt - Contains c_i values, used by LRT method. Also describes the c_i file format.

To run VST and LRT on these examples, assuming a per-base sequencing error rate of 1% (0.01) and with 1000 permutations to obtain a p-value, use the following:

java -jar VST.jar -i exampleinput.txt -e 0.01 -p 1000 java -jar LRT.jar -i exampleinput.txt -e 0.01 -p 1000 -c exampleci.txt -o outputfile.txt -t covariatesfile.txt

======================================
Note on large input files: Large input files may require increasing the memory available to the Java Virtual Machine (JVM), by adding the switch -Xmx. If you get OutOfMemory errors, try this: java -Xmx2000m -jar VST.jar
======================================
For further information, please contact the authors of the above article.

Input File Formats

Input file formats are described in details in exampleinput.txt file in the software. Please note that you can run our methods on low-coverage sequencing where each individual is sequenced (rather than pools). You should use individual ID for poolId (first column) in the file, N should be 1 (third column), and a1 is 1 (fourth column).