filterstatsΒΆ

$ micca filterstats --help
usage: micca filterstats [-h] -i FILE [-o DIR] [-t TOPN]
                         [-e MAXEERATES [MAXEERATES ...]] [-n MAXNS]

micca filterstats reports the fraction of reads that would pass for each
specified maximum expected error (EE) rate %% and the maximum number of
allowed Ns after:

* discarding sequences that are shorter than the specified length
  (suggested for Illumina overlapping paired-end (already merged)
  reads);

* discarding sequences that are shorter than the specified length AND
  truncating sequences that are longer (suggested for Illumina and 454
  unpaired reads);

Parameters for the 'micca filter' command should be chosen for each
sequencing run using this tool.

micca filterstats returns in the output directory 3 files:

* filterstats_minlen.txt: fraction of reads that would pass the filter after
  the minimum length filtering;
* filterstats_trunclen.txt: fraction of reads that would pass the filter after
  the minimum length filtering + truncation;
* filterstats_plot.png: plot in PNG format.

optional arguments:
  -h, --help            show this help message and exit

arguments:
  -i FILE, --input FILE
                        input FASTQ file, Sanger/Illumina 1.8+ format
                        (phred+33) (required).
  -o DIR, --output DIR  output directory (default .).
  -t TOPN, --topn TOPN  perform statistics on the first TOPN sequences
                        (disabled by default)
  -e MAXEERATES [MAXEERATES ...], --maxeerates MAXEERATES [MAXEERATES ...]
                        max expected error rates (%). (default [0.25, 0.5,
                        0.75, 1, 1.25, 1.5])
  -n MAXNS, --maxns MAXNS
                        max number of Ns. (disabled by default).

Examples

Compute filter statistics on the top 10000 sequences, predicting
the fraction of reads that would pass for each maximum EE error
rate (default values):

    micca filterstats -i input.fastq -o stats -t 10000