Evaluate and compare base quality score recalibration (BQSR) tables
This tool generates plots to assess the quality of a recalibration run as part of the Base Quality Score Recalibration (BQSR) procedure.
The goal of this procedure is to correct for systematic bias that affects the assignment of base quality scores by the sequencer. The first pass consists of calculating error empirically and finding patterns in how error varies with basecall features over all bases. The relevant observations are written to a recalibration table. The second pass consists of applying numerical corrections to each individual basecall based on the patterns identified in the first step (recorded in the recalibration table) and writing out the recalibrated data to a new BAM or CRAM file.
The tool can take up to three different sets of recalibration tables. The resulting plots will be overlaid on top of each other to make comparisons easy.
| Set | Argument | Label | Color | Description |
|---|---|---|---|---|
| Original | -before | BEFORE | Maroon1 | First pass recalibration tables obtained from applying {@link org.broadinstitute.hellbender.transformers.BQSRReadTransformer} on the original alignment. |
| Recalibrated | -after | AFTER | Blue | Second pass recalibration tables results from the application of {@link org.broadinstitute.hellbender.transformers.BQSRReadTransformer} on the alignment recalibrated using the first pass tables |
| Input | -bqsr | BQSR | Black | Any recalibration table without a specific role |
You need to specify at least one set. Multiple sets need to have the same values for the following parameters:
covariate (order is not important), no_standard_covs, run_without_dbsnp, solid_recal_mode, solid_nocall_strategy, mismatches_context_size, mismatches_default_quality, deletions_default_quality, insertions_default_quality, maximum_cycle_value, low_quality_tail, default_platform, force_platform, quantizing_levels and binary_tag_name
Currently this tool generates two outputs:
You need to specify at least one of them.
gatk AnalyzeCovariates \
-bqsr recal1.table \
-plots AnalyzeCovariates.pdf
gatk AnalyzeCovariates \
-before recal1.table \
-after recal2.table \
-plots AnalyzeCovariates.pdf
gatk AnalyzeCovariates \
-bqsr recal1.table \
-before recal2.table \
-after recal3.table \
-plots AnalyzeCovariates.pdf
This table summarizes the command-line arguments that are specific to this tool. For more details on each argument, see the list further down below the table or click on an argument name to jump directly to that entry in the list.
| Argument name(s) | Default value | Summary | |
|---|---|---|---|
| Optional Tool Arguments | |||
| --after-report-file -after |
file containing the BQSR second-pass report file | ||
| --arguments_file |
read one or more arguments files and add them to the command line | ||
| --before-report-file -before |
file containing the BQSR first-pass report file | ||
| --bqsr-recal-file -bqsr |
Input covariates table file for on-the-fly base quality score recalibration | ||
| --gcs-max-retries -gcs-retries |
20 | If the GCS bucket channel errors out, how many times it will attempt to re-initiate the connection | |
| --gcs-project-for-requester-pays |
Project to bill when accessing "requester pays" buckets. If unset, these buckets cannot be accessed. User must have storage.buckets.get permission on the bucket being accessed. | ||
| --help -h |
false | display the help message | |
| --ignore-last-modification-times |
false | do not emit warning messages related to suspicious last modification time order of inputs | |
| --intermediate-csv-file -csv |
location of the csv intermediate file | ||
| --plots-report-file -plots |
location of the output report | ||
| --version |
false | display the version number for this tool | |
| Optional Common Arguments | |||
| --gatk-config-file |
A configuration file to use with the GATK. | ||
| --QUIET |
false | Whether to suppress job-summary info on System.err. | |
| --tmp-dir |
Temp directory to use. | ||
| --use-jdk-deflater -jdk-deflater |
false | Whether to use the JdkDeflater (as opposed to IntelDeflater) | |
| --use-jdk-inflater -jdk-inflater |
false | Whether to use the JdkInflater (as opposed to IntelInflater) | |
| --verbosity |
INFO | Control verbosity of logging. | |
| Advanced Arguments | |||
| --showHidden |
false | display hidden arguments | |
Arguments in this list are specific to this tool. Keep in mind that other arguments are available that are shared with other tools (e.g. command-line GATK arguments); see Inherited arguments above.
file containing the BQSR second-pass report file
File containing the recalibration tables from the second pass.
File null
read one or more arguments files and add them to the command line
List[File] []
file containing the BQSR first-pass report file
File containing the recalibration tables from the first pass.
File null
Input covariates table file for on-the-fly base quality score recalibration
Enables recalibration of base qualities, intended primarily for use with BaseRecalibrator and ApplyBQSR
(see Best Practices workflow documentation). The covariates tables are produced by the BaseRecalibrator tool.
Please be aware that you should only run recalibration with the covariates file created on the same input bam(s).
File null
A configuration file to use with the GATK.
String null
If the GCS bucket channel errors out, how many times it will attempt to re-initiate the connection
int 20 [ [ -∞ ∞ ] ]
Project to bill when accessing "requester pays" buckets. If unset, these buckets cannot be accessed. User must have storage.buckets.get permission on the bucket being accessed.
String ""
display the help message
boolean false
do not emit warning messages related to suspicious last modification time order of inputs
If true, it won't show a warning if the last-modification time of the before and after input files suggest that they have been reversed.
boolean false
location of the csv intermediate file
Output csv file name.
File null
location of the output report
Output report file name.
File null
Whether to suppress job-summary info on System.err.
Boolean false
display hidden arguments
boolean false
Temp directory to use.
GATKPath null
Whether to use the JdkDeflater (as opposed to IntelDeflater)
boolean false
Whether to use the JdkInflater (as opposed to IntelInflater)
boolean false
Control verbosity of logging.
The --verbosity argument is an enumerated type (LogLevel), which can have one of the following values:
LogLevel INFO
display the version number for this tool
boolean false
See also General Documentation | Tool Docs Index Tool Documentation Index | Support Forum
GATK version 4.6.2.0 built at Sun, 13 Apr 2025 13:21:43 -0400.