Updating the nextflow config file
- Locate the config file in the File Browser.
- Double-click on
nextflow.config
to edit - Update the relevent lines (see below), save and close.
For the taxonomy module
The taxonomy module uses the maxThree database.
maxThree has two parts:
-
params.fastanidbpath
, the reference genome sequences (ending.fasta
) and -
params.fastanireflistpath
, the names of the reference genomes grouped in files by Genus.
Be sure the update both:
params.fastanidbpath = "/file_path/fna_YYYY-MM-DD/*"
params.fastanireflistpath = "/file_path/fna_ref_lists_YYYY-MM-DD/*"
For the phylogeny module
The phylogeny module uses the maxOne database.
The location of the database is specified here:
params.lsbsrrefgenomefasta = "/file_path/fasta_YYYY-MM-DD/"
genomes must end in .fasta
.
Example
The whole path to a fasta database built on 29th April 2022 is:
params.lsbsrrefgenomefasta = "/genomics/home/vol-genomics/genome_tools/reference_libraries/fasta_2022-04-29/"
Paths to kraken2 databases
kraken2 databases can either be made or a pre-built one can be downloaded. The three processes of the Core Tool that use kraken2 databases are:
-
kdbpath
is used to determine the Genus of each query strain, -
kdb_minikraken_path
is only run if the isolate is of an unexpected Genus, and -
kdb_reads_path
determines if contamination is present in the raw reads.
Multiple kraken2 databases can be specified using:
params.kdbpath = "/file_path/database_name_YYYYMMDD"
params.kdb_minikraken_path = "/file_path/database_name_YYYYMMDD"
params.kdb_reads_path = "/file_path/database_name_YYYYMMDD"
Information
You are not likely to need to change file_path
:
/genomics/home/vol-genomics/genome_tools/reference_libraries/
Take care not to modify any other text in the config file.