Updating the nextflow config file

For the taxonomy module

The taxonomy module uses the maxThree database.

maxThree has two parts:

params.fastanidbpath, the reference genome sequences (ending .fasta) and
params.fastanireflistpath, the names of the reference genomes grouped in files by Genus.

Be sure the update both:

params.fastanidbpath = "/file_path/fna_YYYY-MM-DD/*"

params.fastanireflistpath = "/file_path/fna_ref_lists_YYYY-MM-DD/*"

The phylogeny module uses the maxOne database.

The location of the database is specified here:

params.lsbsrrefgenomefasta = "/file_path/fasta_YYYY-MM-DD/"

genomes must end in .fasta.

Example

The whole path to a fasta database built on 29th April 2022 is:

params.lsbsrrefgenomefasta = "/genomics/home/vol-genomics/genome_tools/reference_libraries/fasta_2022-04-29/"

kraken2 databases can either be made or a pre-built one can be downloaded. The three processes of the Core Tool that use kraken2 databases are:

Multiple kraken2 databases can be specified using:

params.kdbpath = "/file_path/database_name_YYYYMMDD"

params.kdb_minikraken_path = "/file_path/database_name_YYYYMMDD"

params.kdb_reads_path = "/file_path/database_name_YYYYMMDD"

Information

You are not likely to need to change file_path:

/genomics/home/vol-genomics/genome_tools/reference_libraries/

Take care not to modify any other text in the config file.