Updating the nextflow config file

  1. Locate the config file in the File Browser.
  2. Double-click on nextflow.config to edit
  3. Update the relevent lines (see below), save and close.

For the taxonomy module

The taxonomy module uses the maxThree database.

maxThree has two parts:

  • params.fastanidbpath, the reference genome sequences (ending .fasta) and
  • params.fastanireflistpath, the names of the reference genomes grouped in files by Genus.

Be sure the update both:

params.fastanidbpath = "/file_path/fna_YYYY-MM-DD/*"

params.fastanireflistpath = "/file_path/fna_ref_lists_YYYY-MM-DD/*"


For the phylogeny module

The phylogeny module uses the maxOne database.

The location of the database is specified here:

params.lsbsrrefgenomefasta = "/file_path/fasta_YYYY-MM-DD/"

genomes must end in .fasta.

Example

The whole path to a fasta database built on 29th April 2022 is:

params.lsbsrrefgenomefasta = "/genomics/home/vol-genomics/genome_tools/reference_libraries/fasta_2022-04-29/"


Paths to kraken2 databases

kraken2 databases can either be made or a pre-built one can be downloaded. The three processes of the Core Tool that use kraken2 databases are:

  • kdbpath is used to determine the Genus of each query strain,
  • kdb_minikraken_path is only run if the isolate is of an unexpected Genus, and
  • kdb_reads_path determines if contamination is present in the raw reads.

Multiple kraken2 databases can be specified using:

params.kdbpath = "/file_path/database_name_YYYYMMDD"

params.kdb_minikraken_path = "/file_path/database_name_YYYYMMDD"

params.kdb_reads_path = "/file_path/database_name_YYYYMMDD"


Information

You are not likely to need to change file_path:

/genomics/home/vol-genomics/genome_tools/reference_libraries/

Take care not to modify any other text in the config file.