FAQs

What information does the Core require to run?

The Core requires raw, paired-end Illumina sequencing data, and it requires a sample sheet.

What documents are made by the Core?

The name of an isolate is shown by '*'.

The addition of the isolate's Genus and species is shown by '#'.

Folder names Outputs
/ run_parameters.txt
/ software_version_report.txt
/ failure_summary.csv
00_rawData/ *_R1.fastq.gz
*_R2.fastq.gz
01_rawPrep/ *_R1_001_val_1.fq.gz
*_R2_001_val_1.fq.gz
02_assemblies/ fa/#*.fa
gfa/#*.gfa
03_QC/ quast_report.tsv
seqTK.summary.csv
NB501042_139_HFHHCAFXY_multiqc_report_data/
NB501042_139_HFHHCAFXY_multiqc_report.html
counts/*_count.txt
kraken2_raw/*_kraken2_reads.report
04_mass_screening/ summary_'database'-75.tab
isolate/#*_'database'-75.tab
05_annotation/ #*_.faa
#*_.fna
#*_.gbk
#*_.gff
06_taxonomy/ summary_taxonomyTable.csv
isolate/*_ani.tab
07_phylogeny/ sequencerID_run#_flowCellID.nwk

What are the different folders?

The Core's file system contains the following files and folders:

Directory names Description of contents
core.nf The Core nextflowscript.
nextflow.config The nextflow config file.
input_core/ Deposit the raw reads and sample sheet here.
output_core/ Core deposits data here.
work/ The Nextflow working directory.
config/ Ignore.
downloads/ Save your compressed data here.

How do you view the output files?

You can open the files in a spreadsheet, e.g. MS Excel. The output files are in .tab format.

The run_parameters.txt document can be opened in any text editor.

Is there a limit to how many genome assemblies the Core can screen?

There is no lower or upper limit on how many genomes the Core can analyse.