nf-core/configs: NCI Gadi HPC Configuration
nf-core pipelines have been successfully configured for use on the Gadi HPC at the National Computational Infrastructure (NCI), Canberra, Australia.
To run an nf-core pipeline at NCI Gadi, run the pipeline with -profile singularity,nci_gadi. This will download and launch the nci_gadi.config which has been pre-configured with a setup suitable for the NCI Gadi HPC cluster.
Access to NCI Gadi
Please be aware that you will need to have a user account, be a member of an Gadi project, and have a service unit allocation to your project in order to use this infrastructure. See the NCI user guide for details on getting access to Gadi.
Launch an nf-core pipeline on Gadi
Before running the pipeline, you will need to load Nextflow and Singularity, both of which are globally installed modules on Gadi (under /apps). You can do this by running the commands below:
module purge
module load nextflow
module load singularityThe version of Nextflow installed on Gadi has been modified to make it easier to specify resource options for jobs submitted to the cluster through the Nextflow process block (see NCI’s Gadi user guide for more details).
You can then run the pipeline using:
nextflow run <nf-core_pipeline>/main.nf \
-profile singularity,nci_gadi \
<additional flags>Cluster considerations
External network access
Please be aware that NCI Gadi HPC compute nodes do not have external network access. This means you will not be able to pull the workflow codebase or containers if you submit your nextflow run command as a job on any of the standard job queues (see the nf-core documentation for instructions on running pipelines offline). NCI currently recommends you run your Nextflow head job either in a GNU screen or tmux session within a persistent session, or submit it as a job to the copyq.
For example, to run Nextflow in a GNU screen session within a persistent session:
persistent-sessions start -p <project> <ps_name>
ssh <ps_name>.<user>.<project>.ps.gadi.nci.org.au
screen -S <screen_name>
nextflow run ...You can detach from the screen session using Ctrl+A, then D, and log out of the persistent session while the pipeline continues to run. Later, you can reconnect to the persistent session using the same ssh command and reattach to the screen session with: screen -r <screen_name>.
Downloading containers
This config requires Nextflow to use Singularity to execute processes. Before any process can be executed, the nf-core pipeline will first download the required container image to a local cache. This cache location can be specified using either $NXF_SINGULARITY_CACHEDIR environment variable or the singularity.cacheDir setting in the Nextflow config file. nci_gadi.config specifies the download and storage location with:
singularity.cacheDir = "/scratch/${params.nci_gadi_project}/${System.getenv('USER')}/nxf_singularity_cache"See the project accounting section below for details on params.nci_gadi_project.
Furthermore, Singularity uses the $SINGULARITY_CACHEDIR directory to store intermediate image layers and files during pulls (note that this cache is only used when the required container is not already available in Nextflow’s own Singularity cache, specified by $NXF_SINGULARITY_CACHEDIR or singularity.cacheDir). By default, $SINGULARITY_CACHEDIR is set to $HOME/.singularity/cache. For pipelines involving a large number and/or large size of first-time container downloads, we recommend setting this environment variable to a scratch location to avoid exceeding your home filesystem quota. For example, before running your nextflow run command, you can set the environment variable to a location in the scratch filesystem with:
export SINGULARITY_CACHEDIR=/scratch/$PROJECT/$USER/singularity_cacheGadi queues and job submission
This config currently determines which Gadi queue to submit your task jobs to based on the amount of memory required. For the sake of resource and cost (service unit) efficiency, the following rules are applied by this config:
- Tasks requesting less than 128 Gb will be submitted to the normalbw queue
- Tasks requesting more than 128 Gb and less than 190 Gb will be submitted to the normal queue
- Tasks requesting more than 190 Gb and less than 1020 Gb will be submitted to the hugemembw queue
Note that these are only baseline queue settings and may be adjusted depending on the goals of your pipeline run and the most efficient use of the HPC. You can make a local copy of the nci_gadi.config and modify the queue assignments as needed for specific processes or process groups. See the NCI Gadi queue limit documentation for more information on the available queues and their associated charge rates.
Project accounting
This config uses params.nci_gadi_project to assign a project code to all task job submissions for billing purposes. By default, this is set to the environment variable $PROJECT. If you are a member of multiple Gadi projects, you can choose which project will be charged for your pipeline execution by setting params.nci_gadi_project (--nci_gadi_project on the command line) to the desired project code.
Similarly, params.nci_gadi_storage (--nci_gadi_storage on the command line) is used to specify the storage locations that the pipeline needs to access. By default, this is set to gdata/${params.nci_gadi_project}+scratch/${params.nci_gadi_project}.
Note: The version of Nextflow installed on Gadi has been modified to make it easier to specify resource options for jobs submitted to the cluster through the Nextflow process block (see NCI’s Gadi user guide for more details). The values specified through the parameters above are passed into the process block in the nci_gadi.config.
Resource usage
To help monitor the service unit (SU) cost of running workflows on Gadi, a plugin has been developed to generate a report in CSV or JSON format upon workflow completion. The nf-gadi plugin is available via the Nextflow plugin registry and can be enabled by adding -plugins nf-gadi to your Nextflow run command. See the plugin project repository for more details.
Additionally, Sydney Informatics Hub also provides a script to collect per-task SU costs. Upon workflow completion, you can run the gadi_nfcore_report.sh in your workflow execution directory to collect resources from the PBS log files printed to each task’s .command.log. Resource requests and usage for each process are summarised in the output gadi-nf-core-joblogs.tsv file. To run it, execute the following in your workflow execution directory:
bash gadi_nfcore_report.shConfig file
// NCI Gadi nf-core configuration profile
params {
config_profile_description = 'NCI Gadi HPC profile provided by nf-core/configs'
config_profile_contact = 'Georgie Samaha (@georgiesamaha), Kisaru Liyanage (@kisarur), Matthew Downton (@mattdton)'
config_profile_url = 'https://opus.nci.org.au/display/Help/Gadi+User+Guide'
nci_gadi_project = System.getenv("PROJECT")
nci_gadi_storage = "gdata/${params.nci_gadi_project}+scratch/${params.nci_gadi_project}"
}
validation.ignoreParams = ["nci_gadi_project", "nci_gadi_storage"]
// Enable use of Singularity to run containers
singularity {
enabled = true
autoMounts = true
cacheDir = "/scratch/${params.nci_gadi_project}/${System.getenv('USER')}/nxf_singularity_cache"
}
// Submit up to 300 concurrent jobs (Gadi exec max)
executor {
queueSize = 300
}
// Define process resource limits
process {
executor = 'pbspro'
project = "${params.nci_gadi_project}" // The version of Nextflow installed on Gadi has been modified to allow usage of this non standard directive
storage = "${params.nci_gadi_storage}" // The version of Nextflow installed on Gadi has been modified to allow usage of this non standard directive
module = 'singularity'
cache = 'lenient'
stageInMode = 'symlink'
queue = { task.memory < 128.GB ? 'normalbw' : (task.memory >= 128.GB && task.memory <= 190.GB ? 'normal' : (task.memory > 190.GB && task.memory <= 1020.GB ? 'hugemembw' : '')) }
}
Pipeline configs
// NCI Gadi nf-core configuration profile
profiles {
nci_gadi {
params {
config_profile_description = 'nf-core/proteinfold NCI Gadi HPC profile provided by nf-core/configs'
config_profile_contact = 'Mitchell O\'Brien (@mitchob)'
config_profile_url = 'https://opus.nci.org.au/display/Help/Gadi+User+Guide'
project = System.getenv("PROJECT")
storage_account = ''
}
// Define process resource limits
process {
executor = 'pbspro'
project = System.getenv("PROJECT")
storage = params.storage_account?.trim() ? params.storage_account : "scratch/${params.project}+gdata/${params.project}"
module = 'singularity'
cache = 'lenient'
stageInMode = 'symlink'
// Process-specific configurations
withName: 'RUN_ALPHAFOLD2|RUN_ALPHAFOLD2_PRED|RUN_ALPHAFOLD2_MSA' {
queue = params.use_gpu ? 'gpuvolta' : 'normal'
cpus = 48
gpus = 4
time = '4h'
memory = 380.GB
}
withName: COLABFOLD_BATCH {
container = "nf-core/proteinfold_colabfold:1.1.1"
queue = params.use_gpu ? 'gpuvolta' : 'normal'
cpus = 48
gpus = 4
time = '4h'
memory = 380.GB
}
withName: RUN_ESMFOLD {
container = "nf-core/proteinfold_esmfold:1.1.1"
queue = params.use_gpu ? 'gpuvolta' : 'normal'
cpus = 48
gpus = 4
time = '4h'
memory = 380.GB
}
}
// Write custom trace file with outputs required for SU calculation
def trace_timestamp = new java.util.Date().format('yyyy-MM-dd_HH-mm-ss')
trace {
enabled = true
overwrite = false
file = "./gadi-nf-core-trace-${trace_timestamp}.txt"
fields = 'name,status,exit,duration,realtime,cpus,%cpu,memory,%mem,rss'
}
}
}