1 Summary
2 Job Priorities
- 2.1 Usage and Contingents
- 2.2 Priority
  - 2.2.1 Quality-of-Service (QoS): qos_factor
  - 2.2.2 Partition Factor: partition_factor
  - 2.2.3 Waiting Time: age_factor
  - 2.2.4 Job Size: job_size_factor
3 How to get a Higher Priority

Summary

The main quantity that determines the priority of your jobs is the relation of cluster usage in terms of CPU-core-hours and GPU/FPGA hours in relation to the granted contingent.

Important tools:

pc2status: view your projects and their compute-resource usage, quotas, and your usage of the express-priority
squeue_pretty: view priorities of your pending jobs
spredict: show the estimated start time of your pending jobs
sprio: show how different factors contribute to the overall priority of your pending jobs
Be aware of the possibility to use the express-priority, the fpgasynthesis-priority or request an increase in resources.

Job Priorities

Usage and Contingents

You can view your compute-time projects, their cluster usage, and other interesting information with the command-line tool pc2status. The main quantity that determines the priority of your jobs is the relation of cluster usage in terms of CPU-core-hours and GPU/FPGA hours in relation to the granted contingent.

Project Usage U30: The 30-day project usage is the sum of the project usage in the last 30 days plus the resources that the currently running jobs of the project would consume assuming that they run till their time limit.

Project Usage U60: The 60-day project usage is the sum of the project usage in the last 60 days plus the resources that the currently running jobs of the project would consume assuming that they run till their time limit. U60-U30 is the usage in the period between 60 days and 30 days ago.

Contingent C30: The 30-day contingent for the individual resources (CPU-Core-hours, GPU/FPGA-hours) is the total granted compute resources for 30 days, i.e. (total granted)/(project duration in days)*(30). If the project has not been running for 30 days yet or the remaining time is less than 30 days then the C30 is lowered accordingly.

Contingent C90: The 90-day contingent for the individual resources (CPU-Core-hours, GPU/FPGA-hours) is the sum of:

the C30,
half of max(0,(C30-(U60-U30))), i.e. half of the remaining C30 contingent from 30 days ago,
and half the C30, i.e. half of the future C30 contingent

Examples:

A project has started 5 days ago but hasn’t used any resources yet: C30=(total granted)/(project duration in days)*5, C90=C30+0.5*C30=1.5*C30
A project has used
- has been running for more than 60 days and will run for more than 30 days in the future,
- hasn’t used any resources in the last 30 days,
- and has used half of the C30 in the time period between 30 days and 60 days ago,
- then C90=C30+1/2*(C30-1/2*C30)+1/2*C30=2.25*C30

Priority

The priority is an integer number. The higher the better. You can view the priority of your pending jobs with squeue_pretty .

The priority is computed as:

Job_priority = 500.000 * (QoS_factor)

+ 50.000 * (partition_factor)

+ 35.000 * (age_factor)

+ 15.000 * (job_size_factor)

The QoS_factor depends on the QoS-level of a job.

Quality-of-Service (QoS): qos_factor

The following QoS depend on the project usage. Only one of them is active for a project at a time.

QoS name	Usage	QoS- factor

QoS name	Usage	QoS- factor
cont	The project has used less that its C30 contingent in the last 30 days, i.e. U30<=C30.	0.6
lowcont	C30<U30<=C90	0.4
nocont	U30>C90 or total usage>granted contingent	0.2
suspended	project is not active, i.e. expired, locked or hasn’t started yet	0

There are also special QoS that can be chosen by the user for jobs:

QoS name	Usage	QoS limits	QoS- factor	Limits

QoS name	Usage	QoS limits	QoS- factor	Limits
test	for test projects	at most 2 runnning jobs per user and at most 2 submitted jobs per user	0.8	maxSubmitJobs per user = 2 maxRunningJobs per user = 2
express	urgent jobs with high-priority	1000 CPU-Core-h per user and month and 30 GPU/FPGA-hours	0.8	maxSubmitJobs per user = 100
fpgasynthesis	FPGA synthesis for FPGA projects	allowed partition on normal and largemem, only single-node jobs, at most 10 running jobs per user	0.8	maxRunningJobs per user = 10
eaccess	for employees of Paderborn University (see HPC Easy Access), no formal compute-time project needed	currently 500 CPU-Core-h plus 5 GPU-hours per user and month	0.6	maxSubmitJobs per user = 100
devel	for GPU testing and development (DGX node on Noctua2)	detail page

Partition Factor: partition_factor

The partition_factor is 0 for most partitions. Only the fpga-partition has a factor of 1. This is done to allow more freedom in possible future overlapping of partitions.

Waiting Time: age_factor

The age_factor depends on the waiting time of a job. A job that was just submitted has an age_factor of 0. While the job is pending the age_factor grows linearly till it reaches a value of 1 after 10 days of waiting.

Job Size: job_size_factor

The job_size_factor prioritizes larger jobs. The value between 0 and 1 is given by requested resources divided by the total resources available in the cluster. A full-system job has a value of 1.

How to get a Higher Priority

Increasing the Project Contingent

If you either temporarily or permanently need a higher project contingent, then the PI of a project can request it from the Resource Allocation Board of PC2 via mail to pc2-support@uni-paderborn.de. Please include a justification for why you need the increase. Normal and large projects can be increased by at most 25 % over the initially granted total contingent.

express-Priority

The express-QoS (#SBATCH -q express) gives your jobs a very high priority. Each user has a monthly quota listed in the table above. If this quota is exceeded no jobs can be submitted with the express-priority until the quota is refreshed at the beginning of the next month.

You can only use the partitions that you could use via the regular priority in a compute-time project. Thus, if you don't have access to GPUs in a compute time project you also can’t access them via the express-priority.

fpgasynthesis-Priority (Noctua 2 only)

Projects that have been granted access to FPGAs also have access to the fpgasynthesis-QoS (#SBATCH -q fpgasynthesis). Using it gives their FPGA bitstream-synthesis jobs a high priority. The only allowed partitions are normal and largemem. Only single-node jobs are allowed and at most 10 running jobs per user.

PC2-Documentation

Quality-of-Service (QoS) and Job Priorities