Limitations has been added for Hive resources! That means that now each user will be limited in resources under public partitions, that will make Hive more “fair” and usable by higher number of users.
Currently there are 3 Hive and 3 ckpt partitions instead of one Hive and one CKPT:
hive1d - DEFAULT partition, limits: WallTime: 1 day, MaxCPU’sPerUser = 200
hive7d - limits: WallTime: 7 days, MaxCPU’sPerUser = 100
hive31d limits: WallTime: 31 day, MaxCPU’sPerUser = 50
ckpt1d - limits: WallTime: 1 days, MaxCPU’sPerUser = 800 Preemption=REQUEUE Low priority preempt-able partition
ckpt7d - limits: WallTime: 7 days, MaxCPU’sPerUser = 600 Preemption=REQUEUE Low priority preempt-able partition
ckpt31d - limits: WallTime: 31 days, MaxCPU’sPerUser = 400 Preemption=REQUEUE Low priority preempt-able partition
queen - limits: WallTime: 31 days, MaxNODE’sPerUser = 1
ckptqueen - limits: WallTime: 31 days Preemption=REQUEUE Low priority preempt-able partition
mpi - limits: MinCPU’sPerUser = 200
*WallTime= maximum time that each job can run
*MaxCPU’sPerUser= maximum number of CPU’s user can use under specific partition)
*MaxNODE'sPerUser= maximum number of NODE's user can user under specific partition
*MinCPU'sPerUser= minimum resource allocation per job
*Preemption=REQUEUE= Low priority partition, all jobs running on that partition could be preempted/checkpointed by higher priority partition. preempted job will restart from beginning or last checkpoint
Job submission recommendation:
Now that we Have limitations on the system, i recommend all users to give as much information as you can when submitting your jobs, that will help you have good statistics and high priority over other users. Users that will not give information will have lower priority, because system is calculating everything.
*When submitting a job, ask for the resources needed for your job and not MORE than your job is needed
* If your job is using Memory, do not forget to to allocate the required number of memory during submission(flag: --mem=<MB> )
* Do not forget to add the time flag when submitting jobs, if you are running a 3 day job under 7day partition, you could save 4 days to your statistics… (flag: -t, --time=<time>)