Limitations has been added for Hive resources! That means that now each user will be limited in resources under public partitions, that will make Hive more “fair” and usable by higher number of users.
Currently there are 3 Hive partitions instead of one Hive (the maximal CPU's per user may vary due to fine tuning):
hive1d - DEFAULT partition, limits: WallTime: 1 day, MaxCPU’sPerUser ~ 1200
hive7d - limits: WallTime: 7 days, MaxCPU’sPerUser ~ 350
hiveunlim limits: WallTime: unlimited, MaxCPU’sPerUser ~ 180
preempt1d - limits: WallTime: 1 day, MaxCPU’sPerUser ~ 2400
preempt7d - limits: WallTime: 7 days, MaxCPU’sPerUser ~ 1800
queen - limits: WallTime: 31 days, MaxNODE’sPerUser ~ 80
mpi - limits: MinCPU’sPerUser = 1
*WallTime= maximum time that each job can run
*MaxCPU’sPerUser= maximum number of CPU’s user can use under specific partition)
*MaxNODE'sPerUser= maximum number of NODE's user can user under specific partition
*MinCPU'sPerUser= minimum resource allocation per job
*Preemption=REQUEUE= Low priority partition, all jobs running on that partition could be preempted/checkpointed by higher priority partition. preempted job will restart from beginning or last checkpoint
Job submission recommendation:
Now that we Have limitations on the system, i recommend all users to give as much information as you can when submitting your jobs, that will help you have good statistics and high priority over other users. Users that will not give information will have lower priority, because system is calculating everything.
*When submitting a job, ask for the resources needed for your job and not MORE than your job is needed
* If your job is using Memory, do not forget to to allocate the required number of memory during submission(flag: --mem=<MB> )
* Do not forget to add the time flag when submitting jobs, if you are running a 3 day job under 7day partition, you could save 4 days to your statistics… (flag: -t, --time=<time>)