FAQ

How do i use SSH connection to Hive from Windows/Linux/Mac OS?

In order to connect to Hive you should use SSH protocol.

Windows: To use SSH connection from Windows, you should use a special program that will allow your operating system to use ssh protocol. We recommend to use Putty program:

1. Download Putty from official website http://www.putty.org/

2. Install Putty on your computer by few clicks

3. In Putty go to Session --> Logging

4. In section Host name or IP adress enter our access node server: hive01.haifa.ac.il and in PORT section enter: 22

Linux/Mac: Linux and Mac OS User's doesn't have to download any additional soft in order to use SSH connection to connect to Hive. All you need to do is just open Terminal shell that is nattily installed on the OS and use commnad: # ssh username@hive01.haifa.ac.il

Back to top

How do i upload/download files to/from hive?

There is a few ways you can download and upload files from hive and to hive:

Without additional software(for linux users):

If you have SSH connection to hive01.haifa.ac.il you can use standard # scp command to move files from your computer to hive and from hive to your computer.

The below example assuming you running the commands from your linux localhost computer:

Copy the file "data.txt" from hive to the your local host

$ scp your_username@hive01.haifa.ac.il:data.txt /some/local/directory

Copy the file "data.txt" from your local host to hive

$ scp data.txt your_username@hive01.haifa.ac.il:/some/remote/directory

With software:

If you are a windows user or you just feel more comfortable with using any GUI clients to deliver files we recommend using FTP clients. You can download clients from here:

FileZilla - https://filezilla-project.org/ Comfortable client that you can use on on Windows/Mac OS/Linux.

WinSCP - http://winscp.net/eng/download.php Another popular ftp/sftp client for Windows OS.

Back to top

How do i use VPN connection?

If you do not have a VPN account, first you should get an account. Ask your system administrator or directly call to University tech support(number: 2609) and ask for receiving VPN connection in order to receive university IP address to work on Hive HPC from outside of the university.

If you already have an account, then you have few ways to connect:

1. Go to ssl.haifa.ac.il or sslt.haifa.ac.il --> Enter you username and password and continue following the steps that web client will ask you in order to connect.

2. Download Junos Pulse client and connect from native client for your operating system(works on Iphone and Androids too!)

Back to top

How do i change my hive user password?

Simply use the standard # passwd command or if you forgot your password ask your system administrator for help.

Back to top

I am new user, i want to run a job, what partition should i use?

Welcome to Hive!

If you are a new user and trying to execute your first simple job to the queue then you should run your job on one of the partitions named: hive1d hive7d and hive31d . Those partitions contain only the public nodes and has no preemption(when job with higher priority kick your job to restart and taking your resources). You can be sure that your job will run until in will finish.

Please refer to the Slurm, Limitations and Hardware section on the website menu in order learn how it all works.

Back to top

How can i see what public software is installed?

The path of all installed software is /data/apps

Most of public software is installed as module, and you can check what modules are installed by command: # module avail

To see what public software module is already loaded for you user, use command: # module list

Some software doesn't need to be installed as a module so just list to the public storage path and see what is installed there.

Back to top

How do i load and unload modules?

First you would like to see a list of the modules that are currently available for use, to do that run the next command:

# module avail

To load and unload software installed as module on a public storage, simply use the commands:

# module load MODULENAME , # module unload MODULENAME

To see the list of already loaded modules on your environment run the next command:

# module list

Back to top

Where can I get some help on using the command line interface on the cluster?

The command line interface is provided by a Linux shell. Several different Linux shells are available for users. By default, users are provided with bash. For basics, refer to Introduction to Linux.

Back to top

What kind of software can be installed in public storage?

Every software that is large sized or will be used by whole users group or at least a by few users could be installed on the public storage. Please ask your group owner to ask system administrator to install the required software.

Back to top

Definitely NO! Access node is little virtual machine made only for connecting to Hive. All jobs have to run only on compute nodes with the help of slurm resource management.

Back to top

How do i need to allocate resources for RAxML?

RAxML has different executable for each running method, below information on a few methods:

Single-Threaded (Serial) - The serial version of RAxML is called raxmlHPC-SSE3, and is a single-threaded application. This means that with with executable raxmlHPC-SSE3 you can run jobs limited to only one CPU in one node(Default allocation of "slurm" is one CPU in one node, so you do not have to specify how much resources you would need for this job, but if you want, you can tell slurm to use one node -N 1 and one task -n 1).

Multi-Threaded (Parallel) - The multi-threaded version of RAxML is called raxmlHPC-PTHREADS-SSE3, and can use multiple processors on a single compute node. This means that with executable raxmlHPC-PTHREADS-SSE3 you can run job limited by one node, but you can use as much CPU's as the compute node has. Attention: You cannot tell RAxML to use more threads than tasks you allocated to slurm. Example of running RAxML in multi-threaded mod under slurm resource manager:

#!/bin/bash

#SBATCH -n 20 # number of tasks to allocate, for thin compute nodes maximum is 20, for queen server maximum is 32

#SBATCH -J My-RAxML-Multi-Threaded-job # Job name

#SBATCH --partition=hive

#SBATCH --mail-type=ALL # When to send mail (BEGIN, END, FAIL, REQUEUE, ALL)

#SBATCH --mail-user=yourmail@univ.haifa.ac.il # Where to send mail. A valid email address

#SBATCH --error="%j.err" # Direct STDERR here (file identifier), %j is substituted for the job number

#SBATCH --output="%j.out" # Direct STDOUT here (file identifier), %j is substituted for the job number

. /etc/profile.d/modules.sh

module load RAxML/8.1.15 openmpi/1.8.4

raxmlHPC-PTHREADS-SSE3 ... -T 20

Hybrid (Parallel) - The hybrid (MPI and multi-threading) executable of RAxML is called raxmlHPC-HYBRID-SSE3 and uses multiple processors on multiple compute servers. The executable is called raxmlHPC-MPI-SSE3 and can use multiple processors that may, or may not be, on the same compute node. This means that with executable raxmlHPC-HYBRID-SSE3 you can run jobs on different nodes and with large number of CPU's(as much as you can get). Here is an example of running RAxML with the help of MPI:

#!/bin/bash

#SBATCH -N 6 # number of nodes you would like to allocate

#SBATCH -n 120 # number of tasks you would like to allocate

#SBATCH -J My-RAxML-MPI-Job # jobname

#SBATCH --partition=hive # partition name

#SBATCH --mail-type=ALL # When to send mail (BEGIN, END, FAIL, REQUEUE, ALL)
#SBATCH --mail-user=mymail@univ.haifa.ac.il # Where to send mail. A valid email address

#SBATCH --error="%j.err" # Direct STDERR here (file identifier), %j is substituted for the job number

#SBATCH --output="%j.out" # Direct STDOUT here (file identifier), %j is substituted for the job number

. /etc/profile.d/modules.sh

module load RAxML/8.1.15 openmpi/1.8.4

mpiexec --map-by node -np 6 raxmlHPC-HYBRID-SSE3 ... -T 20

Again, pay attention that the number of threads cannot be more than CPU's that compute node has. In this example we are allocating 120 tasks and 6 nodes which will give 20 CPU's on each node and RAxML will run 20 threads on each node.

Back to top

My user has primary and secondary groups, how can i create files with secondary group ownership?

If you have more then one primary group and you want to manage files under ownership of secondary group in specific folder, you should do the next steps:

Create a folder, named "secondary" for example, and change the group ownership to your secondary group:

$ mkdir secondary

$ chgrp secondary /path/to/your/folder/secondary

Now you would want that every new file inside your secondary group folder will be created under secondary group ownership, for that you should run chmod command:

$ sudo chmod g+s /path/to/your/folder/secondary

Now all the new files under your secondary group will receive the parent's folder ownership.

Back to top

Frequently Asked Questions

Copy the file "data.txt" from hive to the your local host

Copy the file "data.txt" from your local host to hive

Create a folder, named "secondary" for example, and change the group ownership to your secondary group:

Now you would want that every new file inside your secondary group folder will be created under secondary group ownership, for that you should run chmod command: