Frequently Asked Questions
Table of Contents
- What is HPC?
- Tell me about supercomputing (SC)!
- What HPC platforms do we have?
- How do I get an account?
- How can I get help?
- I am new to Linux, help!
- How do I connect?
- What are some local HPC tutorials?
- How about some Power8 and GPU examples?
- How about some machine learning examples?
- Can you show me a simple build and run?
- How do I check each machine’s status?
- Who owns nodes on Mio and what are their specs?
- How do I use the file system?
- How do I run?
- I want to run complex scripts. Any advice?
- What prebuilt apps and libs do we have? (The Module System)
- What are other people doing?
- How do I run better?
- How do I select my nodes?
- How do I manage jobs?
- How do I see my scratch usage?
- Why are my jobs stuck in the queue?
- Is there a best practice for submitting jobs to the queue so that wait time is minimized?
- My jobs are always cancelled after they have been running for six days. This is extremely frustrating. What is the cause of this and how can I avoid this?
- Is there a way to estimate when my job will start running on the compute nodes?
Question not answered here? Try Ask.CI!
Go to Ask.CI, the Research Computing Q&A site, for more ideas, examples and information!
What is HPC?
From the first reference given below: “High Performance Computing most generally refers to the practice of aggregating computing power in a way that delivers much higher performance than one could get out of a typical desktop computer or workstation in order to solve large problems in science, engineering, or business.”
A high performance computer, sometimes called a supercomputer or a parallel computer is one that has access to some number of computing resources, in particular processors. The computing resources are used together to solve a problem. The resources are connected together by some network or bus and they work together by passing messages over the network.
The individual resource could be a simple collection of processors similar to what you might have in a laptop computer. Another type of resource that is becoming popular are GPUs or graphic processor units. Thes are similar to the chips that drive a lot of video games.
The basic idea behind HPC is if you have a problem that takes 64 hours on a single computer than why not use 64 computers and run it in 1 hour. Or you may have a problem that does not fit on a single computer so you could split it across several.
You may hear of a supercomputer being called a parallel computer. They are parallel because the the processors work together in parallel. Parallel computing can be thought of computing by committee, with all the same advantages and disadvantages. Each processor (committee member) works on a section of the problem. If the processors all have about the same amount of work to do the committee approach may work well. If there is too much communication (too many committee meetings) and/or if, say, one processor is lagging behind the others the calculation will be slowed.
An important point: the individual resources in a supercomputer might not be any more powerful than your laptop. What makes them fast is using many such resources together. So if you have software that runs on your laptop moving it to a supercomputer it might not run any faster unless it is rewritten to take advantage of multiple resources.
- What is high performance computing – insideHPC
- Overview of High Performance Computing
What HPC platforms do we have?
You can read more about our HPC systems on the Systems & Policies page.
How do I get an account?
We have three distinct high-performance computing (HPC) platforms at Mines:
For machine details, see our HPC Systems & Policies page.
Mio is a shared resource funded in part by the Mines Administration and in part by money from individual researchers. Mio came on line March 2010. Initially it was a relatively small cluster dedicated to a single group of research projects. Mio grew quickly into a supercomputing-class machine, now bigger than AuN.
Mines funds provide the infrastructure; individual researchers can purchase compute nodes that are added to the cluster. The researchers own their nodes, that is, they have exclusive access when they need them. A number of the nodes were purchased using TechFee money so they belong to students.
AuN.mines.edu and Mc2.mines.edu originally comprised BlueM, the front-end machine that served as an access point for their shared file system. AuN nodes are now a part of Wendian (BlueM is also no longer available).
Wendian is similar in spirit to Mio, with the exception that access is not limited to node owners. As with Mio, Mines funds provide the infrastructure and individual research groups can purchase compute nodes belonging to the cluster. With Wendian, however, Mines funds make available a certain percentage of nodes to researchers opting not to invest in node ownership. Various incentives encourage node purchases to ensure an equitable usage environment.
Access to buy nodes on Mio is closed. Only existing node owners may add new users for access on Mio. Tech Fee allows students who are not supported by a researcher to run on Mio. Use of Mio in this capacity precludes faculty from authorship of papers based on associated research.
Wendian and AuN
Access to both Wendian is a proposal process. We periodically have a call for proposals. In between calls, researchers can still request an account by filling out the help center form. Only faculty are allowed to request accounts. After the account is granted they can request that their students be authorized for an account also.
Information about purchasing nodes on/for Wendian can be obtained by submitting a research consultation request. The most recent (February 2019) specifications for nodes are:
- Penguin Relion XO1132G Server Nodes;
- Dual Intel Xeon 6154 processors (total 36core, 3.0GHz, 200W);
- Mellanox ConnectX-4;
- 225GB SSD;
- 192GB RAM; DDR4-2666MHz REG, ECC; 1R(12x16GB)
- Cost ~$8,500;
- 394GB RAM, DDR4-2666MHz REG, ECC; 1R(12x32GB);
- Cost ~$11,500.
How can I get help?
- Submit a help center ticket! The HPC group responds as soon as possible to HPC ticket requests. Our service hours are 8am – 5pm M-F, with off-hours assistance at our discretion. We exist to facilitate researchers’ computational goals, and we all proudly take that mission seriously. Find us at the Research/HPC help center!
- Consult the FAQs on our website!
Our FAQ page offers additional useful links.
- Avail yourself of our plentiful examples and tutorials:
- The “How do I do a simple build and run” section shows how to build and run a simple example. A good resource if you encounter problems during your research; check your approach by trying this again;
- The “local HPC tutorials” section has links to many tutorials;
- See the “How do I connect?” section for information about connecting to HPC platforms;
- If you are new to Linux then you might find the “I am new to Linux, help!” section useful.
- For higher-level, targeted scenarios, examine our campus HPC-specific Tech Reports!
The Mines HPC Group Tech Reports provide some obscure but maybe useful discussions of advanced topics.
I am new to Linux, help!
A computer operating system is a program, or rather a collection of programs, running on a computer that enables people to interact with it. It allows the computer to be controlled, it presents information to the user, and it permits the user to pass information and issue instructions.
Windows is an operating system. Apple’s OSX is an operating system, as is iOS on iPhones.
Linux is an operating system that is used on many high performance computing machines, as well as smaller computers. There are versions of Linux that use graphical user interfaces (GUIs) and those that just use a command line (typing) interface. Most of the interactions with our HPC platforms are via a command line interface.
After you get a feel for Linux you will be comfortable at just about any high performance computing site. You will be surprised that you will feel more comfortable using the lower level features of the Mac’s OSX. As far as Windows, you may feel a bit more comfortable or you may even want to start using Linux on your laptop.
There are many tutorials available on Linux. Here is a short list.
We also have a rather extensive presentation developed locally.
More information can be found at our Other User Guides page.
You also may be interested in our local scripting tutorials, “Advanced Scripts”, under the How can I get help? section.
Finally, you may also be interested in the How do I connect section. It describes the basics of connecting to our HPC platforms as well as some advanced techniques to make your life easier, showing how you can “hop” from one machine to another without needing to enter a password.
How do I connect?
We have three High Performance Computing (HPC) systems on campus. Mio, AuN (Golden) and Wendian. This document describes how to log on to these systems, once you have been granted an account. For information about how to get an account see the “How do you get an account” FAQ section.
After you have logged in please see the “How do I do a simple build and run” FAQ section to see how to build and run applications.
The only way to access the HPC platforms is by using ssh. Unix and Unix-like operating systems, (OSX, Linux, Unicos…) have ssh built in. If you are using a Windows-based machine then you must use a terminal package that supports ssh, such as Mobaxterm or Windows Subsystem for Linux. As of the April 2018 update, SSH is also available by default in Windows Powershell.
All of the HPC platforms are behind the campus firewall. The firewall blocks access from off campus. Thus you need to be on campus to get access, or you need to use VPN software discussed on the CCIT VPN page. There is a third method for gaining access discussed below under the section Setting up keys to make your life much easier (below). This method will allow you access for a fixed period of time without needing to reenter your password.
Assuming you are on campus and you are using a machine that supports ssh directly, you can get to Mio and Wendian, respectively, by entering the following in a terminal window:
You will be asked for your password. The password required here is your MultiPass password. The session should look like the following with “joeuser” replaced with your username and “petra” replaced with the name of the machine from which you are connecting.
[joeuser@petra ~]$ ssh firstname.lastname@example.org email@example.com's password: [joeuser@mio001 ~]$
[joeuser@petra ~]$ ssh firstname.lastname@example.org email@example.com's password: [joeuser@aun ~]$
[joeuser@petra ~]$ ssh firstname.lastname@example.org email@example.com's password: [joeuser@wendian001 ~]$
Setting up keys to make your life much easier
Using ssh keys might make your life easier. This can work from both on campus and off. Also, the procedure discussed below will allow you to log in only entering a passphrase every 8 hours.
The following is a quick guide for setting up keys and tunnels to access aun.mines.edu, and mio.mines.edu from an on campus Linux or MacOS machine. The commands you will enter are shown in red. The procedure for setting up off campus access via tunneling is similar but the configuration file is different and there is an extra step. This is documented below. Note: Non-Mines people are not allowed to tunnel into campus and must use VPN. After VPN is set up off campus users can use the procedure outlined for on campus usage.
Setting up access from an on campus Linux or MacOS machine
Generate your key pair (do not use an empty passphrase):
osage:~ joeuser$ ssh-keygen -f $HOME/.ssh/forbluem -tdsa Generating public/private dsa key pair. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /Users/joeuser/.ssh/forbluem. Your public key has been saved in /Users/joeuser/.ssh/forbluem.pub. The key fingerprint is: 67:60:3c:5e:42:64:23:c5:79:70:62:d1:da:74:97:45 firstname.lastname@example.org The key's randomart image is: +--[ DSA 1024]----+ | .+@=. +E| | *o++ . o | | *=.. . | | o.=. | | S o | | o | | | | | | | +-----------------+ osage:~ joeuser$
Copy the public key to Wendian:
osage:.ssh joeuser$ cat ~/.ssh/forbluem.pub | ssh wendian.mines.edu "cat >> ~/.ssh/authorized_keys"
Copy the public key to Mio:
If you have an account on Mio then you will want to copy your new key there also, allowing you to log in using the same key.
osage:.ssh joeuser$ cat ~/.ssh/forbluem.pub | ssh mio.mines.edu "cat >> ~/.ssh/authorized_keys"
Add the following lines to your ~/.ssh/config file. Create one if it does not exist. Replace “joeuser” with your Mines username.
#Next 5 lines are optional if you don't do X-Windows. The location of XAuthLocation might be different. ForwardAgent yes ForwardX11 yes ForwardX11Trusted yes XAuthLocation /Users/joeuser/.Xauthority #XAuthLocation /opt/X11/bin/xauth ServerAliveInterval 60 PubkeyAcceptedKeyTypes=+ssh-dss AddKeysToAgent yes Host mio,mio.mines.edu HostName 188.8.131.52 User joeuser Identityfile2 ~/.ssh/forbluem Host aun,aun.mines.edu HostName aun.mines.edu User joeuser Identityfile2 ~/.ssh/forbluem
Set the permissions on your config file:
chmod 600 ~/.ssh/config
Run the following to set an 8-hour limit on your key:
ssh-add -t 28800 ~/.ssh/forbluem
Log in to AuN or Mio using ssh:
This time you should not need to enter a password.
Setting up access from an off campus Linux or MacOS machine
Generate your key pair (do not use an empty passphrase):
petra:~ joeuser$ ssh-keygen -f $HOME/.ssh/forbluem -tdsa Generating public/private dsa key pair. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /Users/joeuser/.ssh/forbluem. Your public key has been saved in /Users/joeuser/.ssh/forbluem.pub. The key fingerprint is: 67:60:3c:5e:42:64:23:c5:79:70:62:d1:da:74:97:45 email@example.com The key's randomart image is: +--[ DSA 1024]----+ | .+@=. +E| | *o++ . o | | *=.. . | | o.=. | | S o | | o | | | | | | | +-----------------+ petra:~ joeuser$
Copy the public key to jumpbox and set the permission for the keys file:
[joeuser@petra ~]$ cat ~/.ssh/forbluem.pub | ssh jumpbox.mines.edu "cat >> ~/.ssh/authorized_keys"
[joeuser@petra ~]$ ssh jumpbox.mines.edu "chmod 600 ~/.ssh/authorized_keys"
Add the following lines to your ~/.ssh/config file. Create one if it does not exist. Replace “joeuser” with your Mines username.
#Next 5 lines are optional if you don't do X-Windows. The location of XAuthLocation might be different. ForwardAgent yes ForwardX11 yes ForwardX11Trusted yes XAuthLocation /Users/joeuser/.Xauthority #XAuthLocation /opt/X11/bin/xauth ServerAliveInterval 60 PubkeyAcceptedKeyTypes=+ssh-dss AddKeysToAgent yes Host Mio Hostname mio.mines.edu User joeuser ProxyCommand ssh jumpbox.mines.edu -W %h:%p Identityfile2 ~/.ssh/forbluem Host Wendian Hostname wendian.mines.edu User joeuser ProxyCommand ssh jumpbox.mines.edu -W %h:%p Identityfile2 ~/.ssh/forbluem Host jumpbox.mines.edu Hostname jumpbox.mines.edu User joeuser Identityfile2 ~/.ssh/forbluem #ControlMaster auto #ControlPath /Users/joeuser/.ssh/tmp/%h_%p_%r
Run the following to set an 8-hour limit on your key:
ssh-add -t 28800 ~/.ssh/forbluem
This command should be run as needed to renew your key. You will enter the passphrase that you used to set up the key.
Log in to jumpbox using ssh:
Copy your key from jumpbox to Aun and/or Mio.
Copy the public key to Mio. You should not need to set the permissions.
[joeuser@petra ~]$ cat ~/.ssh/forbluem.pub | ssh mio.mines.edu "cat >> ~/.ssh/authorized_keys"
[joeuser@petra ~]$ ssh mio.mines.edu "chmod 600 ~/.ssh/authorized_keys"
If you’re also on Wendian, repeat this process for that login.
You should now be able to ssh directly to Mio or Wendian from off campus using the machine names, Wendian and Mio.
[joeuser@petra ~]$ ssh mio Last login: Thu Jul 5 11:58:57 2018 from 184.108.40.206 [joeuser@mio001 ~]$
What are some HPC tutorials?
- Introduction to High Performance computing
- What is High performance computing? Why is it of interest? When is it applicable or not? Overview of hardware.
- Linux for HPC
- A very fast paced introduction to the common operating system for most HPC systems. Lots of tips and tricks. If you have only ever worked on a Windows machine this session is a must.
- Message Passing Interface (MPI) Introduction
- The Message Passing Interface Standard (MPI) is a message passing library standard. MPI is the basis of most large scale parallel HPC applications. This will provide a “hello world” introduction and discussion of some of the more used calls.
Slides 1, Slides 2, Slides 3
- Message Passing Interface – Sample Applications
- We will show building of a “simple” MPI application.
Slides 1, Slides 2, Slides 3
- OpenMP – Single node threaded applications
- OpenMP specifies a collection of compiler directives, library routines, and environment variables that can be used to specify shared-memory parallelism in C, C++ and Fortran programs.
- Batch Scripting for HPC
- Show a bunch of techniques and tricks for batch scripting for parallel jobs.
- Bag of Task / Embarrassing Parallel / Large numbers of serial applications
- Say you have a bunch of similar but independent jobs to run. Guides for this TBA .
- Memory Profiling and Building for multiple architectures
- Two unrelated short topics. First we will show subroutine calls for tracking memory usage and then talk about building applications that need to run on several generations of X86 chips.
Slides for both
- Hybrid Applications and Thread Affinity
- We will combine MPI and OpenMP to make a hybrid program. Also, we will show how to ensure that you are using all available cores.
- Introduction to the DDT program debugger
- Introduction to GPUs and Machine learning (Running Tensorflow)
- Discuss GPUs, GPU programming, and the in demand Tensorflow program for Machine learning. (See section below under “SHOW ME SOME MACHINE LEARNING EXAMPLES!”)
- Technical Session
- Discussion of a technique for finding the optimum function, F(x) such that F(x) closely matches a target function, T(x) and F(x) has a low curvature. Link for guide TBA
Laptop software recommendations
If you have a Linux laptop you should be good to go.
If you have a Macintosh, it is suggested that you install XQuartz. This will be needed for some of the GUI based topics such as Debugging and Profiling.
Windows Laptop software recommendations
If you run Windows on your laptop we have a set of recommendations for software. Each of these recommendations will give you various levels of functionality.
Easy install and basic functionality
Most difficult install — high functionality
This option gives you a nearly full Linux operating system running along side of Windows. The instructions under Bash on Ubuntu on Windows show how to install the base system. Unfortunately, the X Window system needed for running GUI based programs is a separate install. One way to get the required components is to install Xming and XLaunch. Note: these also can provide X Window support for the Putty and BitWise ssh clients. But we are not recommending using either of these two packages at this time.
- Bash on Ubuntu on Windows: https://www.windowscentral.com/how-install-bash-shell-command-line-windows-10
- Xming and XLaunch: https://sourceforge.net/projects/xming/files/Xming/
The following page discusses the setup of Xming. It also discusses putty which had been deprecated. http://www.geo.mtu.edu/geoschem/docs/putty_install.html
Relatively Easy install — good functionality — Easy to use
- MobaXterm: http://mobaxterm.mobatek.net
MobaXterm provides another Linux like subsystem operating under Windows. It also adds GUI based terminal connection tools and file transfer tools and an editor. It supports remote X Windows also.
A few notes: (1) The free version works fine for most people. There are actually two free versions. The “Installer edition” is most likely better. (2) The shortcut installed on the Windows desktop does not work. Delete it and start from the menu. (3) When you start MobaXterm if you see the message “CygUtils not installed on you system” follow the directions to install it. The plugin needs to be installed in the same folder as the MobaXterm program. You may need to save it to your desktop first and drag it into your install directory.
How about some Power8 and GPU examples?
Wendian has a total of nine compute nodes with 36 GPU devices. The GPUs are accessible to all users, with that access subject to preemption by the research groups that provided these nodes to Wendian.
To run on the GPUs, one will need to explicitly request the gpu partition. Use of the “—gres” option to request the use of GPU devices including the type and number of devices, is also required. We currently have two types of GPU cards in our nodes, the v100 and the a100; each of the nodes has 4 GPU devices in them. A proper submission script requesting the use of four v100 devices per node would include lines that look like this:
#SBATCH -p gpu
To run interactive jobs or to specify these options from the command line, one may add any of the above to the call to sbatch or to salloc.
More information will be included on these webpages during the summer of 2021. For assistance until then, please contact the Mines Help Center here:
Mio has two IBM Power 8 GPU enhanced nodes. Each node has 20 Power cores and two Nvidia K80 GPU cards, each with two GPUs. See this PDF for more details!
Building and running on these nodes is slightly different, which includes:
- There are several version of MPI, one of which requires a special launch command.
- The vendor supplied math library is ESSP/PESSL not MKL.
- They have GPUs
Examples are coming soon, but are available upon request.
How about some machine learning examples?
We will have examples coming soon, but are available upon request!
Can you show me a simple example build and run?
This page shows you how to build and run a simple example on Mio or Wendian. To run the example, enter or copy/paste the text shown below in bold.
To run the quick start example, create a directory for your example and go to it.
[joeuser@mio001 bins]$ mkdir guide [joeuser@mio001 bins]$ cd guide
Copy the file that contains our example code to your directory and unpack it.
[joeuser@mio001 guide]$ tar -xf /sw/examples/simple_build_and_run.tgz -C .
If you like, do an
ls to see what you have.
[joeuser@mio001 guide]$ ls 4MioAuNMakefile aun_script docol.f90 helloc.c makefile mio1_script phostname.c 4WenMakefile power_script simple slurm_script color.f90 example.tgz info.html mc2_script out.dat phostone.c set_alias simple_slurm add.f90 simple_slurm
[joeuser@mio001 scratch]$ cp 4WenMakefile makefile
You will see a file called
4MioAuNMakefile. This file is a copy of the original
makefile; when you override
4WenMakefile, you will retain a copy of the original in
4MioAuNMakefile. If you perform this exercise on Mio or Wendian, it’s best to start with a clean
example.tgz, but you can always copy
4MioAuNMakefile to makefile on Mio. Note that the
makefile and run scripts discussed here can be used as templates for other applications.
Special instructions for building and running on the ppc001 and ppc002 (Power) nodes of Mio
Mio has two nodes, ppc001 and ppc002, that are based on IBM Power processors instead of the more common Intel x86 processor family. There are minor changes to the build and run procedures for these nodes. See the section about this below.
Next we want to ensure that your environment is set up to run parallel applications. The following two commands will give you a clean, tested environment:
[joeuser@mio001 guide]$ module purge [joeuser@mio001 guide]$ module load StdEnv
Make the program:
[joeuser@mio001 guide]$ make $ echo mio001 $ mio001 $ mpif90 -c color.f90 $ mpicc -DNODE_COLOR=node_color_ helloc.c color.o -lifcore -o helloc $ rm -rf *.o
On Wendian you need to supply an account number to run parallel applications. Mio does not require account numbers. So, next find out which accounts you are authorized to use on each machine:
[joeuser@wendian001 scratch]$ /sw/utility/local/accounts User Def Acct Account ---------- --------------- -------------------- joeuser hpcgroup [joeuser@wendian001 scratch]$
If you run this command on Mio you will get:
[joeuser@mio001 guide]$ /opt/utility/accounts Accounts strings are not required on Mio
So, to run a parallel application on Mio you would do the following:
[joeuser@mio001 guide]$ sbatch simple_slurm Submitted batch job 1993
On Wendian, you add a
-A option to the command line followed by the account string from the command given above.
[joeuser@wendian001 guide]$ sbatch -A test simple_slurm Submitted batch job 1993
If you receive the message shown below that means that the account you have specified has expired. Try another.
batch: error: Batch job submission failed: Job violates accounting/QOS policy (job submit limit, user's size and/or time limits)
If you quickly enter the command below you may/will see your job waiting to run or running. A
USER ST of
PD implies that it is waiting;
R means it is running.
[joeuser@mio001 guide]$ squeue -u $USER JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 1993 compute hybrid joeuser PD 0:00 2 (Priority)
If this command returns no jobs listed then your job is finished. If the machine is very busy then it could take some time to run.
When the job is complete there will be an output file in your directory that starts with the word “slurm” then contains the jobid from the sbatch command followed by the word out.
[joeuser@mio001 guide]$ ls slurm* slurm-722122.out
This simple test program is a glorified parallel “hello world” program. You will see 16 lines that start with the name of the nodes on which you are running, followed by the MPI task id which should be in the range 0-15 and the the number 16 which is the number of tasks you are running. Next we have a number which will be either 0 or 8. This is the MPI task number of the lowest task running on a node.
You will also see two additional lines that are basically the same output described above but for the words “First task”. There is one line output per node.
cat slurm*.out will show you the output of the job. To see your output in a nice order you can use the sort command:
[joeuser@mio001 guide]$ sort slurm*.out -k1,1 -k2,2n | grep 16 compute028 0 16 0 compute028 1 16 0 compute028 2 16 0 compute028 3 16 0 compute028 4 16 0 compute028 5 16 0 compute028 6 16 0 compute028 7 16 0 compute029 8 16 8 compute029 9 16 8 compute029 10 16 8 compute029 11 16 8 compute029 12 16 8 compute029 13 16 8 compute029 14 16 8 compute029 15 16 8 First task on node compute028 is 0 16 0 First task on node compute029 is 8 16 8
Just to note, the sort options -k1,1 sorts on the first word in the output. The next option -k2,2n sorts on the second column numerically. The grep command filters out every line that does not contain “16”, giving us only those lines of interest.
Congratulations, you have run your first supercomputing program.
The script complex_slurm runs the same program but it adds a number of features to the run. It first creates a new directory for your run, then goes to it and runs your program there.
The script threads_slurm shows how to run a hybrid MPI/OpenMP program. The program it runs is /opt/utility/phostname. This is again a glorified “hello world” program that also prints thread ID. Note the source for this program is included in the directory and it can be made using the command make phostname.
Queue and Partition Information
On Mio, individual research groups own nodes. They have priority access to their nodes. You request priority access to your nodes by specifying a partition. Please ask your PI or instructor which partition you should be using on Mio.
Add the string
-p PARTITION_NAME to your
sbatch command line. For example:
[joeuser@wendian001 guide]$ sbatch -A test -p compute simple_slurm Submitted batch job 1993
Special instructions for building and running on the ppc001 and ppc002 (Power) nodes of Mio
Mio has two nodes, ppc001 and ppc002, that are based on IBM Power processors instead of the more common Intel x86 processor family. It is not possible to build applications for these nodes on the Mio headnode. You must launch an interactive session on one of these two nodes to build applications for them. An interactive session can be launched by running the command:
[joeuser@mio001 guide]$ srun -N 1 --tasks-per-node=1 -p ppc --share --time=1:00:00 --pty bash
Note that the prompt has changed to ppc002 or ppc001 to show that you are now on the Power nodes.
Alternatively, you could create an alias called
p8for this command:
$ alias p8="srun -N 1 --tasks-per-node=1 -p ppc --share --time=1:00:00 --pty bash"
You may want to add this alias to your .bashrc file so it is available every time you login.
Running this command is a little different from doing an ssh. In particular you are placed in the directory from which you launched the command instead of your home directory.
Also, if the nodes are busy running batch jobs you may not get the interactive session immediately.
After you have obtained the interactive session you proceed as shown above.
We want to ensure that your environment is set up to run parallel applications. The following two commands will give you a clean, tested environment:
[joeuser@mio001 guide]$ module purge [joeuser@mio001 guide]$ module load StdEnv
Make the program:
[joeuser@mio001 guide]$ make make mpicc -DNODE_COLOR=node_color_ helloc.c color.o -lgfortran -lmpi_mpifh -o helloc rm -rf *.o
At this point you should exit your interactive session by entering exit.
[joeuser@ppc002 guide]$ exit exit
So, to run a parallel application on Mio Power nodes you would do the following:
[joeuser@mio001 guide]$ sbatch -p ppc power_script Submitted batch job 1299071
-p ppc forces your job to run on the Power nodes. This can also be specified in the script.
There are a few special requirements for scripts for the Power nodes. Here is a slightly edited version of the run script:
|Script for Mio Power Nodes||Explanation of the differences|
#!/bin/bash #SBATCH --job-name="hybrid" #SBATCH --nodes=1 #SBATCH --ntasks-per-node=4 #SBATCH --ntasks=4 ##SBATCH --exclusive #SBATCH --time=00:05:00 #SBATCH -p ppc #SBATCH --export=NONE #SBATCH --get-user-env=10L # Go to the directory from # which our job was launched cd $SLURM_SUBMIT_DIR module purge module load StdEnv srun --mpi=pmi2 --export=ALL ./helloc
How do I check each machine’s status?
Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids.
The following links show a web page displaying the running jobs (same info as the command line tool).
The following links show a web page displaying each node’s status (same info as the command line tool).
How do I use the file system?
- The file system on HPC platforms is provided by the school. No individual group owns any portion of the file system.
- The file system is shared by all groups.
- No group or user will be allowed to jeopardize the access to HPC platforms by abusing the file system.
- Backups are not done of users’ data.
Each user has three base directories which can be accessed either by their name or by the their environmental variable:
|Your home directory||$HOME|
In addition a group may have a $SETS directory which is designed for semipermanent data sets that will be used repeatedly by the group. $SETS can contain things like equations of state or velocity fields. It may also contain programs used by multiple members of a group. $SETS will be readable on the compute nodes. Not all groups have $SETS directories.
$HOME – Should be kept very small, having only start up scripts and other simple scripts. Output from parallel jobs can not be directed to $HOME. It should only be read from compute nodes.
$BINS – Should contain programs users have built for personal use and small data sets and run scripts. Output from parallel jobs can not be directed to $DATA It will be read only from compute nodes.
$SCRATCH – The main area for running applications. Output from parallel runs should be done to this directory.
File System Quotas
|Machine||$SCRATCH||$HOME + $BINS (Combined Total)|
|Aun/Mc2||2,000,000 Files||20 GBs|
|Mio||2,000,000 Files||20 GBs|
Note: most unix style file systems will see a performance decrease as the number of files per directory increases, this will be noticeable as the number of files per directory gets into the hundreds. This will cause a performance hit for all users when a user access files in a directory that contains a large number of files. Please keep the number of files per directory reasonable.
The organizational structure of the file system is the same on Mio, AuN and Mc2; however, Mio has its own file system while AuN and Mc2 actually share the same file system. Also, from BlueM it is possible to see the AuN/Mc2 file system and the Mio file system. Technically we say that BlueM mounts the AuN/Mc2 file system and it mounts the Mio file system.
Getting around the various filesystems
When you first login to AuN or Mc2 you will see that you have the directories:
- On AuN:
- bins scratch mc2
- On Mc2:
- bins scratch aun
The Mc2 directory on AuN is a link to your home directory on Mc2 and the AuN directory on Mc2 is the reverse.
Scratch is shared directly across AuN and Mc2. This is where runs should be done, not in your home directory. The bins directory is distinct on the two machines. Files created in bins on Mc2 are not in bins on AuN. The bins directory is where you should store applications that you build.
When you log in to BlueM you will see a directory remote that contains:
- bins home scratch
- bins home scratch
- bins home scratch
These “remote” directories have links to bins, home and scratch on the given machine. Thus, to copy a file from your desktop machine to AuN you only need to copy it to remote/aun on BlueM. The same holds for remote directories for Mc2 and Mio.
If you have an account on Mio the remote directory on BlueM will contain subdirectories for Mio. Thus it is possible to move files among Mio, AuN and Mc2 by doing a “cp”.
On Mio, by default, you have only bins and scratch directories. There is no remote directory.
I want to run complex scripts. Any advice?
The scheduler we use on our HPC platforms is SLURM. You may want to look at the documentation at: http://slurm.schedmd.com/documentation.html
We have a tutorial on scripting on the User Guides page. Subjects include:
- Bash useful concepts
- Basic Scripts
- Using Variables in Scripts
- Redirecting Output, getting output before a job finishes
- Getting Notifications
- Keeping a record of what you did
- Creating directories on the fly for each job
- Using local disk space
Multiple jobs on a node
- Multiple scripts – one node
- One Script – different MPI jobs on different cores
Mapping tasks to nodes
- Less than N tasks per node
- Running on heterogeneous nodes using all cores
- Different executables working together
- Hybrid MPI/OpenMP jobs (MPI and Threading)
- Job dependencies
- Jobs submitting new jobs
The HPC Tech Reports has a link to:
- Chaining jobs in Slurm and dealing with script errors
This note discusses how you can set up dependencies in slurm jobs so a second job waits for a first to finish before automatically starting. In particular, this shows how to set it up so that if the first job fails then the second will not start.
What prebuilt apps and libs do we have? (The Module System)
HPC@Mines has a module system. The module system allows setting up the environment for running applications using one or two simple commands. Module commands can be run from the command line or they can be placed in your .bashrc file. The primary module command is
module load Name_of_module_to_load
This would load a module, which sets your environment to run some application. This typically would involve changing your PATH environmental variable and possibly your
LD_LIBRARY_PATH variable. There are also modules for setting up one of several different programming environments. Module loads “go away” when you logout. That is, you need to load modules every time you login or put the module load commands in your
.bashrc file so they get run automatically when you login.
It is important to load only the module you need. If, for example, you were to load every module it would cause your interactive session to not work properly because it would overload key environmental variables. Most nonstandard Linux applications on our machine have modules associated with them.
There are two ways to see available modules. On a web page and by running the module avail command.
Links to list of available modules
module avail command
Running the command
on the the machine in question will give you a current list.
Module Notate Bene and FAQ’s
Resetting the environment
Running the commands
module purge module load StdEnv
will reset your environment to a known simple working state.
Resolving python module issues
The information below describes a common happenstance with python modules:
As a general rule, and as displayed above, HPC recommends doing a
module purge, then loading the
StdEnv module into your environment. The
StdEnv module in turn loads the following modules:
PrgEnv/intel/15.0.090 PrgEnv/mpi/openmpi/intel/1.6.5 PrgEnv/python/gcc/3.4.3
With regard to Python, after loading
StdEnv, Python 3.4.3 is now available to you. This is the most recent version accessible on Mio, and requires the command
python3 at the prompt to run. By default, version 2.6.6 (the system version) is in your path; the command
python will run version 2.6.6. The significance of this setup is that the system version of Python (2.6.6) is kept clean, while later versions (which require the appropriate module be loaded to the environment) include non-standard Python modules.
Another salient point is that the StdEnv module forces the loading of an Intel compiler module (see list above). This module links MKL libraries to the environment, which are required by all Python versions. An error such as
ImportError: libmkl_rt.so: cannot open shared object file: No such file or directory implies that most likely the MKL libraries made accessible by the Intel module are missing.
Setting up a virtual environment for Python
Guide on this coming soon!
How do I run better?
Every important Linux program has a manual/’man’ page that can provide you further details on how to use it. To open the manual page for an application on one our systems, first load the relevant module and then type the command:
$ man <name of application>
During this, you can scroll through the manual page using the arrow keys and can close it by pressing ‘q’.
Some notable programs have useful manual pages listed below:
- Intel C compiler – icc
- Intel Fortran compiler – ifort
- IBM C compiler (power version) – xlc
- IBM Fortran compiler (power version) – xlf90
- sbatch (submit a batch script to Slurm)
- scancel (used to signal Slurm jobs)
- sinfo (view information about Slurm nodes and partitions)
- squeue (view information about jobs located in the Slurm scheduling queue
- srun (run parallel job)
We have a collection of longer articles that describe aspects of high performance computing. This includes:
- FFTs and other wrapper library calls available in MKL 03/31/15
- Chaining jobs in Slurm and dealing with script errors 03/31/15
- OpenMP threading on Mio and AuN 04/01/15
- Qbox – Hybrid MPI/threading on Mc2 04/16/15
- Quantum Espresso – Optimization on Mc2 06/04/15
- Linux for High Performance Computing 06/09/15
- Threading on Power Nodes 01/010/17
Determining where you program spends its time is an important part of source code level optimization. We have a number of slides and a short video that show how to get started with the Allinea map profiler.
There are many optimizations that can be performed simply by selecting compile line options. We have full compiler documentation available on campus. (Note the pages listed below will not open off campus.)
How do I select my nodes?
Reservations, Node Selection, Interactive Runs
Reservations on AuN
Reservations on AuN are not currently supported.
Reservations on Mio
Reservations are no longer required on Mio to evict people from your nodes. In the past people would set a reservation for their nodes and in doing so purge jobs from users not belonging to their group. Now, people need only run the job, selecting to run in their group’s partition. See Selecting Nodes on Mio and Running only on nodes you own below.
Selecting Nodes on Mio
There are two ways to manually select nodes on which to run. They can be listed on the command line or by selecting a partition. The “partition” method is discussed in the next section.
We have below a section of the man page for
srun command describing how to specify a list of nodes on which to run:
-w, --nodelist=<host1,host2,... or="" filename=""> Request a specific list of hosts. The job will contain at least these hosts. The list may be specified as a comma-separated list of hosts, a range of hosts (compute[1-5,7,...] for example), or a filename. The host list will be assumed to be a filename if it contains a "/" character. If you specify a max node count (-N1-2) if there are more than 2 hosts in the file only the first 2 nodes will be used in the request list. Rather than repeating a host name multiple times, an asterisk and a repitition count may be appended to a host name. For example "compute1,compute1" and "compute1*2" are equivalent. </host1,host2,...>
Example: running the script myscript on compute001, compute002, and compute003…
[joeuser@mio001 ~]sbatch --nodelist=compute[001-003] myscript
Example: running the “hello world” program /opt/utility/phostname interactively on compute001, compute002, and compute003…
[joeuser@mio001 ~]srun --nodelist=compute[001-003] --tasks-per-node=4 /opt/utility/phostname compute001 compute001 compute001 compute001 compute002 compute002 compute002 compute002 compute003 compute003 compute003 compute003
Running only on nodes with particular features such as number of cores
There are several generation of nodes on Mio each with different “features.” You can see the features by running the command:
[joeuser@mio001 ~]/opt/utility/slurmnodes -fAvailableFeatures compute000 Features core8,nehalem,mthca,ddr compute001 Features core8,nehalem,mthca,ddr ... compute032 Features core12,westmere,mthca,ddr compute033 Features core12,westmere,mthca,ddr ... compute157 Features core24,haswell,mlx4,fdr ... ...
Features can be used to select subsets of nodes. For example, if you want to run on nodes with 24 cores you can add an option –constraint=core24 to your sbatch command line or script.
[joeuser@mio001 ~]sbatch --constraint=core24 simple_slurm Submitted batch job 1289851 [joeuser@mio001 ~]
Which gives us:
[joeuser@mio001 ~]squeue -u joeuser JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 1289851 compute hybrid joeuser R 0:01 2 compute[157-158] [joeuser@mio001 ~]
Running only on nodes you own (or in a particular partition)
Every normal compute node (exceptions are GPU and PHI nodes) on Mio is part of two partitions or groupings. They are part of the
compute partition and they are part of a partition that is assigned to a research group. That is, each research group has a partition and their nodes are in that partition. The GPU and PHI nodes are in their own partition to prevent people from accidentally running on them.
You can see the partitions that you are allowed to use (compute, phi, gpu and your groups partions) by running the command
sinfo -node will display which partitions you are allowed to run in.
sinfo -a will show all partitions.
sinfo -a --format="%P %N" shows a compact list of all partitions and nodes.
Add the option
-p partition_name to your
srun command run in the named partition. The default partition is compute which is all of the normal nodes. By default your job can end up on any nodes. Specifying your groups partition will restrict your job to “your” nodes.
Also, starting a job in your groups partition will purge any job running on your nodes that are run under the default partition. Thus, it is not necessary to create a reservation to gain access to your nodes. If you do not run in your partition your jobs have the potential to be deleted by the group owning the nodes.
There is a shortcut command that will show you the partitions in which you can run, /opt/utility/partitions. For example:
[joeuser@mio001 utility]$ /opt/utility/partitions Partitions and their nodes available to joeuser compute compute[000-003,008-013,016-033,035-041,043-047,049-052,054-081,083-193] phi phi[001-002] gpu gpu[001-003] joesgroup compute[056-061,160-167] [tkaiser@mio001 utility]$
We see that joeuser can run on nodes in the compute partition. The partitions compute, phi, and gpu are available to everyone. Joes group “owns” compute[056-061,160-167] and running in the joesgroup partition will allow preemption.
Running threaded jobs and/or Running with less than N MPI tasks per node Slurm will try to pack as many tasks on a node as it can to try to fill it so that there is at least 1 task or thread per core. So if you are running less than N MPI tasks per node where N is the number of cores slurm may put additional jobs on your node.
You can prevent this from happening by selecting setting values for the flags –tasks-per-node and –cpus-per-task on your sbatch command line or in you slurm script. The value for –tasks-per-node times –cpus-per-task should be the number of cores on the node. For example, if you are running on 2 16 core nodes you want 8 MPI tasks you might say
--nodes=2 --tasks-per-node=4 --cpus-per-task=4
where 2*4*4=32 or the total number of cores on two nodes.
You can also prevent additional jobs from running on nodes by using the –exclusive flag
How do I manage jobs?
- Launching a job
- Launching a job using a particular account
sbatch -A ACCOUNT_NUMBER script
- Show the accounts I can use on AuN or Mc2
- Launching a job with exclusive access (recommended)
sbatch --exclusive script
- Launching a job in a particular partition or set of nodes or running interactively
- See: How do I select my nodes?
- See all jobs in the queue
- Seeing what jobs I have in the queue
squeue -u $LOGNAME
- Show an estimate of when a job will start
squeue --start --job JOB_NUMBER
- Killing a job
- Show what partitions I am allowed to use:
- Show what partitions I am allowed to use and the nodes:
- Show all partitions and their nodes:
- Formatted Slurm Man Pages
- A cross reference for other work load managers
How do I see my scratch usage?
Managing your scratch space usage
Until recently we did not have a good way for people to monitor their usage of scratch space on Mio. We now can easily show total usage. With a bit more effort you can also show aging of your files and directories .
mmlsquota (Mio only)
We have enabled the command
mmlsquota which will show your usage.
You can do a
[joeuser@mio001 ~]$ man mmlsquota
to see the full description of the command or
[joeuser@mio001 ~]$ mmlsquota -h
to get a short description.
When you run
mmlsquota you will get more information than is useful. You will see two Filesystems listed,
sb. The one that describes your scratch usage is
sb filesystem report is not important. You may also see a line that lists a
sets fileset, which is also not important.
We have a command
/opt/utility/scsize that filters out most of the unimportant information. For example:
[joeuser@mio001 ~]$ /opt/utility/scsize Block Limits Filesystem Fileset type GB quota limit in_doubt grace lb root USR 19 76800 102400 0 none
This shows that
joeuser has 19 gigabytes in scratch. The quota is a theoretical upper limit as to the amount of space you could use. In fact, you will draw the attention of the HPC group long before you get anywhere (think small fraction) close to that limit.
As you know the HPC group reserves the right to remove files in scratch as necessary to keep the system running. Scratch by definition is for temporary storage of data. If you plan on keeping data it should be moved off of the machine.
There has been a question and debate about automatically removing files after they reach a certain age. Some institutions do that. We don’t for three reasons. First, people are generally responsible about cleaning up after themselves. Second, It is actually an expensive operation to routinely purge files. Finally, for those few that are not responsible, it is too easy to “game” the aging tests.
However, we now have the ability for users to show their file aging information. This is a multistep process. The first step can be time consuming and hits the file system pretty hard so it is not something you will want to do on a daily basis.
The new command is
/opt/utility/agedu. Again, you can get the
man page for this command.
For the first step
cd to your
scratch directory and then run the command
[joeuser@mio001 joeuser]$cd $SCRATCH> [joeuser@mio001 joeuser]$/opt/utility/agedu --no-progress -f $HOME/adedu.dat -s $SCRATCH
This will create an inventory of your scratch directory. It will create a file
agedu.dat. This can take several minutes. In a recent test for a user with a large number of files this took about 20 minutes. For most users it should run in a minute or two.
Please delete your inventory file,
$HOME/adedu.dat, after you are done with it. They can be rather large and become irrelevant after you have modified your directory. The file is binary and can only be viewed as discussed below.
Once the inventory is created there are many options for displaying the data. You can:
- Filter by age
- Create a text file report
- Create a static HTML page that can be viewed offline
- Create a navigable web page that can show subdirectories
Here are some examples of generating a text report filtering by age. The first column is the amount of data in kilobytes in the given directory of that age or older.
Find data over 2 years old
[joeuser@mio001 joeuser]$ /opt/utility/agedu -a 2y -f $HOME/adedu.dat -t $SCRATCH 89247072 /scratch/joeuser/DMOL 42528 /scratch/joeuser/QuIET 48716960 /scratch/joeuser/Siesta 395154304 /scratch/joeuser
Find data over 1 years old
[joeuser@mio001 joeuser]$ /opt/utility/agedu -a 1y -f $HOME/adedu.dat -t $SCRATCH 89247072 /scratch/joeuser/DMOL 2170464 /scratch/joeuser/Octopus 42528 /scratch/joeuser/QuIET 48717024 /scratch/joeuser/Siesta 1952 /scratch/joeuser/ddscat 397326784 /scratch/joeuser
Find data over 1 month old
[joeuser@mio001 joeuser]$ /opt/utility/agedu -a 1m -f $HOME/adedu.dat -t $SCRATCH 89247072 /scratch/joeuser/DMOL 2170528 /scratch/joeuser/Octopus 512941760 /scratch/joeuser/Qchem 42528 /scratch/joeuser/QuIET 48717024 /scratch/joeuser/Siesta 1952 /scratch/joeuser/ddscat 910268608 /scratch/joeuser
Notice the size changes as we change the reporting period. You can also specify subdirectories to get more detailed information.
[joeuser@mio001 joeuser]$ /opt/utility/agedu -a 9m -f $HOME/adedu.dat -t $SCRATCH/Qchem 8726752 /scratch/joeuser/Qchem/Aniline 1931872 /scratch/joeuser/Qchem/Benzene 32448 /scratch/joeuser/Qchem/Coronene 27296 /scratch/joeuser/Qchem/H2 135328 /scratch/joeuser/Qchem/H2O 34905824 /scratch/joeuser/Qchem/TPA 96 /scratch/joeuser/Qchem/TPBoron 20947296 /scratch/joeuser/Qchem/TPCarbon 50214656 /scratch/joeuser/Qchem/TPP 9971424 /scratch/joeuser/Qchem/TPSilicon 40411488 /scratch/joeuser/Qchem/Trinapamine 4786048 /scratch/joeuser/Qchem/Triphenylarsenic 172090528 /scratch/joeuser/Qchem
Create a static web page for offline viewing
[joeuser@mio001 joeuser]$ /opt/utility/agedu -a 1y -f $HOME/adedu.dat -H $SCRATCH/Qchem > agedu.html
You can then copy the file
agedu.html to your local machine for viewing. This will give you a static very top level view of your directory structure.
The next option is much more interesting.
Create a navigable web page
Finally, maybe the most useful option is to create a navigable web page that allows you to dive into subdirectories. When the page is created you can view your directory as a tree structure and navigate to see the size and ages of directories and files.
[joeuser@mio001 joeuser]$/opt/utility/agedu -a 2y -f $HOME/adedu.dat -w --address mio001.mines.edu --auth basic
This command will block until you do a Control-C. The command shows a user name:
agedu, a password and a URL.
agedu actually starts a mini web server. It will display your data via the given URL. You will need to enter the requested username and password.
On a live version of the page you can click on the directory name on the right to see details.
Please note, this page is not updated if you delete files. You will need to regenerate the
agedu.dat file to see your updates.
Finally, please delete your inventory file,
$HOME/adedu.dat, after you are done with it. They can be rather large and become irrelevant after you have modified your directory. The file is binary and can only be viewed as discussed above.
Why are my jobs stuck in the queue?
Example: I have only two jobs waiting in the queue. Some users seem to submit multiple jobs (and I mean on the order of hundreds) that join the queue behind my jobs, yet start running on the compute nodes while mine are still stuck in the queue. This does not seem fair; I have grant deadlines, graduation requirements and the like. Why is this allowed to occur?
The number of jobs in the queue can make it look very intimidating, but the number alone may not be a good indicator of how quickly your job will start which in turn affects how quickly you can get your science done.
All jobs that enter the queue are given a priority that is based on three factors. The primary factor is called fair-share; it confers a higher priority in the queue on users and groups that have used fewer core hours recently than others in the queue (giving a relatively lower priority to those who’ve been running at high levels in the recent past). This should give users who have run lots of jobs recently a lower priority compared to users that haven’t run as much recently. If you aren’t running much then your jobs should get a higher priority and run sooner than the users that are submitting hundreds or thousands of jobs. A second factor concerns the partition. When you submit to your group’s partition your jobs get a very high priority, mainly to ensure that they are evaluated first so that they will preempt any “compute” partition jobs that are currently running on your group’s nodes. Often this will allow your jobs to run immediately. If there are many other users in your group that are using the group’s partition, you may need to wait for their jobs to finish. This latter aspect is a negotiation internal to your group. The final priority factor is age, which assigns a slightly higher priority to the job the longer it waits in the queue. That factor is mostly meant as a tie-breaker for jobs with the same fair-share value.
All of that is a long way of saying that the queue is not a simple first-in first-out algorithm and that lots of small jobs do not necessarily mean your few jobs have to wait behind all of them. If you have any questions about priority let us know and we can discuss the mechanisms in more detail, along with options to minimize your wait time.
Is there a best practice for submitting jobs to the queue so that wait time is minimized?
For a high-level summary of how the scheduling algorithm manages the queue, see “Why are my jobs stuck in the queue?”. Some additional tips:
- Estimate the resources needed to run your job as accurately as you can. (Set the values in the preamble of your script to reflect an efficient allocation request.) This will help the scheduler use the backfill method when an appropriate block arises, and will also improve start time assessment.
- Perform analyses on your structure prior to submitting your calculation to set input file parameters that will maximize the efficiency of your job (such as energy cut-off values, potentials and exchange-correlation choices, convergence intervals).
Is there a way to estimate when my job will start running on the compute nodes?
There is a slurm command one may run to obtain the scheduler’s assessment of when a user’s queued jobs will start. Be mindful that the estimate is only as reliable as the accuracy of parameters input by users (such as wall time). That command is:
$ squeue --user=$username --start
$ man squeue
will print the help page for the squeue command to your terminal.
My jobs are always canceled after they have been running for six days. This is extremely frustrating. What is the cause of this and how can I avoid this?
A six-day/144-hour maximum wall time is policy for jobs run on Wendian, and on all Mines HPC platforms. There are reasons for this limit, including fair usage but also to guard from lost compute time for periods longer than six days. We encourage checkpointing of jobs, should a user anticipate runs taking more than six days. In emergency situations, we can and do extend running jobs upon request; you may submit a help request here for review.