USING UPC at MTU

Phil Merkey

Introduction

Getting on a new machine and figuring out how to compile and run a program in a new is always harder than it seems it should be. UPC is no different than any other language in this respect.

This usage page is based on an earlier work by Zhang and Thorsen. Errors in document are my fault and responses should be directed to me: merk@mtu.edu

This page give instruction on how to run UPC programs on the machines operated by the Computational Science and Engineering Research Institute (CSERI).

UPC compilers available at MTU

Machine name Architecture UPC compilers Documentation
lionel.cse.mtu.edu Linux x86 cluster
20 2-way 2.0GHz Pentium nodes
Mryinet interconnect (only nodes 1-16)
Gigabit Ethernet interconnect (only nodes 1-16)
MuPC

Berkeley's UPC

http://www.upc.mtu.edu/MuPCdistribution/

http://upc.nersc.gov/docs/

flyer.cse.mtu.edu AlphaServer cluster
8 4-way ES40 nodes with
833MHz Alpha EV68 21264s
Quadrics interconnect
MuPC

Berkeley's UPC

HP's UPC

http://www.upc.mtu.edu/MuPCdistribution/

http://upc.nersc.gov/docs/

http://www.hp.com/go/upc

flash.cse.mtu.edu Cray T3E
48 Alpha 21064s
Cray interconnect
GCC-UPC

Bill's UPC

http://www.intrepid.com/upc/

(No longer used.)

Compiling and Running

MuPC on Lionel

Start by logging in with the command ssh lionel.cse.mtu.edu. This is a Beowulf cluster where the frontend (lionel) is a normal linux box. Your shell should be bash and you should be able to configure your enivronment in the usual way.

Compiling and running helloworld

/usr/local/MuPC/bin/mupcc -f 4 hello.c
/usr/local/MuPC/bin/mupcrun -n 4 ./a.out

MuPC on Flyer

Compiling and running helloworld

/usr/local/MuPC/bin/mupcc -f 4 hello.c
prun -I -n 4 ./a.out

Berkeley's UPC on Lionel

Berkeley's UPC needs a configuration file in order to work smoothly. On lionel, you can create the configuration file by copying /home/zhazhang/.upccrc to your home directory.

Compiling and running helloworld

/home/bonachea/.upc-dist/inst/bin/upcc -T 4 hello.c
/home/bonachea/.upc-dist/inst/bin/upcrun -n 4 ./a.out

Berkeley's UPC on Flyer

Berkeley's UPC needs a configuration file in order to work smoothly. On flyer, please copy /usr/users/zhazhang/.upccrc to your home directory.

Compiling and running helloworld

/usr/local/berkeley/upc/stable/runtime/inst/bin/upcc -T num hello.c
/usr/local/berkeley/upc/stable/runtime/inst/bin/upcrun -n num ./a.out
Berkeley's UPC needs a configuration file in order to work smoothly. On lionel, you can create the configuration file by copying /home/zhazhang/.upccrc to your home directory. On flyer, please copy /usr/users/zhazhang/.upccrc to your home directory.

HP's UPC on Flyer

Compiling and running helloworld

/usr/bin/upc -fthreads 4 hello.c
prun -I -n 4 ./a.out

The T3E

The two compilers for the T3E are billsUPC written by Bill Carlson and Jesse Draper. It is now considered obsolete. It is recommended that you use the gcc compiler written by Intrepid.

GCC-UPC on the T3E

First, the system runs UNICOS, the Cray variant of UNIX. So the first thing is that your default shell is tsch. If you don't want to use it, don't change anything, just type bash after you login. Now you should be able to control your environment in the usual way. The second main difference is that Cray uses modules to control which versions of various packages one uses. To set our enivronment variables like PATH, LD_LIBRARY and MANPATH to use gcc-upc you will need to issues the command:
module load gcc-upc
You can of course put this in your .bashrc file as well.

Compiling and running helloworld

upc -fupc-threads-4 hello.upc
./a.out

FAQ

Q3: Where can I get more information about how to use MuPC?
Go to the web for MuPC.

Q4: Where can I have more information about how to use HP's UPC?
(1) Read the man page. (2) Go to the web for HP's UPC.

Q5: Where can I have more information about how to use Berkeley's UPC?
(1) Try "upcc -help". It lists all possible command line options. (2) Go to the web for Berkeley's UPC.

Q6: Where can I have more information about how to use GCC-UPC?
Go to the web for GCC-UPC's homepage.

Q7: How many UPC threads can I launch for my program?
It is generally a good idea to have one UPC thread per node. Having multiple threads per node is possible but usually leads to very poor performance. The only exception is on flyer, where we can have 2 UPC threads per node and still get good performance. Therefore, the suggested maximum values for THREADS on the three platforms are listed below:

lionel: 16 nodes, 16 UPC threads (32 threads possible, but for testing only)
flyer: 8 nodes, 16 UPC threads with 2 threads per node
flash: 48 nodes, 48 UPC threads

Please read more on Q1, Q4 and Q5 in the "Run time system configuration" section.

Run time system configuration

Q1: When running MuPC, how do I select which nodes to use on lionel?
Let lionel decide for you. Lionel will choose the least busy nodes and execute your program on them.
- OR -
Use a machine file. A machine file is a plain text file containing 16 lines, with each line specifying one node name. Node names are n1, n2, ..., n16. The order in which the nodes appear doesn't matter. The run time system always picks the first node to run Thread0, the second to run Thread1, and so on. The machine file name must be supplied in the command line using the -machinefile switch, as in the following:
mpirun -np 16 -machinefile mf ./a.out
Please note that only nodes n1 to n16 are eligible nodes to be in the machine file. Read more in Q2 and Q3.

Q2: What happens if some nodes in my machine file are down?
You can still run your programs as long as there are enough running nodes for your purposes. But you have to take the faulty ones out of your machine file by deleting them from the machine file or prepending each of them with a # sign.

Q3: When should I supply a machine file?
Only if you want to use specific nodes.

Q4: How do I specify which nodes to use on flyer? Do I need a machine file also?
On flyer you don't need a machine file.
You specify the layout of MuPC threads using the command line, for example:
1. 2 UPC threads using 2 nodes: prun -I -n 2 -N 2 ./a.out
2. 2 UPC threads using 1 node: prun -I -n 2 -N 1 ./a.out
3. 4 UPC threads using 4 nodes: prun -I -n 4 -N 4 ./a.out
4. 4 UPC threads using 2 nodes: prun -I -n 4 -N 2 ./a.out
5. 4 UPC threads using 1 node: (poor performance, not recommended) prun -I -n 4 -N 1 ./a.out
6. 16 UPC threads using 8 nodes: prun -I -n 16 -N 8 ./a.out
and so on.
You've got the idea. For more information, please refer to the man pages of prun and allocate.

Q5: How do I specify which nodes to use on flash?
You have no control of processor allocation on flash. The system automatically spawns UPC threads for you on the least busy nodes.

Q6: How do I set the cache size for MuPC?
Create a file named mupc.conf in your home directory. Add the following three lines to this file:
CACHE_LINE_LENGTH 1024
CACHE_TABLE_SIZE 256
SHARED_MEM_SIZE_PER_THREAD 268435456

The first two lines govern the geometry of the cache. The values shown above are the default settings. Modify those values to change the cache size. Note that those values must always be powers of 2 with the following exception: setting the values in the first two lines to zeros turns off the cache. The maximum value for CACHE_TABLE_SIZE is 1024, and the maximum value for CACHE_LINE_LENGTH is 8192. Read more in Q7.

Q7: What exactly is the structure of MuPC's cache?
Cache in MuPC is a non-coherent, direct-mapped, write-back cache. Each UPC thread maintains a cache with (THREADS-1) blocks, with each block for references made to every other thread. Each block has CACHE_TABLE_SIZE lines; each line has CACHE_LINE_LENGTH bytes.

Q8: What is the third line (last line) in mupc.conf for?
This line specifies the amount of heap space available for UPC's dynamic memory allocation. You can enlarge this value if you encounter an "insufficient memory" problem. But remember that the real amount of heap space is limited by the physical memory resources on the platform.

Q9: How do I set the cache size for HP's UPC?
The following environment variables control the cache behavior for HP's UPC:
UPCRTS_USE_CACHE (default FALSE, set to TRUE to turn on caching)
UPCRTS_CACHE_SETS (default 128, similar to CACHE_TABLE_SIZE in MuPC)
UPCRTS_CACHE_BLOCK_SIZE (default 64 bytes, similar to CACHE_LINE_LENGTH in MuPC)
UPCRTS_CACHE_ASSOCIATIVITY (default 4) Please refer to the man page of HP's UPC for more details.

Q10: Do I need to re-compile my code every time I change the cache size?
No. These settings take effects at run time only.

Q11: How do I set the cache sizes for Berkeley's UPC and GCC-UPC?
As far as we know, Berkeley's UPC and GCC-UPC do not have caching facilities yet.

Q12: How do I turn caching on and off?
See Q6 or Q9.

Q13: How do I set the heap size for dynamic shared memory allocation in MuPC?
See Q8.

Q14: How do I set the heap size for dynamic shared memory allocation in HP's UPC?
Modify the LIBELAN_ALLOC_SIZE environment variable.

Q15: How do I set the heap size for dynamic shared memory allocation in Berkeley's UPC?
Use the "-shared-heap" switch at compile time. For example:
upcc -T 4 -shared-heap=256MB prog.c
upcc -T 4 -shared-heap=1GB prog.c
and so on.

Q16: How do I set the heap size for dynamic shared memory allocation in GCC-UPC?
We don't know yet. If you figure it out, please let us know.

Questions about using lionel

Q1: When running MuPC, how do I specify which nodes to use on lionel?
See Q1 in the "Run time system configuration" section.

Q2: I am used to using LAM MPI. Why can I not use LAM anymore?
MuPC on lionel is configured to use MPICH-GM. Your PATH environment variable should contain an entry for /usr/local/mpi/bin. This way MuPC can find the correct mpicc and mpirun. See Q3 for more information.

Q3: How to edit my PATH environment variable to get the right MPI for MuPC on lionel?
Edit the .bashrc file in your home directory (assuming you are using bash).
Add /usr/local/mpi/bin and /usr/local/MuPC/bin to the head of the PATH environment variable redefinition.
Then log out and log in again.

Q4: Why does MuPC hang on lionel?
It's because of a bug in your program. If you are certain your program is correct, there could be something wrong with the system.
See Q1 in the "Run time system configuration" section.

Q5: Why do I get "cannot open GM port" error messages, or something similar?
It might be a problem with the system, Please see Q7 below.
Sometimes the Myrinet network on lionel is misbehaving. Send a note to Christopher K. Pinnow (ckpinnow@mtu.edu).

Q7: I do have a machine file specified in the command line, but MuPC still fails. Why?
Check your machine file.
On lionel, only nodes n1 to n16 are connected by the Myrinet switch. Nodes n17, n18, n19 and n20 are not on the Myrinet network (nor are they on the Gigabit Ethernet network). None of them should be listed in the machine file. You should also not include the front end node, lionel.cse.mtu.edu, in the machine file. It is not on the Myrinet network either and it is not a compute node.

Questions about using flyer

Q1: What MPI implementation are we using on flyer? Do I need edit the PATH environment variable in order to run MuPC?
We use Quadrics MPI library on flyer. This is the default MPI installation. You do not need to edit your PATH to run MuPC on flyer.

Q2: How do I specify which nodes to use on flyer? Do I need a machine file also?
See Q4 in the "Run time system configuration" section.

Q3: How do I set the cache size for HP's UPC on flyer?
See Q9 in the "Run time system configuration" section.

Q4: Why do I get an "insufficient memory" message when using HP's UPC?

Q5: Why does dynamic allocation fail when using HP's UPC?
See Q14 in the "Run time system configuration" section.

Q6: Why do I get "cannot allocate resource now" error messages on flyer?
This means not enough nodes are available currently, some nodes have been occupied by other users. Just wait and try it some time later. You can use the command rinfo to see which nodes are currently free.

Q7: Why can't the block size of a shared object exceed 1024?
This is a limitation of some old versions of HP's UPC. In the current version, the -wide option in the command line allows you to have a much larger block size. Read the man page for more information.

Questions about using flash

Q1: How do I specify which nodes to use on flash?
See Q5 in the "Run time system configuration" section.

Q2: Why doesn't "logout" work on flash?
Because the default shell on flash is ksh. Use exit please, or you can change your default shell to csh using the chsh command.

Q3: How to set the cache size for GCC-UPC on flash?
GCC-UPC doesn't have a cache yet.

Q4: How to set the heap size for dynamic shared memory allocation in GCC-UPC?
See Q16 in the "Run time system configuration" section.