Facilities > ALMA/NAASC > Documentation > Manuals > Computing Resources

Computing Resources

« Return to page index

1. Overview

Overview

This document describes acceptable use of NRAO computing facilities for the purpose of calibration and imaging ALMA observations at the North American ALMA Science Center (NAASC).  The document describes the process of requesting an account, requesting resources and accessing data. In addition it enumerates available NRAO hardware and software and limits on the volume and duration of resources requests.

Users of NRAO computing resources must abide by the Acceptable Use Policy. Sharing of assigned user accounts is not permitted.

Resource request types and prioritization

The NAASC has finite computing resources. In the event of over subscription resources will be granted in the following priority order:

  1. Pipeline reprocessing requests to re-run NRAO supplied ALMA pipelines with modified parameters
  2. Batch (script) submission to execute a user defined pipeline.
  3. Interactive use for direct CASA or other interaction.


In the event that all resources are in use, new tier 1 jobs will move to the top of the queue, then tier 2, and finally tier 3.  NRAO expects roughly 50% of the compute resources to be available for tier 3 interactive use. The distribution of job type is expected to change as more observers adopt pipeline reprocessing and batch processing modes.  

Over time the NRAO will examine finer grained prioritization, particularly within the batch and interactive queues, based on science rank, data size, time since observation or other parameters.


2. User Accounts and Remote Access

Requesting an account

A valid entry in the my.nrao.edu User Database (UserDB) is required for account access.  Please ensure your email address therein is correct before requesting an account.

To request a temporary computer account perform the following steps

  1. Ensure your default email address is correct at my.nrao.edu
  2. Submit a ticket at https://help.almascience.org/
    1. To log in, use the same user ID and password as when accessing the ALMA Science Portal.
    2. Under 'Select a department' choose "Data Reduction (NA)" which will ensure the ticket is directed to the appropriate group.
    3. Indicate how long you will need the account, the default is one month: 2 weeks for processing plus a 2 week grace period to transfer data products.

A unique UNIX based computer account will be created upon receipt of the request ticket.  The account name will be ‘cv-<#ID>’ where <#ID> is your numeric UserID in the UserDatabase, the account password will be the same as the one used above to submit the request.

You will receive an automated email delivered to the address registered in the UserDB when the account is created.  The email will include your account name, account expiration time and a pointer to this documentation.

The assigned account allows access to the NAASC ssh portal, authenticated ftp (sftp) server, NAASC Guest Workstations, the processing cluster master server and any assigned cluster nodes.  It does not grant access to other NAASC or NRAO systems or staff machines.

Requesting Resources

Note: to preserve ssh agent, and X11 forwarding, you may need to use either -AX (Linux) or -AY (OSX) options on the ssh command line.

Cluster node reservation requests must be issued on the cluster master node: cvpost-master.  This server is not directly accessible from outside the NRAO.  To access it from a system outside the NRAO you must first login to the ssh portal:

ssh <username>@ssh.cv.nrao.edu

From there you can ssh to the correct host:

ssh cvpost-master

To reserve a cluster node for interactive use run the command nodescheduler on cvpost-master.  The command takes two arguments, number of days to reserve the node and number of nodes.  Please limit number of nodes to 1 unless you can show a need for multiple servers.  For example:

cvpost-master$ nodescheduler --request 14 1 

would reserve one node for 14 days.  Running nodescheduler with no arguments will provide usage information.   Once a node has been assigned (could take anywhere from seconds to hours depending on demand) you will receive an email listing which node you were assigned and how to release the node when you are done.  Please release the node when you are done or won't be using it for an extended period of time (days).

Once a node has been reserved, your account and only your account has access to it.  For information on available software (e.g. CASA, AIPS, Miriad) see the software section of this manual.

3. Connecting via VNC

Accessing the Cluster Remotely with VNC

While ssh will work fine if you are on the internal NRAO network, if you are trying to display things from a remote site we recommend using VNC.

Connect to the NRAO

From your local machine, login to the ssh portal ssh.cv.nrao.edu with your username (e.g. cv-4386).  Skip this section if you are physically at the NRAO.

For Linux and Mac Machines

ssh cv-4386@ssh.cv.nrao.edu

For Windows Machines

Install PuTTY, fill in the Host Name field and click Open.

Start the VNC Server

From the ssh portal, or some other NRAO machine, login to the node assigned to you (e.g. cvpost050)

ssh cvpost050

and start a VNC server with the following command

vncserver

The first time you run this, it should prompt you to set a password.  Do not use the same password as your username.  The system should then return something like:

New 'cvpost050:1 (cv-4386)' desktop is cvpost050:1

The 1 in this example is your session.  You will need this number later when you use your VNC client.

Connect to the VNC Server

The VNC Client used to connect to the VNC server is different depending on the OS you are using (Linux/RHEL, Linux/Ubuntu, MacOS)

Linux (RHEL, CentOS, SL, OEL, Debian)

If your local machine is an RHEL or Debian derivative, use vncviewer to start the VNC connection like so (assuming the session number is 1)

vncviewer -via cv-4386@ssh.cv.nrao.edu cvpost050:1

If you are physically at the NRAO, skip the "-via" syntax like so

vncviewer cvpost050:1

Linux (Ubuntu)

If your local machine is Linux/Ubuntu, use remmina to start the VNC connection like so (assuming the session number is 1)

Launch the remmina program and select Connection -> New

 

Set the Name to something descriptive like NRAO Cluster, change the Protocol to VNC - Virtual Network Computing, set the Server to the node assigned to you followed by a colon and the session number (e.g. cvpost050:1), set the User name (e.g. cv-4386).  If you see Repeater, leave it blank. Then select the SSH tab

 

Check the box for Enable SSH tunnel, select Custom and set it to ssh.cv.nrao.edu, set the User name (e.g. cv-4386), click on Save.  The window will disappear (Ubuntu 16+) so then right-click on the entry for this connection in the main remmina window and then choose Connect.

 

 

Mac

There are two general ways of connecting on a Mac. 

Built-in Screen Sharing

This assumes you are connecting from the outside.  First, establish a tunnel to the relevant node in a terminal window:

ssh -L5901:cvpost050:5901 -n ssh.cv.nrao.edu

Leave that terminal in the background.  Then in the Finder, Pull down the GO menu and choose Connect to Server. For the server address, specify:

vnc://localhost:5901

You will be challenged for the VNC password you set up (likely at the time you launched the vnc server).

 

Add-on Software (Chicken)

CAUTION: The "Chicken of the VNC" client has not had updates in some time and may be abandoned.  Use only with caution.

If your local machine is Mac, use a VNC client like Chicken with the following setup (assuming the session number is 1).  If you are physically at the NRAO, leave the "SSH Host" line blank.

Host: <node assigned to you> (e.g. cvpost050)

Display or port: <the session from above>  (e.g. 1)

Password: <the VNC password you created above>

Tunnel over SSH check this box

SSH Host: <username@ssh.cv.nrao.edu> (e.g. cv-4386@ssh.cv.nrao.edu)

Windows

If your local machine is Windows, use a VNC client like the Java Viewer from TightVNC with the following setup.  The port number can be found by adding 5900 to the session number.  So in the above example, with a session number of 1, the port will be 5901.  If you are physically at the NRAO, leave the "SSH Server" line blank.

End the VNC Server

Commands that are run in this VNC session will continue to run even after closing your local VNC client. Once all processes are done, you should close your VNC server by connecting via ssh again to the nmpost cluster node and running (assuming the session number is 1)

vncserver -kill :1

4. Resource Limits and Data Retention

Allocation and limits on processing resources applies to both NRAO staff and observers.

Limits

Users are limited to 3TB of space on the shared Lustre filesystem and up to 2 compute nodes. For interactive sessions nodes are assigned for 1 to 14 days.  The default access duration is 1 week with a 2 week maximum.

Compute nodes reserved for interactive use can only be accessed by the reserving account.

During periods of increased pressure users may be asked to reduce usage to one node to allow broader community access.

Requests for increased access duration, storage space and compute nodes should be submitted as a ticket to https://help.almascience.org/ (Data Reduction [NA] department) and will be reviewed by designated NAASC staff on a case by case basis.

Status

Reports on Lustre space usage is calculated periodically:

 

Data Retention

External accounts, along with any data products, will be removed two weeks from the completion of a processing request.    For interactive sessions the account and data products will be removed two weeks from the end of the session.  You will receive an email warning prior to accounts deletion.  In the event of multiple processing requests the account expiration date will be triggered by the last request to complete.

Large Proposals

Observers associated with large proposals (greater than 200 hours) who plan to use NRAO computing resources are encouraged to request the creation of a project area on the Lustre filesystems and a Unix group for shared data access among proposal members.

Project area size limits will be negotiated with NRAO at the time of the request to match project data rates and imaging plans but will typically be 10TB to 20TB in size.

Large proposals may request a block of compute nodes accessible by anyone in the group.  Typically this would be 2 to 4 nodes reserved for two to four months at a time.  

5. Available Hardware Resources

The NAASC has a post processing environment comprised of a processing cluster to support CASA execution and a data storage cluster for support of the Lustre filesystem.  Resource allocations of NRAO facilities are limited by available nodes (servers) in the processing cluster and total space within the storage cluster.

Processing Cluster

The NAASC has a 64 node compute cluster.  Each node has dual 2.6GHz 16 core Intel E5-2670 Sandy Bridge processors and 64GB of memory.  The compute nodes have no local disk storage instead they are connected to a distributed parallel filesystem (Lustre) via a 40Gbit Infiniband network.  The compute cluster supports automatic ALMA pipeline processing, archive retrievals, batch processing requests and interactive processing sessions.

Lustre Filesystem

Lustre is a distributed parallel filesystem commonly used in HPC environments.  Each cluster node sees the same shared filesystem.  The NAASC Lustre filesystem is made up of many storage servers each with multiple RAID arrays which vary from 8TB to 16TB in size.  The total storage volume varies in time and as of 2016-02-02 was 436 TB.  Individual nodes can read and write to the Lustre filesystem in excess of 1GByte/sec, the entire filesystem can sustain roughly 10GByte/sec aggregate I/O.

The Lustre filesystem appears as /lustre/naasc on all lustre enabled client computers.

Public Workstations

The NAASC has five workstations for local visitors.  The systems have 8 x Intel E5-1660 Xeon 3.0 GHz processors, 32 GB RAM, 3 TB local disk space (accessible as /home/<hostname>_1 and /home/<hostname>_2) and a 10Gbit connection to the Lustre filesystem.   Instructions for reserving workstations can be found in the BOS system (used to make your visitor reservation at the NAASC).

6. Data Storage and Retrieval

Data Storage

Observer accounts reside on the Lustre filesystem in /lustre/naasc/observers/<account name>.   This area is also where you should store all data products and scratch files.   Lustre is a shared resource amongst all staff and observers, we ask that everyone keep their usage as far below the 3TByte limit as possible.

Please consult the NAASC Data Analysts in order to retrieve data directly from the archive into your home area.

Data Retrieval

The NAASC supports the following methods for securely transporting data to remote facilities and has plans to support XSEDE's Globus Connect platform.  For the following examples, <user> is your cv-#### account name, your data area would be /lustre/naasc/observers/<user>.

SFTP

SFTP acts like an encrypted ftp protocol.  You can access your home area from remote facilities via: sftp sftp.cv.nrao.edu and login with your NAASC account name.   From there sftp behaves much like any ftp client.

The example below would connect user cv-1234 to the sftp server.  The current directory would be cv-1234's home lustre area.

sftp cv-1234@sftp.cv.nrao.edu

SCP

SCP is an encrypted copy that can transfer between remote hosts.  The format is scp user@remotemachine:/<remote_path> <local_path>. From your machine you would run scp <user>@ssh.cv.nrao.edu:<relative path to files> <local path>

The example below would copy all files ('*') in cv-1234's data directory to the current directory ('.') on your machine.

scp cv-1234@ssh.cv.nrao.edu:data/* .

LFTP

LFTP is a more sophisticated version of the classic ftp protocol which, among other things, uses multiple channels to speed performance.

lftp -u <user> sftp://sftp.cv.nrao.edu

RSYNC

RSYNC is a versatile file-copying tool that only copies necessary files, that is the ones that are missing in your local copy.  This is useful if, for example, you have deleted some files from your local copy of your JobID and want to copy just those missing files.

By default, and in the examples shown here, rsync uses ssh as the underlying transport protocol.  It works best if you have your identity already cached in your ssh-agent.

The example below would copy all the files in cv-1234's data directory to a local directory.  Without the trailing '/' rsync would copy the directory and its contents (so that your current local working directory will end up with a "data" subdirectory) , with a trailing '/' it copies only the contents of the remote "data" directory into the local working directory.  Adding '--delete' to the arguments list will keep the two areas exactly in sync by removing files on your local system's copy if they've been removed from the remote copy.

rsync -av cv-1234@ssh.cv.nrao.edu:data/ .

Browser Access

You can use the bulk.cv.nrao.edu web server to retrieve tarred files after your account is closed but before it is deleted.

Go to https://bulk.cv.nrao.edu/observers/<user-account>. You will need to login using the observer account name and my.nrao.edu password. This will allow you to navigate through your filesystem on a browser to view files.

GlobusOnline

The NAASC will be investigating a Globus Connect portal in coming months.

7. Software

For detailed information regarding data processing refer to the CASA and AIPS documentation pages.

The following information addresses details specific to executing CASA or AIPS at the NAASC and lists additional reduction packages available at the the NMASC.

CASA

The current release version of CASA is the default with executing 'casa' or it's derivitives: casapy, casabrowser etc. Previous releases can be found at /home/casa/packages/RHEL6/release/casapy-<version>.  In addition typing casa -ls will show available versions of casa, and casa -r <version> to run a specific version.

AIPS

AIPS is not supported by the NAASC, but the information below is given for completeness.

To use AIPS on the cluster users should have an AIPS specific Lustre data area and a personal .dadevs.always file in their home directory.  There are six predefined public lustre areas on both the NAASC and CV Lustre systems, but these are intended for temporary use and are periodically cleaned of old files.

To set your own AIPS data areas, do the following:

  cd                          # make sure you are in the home area
mkdir AIPS_1 # Create the directory touch AIPS_1/SPACE # Make an empty lock file (required) emacs -nw ~/.dadevs.always # Note the leading dot in the filename

(or use vi or gedit or your preference of editor). Then ensure this line is contained in the file:

  +  /lustre/naasc/observers/<username>/AIPS_1

The + sign MUST be in the first column, and there MUST be exactly two spaces between it and the leading slash in /lustre....

8. Reporting Problems

Please report any problems through the ALMA Science Helpdesk. To log in, use the same user ID and password as when using the ALMA Science Portal.  Note this is a different account name but the same password used in accessing your temporary computer account. The correct department is Data Reduction (NA).