This is www.gnqs.org, The Home Of Batch Processing


Home | Developers | Documents | Downloads | Mailing Lists | People | Support | Volunteer


Installing Generic NQS Version 3.4x.x

Academic Computing Services , University of Sheffield

Stuart Herbert (S.Herbert@sheffield.ac.uk)

Document copyright ©. All rights reserved.


Abstract

Under grant NTI/48.2 from the New Technologies Sub Committee (NTSC) of JISC, the University of Sheffield is maintaining a freely-available version of the Network Queueing System (NQS), the de facto standard batch processing system for the UNIX operating system.

This document explains how to install, and configure, Generic NQS v3.40 or later.


Contents

Click here for a plain-text version of this paper. Click here for a copy of this document in Microsoft RTF format, suitable for printing (if available).


Introduction


Welcome To Generic NQS

Thank you for your interest in Generic NQS.

Generic NQS is the continuing development of Monsanto-NQS, itself descended directly from the original COSMIC NQS, written under contract to NASA by Sterling Software, Inc.

Since October, 1994, Monsanto-NQS (and then Generic NQS) has been maintained by The University of Sheffield. We are funded to produce a freely-available, robust, and well-documented version of NQS for UK Academia.

For more information, please see the `README' file included with the source code distribution, or alternatively, from :

> http://www.shef.ac.uk/uni/projects/nqs/Product/GNQS/v3.4x/README/


About This Document


Purpose

This document is supposed to provide you with all the information you need in order to compile, install, and setup NQS at your site for daily use.

Instructions are included (in the following order) for :

  • New Generic NQS users - Quick Start Guide

    One of the more frequent complaints about Generic NQS is that it appears to be very difficult to install - much more so than, say, CERN NQS.

    The Quick Start Guide is for you if you just want to get NQS up and running in a hurry.

  • New Generic NQS users - compiling, and installing.

    These are detailed instructions for users who are installing GNQS for the very first time, or who are re-installing from scratch.

  • Existing Monsanto-NQS/Generic NQS users - upgrading.

    These are detailed instructions for users who are upgrading from an older version of Generic NQS, or from Monsanto-NQS.

  • How To Configure GNQS For Batch On A Single Machine

    These are typical configurations for a single compute server.

  • How To Configure GNQS For Batch On A Cluster Of Workstations

    These are typical configurations for a set of workstations which are clustered together.


Using This Document

I recommend that your print this document out (or otherwise have the document available) so that you can follow the instructions below and still read this document at the same time.

A HTML version of this document is available from the following URL :

> http://www.shef.ac.uk/uni/projects/nqs/Product/GNQS/v3.4x/Install/


Conventions

During each step of the installation process, you will see a paragraph (or more) of instructions, followed by sample commands which demonstrate what to do. Sample commands are represented by

>  this is a sample command


Contacting The Author

IF there is anything about compiling, installing, and setting up Generic NQS which this document does NOT cover, please mail the author.

> mailto:S.Herbert@sheffield.ac.uk
I normally reply within one working day.


Quick Start Guide


Introduction

In this chapter, we look at how to compile, install, and configure Generic NQS with the minimum of fuss, effort, and detail. After reading this chapter, please read the two chapters which explain typical configurations for using Generic NQS on a number of machines.


Installing Generic NQS

As the user `root' on your machine, download the Generic NQS source code. Uncompress the source code.

>  ftp ftp.shef.ac.uk
>  cd /pub/uni/projects/nqs/latest
>  binary
>  get Generic-NQS.tar.gz
>  quit
>  tar zxf Generic-NQS.tar.gz
>  cd Generic-NQS-3.40.0/proto
Edit the file `Makefile'. Scroll through the Makefile, and uncomment the line which includes the support for your version of UNIX.

>  vi `Makefile'
>  (scroll down to where it says STEP THREE)
>  (remove the # from the front of the correct line for your OS)
>  :wq (save and quit)
Compile and install the software.

>  make ; make directories ; make install ; make install.man
Add a service entry for Generic NQS. If your machine uses NIS, NIS+, or something else like that, then you may need to do additional steps - refer to your system administrator's manual.

>  vi /etc/services
>  nqs	607/tcp			# Generic NQS
Create an NQS machine id for your host.

>  rehash
>  nmapmgr
>  add host &ltyour machine's hostname>
>  (eg add host myrddraal)
>  add alias &ltyour machine's full DNS name> &ltyour machine's hostname>
>  (eg add alias myrddraal.shef.ac.uk myrddraal)
>  list
>  exit
Finally, start Generic NQS for the very first time.

>  qmgr start nqs > /usr/adm/nqslog
If Generic NQS fails to start, then any error messages will be placed in the file /usr/adm/nqslog.

You may want to put this into your machine's startup scripts.


Configuring Generic NQS

First things first - add your normal, non-root user as an NQS manager, so that you do not have to be logged in as root whenever you want to change the configuration of Generic NQS.

>  qmgr add manager yourself:m
Also add any other NQS managers as required.

Next, specify where you want the NQS logfile to go.

>  qmgr set log_file /usr/adm/nqslog
Now, create your first NQS queue.

>  qmgr create batch_queue test1
>  qmgr enable queue test1
>  qmgr start queue test1
Submit your first NQS request.

>  qsub -q test1
>  date
>  ^D
Finally, view the output files created by your first NQS request.

>  ls STDIN.*
>  more STDIN.o0
>  more STDIN.e0

Compiling And Installing Generic NQS


Introduction

In this chapter, we look at how to compile and install the GNQS source code. These instructions were written for Generic NQS v3.40.0, and were last updated on Friday, 1st September 1995.

Generic NQS v3.40.0 is available from

> ftp://www.shef.ac.uk/uni/projects/nqs/v3.40/


Getting Started

Before you start, you need to know the following :

  • Which version of UNIX do you have?

    Generic NQS has been tested on the following versions of UNIX at some point in its history :

    AIX 3.2.5, 4.1; Fujitsu; HP-UX 8,9; Irix 4,5,6; Linux; NCR; OSF/1; Solaris 2; SunOS 4; ULTRIX, UNICOS 8.

    If you are successful in getting Generic NQS to work, please email me, and tell me, or use the following form on the World-Wide Web :

    > http://www.shef.ac.uk/uni/projects/nqs/Product/GNQS/v3.4x/Success.html


Edit The Makefile

Change to the `proto' directory, and edit the file `Makefile'.

>  cd Generic-NQS-3.40.0/proto
>  vi Makefile


Where NQS Is Installed

The GNQS software itself can be installed onto a central server, and shared between all of the workstations which mount their software (typically via NFS) from that server.

The GNQS software is installed into directories off of NQS_ROOTDIR. NQS_ROOTDIR can be a directory which is shared via NFS between several machines. For example, if NQS_ROOTDIR was `/usr/local', then GNQS would be installed into `/usr/local/bin', `/usr/local/sbin', `/usr/local/man' etc., etc.

If you want to install GNQS into somewhere other than `/usr/local', then change the value of NQS_ROOTDIR.

> NQS_ROOTDIR = /usr/local
The GNQS software also requires temporary file space to work. This temporary space MUST be unique to each machine running GNQS, although it can be on an NFS partition. By default, GNQS uses the standard UNIX spool area, `/usr/spool'.

If you want GNQS to store its working files elsewhere, change the value of NQS_ROOTPRIV.

> NQS_ROOTPRIV = /usr/spool
After this, the Makefile has a set of entries (from NQS_MAN to NQS_NMAP) which specify where to install the various components of Generic NQS. We recommend that you use the default settings.


Which Optional Features Do You Want?

Generic NQS now includes a number of features which you can choose to add, or leave out, at compile time. The Makefile includes a brief list, and you can find more details in the file `doc/Features', or from the URL

> http://www.shef.ac.uk/uni/projects/nqs/Product/GNQS/v3.4x/Features/
By default, the Makefile enables the features most GNQS installations will want. If you want to change which features are available, then change the value of FEATURES.

>  FEATURES = -DTAMU


Which Version of UNIX You Are Using

The GNQS software can be compiled on a number of platforms. For each supported platform, there is an appropriate Makefile, which contains all the information specific to each version of UNIX.

Please uncomment the line which selects the Makefile for your UNIX machine.

> #include Makefile.linux
becomes

> include Makefile.linux
for example.


File Ownership

The GNQS software (and directories it uses) will be owned (by default) by the user `root', and the group `bin'. If you wish to change either of these, edit the values of `NQS_OWNER' and `NQS_GROUP' respectively.

> NQS_OWNER = root
> NQS_GROUP = bin


Save Your Changes

Once you have done the above, please save your changed Makefile to disk ready for compiling.


Compiling The Software

We are now ready to compile Generic NQS. To do so, please change to the `proto' directory (if you're not already there), and type `make'.

> cd Generic-NQS-3.40.0/proto
> make
This will compile Generic NQS v3.4x, ready for installing. Generic NQS is a large program, and can take over half an hour to compile (especially if your machine is heavily loaded).

While Generic NQS is compiling, you will probably see your C compiler producing warnings. The Generic NQS code is very old (much of it dates back to 1985) and while a number of people have spent time removing those warnings, we have not yet managed to remove them all.

If Generic NQS fails to compile, please contact the author with the following details :

  • Which version of UNIX you are compiling on?

    You can get this information normally by using `uname -a'.

  • Where there is a choice, which C compiler you are using?

    You can get this information normally by using `which cc'. If you are using the Free Software Foundation's GCC, please indicate which version of GCC you are using.

  • A log of the compilation.

    You can do this by typing `make >& logfile' (if you use csh/tcsh) or `make > logfile 2&gt1' (is you use bash/sh/ksh). This will store information only about the file that fails to compile.

  • Anything else you think is important.

Please email all of the above to NQS-Support@mailbase.ac.uk.


Installing


Creating The GNQS Working Files

Once GNQS has compiled, the next step is to build the working files GNQS uses. To do this, go to the `proto' directory, and type `make directories'.

>  cd Generic-NQS-3.40.0/proto
>  make directories


Installing The GNQS Software

Now you can install the Generic NQS software itself, by using the command `make install'.

>  cd Generic-NQS-3.40.0/proto
>  make install


Installing The GNQS Manual Pages

The Generic NQS manual pages can be installed by using the command `make install.man'.

>  cd Generic-NQS-3.40.0/proto
>  make install.man
You will then need to rebuild your manual page database. For some versions of UNIX, this is done with the command /usr/lib/whatis; for others, this is done using catman. Please refer to your vendor's documentation.


Adding The Service Entry

You next need to edit /etc/services (or modify your NIS/NIS+ database) to add an entry for GNQS :

>  nqs  607/tcp           # Network Queueing System


Creating A Machine ID

Your next step is to run the nmapmgr program, provided with Generic NQS, to allocate a machine id to your computer. The following commands will use the IP address of your computer to form a machine id :

>  nmapmgr
>  NMAPMGR: add host &lthostname>
>  NMAPMGR: add alias &lthostname>.&ltdomainname> &lthostname>
>  NMAPMGR: list
>  NMAPMGR: exit
Each machine running NQS requires a unique machine id. The machine id is used to track NQS requests as they move between different machines running NQS.

If you wish to use Generic NQS in conjunction with other versions of NQS, then you may have to assign machine id's explicitly - some versions of NQS only allow machine id's which are small in value. See the nmapmgr(1) manual page for information on how to manually assign machine id's.


Start Up GNQS

To start GNQS for the very first time, use the command

>  qmgr start nqs
You should add this command to your startup scripts to restart the NQS daemon whenever the machine is restarted.


Installation Complete

At this point, Generic NQS is now installed on your computer. Your next step is to configure Generic NQS to suit your setup. Chapter 4, below, has details on how to go about this, and also includes several example setups which may suit your needs.


Upgrading From An Older Version


Introduction

In this chapter, we look at how to upgrade your existing version of Monsanto-NQS/Generic NQS to the latest Generic NQS v3.4x.x.


Getting Started

You must first ask yourself the following questions :

  • Am I upgrading from Monsanto-NQS v3.35, Monsanto-NQS v3.36.x, or Generic NQS v3.4x.x?

  • Do I intend to install this new version of Generic NQS into exactly the same directories as my current version?

If you answered `no' to either of these, then please follow the instructions for a `A Non-Staged Upgrade', below.

If you answered `yes' to both questions, then please follow the instructions for a `A Staged Upgrade', below.

If you are upgrading from someone else's version of NQS (eg, CERN NQS), then you should consider this a new installation, and follow the instructions given in Chapter 3.


A Non-Staged Upgrade


Introduction

These are instructions on how to upgrade your NQS installation to use the software from this latest release. These instructions should be followed if any of the following are true :

  • You are upgrading from a version of Monsanto-NQS prior to v3.35.

  • You do not wish to install this latest release in the same place as your current NQS.


Step One : Compile The Software

Follow the instructions in Chapter 3, above, to compile this release of Generic NQS. Stop once you reach section 3.5, ``Installing''.


Step Two : Take A Copy Of Your Current NQS Configuration

If you want to keep your current NQS configuration, use the qmgr(1) command to create a `snap-file' :

>  qmgr 
>  #Mgr: snap file=&ltfilename>
>  #Mgr: exit
This is a precaution, in case things go horribly wrong, and you need to rebuild your NQS configuration.


Step Three : Stop NQS

Make sure that there are no running NQS jobs, and shutdown the running NQS daemon :

>  qmgr
>  #Mgr: shutdown
>  #Mgr: exit


Step Four : Install The New Version Of Generic NQS

Install the new version of Generic NQS by using the following commands :

>  cd Generic-NQS-3.40.0/proto
>  make install
>  make install.man


Step Five : Move The NQS Spool Area

This step only applies if :

  • You have compiled the new version of NQS to store its working files into a different directory.

Move the NQS spool area from its current place (typically, /usr/spool/nqs) to its new place.


Step Six : Restart NQS

Finally, start up the new NQS daemon, by using qmgr :

>  qmgr start nqs
Your Non-Staged Upgrade should now be complete.


A Staged Upgrade


Introduction

These are instructions for how to upgrade your NQS installation to use the latest Generic NQS release. These instructions should be followed if all of the following are true :

  • You are upgrading from Monsanto-NQS v3.35 or later, or from Generic NQS v3.40.0 or later.

  • You intend to install this release of Generic NQS into the SAME directories as your current version of NQS.


Step One : Compile The Software

Follow the instructions in Chapter 3, above, to compile this release of Generic NQS. Stop when you get to section 3.5, ``Installing''.

Make sure that you edit the makefiles to ensure that the new NQS software will be installed into the same directories as your existing NQS software.


Step Two: Stage In The Software

To stage in the new NQS software, use the following commands :

>  cd Generic-NQS-3.40.0/proto
>  make stage
Your upgrade is now complete. Your existing NQS installation will automatically replace itself with the new software when it can.


Configuring Generic NQS


Introduction

This chapter explains how to configure Generic NQS once it has been installed. To do this, we will work through a collection of sample configurations which have been contributed by various Generic NQS users. Feel free to use one of these configurations for your own computer.

Comments, and contributed configurations, are always welcome.

I've broken these configurations up into two groups, which represent the two types of computer system NQS is typically used in.


Compute Servers


Introduction

One of the most popular uses of NQS is to impose some kind of order on the users of central compute servers. These are typically powerful UNIX machines (eg, SGI Challenge XL), possibly acting as servers for a number of clustered workstations. They have many large or CPU-intensive processes running concurrently.

NQS installations on this type of machine are typically stand-alone, and do not dispatch jobs out to lesser machines, such as workstations. Sometimes, workstations may forward jobs to the compute server.

The purpose of an NQS installation on such a machine is to prevent the over-allocation of system resources, so that a healthy throughput is maintained. The main system resources which are always in short supply are CPU time, and memory.


Sample Configuration - Controlling CPU Usage

This configuration is based upon the one used here at the University of Sheffield on our SGI Challenge XL computer. This configuration will probably be sufficient for most environments, because, if local experience is anything to go by, most users soon get a feel for how long their work will take to run, but they really haven't a clue as to how much of other resources (such as memory) it will make use of during that time.

Create four batch queues :

>  qmgr create batch_queue short
>  qmgr create batch_queue medium
>  qmgr create batch_queue long
>  qmgr create batch_queue extra_long
Next, for each queue, specify a maximum CPU time, the limit getting progressively larger for each queue.

>  qmgr set per_process cpu_limit = ( 2:0:0 ) short
>  qmgr set per_process cpu_limit = ( 8:0:0 ) medium
>  qmgr set per_process cpu_limit = ( 24:0:0 ) long
>  qmgr set per_process cpu_limit = ( 72:0:0 ) extra_long
Here, we have limits of 2 hours, 8 hours, 24 hours and 72 hours respectively for the four queues.

We now need to specify priorities and runlimits for each of these queues, to ensure a good working balance between the four queues. The runlimits depend entirely on what your machine can handle - those given here are for a Challenge XL with 12 CPUs and 512Mb of real RAM. I recommend that you experiment with the runlimits in order to ensure that the running NQS requests don't put a strain on your memory resources.

>  qmgr set priority = 40 short
>  qmgr set priority = 30 medium
>  qmgr set priority = 20 long
>  qmgr set priority = 10 extra-long
>  qmgr set run_limit = 5 short
>  qmgr set run_limit = 4 medium
>  qmgr set run_limit = 2 long
>  qmgr set run_limit = 1 extra-long
Next, you need to decide, for each queue, how many requests each user is allowed to have actually running at the same time. If you compiled NQS with dynamic scheduling (enabled by default), then users who submit more jobs than they are allowed to run simultaneously will find that their jobs will have a lower priority, and therefore will be lower down in the queue.

>  qmgr set user_limit = 2 short
>  qmgr set user_limit = 1 medium
>  qmgr set user_limit = 1 long
>  qmgr set user_limit = 1 extra-long
Finally, you need to decide when these queues may run, and then use root's crontab to start and stop the NQS queues as appropriate. In this configuration, the only queue which would not run all the time would be the extra-long queue; this queue would be started at 5pm on Fridays, and stopped sometime before 9am Monday morning.

I'm sure that there are ways in which this configuration could be improved; feel free to discuss this on the NQS-Support mailing list.


Clustered Workstations


Introduction

In recent times, there has been much interest in finding scheduling software which can make use of UNIX workstations sat on people's desks. These workstations are typically idle for long periods of time overnight, which represents a significant amount of wasted CPU time.

We will concern ourselves only with `clustered' workstations. These are workstations which typically mount software and/or user filestore via NFS (or equivalent) from a local server. This has the effect of ensuring that all the workstations in a cluster are the same architecture, run the same operating system, and have identical filestore layouts. When the local server fails, each workstation is unusable, because of the loss of services involved.


Sample Configuration - Clustered Workstations

This configuration demonstrates how to use a combination of pipe and batch queues to setup a load-balancing NQS queue for a cluster of workstations. You can then create more load-balancing queues, using the same principles, and vary the limits per load-balancing queue in order to provide a balanced service.

On each workstation which will run NQS requests, do the following :

>  qmgr create batch misc-dest pipeonly run_limit = 1 user_limit = 1
>    nice_level = 10
This creates a queue, misc-dest, which will run one NQS request at a time, and which runs all requests at a nice level of `10', just in case a.user is sat at the console trying to work while the job is running.

Then, on each workstation which will run NQS requests, do :

>  qmgr create pipe misc-in pipeonly run_limit = 5 user_limit = 1
>    destination = misc-dest
>  qmgr set lb_in misc-in
This creates a pipe queue, misc-in, which will store up to five requests at a time. It will forward those requests to the queue misc-dest, and will only accept requests if there are less than five requests in the queue.

Finally, on each workstation which will run NQS requests, do :

>  qmgr set scheduler server-name
where `server-name' is the DNS name of the local server which the workstations mount filestore from. NOTE that you must set all the workstations in a cluster to point to the SAME server.

Now, on the local server, do :

>  qmgr create pipe misc-sched run_limit = 40 user_limit = 5
>    destination = misc-in@workstation1, misc-in@workstation2 ...
>  qmgr set lb_out misc-sched
where `workstation1', `workstation2' and so on are all of your workstations which will run NQS requests. NQS will only send new requests to your workstations when they have room for them in their `misc-in' queues, and based on the load information from each machine (a machine with a low load is favoured over a machine with a high load).

Finally, on each of your workstations on which users can submit NQS requests, do :

>  qmgr create pipe misc destination = misc-sched@server
>  qmgr set lb_in misc
So, when a user submits a request locally to the queue `misc', it is sent to the queue `misc-sched' on the local server, which then sends it to the least loaded workstation in the cluster.



This site (www.gnqs.org) is copyrighted. You can view the terms & conditions here.
You can contact the webmaster here.