Changes To Generic NQS (GNQS)
Academic Computing Services , University of Sheffield
Stuart Herbert (S.Herbert@sheffield.ac.uk)Document copyright ©. All rights reserved.
Abstract
The University of Sheffield is supplying, and supporting, Generic
NQS to UK Higher Educational Sites as part of the New Technologies
Initiative of JISC, under grant NTI/48.2. This document contains
a summary of changes for each new release of NQS.
Contents
Click here for a plain-text version of this paper. Click here for a copy of this document in Microsoft RTF format, suitable for printing (if available).
Introduction
Introduction
This is a summary of changes to Generic NQS as released by the
University of Sheffield.
We are most grateful for the contributions made by other individuals
and organisations.
Version 3.40.0
About This Release
Purpose
This is the full public release of Generic NQS v3.40.0. Generic NQS
was previously known as Monsanto-NQS.
Supported Platforms
This product has been compiled, and used, on the following
platforms prior to full release :
> ----------------------------------------------------------------
> Platform | Release | Compiler | Tested?
> | Used | Warnings? |
> ----------------------------------------------------------------
> AIX v3 | v3.40.0 #3 | Unknown | Yes
> AIX v4 | v3.40.0 #3 | Unknown | Yes
> Fujitsu | v3.40.0 #2 | Very few | Yes
> HP-UX 8.x | v3.40.0 #3 | Unknown | Yes
> IRIX 5.3 | v3.40.0 #3 | Unknown | Yes
> IRIX 6.0 (32) | v3.40.0 #3 | Unknown | Yes
> Linux/ELF | v3.40.0 #3 | No | Yes
> Solaris 2.3 | v3.40.0 #3 | No | Yes
> Solaris 2.4 | v3.40.0 #3 | No | Yes
> SunOS 4.1.3 | v3.40.0 #3 | Very few | Doesn't work
> ULTRIX 4.3a | v3.40.0 #1 | No | No
> UNICOS v8 | v3.40.0 #2 | Unknown | Yes
> ----------------------------------------------------------------
The following platforms are also supported, but we have been unable
to test this product on them prior to full release.
> ----------------------------------------------------------------
> Platform | Last Release We Know Which Worked
> ----------------------------------------------------------------
> HPUX 8.x | v3.37.1
> HPUX 9.x | v3.37.1
> IRIX 4 | v3.36.0
> IRIX 5.2 | v3.37.1
> IRIX 6.1 | v3.37.1
> NCR | v3.36.6
> OSF/1 v3.2 | v3.37.1
> SunOS v4.x | v3.37.1
> ----------------------------------------------------------------
If you successfully compile this release on other platforms, please
send me details, so that I can expand this section further.
Testing
Prior to release, there were three public pre-releases of Generic
NQS v3.40.0. These releases were made to allow the NQS user
community time to test Generic NQS, and to report any problems with
the changes which have been made.
It has not been possible to test all of the following changes on all
of the supported platforms. Each change includes a section,
`Status', which details what testing has been carried out.
Changes List
The following sections list the changes between Monsanto-NQS v3.37.1
and Generic NQS v3.40.0, in the following order :
- New Platforms Supported
- Compilation Fixes For Supported Platforms
- New Features
- Bug Fixes
- Anything Else
New Name
Monsanto-NQS was originally maintained by John Roman, of the
Monsanto Company. Since October, 1994, it has been maintained by
The University of Sheffield. By agreement with John, there will be
no more releases under the Monsanto-NQS name.
AIX 4.1 Support Added
Description
Generic NQS has been ported to v4.1 of IBM's AIX operating system.
Status
Pre-releases #2 and #3 were tested.
Platforms Affected
> [ ] AIX 3 [x] AIX 4
> [ ] FUJITSU [ ] HPUX 8
> [ ] HPUX 9 [ ] HPUX 10
> [ ] IRIX 4 [ ] IRIX 5
> [ ] IRIX 6 [ ] LINUX
> [ ] NCR [ ] OSF/1
> [ ] SOLARIS 2 [ ] SUNOS 4
> [ ] ULTRIX [ ] UNICOS
Code Contributed By
Mark Loveridge (markl@gatwick.geco-prakla.slb.com)
Fujitsu VP2200/20 UXP/M Support Added
Description
Generic NQS now compiles and functions on the Fujitsu VP2200/20
running UXP/M V10L20, compiled with /usr/ccs/bin/cc (NOT /usr/ucb/cc).
Platforms Affected
> [ ] AIX 3 [ ] AIX 4
> [x] FUJITSU [ ] HPUX 8
> [ ] HPUX 9 [ ] HPUX 10
> [ ] IRIX 4 [ ] IRIX 5
> [ ] IRIX 6 [ ] LINUX
> [ ] NCR [ ] OSF/1
> [ ] SOLARIS 2 [ ] SUNOS 4
> [ ] ULTRIX [ ] UNICOS
Status
Tested.
Code Contributed By
Mark Loveridge (markl@gatwick.geco-prakla.slb.com)
Experimental HP-UX 10 Support Added
Description
After discussions with users who have access to HP-UX v10, it appears
that HP-UX 10 is System 5, release 4 compatible - to be honest, I
don't actually know if this is true.
Based on this assumption, I've added a Makefile for HP-UX 10 which
I expect will compile Generic NQS on HP-UX 10. Because this port is
based entirely on this assumption, this support is experimental, and
feedback from HP-UX users would be most appreciated.
Status
Completely untested.
Platforms Affected
> [ ] AIX 3 [ ] AIX 4
> [ ] FUJITSU [ ] HPUX 8
> [ ] HPUX 9 [x] HPUX 10
> [ ] IRIX 4 [ ] IRIX 5
> [ ] IRIX 6 [ ] LINUX
> [ ] NCR [ ] OSF/1
> [ ] SOLARIS 2 [ ] SUNOS 4
> [ ] ULTRIX [ ] UNICOS
Code Contributed By
Stu
UNICOS 8 Support Added
Description
Generic NQS has been ported to v8 of Cray's UNICOS operating system.
Status
Pre-release #2, plus the UNICOS support patches, have been tested on
UNICOS v8.
Platforms Affected
> [ ] AIX 3 [ ] AIX 4
> [ ] FUJITSU [ ] HPUX 8
> [ ] HPUX 9 [ ] HPUX 10
> [ ] IRIX 4 [ ] IRIX 5
> [ ] IRIX 6 [ ] LINUX
> [ ] NCR [ ] OSF/1
> [ ] SOLARIS 2 [ ] SUNOS 4
> [ ] ULTRIX [x] UNICOS
Code Contributed By
Dave Safford (saff@tamu.edu)
Compilation Fixes For AIX 3
Description
Generic NQS now compiles without warnings on AIX 3.2.5.
Status
Tested.
Platforms Affected
> [x] AIX 3 [ ] AIX 4
> [ ] FUJITSU [ ] HPUX 8
> [ ] HPUX 9 [ ] HPUX 10
> [ ] IRIX 4 [ ] IRIX 5
> [ ] IRIX 6 [ ] LINUX
> [ ] NCR [ ] OSF/1
> [ ] SOLARIS 2 [ ] SUNOS 4
> [ ] ULTRIX [ ] UNICOS
Code Contributed By
Mark Loveridge
Compilation Fixes For Fujitsu
Description
The Fujitsu system libraries declare a global variable `Logfile',
which conflict with an interal NQS variable of the same name. The
NQS variable `Logfile' has been renamed `NetLogfile'.
Status
Tested.
Platforms Affected
> [ ] AIX 3 [ ] AIX 4
> [x] FUJITSU [ ] HPUX 8
> [ ] HPUX 9 [ ] HPUX 10
> [ ] IRIX 4 [ ] IRIX 5
> [ ] IRIX 6 [ ] LINUX
> [ ] NCR [ ] OSF/1
> [ ] SOLARIS 2 [ ] SUNOS 4
> [ ] ULTRIX [ ] UNICOS
Code Contributed By
Mark Loveridge
Compilation Fixes For ULTRIX
Description
Substantial work has been done to allow Generic NQS to work on Ultrix
4.3A. I have independant reports that this works well.
Platforms Affected
> [ ] AIX 3 [ ] AIX 4
> [ ] FUJITSU [ ] HPUX 8
> [ ] HPUX 9 [ ] HPUX 10
> [ ] IRIX 4 [ ] IRIX 5
> [ ] IRIX 6 [ ] LINUX
> [ ] NCR [ ] OSF/1
> [ ] SOLARIS 2 [ ] SUNOS 4
> [x] ULTRIX [ ] UNICOS
Status
Tested under ULTRIX. No further work apparently required.
Code Contributed By
David Billinghurst (billingd@crc.cra.com.au)
`Cost' Style Accounting Hook Added
Description
The empty function, nqs_checkbal(), can be modified to determine
whether a user has used up their budget, and if so, their jobs are
given a priority of zero.
Sites requiring `Larmouth'-type scheduling in particular may find
this to be of use in conjunction with the dynamic scheduling
support.
`Cost' Accounting is a compile-time optional extra.
Status
Tested
Platforms Affected
All.
Code Contributed By
Dave Safford (saff@tamu.edu)
Dynamic Scheduling Added
Description
The scheduler has been updated, so that queues are resorted just
before NQS tries to spawn a new request (ie, when a request is
queued, or a running job exits).
The supplied default comparison routine, bsc_compare() (source code
in src/nqs_bsc.c) lowers the priorities of jobs if a user has more
jobs queued than the current user_limit for the given queue. This
routine can be tailored to perform more detailed comparisons.
Support for dynamic scheduling is a compile-time optional extra.
Status
Tested.
Platforms Affected
All.
Code Contributed By
Dave Safford (saff@tamu.edu)
New Documentation
Description
The documentation `INSTALL', `README', and `PROBLEMS' has all been
substantially re-written.
Status
Done.
Contributed By
Stu
NIS Netgroup Support
Description
On platforms which use NIS, the file /etc/hosts.allow may contain
references to NIS netgroups. Generic NQS can now parse these
netgroup references correctly.
Support for NIS netgroups is a compile-time optional extra.
Status
Tested on many previous versions of Monsanto NQS.
Platforms Affected
All.
Code Contributed By
Thomas Richter (richter@chemie.fu-berlin.de)
Optional Features Now Available
Description
Generic NQS now includes a number of features which can be
considered optional - the file `FEATURES' provides full
documentation on these (this document is also available from our
WWW site).
The file `proto/Makefile' now provides a variable `FEATURES', to
which you can add switches to ensure that Generic NQS is compiled
with the optional features you require. By default, Generic NQS
currently ships with the `TAMU' featureset enabled, but this easily
changed.
Status
Tested.
Platforms Affected
All.
Code Contributed By
Stu.
Portability Fixes
Description
Adding support for a new platform to NQS is often a tedious task of
going around, editing the many source files to add `-Dplatform' to
ensure that the correct code is compiled in.
Generic NQS now allows those porting to new platforms to indicate
what type of platform they are porting to. The exising NQS code
has been broken up into three sets :
- Code which is POSIX.1 compliant.
- Code which is System V, r4 or later, compliant.
- Code which is BSD 4.3 or later compliant.
Most modern platforms are POSIX.1 complaint, and also SYSVr4
compliant, and the changes make porting to such platforms
significantly easier than it currently is, although not perfect
at this time.
For platforms which still `do their own thing' in places, support for
individual platforms can still be added to the source tree, as
before, but this now only needs to be done for exceptions, rather
than for everything.
While working on this modification, a number of inconsistencies came
to light, affecting support for ULTRIX, UNICOS, and OSF/1. I've
asked Dave about what's right for UNICOS, but for OSF/1, I've had to
guess (not having access to OSF/1 locally) and so OSF/1 users might
well find that this release is a complete no-go for them.
With a bit of luck, this will actually lead to the support for
existing platforms becoming more accurate, and hence more robust,
once the initial problems are ironed out. In the meantime, I've
probably been over-generous in marking code as POSIX, and so the
platforms which are neither SYSV nor BSD may well have problems.
Status
Tested on AIX 3 & 4, IRIX 5 & 6, Linux/ELF, SunOS 4, Solaris 2.
SunOS 4 support (generic BSD) is reportedly broken, and will be
fixed in the next release. OSF/1 support has NOT been tested, and
is of concern - the OSF/1 port was substantially re-written.
Platforms Affected
All.
Code Contributed By
Stu.
Processor Set Functionality Now Optional Extra
Description
The processor set support, currently for IRIX 5 & 6, was originally
added to Monsanto NQS v3.36.4 by Dave Safford.
This support is now a compile-time optional extra. It is not
enabled by default.
Status
Tested.
Platforms Affected
> [ ] AIX 3 [ ] AIX 4
> [ ] FUJITSU [ ] HPUX 8
> [ ] HPUX 9 [ ] HPUX 10
> [ ] IRIX 4 [x] IRIX 5
> [x] IRIX 6 [ ] LINUX
> [ ] NCR [ ] OSF/1
> [ ] SOLARIS 2 [ ] SUNOS 4
> [ ] ULTRIX [ ] UNICOS
Code Contributed By
Stu
Temporary Directory Functionality Now Optional Extra
Description
The TMPDIR temporary directory support was originally added to
Monsanto NQS v3.36.4 by Dave Safford.
This support is now a compile-time optional extra.
Platforms Affected
All.
Code Contributed By
Stu
ANSI C Cleanups
Description
More work has been done to ensure that NQS compiles cleanly under
ANSI C. If your native C compiler has trouble with these changes,
please try to compile NQS with the Free Software Foundation's GCC
before reporting any problems.
From now on, please report any compile-time warnings so that they
can be eliminated.
Platforms Affected
All platforms affected.
Status
Tested on Linux/ELF and Solaris 2.3 by Stu.
Code Contributed By
David Billinghurst, Mark Loveridge, and Stu.
More ANSI C Cleanups
Description
Pre-release #2 produced compiler warnings when compiled as a native
64-bit binary for IRIX 6. Many of these warnings were related to
unused variables, and unreachable code.
Warnings generated at compile time should have been eliminated - I'd
like to know if I've missed any. Warnings generated at link time
remain - the SGI compiler is somewhat noisy in this area.
Please note that, even with these fixes, the 64-bit version of NQS
reportedly still does not compile on IRIX 6 at this time. It is
*POSSIBLE* that this is a problem with older versions of the IRIX
6.0x compiler.
In addition, Mark has done numerous cleanups for SunOS 4.1.4_U1, and
general cleaning up of code I'd previously added myself.
Platforms Affected
All.
Status
Not tested - I don't have access to IRIX 6. Mark has tested his
patches by compiling on Solaris, SunOS, AIX 3 & 4, IRIX 5.3, and
Fujitsu UXP/M.
Code Contributed By
Stu, with thanks to Dr Jaume Farras, University of Barcelona.
Mark Loveridge
More ANSI C Fixes
Description
The declaration of function prototypes in the NQS source code was
previously of the form :
> #ifndef __CEXTRACT__
> #if __STDC__
>
> ANSI C Prototypes
>
> #else
>
> KnR C Prototypes
>
> #endif
> #endif
This has been reduced down to just ANSI C prototypes. This has the
following benefits :
- Remove a number of compile time warnings from Solaris 2
- Enable compilation on Fujitsu machines
Platforms Affected
All.
Status
Tested.
Code Contributed By
Stu
Email Messages Corrected
Description
If NQS_SPOOL was set to anything but the default value, email to
users once the output was spooled would contain an incorrect
directory name. Fixed.
Platforms Affected
All.
Status
Tested.
Code Contributed By
Mark Loveridge, and Stu.
Jobs No Longer Get Stuck In Routing State
Description
When a job is being moved from one pipe queue to another pipe queue,
the job is said to be in the `routing state'. While it is in this
state, the job cannot be deleted.
I consider this to be one of the three most important outstanding
problems reported for Generic NQS (the other two being unanticipated
transaction failures, fixed by David Billinghurst, and problems with
reading the nmap database, so far unfixed) because the only way to
recover from this is to delete the NQS spool space and reinstall.
This problem occurs when, for some reason, pipeclient fails to
establish communication with a remote host. If a pipe queue has
several destinations, and at least one of them successfully accepts
the transfer, there is no problem, but when all the destination(s)
fail, the request remains in the routing stage, where it cannot be
deleted at all.
The problem was caused by pipeclient failing to tell the NQS daemon
to reschedule the request correctly. Now, once pipeclient has
determined that all remote hosts are unavailable, the request is
requeued in the waiting state (where it can be deleted etc as
desired).
Status
Second implementation of solution. Tested on Solaris 2.3 and
Linux/ELF - no apparent side effects observed to date.
Platforms Affected
All.
Code Contributed By
Stu.
More Environment Variable Fixes
Description
When NQS builds an environment for processes, it now correctly
determines the length of all the strings in the environment.
Platforms Affected
All.
Status
Tested.
Code Contributed By
Mark Loveridge
NMAP Database Fixes
Description
Under pre-releases of Generic NQS v3.40.0, the `list' command of
nmapmgr failed to list any aliases defined for a machine. This was
caused by a previous fix to remove compile-time warnings, and has
been fixed.
Status
Tested.
Platforms Affected
All.
Code Contributed By
Stu.
qcat -u Fixed
Description
whompw was not initialised to NULL, and this prevented qcat -u
working on all platforms except Fujitsu. Fixed.
Platforms Affected
All except Fujitsu.
Status
Tested.
Code Contributed By
Mark Loveridge
Quota Support Improved
Description
The `quotas' (kernel-supported resource limits in UNIX parlance)
enforcing code has been reworked. For all SYSV-like platforms, a
resource limit of `unlimited' should now be correctly enforced,
while for BSD-like platforms, resource limits greater than
`unlimited' should now be correctly set to unlimited.
The `quotas' code is now the same for all platforms.
This change was introduced at this stage to fix problems encountered
on the IRIX 6 platform.
Status
Tested on Solaris 2.3 and Linux/ELF - a subset of the changes have
also been used at Exeter, on IRIX 6.
Platforms Affected
All.
Code Contributed By
Stu
Further Quota Fixes
Description
Code has been streamlined in Generic NQS, to ensure that the
implementation of BSD-like kernel-supported resources is correct.
This code may not work for platforms which do not support the ANSI C
CLK_TCK symbol.
Status
Tested.
Platforms Affected
All
Code Contributed By
Stu, based on discussions with Phil Chambers
(P.A.Chambers@exeter.ac.uk).
Removal of BSDish Code
Description
The amount of code which uses the BSD API has been reduced. Solaris 2
supplies a BSD-compatibility library which is notorious for being
buggy, while Linux + ELF doesn't seem to provide such a library at
all ...
NOTE that these changes have not been extensively tested, and may
result in broken functionality.
Platforms Affected
> [ ] AIX 3 [ ] AIX 4
> [ ] FUJITSU [ ] HPUX 8
> [ ] HPUX 9 [ ] HPUX 10
> [ ] IRIX 4 [ ] IRIX 5
> [ ] IRIX 6 [x] LINUX
> [ ] NCR [ ] OSF/1
> [x] SOLARIS 2 [ ] SUNOS 4
> [ ] ULTRIX [ ] UNICOS
Status
Tested under Solaris 2.3 and Linux/ELF.
Code Contributed By
Stu
Scheduling Fix On Startup
Description
When restarting NQS, pending jobs were not started up in a timely
fashion. This has been fixed.
Platforms Affected
All.
Status
Tested on Solaris 2.3 and Linux/ELF. No apparent side effects
observed.
Code Contributed By
Rob Creecy (rcreecy@census.gov)
Unanticipated Transaction Failure Fix
Description
Generic NQS currently uses inodes themselves to store information
about transactions - this is the way it was originally done back in
COSMIC NQS.
Generic NQS assumes that inodes are 32-bit unsigned integers - this
is not the case for all versions of UNIX (POSIX.1 does not define
the size of an inode). On systems which use 32-bit signed integers
for the inode, NQS generates `unanticipated transaction failures',
because negative values are not permitted.
This affects the timestamp for a transaction, which can no longer
fit into the inode correctly. The solution is to reduce the
granularity of the timestamp (so that it is accurate to every four
seconds) so that it will fit into the inode correctly.
Platforms Affected
All.
Status
Tested on Solaris 2.3, Linux/ELF, and ULTRIX. No apparent side
effects observed.
This change may make upgrading from Monsanto-NQS difficult under
certain circumstances.
Change Contributed By
David Billinghurst (billingd@crc.cra.com.au)
|