July 15, 2014
This document will describe common techniques to debug problems when running the Software Testing Automation Framework (STAF).
To find more detailed information on using STAF, go to the main STAF web page
1. General Debugging Information | |
1.1. | STAFProc console output |
When STAFProc starts on a machine, the initial output will contain the following information: Machine : staf3a.austin.ibm.com Machine nickname : staf3a.austin.ibm.com Startup time : 20080626-08:35:57 STAFProc version 3.3.0 initialized The first line, Machine, indicates the TCP/IP hostname (or the IP address if a hostname is not available) used to identify the machine. The second line, Machine nickname, indicates the machine nickname that is used for the machine. This nickname is not used for any network communication; it is used only by STAF services (such as the Log and Monitor services) which store data based on the machine from which it came. The third line, Startup time, indicates the time and date that STAFProc was started on the machine. The fourth line indicates the version of STAF. You can find specific features and bug fixes that were added to a version of STAF by examining the STAF History file. Note that if errors are encountered while STAFProc is starting, details about the errors will be displayed in the STAFProc console output. If you are starting STAFProc on Windows via the Start Menu, and errors occur during startup, the command prompt containing the console output will close and you will not be able to see the error information. If this occurs, open your own command prompt and run "STAFProc" to start STAF and see the errors in the console output. | |
1.2. | Redirecting STAFProc console output |
If errors occur with the STAFProc daemon, the error messages may be displayed in its console output. In order to ensure that this data is accessible, it is recommended that you redirect the STAFProc console output to a file, so that the information is available if the STAFProc console is closed. To redirect STAFProc's stdout and stderr to a file, you can execute the following when starting STAFProc: On Windows: STAFProc >> STAFProc.out On Unix: STAFProc >STAFProc.out 2>&1 & On Unix (on systems where logging out of the terminal would cause the STAFProc process to be terminated): nohup STAFProc >STAFProc.out 2>&1 & | |
1.3. | Configuring STAF |
STAF is configured through a text file called the STAF Configuration File. This file may have any name you choose, but the default is STAF.cfg. By default, this file is located in c:\STAF\bin on Windows, /usr/local/staf/bin on UNIX, and /Library/staf/bin on Mac OS X. When you start STAFProc on a system, that system's STAF.cfg file will be read to determine how STAF should be configured on the machine. If you make any changes to a machine's STAF.cfg file, you must restart STAFProc on that machine to make these changes have effect. Some configuration items, such as Trust levels, can be changed dynamically (via an associated STAF service, such as the TRUST service) while STAFProc is running. However, once STAFProc is restarted, these dynamic changes will no longer be in effect. So, usually after making a dynamic change on a machine, you will want to also update the machine's STAF.cfg file, so that the change will be active the next time STAFProc is restarted. | |
2. STAF Installation Verification | |
2.1. | STAF Install location |
By default STAF will be installed to C:\STAF (on Windows), /Library/staf on Mac OS X, and /usr/local/staf on all other Unix platforms. During STAF installation, the user can select any directory as the target for the installation. | |
2.2. | STAF Install packages |
STAF provides 2 ways to install STAF: InstallAnywhere (for Windows and most Unix platforms), and a tar.gz STAFInst script (for all Unix platforms). Both installers will install the same files to the target install directory. The InstallAnywhere installer will perform additional system updates, such as automatically updating system/user environment variables. The STAF InstallAnywhere installers for most platforms are available as an executable file (.exe on Windows, .bin on Unix); on Mac OS X the InstallAnywhere installer is available as a .zip file. The "Bundled JVM" executable file includes a bundled JVM that will be used during the install and uninstall of STAF. The "NoJVM" executable file will require the system to have an existing JVM. | |
2.3. | STAF directories |
The following directories will be created when STAF is installed:
| |
2.4. | Key STAF files |
The following are descriptions of some of the key STAF files that are installed in the root STAF directory:
| |
2.5. | STAF Environment |
There are multiple environment settings required for STAF to function correctly. You can find more information about the STAF environment variables in the STAF User's Guide. To view the current environment variables on a system, you can run "set" on Windows or "export" on Unix. Note that on Windows the InstallAnywhere installer will update the appropriate system/user environment variables. These can be viewed in Control Panel -> System -> Advanced -> Environment Variables. On Unix, the InstallAnywhere installer will update the /etc/profile file with the appropriate environment variables. If you used a tar.gz installer, you must set the environment variables for STAF either by running STAFEnv.sh or by updating the /etc/profile file. | |
2.6. | Determining which version/architecture of STAF is installed |
After the STAF install is complete, an install.properties file will be created in the root STAF install directory. The file will contain key/value pairs that provide information about the version of STAF that has been installed. The install.properties file will contain the following information:
Here is a sample install.properties file from a Windows system (using the IA installer): version=3.3.0 platform=win32 architecture=32-bit installer=IA file=STAF330-setup-win32.exe osname=Windows osversion=* osarch=x86 Here is a sample install.properties file from a Mac OS X i386 system (using the STAFInst installer): version=3.3.0 platform=macosx-i386 architecture=32-bit installer=STAFInst file=STAF330-macosx-i386.tar osname=Mac OS X osversion=10.4+ osarch=i386 | |
3. STAF Variables | |
| |
3.1. | VAR LIST |
To view all currently set STAF variables, you can run the following command: STAF local VAR LIST Here is an example of the output: STAF/Config/BootDrive : C: STAF/Config/CodePage : IBM-437 STAF/Config/ConfigFile : C:\STAF\bin\STAF.cfg STAF/Config/DefaultAuthenticator: none STAF/Config/DefaultInterface : tcp STAF/Config/InstanceName : STAF STAF/Config/Machine : staf3a.austin.ibm.com STAF/Config/MachineNickname : staf3a.austin.ibm.com STAF/Config/Mem/Physical/Bytes : 2135666688 STAF/Config/Mem/Physical/KB : 2085612 STAF/Config/Mem/Physical/MB : 2036 STAF/Config/OS/MajorVersion : 5 STAF/Config/OS/MinorVersion : 1 STAF/Config/OS/Name : WinXP STAF/Config/OS/Revision : 2600 STAF/Config/Sep/Command : & STAF/Config/Sep/File : \ STAF/Config/Sep/Line : STAF/Config/Sep/Path : ; STAF/Config/STAFRoot : C:\STAF STAF/Config/StartupTime : 20070731-19:22:32 STAF/DataDir : C:\STAF\data\STAF STAF/Env/ALLUSERSPROFILE : C:\Documents and Settings\All Users STAF/Env/ANT_HOME : C:\apache-ant-1.6.5 STAF/Env/APPDATA : C:\Documents and Settings\Administrator\Applic ation Data STAF/Env/CLASSPATH : C:\STAF\bin\JSTAF.jar;C:\STAF\samples\demo\STA FDemo.jar; STAF/Env/CLIENTNAME : Console STAF/Env/CommonProgramFiles : C:\Program Files\Common Files STAF/Env/COMPUTERNAME : STAF3A STAF/Env/ComSpec : C:\WINDOWS\system32\cmd.exe STAF/Env/FP_NO_HOST_CHECK : NO STAF/Env/HOMEDRIVE : C: STAF/Env/HOMEPATH : \Documents and Settings\Administrator STAF/Env/LOGONSERVER : \\STAF3A STAF/Env/NUMBER_OF_PROCESSORS : 2 STAF/Env/OS : Windows_NT STAF/Env/Path : C:\STAF\bin;C:\ibmjava142\bin;C:\WINDOWS\syste m32;C:\WINDOWS; STAF/Env/PATHEXT : .COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.W SH;.pyo;.pyc;.py;.pyw STAF/Env/PROCESSOR_ARCHITECTURE : x86 STAF/Env/PROCESSOR_IDENTIFIER : x86 Family 15 Model 4 Stepping 4, GenuineIntel STAF/Env/PROCESSOR_LEVEL : 15 STAF/Env/PROCESSOR_REVISION : 0404 STAF/Env/ProgramFiles : C:\Program Files STAF/Env/SESSIONNAME : Console STAF/Env/STAFCONVDIR : C:\STAF\codepage STAF/Env/SystemDrive : C: STAF/Env/SystemRoot : C:\WINDOWS STAF/Env/TCLLIBPATH : C:\STAF\bin;C:\STAF\bin STAF/Env/TEMP : C:\DOCUME~1\ADMINI~1\LOCALS~1\Temp STAF/Env/TMP : C:\DOCUME~1\ADMINI~1\LOCALS~1\Temp STAF/Env/tvdebugflags : 0x260 STAF/Env/tvlogsessioncount : 5000 STAF/Env/USERDOMAIN : STAF3A STAF/Env/USERNAME : staf STAF/Env/USERPROFILE : C:\Documents and Settings\Administrator STAF/Env/windir : C:\WINDOWS STAF/Version : 3.2.2 The following sections will describe some STAF variables that can be useful when debugging STAF. | |
3.2. | STAF/Config/Machine |
This variable shows the TCP/IP hostname used to identify the machine. | |
3.3. | STAF/Config/MachineNickname |
This variable shows the machine nickname that is used for the machine. This nickname is not used for any network communication; it is used only by STAF services which store data based on the machine from which it came. | |
3.4. | STAF/Config/ConfigFile |
This variable shows the STAF configuration file that was used when STAFProc was started. Note that if you have made changes to your STAF configuration file, and restarted STAFProc, but the changes made to the STAF configuration file have not been used, then verify that this STAF variable is showing the expected configuration file. | |
3.5. | STAF/Config/InstanceName |
This variable shows the name of this STAF instance. STAF Instance Names are used when you want to run multiple instances of STAFProc at the same time on the same system. This STAF variable is set to the value (when STAFProc is started) of environment variable STAF_Instance_Name. If this environment variable is not set when STAFProc is started, the default instance name STAF will be used. Note that if the value of STAF variable STAF/Config/InstanceName is set to an empty string, that is not the same as having it set to the default instance name STAF. | |
3.6. | STAF/Config/STAFRoot |
This variable shows the root STAF directory for the currently running instance of STAF. | |
3.7. | STAF/DataDir |
This variable shows directory that STAF and its services use to write data (based on the DATADIR operational parameter). | |
3.8. | STAF/Env/* |
These environment variables show all of the environment variables that were set when STAFProc started. For example, the value set for environment variable CLASSPATH will be use to set the value for STAF variable STAF/Env/CLASSPATH. | |
4. Service Help and Error Codes | |
4.1. | Obtaining STAF service syntax |
Every STAF service provides a HELP command which returns the commands that the service accepts along with the options that are available for each command. To determine which STAF services are available on the machine, you can run the following: STAF <machine> SERVICE LIST Here is an example of the output: Name Library Executable -------- ---------- ------------------------------------ DELAY <Internal> <None> DIAG <Internal> <None> ECHO <Internal> <None> EMAIL JSTAF C:\STAF/services/email/STAFEmail.jar EVENT JSTAF C:\STAF/services/stax/STAFEvent.jar FS <Internal> <None> HANDLE <Internal> <None> HELP <Internal> <None> LOG STAFLog <None> MISC <Internal> <None> PING <Internal> <None> PROCESS <Internal> <None> QUEUE <Internal> <None> SEM <Internal> <None> SERVICE <Internal> <None> SHUTDOWN <Internal> <None> STAX JSTAF C:\STAF/services/stax/STAX.jar TRACE <Internal> <None> TRUST <Internal> <None> VAR <Internal> <None> You can submit a <service> HELP request to each service to obtain its request syntax. Here is an example of getting the command syntax for the TRACE service: STAF <machine> TRACE HELP Here is an example of the output: Trace service help ENABLE ALL [ TRACEPOINTS | SERVICES ] ENABLE TRACEPOINTS <Trace point list> | SERVICES <Service list> ENABLE TRACEPOINT <Trace point> [TRACEPOINT <Trace point>]... ENABLE SERVICE <Service> [SERVICE <Service>]... DISABLE ALL [ TRACEPOINTS | SERVICES ] DISABLE TRACEPOINTS <Trace point list> | SERVICES <Service list> DISABLE TRACEPOINT <Trace point> [TRACEPOINT <Trace point>]... DISABLE SERVICE <Service> [SERVICE <Service>]... SET DESTINATION TO < STDOUT | STDERR | FILE <File name> > SET DEFAULTSERVICESTATE < Enabled | Disabled > LIST [SETTINGS] PURGE HELP You can find more information about the commands and options, including examples, in the User's Guide documentation for the service. All internal services, and the Log, Monitor, Respool, and Zip services, have their commands/options documented in the STAF User's Guide. All other external services have their commands/options documented in the service User's Guide (for example, the STAX User's Guide and the Email User's Guide). Service User's Guides are distributed with each service and are available via the Download Services page. When examining the syntax for each service, keep the following rules in mind:
More information on the option syntax is provided in the appropriate user's guide. | |
4.2. | STAF service syntax errors |
If you submit an invalid request to a STAF service, it will return an RC 7, which indicates that the request string was invalid. The result will contain details about why the request string was invalid. Here is an example of an invalid request for the TRACE service: STAF <machine> TRACE SET DESTINATION TO Here is an example of the output: Error submitting request, RC: 7 Additional info --------------- When specifying one of the options TO, you must also specify one of the options STDOUT STDERR FILE | |
4.3. | STAF error codes |
You can use the HELP service to obtain help about STAF error codes. For example, to get a brief overview of all STAF error codes, you can run: STAF <machine> HELP LIST ERRORS Here is an example of the output: Return Code Description ----------- ------------------------------ 0 No error 1 Invalid API 2 Unknown service 3 Invalid handle 4 Handle already exists 5 Handle does not exist 6 Unknown error 7 Invalid request string 8 Invalid service result 9 REXX Error 10 Base operating system error 11 Process already complete 12 Process not complete 13 Variable does not exist 14 Unresolvable string 15 Invalid resolve string 16 No path to endpoint 17 File open error 18 File read error 19 File write error 20 File delete error 21 STAF not running 22 Communication error 23 Trusteee does not exist 24 Invalid trust level 25 Insufficient trust level 26 Registration error 27 Service configuration error 28 Queue full 29 No queue element 30 Notifiee does not exist 31 Invalid API level 32 Service not unregisterable 33 Service not available 34 Semaphore does not exist 35 Not sempahore owner 36 Semaphore has pending requests 37 Timeout 38 Java error 39 Converter error 40 Not used 41 Invalid object 42 Invalid parm 43 Request number not found 44 Invalid asynchronous option 45 Request not complete 46 Process authentication denied 47 Invalid value 48 Does not exist 49 Already exists 50 Directory Not Empty 51 Directory Copy Error 52 Diagnostics Not Enabled 53 Handle Authentication Denied 54 Handle Already Authenticated 55 Invalid STAF Version 56 Request Cancelled 4000+ Service specific errors The STAF User's Guide has detailed information about each error code. You can also get detailed information for each error code via the HELP service. For example, you can run: STAF <machine> HELP ERROR 25 Here is an example of the output: Description: Insufficient trust level Details : You have submitted a request for which you do not have the required trust level to perform the request. Note: Additional information regarding the required trust level may be provided in the result passed back from the submit call. In addition to the standard STAF error codes, external STAF services can use error codes that are specific for the service. These error codes will always be in the range of 4000 and beyond. The service User's Guide will have more information about the service-specific error codes. You can also use the HELP service to get detailed information about these service-specific error codes. For example, you can run: STAF <machine> HELP SERVICE LOG ERROR 4004 Here is an example of the output: Description: Invalid level Details : An invalid level was specified | |
5. MISC service | |
| |
5.1. | MISC service |
The STAF MISC (Miscellaneous) service provides some useful debugging information. You can run the MISC WHOAMI request to determine information about who a system thinks you are. | |
5.2. | MISC WHOAMI |
For example, you can run the following command: STAF <remote-machine> MISC WHOAMI Here is an example of the output: Instance Name : STAF Instance UUID : A5CA1346980800000903D3D661663361 Request Number : 106693 Interface : tcp Logical ID : staf3a.austin.ibm.com Physical ID : 9.3.211.214 Endpoint : tcp://staf3a.austin.ibm.com@6500 Machine : staf3a.austin.ibm.com Machine Nickname: staf3a.austin.ibm.com Local Request : No Handle : 26 Handle Name : STAF/Client User : none://anonymous Trust Level : 5 The Instance Name value contains the STAF instance name that identifies the instance of STAF to which the request is communicating (in case multiple instances of STAF are running). The default STAF instance name is "STAF". The Logical ID value contains the hostname of your machine. The Physical ID value contains the IP address of your machine. The Trust Level value contains trust level that the remote machine has granted your machine. If you are encountering trust-related problems, then check this value and compare it to the trust defintions on the remote machine by running STAF <remote-machine> TRUST LIST. | |
5.3. | MISC WHOAREYOU |
The MISC WHOAREYOU request will display information about a system, such as the STAF instance name, instance UUID, machine name (the value of the STAF/Config/Machine system variable for the machine), machine nickname, (the value of the STAF/Config/MachineNickname variable for the machine) and if it's the same system as the machine who submitted the request. For example, you can run the following command: STAF <machine> MISC WHOAREYOU Here is an example of the output: Instance Name : STAF Instance UUID : 711E9E411B0A00000929359245636173 Machine : client2.austin.ibm.com Machine Nickname: client2.austin.ibm.com Local Request : Yes | |
5.4. | MISC LIST INTERFACES |
The MISC LIST INTERFACES request shows you information about the network interfaces that STAF is currently using. For example, you can run the following command: STAF local MISC LIST INTERFACES Here is an example of the output: [ { Interface Name: local Library : STAFLIPC Options : { IPCMethod: Shared memory IPCName : STAF } } { Interface Name: tcp Library : STAFTCP Options : { ConnectTimeout: 5000 Port : 6500 Protocol : IPv4 Secure : No } } ] Note that normally you would have a "local" interface, and one or more "tcp" interfaces (note that the default TCP/IP port is 6500). | |
5.5. | MISC LIST PROPERTIES |
The MISC LIST PROPERTIES request shows you the install properties for the version of STAF that is currently running. The output of this request will contain the following information:
For example, you can run the following command: STAF local MISC LIST PROPERTIES Here is an example of the output: version : 3.3.0 platform : win32 architecture: 32-bit installer : IA file : STAF330-setup-win32.exe osname : Windows osversion : * osarch : x86 | |
6. Debugging STAF communication problems | |
6.1. | Debugging STAF communication problems |
If you are having problems getting two STAF machines to communicate, you should first verify that a non-STAF ping between the two machines is successful. If it is not, then there is a basic TCP/IP communication problem between the machines. If a non-STAF ping between the two machines is successful, then check the following:
| |
7. Debugging STAF trust problems | |
7.1. | Debugging STAF trust problems |
If you are having trust related problems when submitting requests to STAF services (such as RC 25, which indicates you have submitted a request for which you do not have the required trust level to perform the request), you can use the TRUST service to verify that the correct trust levels have been set. You can run the following command on a machine to see the current trust settings on that machine: STAF <machine> TRUST LIST Here is an example of the output: Type Entry Trust Level ------- ----------------------------- ----------- Default <None> 1 Machine *://*.austin.ibm.com 2 Machine *://9.31.73.14* 3 Machine *://9.31.73.147 5 Machine *://client1.austin.ibm.com 5 Machine *://client3.austin.ibm.com 3 Machine local://local 5 Machine tcp://client2.austin.ibm.com 0 You can use the GET request to determine the effective trust level of a specific machine. For example: STAF <machine> GET MACHINE client4.austin.ibm.com Here is an example of the output: 2 | |
8. STAF Handles | |
| |
8.1. | HANDLE LIST |
You can view all of the currently active STAF handles by running the following command: STAF local HANDLE LIST HANDLES PENDING STATIC REGISTERED INPROCESS LONG Here is an example of the output: Handle Handle Name State Last Used Date-Time PID ------ ------------------------------- ---------- ------------------------ 1 STAF_Process InProcess 20070712-10:36:42 1636 2 STAF/Service/STAFServiceLoader1 InProcess 20070709-16:56:22 1636 3 STAF/Service/STAX Registered 20070709-16:56:22 2844 4 STAF/Service/LOG InProcess 20070709-16:56:22 1636 5 STAF/SERVICE/Event Registered 20070709-16:56:22 2844 32 STAF/Client Registered 20070712-23:09:16 2900 Note that the "PID" value will contain the process id assigned by the operating system. This can be useful when debugging Java services, for example. | |
9. STAF Processes | |
9.1. | PROCESS LIST |
You can obtain information about all of the processes started via STAF by running the following command: STAF local PROCESS LIST LONG Here is an example of the output:
H# Workload Command PID Start Date-Time End Date-Time RC -- -------- ---------------- ---- ----------------- ----------------- --------- 17 <None> notepad.exe 1444 20070625-11:33:14 20070625-11:37:55 0 25 <None> java TestProcess 2836 20070625-11:53:18 20070625-11:53:18 1 5 5 0 29 My Test java TestA 3376 20070625-12:01:05 20070625-12:05:23 0 43 My Test java TestB 2776 20070625-12:32:38 <None> <None> 47 My Test C:/tests/MyTest. 2448 20070625-12:32:56 <None> <None> exe 56 TC1 C:/tests/tc1.exe 2840 20070625-12:33:24 20070625-12:35:32 3
| |
9.2. | Debugging PROCESS START errors |
If you are having problems starting processes via STAF, you can try the following:
| |
10. TRACE output | |
| |
10.1. | TRACE output |
STAF provides tracing facilities that allow you do dynamically obtain more information about what is happening in your STAF environment. This can be done by enabling tracepoints for one or more (or all) currently registered STAF services. You can enable tracing during STAF startup by adding TRACE statements to the STAF.cfg file, or you can dynamically enable/disable tracing after STAFProc has started by sending requests to the TRACE service. In both cases, the syntax to enable/disable tracing is the same. The most common tracepoints that you will enable for debugging are SERVICEREQUEST, SERVICERESULT, and SERVICECOMPLETE:
By default the trace output will be in the STAFProc console. However, in most cases you will want to redirect the trace output to a file, using the TRACE SET DESTINATION TO FILE request. Enabling ServiceRequest/ServiceResult/ServiceComplete will result in trace output for all services, which can give you a lot of extra, unnecessary, trace information, so typically you would only enable these tracepoints for a few services. Here is an example of enabling the ServiceRequest and ServiceComplete tracepoints for only the FS and Process services (and redirecting the trace output to a file):
STAF local TRACE ENABLE TRACEPOINTS "ServiceRequest ServiceComplete" STAF local TRACE SET DESTINATION TO FILE /usr/local/staf/STAFTrace.out STAF local TRACE DISABLE ALL SERVICES STAF local TRACE ENABLE SERVICES "FS Process"
| |
11. Debugging Java problems | |
11.1. | Determining Java version |
When using Java STAF services (which require a JVM on the machine where the service will be configured) or Java classes that call into the STAF Java APIs, it is often useful to determine the exact version of Java that will be used. You can use "java -version" to determine the exact version of Java. By default, STAF will use the first "java" executable that is found in the System PATH (unless you specify the full path to the Java executable). To find which version of Java STAF is using by default, you can run the following command: STAF local PROCESS START SHELL COMMAND "java -version" RETURNSTDOUT STDERRTOSTDOUT WAIT Here is an example of the output: { Return Code: 0 Key : <None> Files : [ { Return Code: 0 Data : java version "1.6.0" Java(TM) SE Runtime Environment (build 1.6.0-b105) Java HotSpot(TM) Client VM (build 1.6.0-b105, mixed mode) } ] } Note that you can specify to use a different Java version for a Java service when registering it by specifying OPTION JVM=<location of java executable> and OPTION JVMName=<JVM name>. | |
11.2. | Debugging multiple STAF Java services |
When debugging multiple STAF Java services, it is recommended that you run each Java service in its own JVM. You can specify to run a Java service in its own JVM when registering it by specifying OPTION JVMName=<JVM name>. | |
11.3. | Testing STAF Java support |
The TestJSTAF class allows you to submit a command-line STAF request using STAF Java support. This class is useful if you want to verify that STAF Java support is working correctly, without requiring a GUI display or any modifications to the CLASSPATH. The syntax of this class is: Usage: java com.ibm.staf.TestJSTAF <Endpoint | LOCAL> <Service> <Request> Here is an example of using this class: C:\> java com.ibm.staf.TestJSTAF LOCAL MISC VERSION TestJSTAF using STAF handle 15 RC=0 Result=3.2.1 | |
12. JVM Logs | |
12.1. | JVM Logs |
Each Java service that is registered with STAF runs in a JVM (Java Virtual Machine). Each JVM created by STAF has a JVM Log file associated with it. Note that more than one Java service may use the same JVM (and thus the same JVM Log file) depending on the options used when registering the service. A JVM Log file contains JVM start information such as the date/time when the JVM was created, the JVM executable, and the J2 options used to start the JVM. It also contains any other information logged by the JVM. This includes any errors that may have occurred while the JVM was running, and any information written to standard output/error by the STAF Java services running in the JVM. STAF stores JVM Log files in the {STAF/DataDir}/lang/java/jvm/<JVMName> directory. STAF retains a configurable number of JVM Logs (5 by default) for each JVM. The current JVM log file is named JVMLog.1 and older saved JVM log files, if any, are named JVMLog.2 to JVMLog.<MAXLOGS>. When a JVM is started, if the size of the JVMLog.1 file exceeds the maximum configurable size (1M by default), the JVMLog.1 file is copied to JVMLog.2 and so on for any older JVM Logs, and a new JVMLog.1 file will be created. This JVM log will contain something similar to: ****************************************************************************** *** 20070718-09:18:01 - Start of Log for JVMName: STAFJVM1 *** JVM Executable: C:/jdk1.6.0/jre/bin/java *** JVM Options : none *** JVM PID : 4736 ****************************************************************************** Note that the JVM log includes the System PID for the JVM. This can be used to determine system information, such as CPU and memory utilization, for the JVM. | |
12.2. | Viewing JVM Logs via the STAX Monitor |
To display the JVM Log for the STAX service or for any Java service on any machine, from the main STAX Job Monitor window's Display menu bar, select one of the following menu items:
| |
12.3. | Viewing JVM Logs via the STAFJVMLogViewer class |
The STAFJVMLogViewer class provides a Java GUI that can display a JVM Log for any STAF Java service that is currently registered. For more information on how to use the STAFJVMLogViewer class, see section "3.6.2 Class STAFJVMLogViewer" in the STAF Java User's Guide. Here is an example of using the STAFJVMLogViewer class to display the current JVM Log for the Cron service on machine client1.company.com: java com.ibm.staf.STAFJVMLogViewer -serviceName Cron -machine server1.company.com | |
13. Service logs | |
| |
13.1. | Service logs |
Many STAF services write information to a STAF log file. These services include:
These logs are machine logs. Here is an example of querying a service log: STAF local LOG QUERY MACHINE {STAF/Config/MachineNickname} LOGNAME EventManager LAST 5 Here is an example of the output: 20070717-15:44:52 Info [ID=3] [local://local, STAF/EventManager/UI] Registered a STAF command. Register request: REGISTER MACHINE :5: local SERVICE :4:misc REQUEST :7:version PREPARE :3:a=1 TYPE :3:abcSUBTYPE :3:abcDESCRIPTION :20:Get the STAF v ersion 20070717-15:56:54 Info [ID=4] [local://local, STAF/EventManager/UI] Registered a STAF command. Register request: REGISTER MACHINE :5: local SERVICE :4:stax REQUEST :35:execute file c:/tests /startregr.xml TYPE :7:prodXYZSUBTYPE :5:win32DESCRIPTI ON :26:Start the regression tests 20070717-15:57:02 Info [ID=4] [local://local, STAF/EventManager/UI] Triggering a STAF command. TRIGGER ID 4 20070717-15:57:02 Info [ID=4] [dave2268.austin.ibm.com:2884] Submitted a STAF command. Event information: N/A Submitted STAF command: STAF local stax execute file c:/tests/startregr.xml 20070717-15:57:02 Fail [ID=4] [dave2268.austin.ibm.com:2884] Completed a STAF command. RC=48, Result=Error getting XML file c:/tests/ startregr.xml from machine local://local c:/tests/star tregr.xml You can refer to the individual service user's guides for more information on the records that are written to the service log. | |
13.2. | Viewing STAF service logs |
In addition to using LOG QUERY requests from the command line to query logs for STAF services, you can also use the service's UI, if applicable. For example, the STAX Monitor allows you to view the STAX service logs, and the EventManagerUI/CronUI applications allow you to view the service logs for the EventManager and Cron services. | |
14. System CPU/memory utilization | |
14.1. | System CPU/memory utilization - Windows |
To determine CPU and memory utilization on Windows, use "Task Manager". "STAFProc.exe" should be listed as an "Image Name" in the "Process" tab. You can have additional data displayed by selecting View -> Select Columns... and selecting "Handle Count" and "Thread Count". If you have any Java STAF services configured, each "java.exe" will also be listed in the "Process" tab. You can find the PID for the Java executable used by your Java STAF services by examining the JVM log(s) for the services, or by submitting a STAF local HANDLE LIST HANDLES LONG request. Here is an example of the STAF-related utilization data (in this example there are 4 JVMs being used for Java STAF services): Image Name PID User Name CPU Mem Usage Peak Mem Usage Handles Threads ================================================================================ java.exe 4060 Administrator 00 29,438K 34,244K 364 18 java.exe 4040 Administrator 00 53,148K 99,608K 338 11 java.exe 3632 Administrator 00 103,612K 137,780K 575 18 java.exe 1756 Administrator 00 14,308K 21,992K 348 12 STAFProc.exe 440 Administrator 00 12,748K 55,708K 699 68 java.exe 340 Administrator 00 15,140K 19,520K 350 12
| |
14.2. | System CPU/memory utilization - Unix |
Examining CPU/memory utilization varies depending on the Unix operating system. For example, on Linux you can use the "top" command to display the utilization data. You can run "ps -ea" to get the PID for STAFProc, and you can find the PID for the Java executable used by your Java STAF services by examining the JVM log(s) for the services. Here is an example of the STAF-related utilization data (in this example there is 1 JVM being used for Java STAF services): > top -p 28991 -p 28997 top - 14:39:07 up 135 days, 3:23, 3 users, load average: 0.00, 0.00, 0.00 Tasks: 2 total, 0 running, 2 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si Mem: 2066304k total, 2008560k used, 57744k free, 47216k buffers Swap: 2031608k total, 160k used, 2031448k free, 1699204k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28991 root 16 0 271m 5636 3780 S 0.0 0.3 0:19.23 STAFProc 28997 root 16 0 635m 54m 4628 S 0.0 2.7 1:17.98 java
| |
15. Debugging STAX Jobs | |
| |
15.1. | Testing STAX Jobs |
Whenever you make changes to a STAX xml file, including a file that is going to be imported by other STAX jobs, you should always test it to show any XML parsing or Python compile errors. You can test a STAX job by submitting a STAX EXECUTE FILE ... TEST request, or by clicking on the Test button the STAX Monitor's "STAX Job Parameters" dialog. If there are any XML parsing errors or Python compile errors, details will be displayed about the errors. | |
15.2. | Debugging XML Parsing Errors |
The STAX DTD is a formal description, in XML Declaration Syntax, of what names are to be used for the different types of elements in your STAX job, where they may occur, and how they all fit together. Every STAX job must comply with the STAX DTD. When you test a STAX job, or submit a STAX job for execution, if it does not conform to the STAX DTD, you will receive a STAXXMLParseException, with details about syntax errors, including the line number where the error occurred. For example, if your STAX job contains a <stafcmd> element without the required <service> element: <stafcmd> <location>machine</location> <request>'PING'</request> </stafcmd> you would get the following error: Caught com.ibm.staf.service.stax.STAXXMLParseException: Line 27: The content of element type "stafcmd" must match "(location,service,request)". | |
15.3. | Using XML-aware Editors |
You can use XML-aware editors, along with the STAX DTD, to provide syntax checking while you are editing your STAX XML files. Some examples of XML-aware editors are XML Cooktop and JEdit. To use these types of editors to validate your STAX jobs, you will need to have a copy of the STAX DTD file. Since the STAX DTD is generated dynamically, you can retrieve the contents of the STAX DTD and save it on the local file system (in the directory where your STAX XML files are located) by running: set STAF_QUIET_MODE=1 STAF local STAX GET DTD > stax.dtd | |
15.4. | Debugging Python Compile-time Errors |
When you test a STAX job, or submit a STAX job for execution, any Python code contained within the STAX job will be compiled, and any syntax errors will be reported as a STAXPythonCompileException. For example, if your STAX job contains a <log> element which is missing a closing quote (') for the message: <log level="'info'">'This is the start of the STAX job</log> you would get the following error: Caught com.ibm.staf.service.stax.STAXPythonCompileException: Element: log Python code compile failed for: 'This is the start of the STAX job Traceback (innermost last): (no code object) at line 0 SyntaxError: ('Lexical error at line 1, column 35. Encountered: "\\n" (10), after : ""', ('<string>', 1, 35, "'This is the start of the STAX job")) Note that the Python error message indicates the line (1) and column position (35) where the error occurred. Here is an example of a <script> element that has multiple lines for a single Python statement: <script>output = '%d file(s) returned in STAXResult' % len(STAXResult)</script> you would get the following error: Result=Caught com.ibm.staf.service.stax.STAXPythonCompileException: Element: script Python code compile failed for: output = '%d file(s) returned in STAXResult' % len(STAXResult) Traceback (innermost last): (no code object) at line 0 SyntaxError: ('invalid syntax', ('<string>', 1, 47, "output = '%d file(s) returned in STAXResult' %")) Note that the Python error message indicates the line (1) and column position (47) where the error occurred. | |
15.5. | Debugging Python Run-time Errors |
Some Python error cannot be detected at compile time. Runtime Python errors that are encountered while your STAX job is executing will result in a STAXPythonEvaluationException signal being raised. The default signalhandler for this signal sends a message to the STAX Monitor, logs a message in the STAX Job Log with level 'error', and terminates the job. For example, if your STAX job contains a reference to a Python variable which has not been defined (in this cases a variable named 'service'); <stafcmd> <location>'local'</location> <service>myService</service> <request>'DELAY 5000'</request> </stafcmd> you would get the following error: ===== Element Information ===== <stafcmd> <location>'local'</location> <service>myService</service> <request>'DELAY 5000'</request> </stafcmd> Stafcmd sub-element in error: <service> ===== Python Error Information ===== com.ibm.staf.service.stax.STAXPythonEvaluationException: Python string evaluation failed for: myService Traceback (innermost last): File "<pyEval string>", line 1, in ? NameError: myService ===== Call Stack for STAX Thread 1 ===== [ Block: main Sequence: 25/25 Function: main Sequence: 1/1 ] | |
15.6. | Displaying/logging data within your STAX jobs |
When debugging a STAX job, you may find it useful to add log and/or message elements to your STAX job, or Python print statements in script elements. It is recommended that when you add <message> elements to your STAX job, you include the log attribute (if the message is important) so that the message data is also written to the STAX Job User log. For example: <message log="1">'Whatever text/variables you want to see'</message> This data will be displayed in the STAX Monitor and will be written to the STAX Job User log. You can also use Python print statements in script elements to debug your STAX jobs. For example: <script> if debug: print 'Debug info: ', machName, cmd </script> Note that output from a Python print statement will be written to the STAX Job User Log by default, but this can be changed via the PYTHONOUTPUT setting. | |
15.7. | Holding STAX jobs for debugging |
When debugging a STAX job, you may also find it useful to hold a STAX job and then query the job. You can submit a HOLD request to the STAX service via the command line or via the STAX Monitor. You can also add the hold element at various points in your STAX job and then you can query information about the STAX job. | |
15.8. | Debugging hung STAX jobs |
If a STAX job appears to be hung (or you just want to see what it's currently executing), you can submit a LIST JOB <Job ID> THREADS request to the STAX service to get a list of the threads currently running in the specified STAX job. Then, for each thread, submit a QUERY JOB <Job ID> THREAD <Thread ID> request to the STAX service to get more information on the current state of a thread. Note that querying a thread provides a "Call Stack" and a "Condition Stack" for the thread which can be useful for debugging a STAX job. Note that the "Call Stack" shows you which elements in a STAX job are currently being executed. For example, if debugging job 10 that's currently running, you could submit the following requests: C:\>STAF local STAX LIST JOB 10 THREADS Response -------- Thread ID Parent TID State --------- ---------- ------- 1 <None> Blocked C:\>STAF local STAX QUERY JOB 10 THREAD 1 Response -------- { Thread ID : 1 Parent TID : <None> Start Date-Time: 20070420-17:40:57 Call Stack : [ Block: main Sequence: 24/24 Function: Main Finally: Try: Iterate: 2 clientMachines Sequence: 2/3 STAFCommand: Delay 5 seconds ] Condition Stack: [ HoldThread: Source=STAFCommand, Priority=1000 ] } Note that this is the output when querying the following STAX job when while it is currently running the <stafcmd> element that delays for 5 seconds: <script> <!DOCTYPE stax SYSTEM "stax.dtd"> <stax> <defaultcall function="Main"/> <script> clientMachines = ['client1.company.com', 'client2.company.com'] </script> <function name="Main"> <try> <iterate var="machine" in="clientMachines"> <sequence> <log message="1">'Starting Try Block for machine %s' % (machine)</log> <stafcmd name="'Delay 5 seconds'"> <location>'local'</location> <service>'DELAY'</service> <request>'DELAY 5000'</request> </stafcmd> <log message="1">'Ending Try Block for machine %s' % (machine)</log> </sequence> </iterate> <finally> <block name="'FinallyBlock'"> <log message="1">'Starting Finally Block...'</log> </block> </finally> </try> </function> </stax> | |
15.9. | STAX Requests return RC 6 |
If you are receiving an RC 6 when submitting requests to the STAX service, check its JVM log to see if any additional information about the problem is logged, such as a Java exception. | |
16. Reducing overhead in STAX jobs | |
| |
16.1. | Retrieving large files |
You should use caution when doing large file retrievals into a STAX job. This is particularly problematic if you don't "clear" the variable before doing a <parallel> or <paralleliterate> as that will cause the variable to be replicated across all the threads. | |
17. Getting additional support | |
17.1. | Getting additional support |
If you have read/searched this document and you still need support (problems/questions/etc.) for STAF or any of the STAF services, there are several ways to get help. First, we ask that you:
There are several ways you can get support:
|