FULLSTIMER SAS Option
The SAS System provides the FULLSTIMER option to collect performance statistics on each SAS step, and for the job as a whole and place them in the SAS log. It is important to note that the FULLSTIMER measures only give you a snapshot view of performance at the step and job level. Each SAS port yields different FULLSTIMER statistics based on the host operating system. See the SAS host specific documentation for the exact statistics offered. FULLSTIMER is invoked as a SAS option and takes effect after the option invocation. If you would like to have the performance statistics written to a SAS data set, download this ZIP file which contains the experimental %LOGPARSE macro.Why start with the FULLSTIMER option for monitoring? The best reason is that it tells you what is happening with the SAS system specifically. The statistics it provides are at the job step and can help pinpoint performance problems down to the step. This is extremely helpful in narrowing troublesome activity, and relating it to what your code is telling the system to do. (Note: If the test execution is long, expensive, high impact to the environment, and is not easily set up, the SAS session monitoring can be done simultaneously with server and system performance monitoring.) FULLSTIMER measures can be used to help determine if more in-depth performance monitoring with host monitoring or third party tools is indicated.
A sample result of a FULLSTIMER option UNIX output for a SAS Data Step is listed below:
NOTE: DATA statement used:
real time 0.06 seconds
user cpu time 0.02 seconds
system cpu time 0.00 seconds
Memory 88k
Page Faults 10
Page Reclaims 0
Page Swaps 0
Voluntary Context Switches 22
Involuntary Context Switches 0
Block Input Operations 10
Block Output Operations 12
It is important to know how these numbers are defined and what can be derived from them.
FULLSTIMER Statistics Definition and Interpretation
Real Time - the Real Time represents the elapsed time or "wall clock" time. This is the time spent to execute a job or step. This is the time the user experiences in wait for the job/step to complete. Note: As host system resources are heavily utilized the Real Time can go up significantly - representing a wait for various system resources to become available for the SAS job/step's usage.
User CPU Time - the time spent by the processor to execute user-written code. This is user-written from the perspective of the operating system and not the customer's language statements. That is all SAS system code that is not operating system code.
System CPU Time - the time spent by the processor to execute operating system tasks that support user-written code (all CPU tasks that were not executing user-written code). The user CPU time and system CPU time are mutually exclusive.
Memory - Memory represents the amount of memory allocated to that job/step. This does not represent the entire amount of memory that the SAS session is consuming, as it does not reflect any SAS overhead activities (SAS manager, etc.).
Page Faults - Represents the number of virtual memory page faults that occurred during the job/step. Page Faults are pages that required an I/O to retrieve (a read was done to the I/O subsystem).
Page Reclaims - Represents the number of pages retrieved from the page list awaiting re-allocation (all done in memory). These pages did not require I/O activity to obtain.
Page Swaps - The number of times a process was swapped out of main memory.
Voluntary Context Switches - Represents the number of times a process releases its CPU time-slice voluntarily before it's time-slice allocation is expired. This usually occurs when the process needs an external resource, like making an I/O call for more da ta.
Involuntary Context Switches - The number of times a process releases its CPU time-slice involuntarily. This usually happens when its CPU time-slice has expired before the task was finished, or a higher priority task takes its time-slice away.
Block Input Operations - The number of "bufsize" reads that occur. These are I/O operations to read the data into memory for usage. Not all reads have to utilize an I/O operation since the page being requested may still be cached in memory from previous r eads.
Block Output Operations - This represents the number of "bufsize" writes that occur. These are the same as block input operations except that they pertain to the writes to files. As in the case of block input operations, not all block outputs will cause an I/O operation. Some files may still be cached in memory.
Performance problems usually involve one or more of the following physical areas:
- CPU activity
- Memory activity
- I/O subsystem activity (disk and file systems) Network activity (this will be discussed outside the context of the SAS system later).
By examining FULLSTIMER statistics, and interpreting what is happening with and between the factors producing the measures, we can get a quick idea of where the system is having problems. We can then resort to host-level and third-party measuring tools to obtain a very detailed picture of problem issue. If the host-level and third-party tools give such detail why not use them first? Very simply, there are many tools to use, and each is fairly good at one or more specific areas of investigation, such as CP, Memory, and I/O. Also some require Server Root-Level access to deploy. FULLSTIMER is quick and easy (incorporated in the SAS system), requires no special privileges, you can do it yourself, and it can help quickly narrow the field of things to test next.
The following is a general list of interpretations you can make using FULLSTIMER:
- Real Time/CPU Time. The most valuable way to use FULLSTIMER is to compare timing information. By comparing the Real Time (elapsed time), with the total CPU time (system CPU time plus user CPU time) you can quickly determine if the problem is CPU related.
- If the Real time and total CPU time are within 15 percent of each other, this usually indicates that the system is moving data well (at least during the run time of that job/step processing). This means that the ratio of CPU process time is close to that of the total job. This indicates that the system memory, disk system, and file system are getting data to the CPU quickly enough to not be a problem. If you are experiencing bad task performance, and the real and CPU time are within 15 percent of each other, it most likely means that your task is CPU bound. The only way to improve the performance will be to get a faster CPU, split the process over more CPUs (multi-threading or parallel processing), or reengineer the code to be more efficient.
- If the Real time and total CPU time are routinely very disparate, (for example if there is a 50 percent margin between them), then you very likely have a problem in your system getting information to the CPU fast enough. Make a closer examination of the Memory and I/O subsystems using the host or third-party tools mentioned in the next section.
Other valuable information from FULLSTIMER can be gained by looking at the other statistics:
- Memory. If a sizeable quantity of memory is used and your elapsed time differs greatly from your total CPU time, you may also want to take a close look at your memory using host or third-party tools that are mentioned in the next section.
- Involuntary Context Switches. If Involuntary Context Switches are consistently high across many steps and jobs over long time-periods, then your CPU system is under a heavy load, and you will want to examine that more closely with the tools mentioned in the next section.
- Page Swaps. If Page Swaps are consistently high then your memory system is being stressed, and needs more examination.
Other statistics like Block Input and Output operations, Page Faults and Reclaims, and Voluntary Context switches can hint at issues, but require more corroboration from the measures previously discussed to make a case for narrowing down investigation. These measures could be high in-and-of themselves without being a symptom of performance problems.
Once FULLSTIMER statistics have been examined, they should help indicate which area(s) should be examined in more detail. It is often the case on overloaded systems that multiple areas present themselves for examination. The FULLSTIMER activity should help point to tools that could be used to get a more detailed level picture of any hardware/file system issues. This comprises our next step, detecting performance issues at the host server system level.
We at SAS have created the Scalability Community to make you aware of the connectivity and scalability features and enhancements that you can leverage for your SAS installation. The success of this community depends on you. Send electronic mail to scalability@sas.com with your comments, requirements, and suggestions.
_____________________________________________________________________
Dan Strickland
Inland Fisheries Division
Texas Parks and Wildlife
3407-A S.Chadbourne Street
San Angelo, TX 76903
Phone: 512-666-4546
Hello Dan,
ReplyDeleteGood posting for Fullstimer SAS option.Aftre going through the blog,I have been cleared more about the same