|
Next |
Section 4: Optimization techniques to use on I/O-bound code
Return | Section 2: Analyzing and optimizing memory-bound code
Introduction and Table of Contents
3.0 Analyzing I/O-Bound Code
I/O optimization can offer significant savings in elapsed time. I/O optimization
focuses on the files on which a code spends the most processing time.
In the analysis phase, you determined that your code is I/O bound. After determining
which files consume the most wall-clock (elapsed) time for the code, you can begin
evaluating ways to make the code run more efficiently.
|
Procedure 8: Optimizing I/O
performance
|
I/O optimization incorporates the following general steps:
- Identify files using large amounts of processing time.
- Evaluate possible I/O alternatives.
- Restructure I/O.
- Check your answers (see Procedure 10, Step 4).
- Return to the initial analysis phase to see if you are satisfied with the
performance
improvement brought about by applying a technique.
|
3.1 Understanding I/O
I/O can be represented as a series of layers of data movement. Each layer involves
some processing. Figure 10, shows typical output flow from a Cray Research system to disk.
Figure 10. I/O layers
During output, data moves from the user space to a library buffer, where small chunks of data are collected into larger, more efficient chunks. When the library buffer is full, a system request is made and the kernel moves the data to a system buffer. From there, the data is sent, perhaps through logical device cache (ldcache), to the device. During input, the path is reversed.
3.2 Using procview to identify heavy use of processing time
The procstat and procview utilities are the primary tools for I/O optimization. These utilities provide complete I/O information for you on a file-by-file basis. The procstat utility gathers process-level I/O and memory statistics on each file that is created within a process. A process is any executable entity for which a process identifier (PID) is assigned, such as a Fortran program or a UNICOS command.
|
Procedure 9: Identifying files that use large amounts of processing time
|
To identify files that use large amounts of processing time, perform the following steps:
- Create a procview report. For instructions, see Procedure 5.
- From the main procview window, create a long report by selecting the following
menu choice:
Reports, Files -> All User Files (Long Report)
.
Then choose an option for sorting the file information, either Bytes processed or I/O wait time, to sort according to the files using the most processing time.
- Look at the Wait Time column for each file and determine the files in which a significant amount of
real time is spent on I/O statements. These files are using a large amount of processing time and
should be a focus for I/O optimization. Note that I/O wait time might be much higher for jobs in a
multiuser system than for those running on a dedicated system.
- To determine the optimization techniques to apply to the program's I/O, go to Section 3.3.
Figure 11 displays a sample procview long file report for one file. It is called a Procstat File Report because it shows statistics generated by the procstat utility.
Figure 11. Sample procview report
|
3.3 Evaluating I/O alternatives
Use the following procedure to determine why the files you identified use the largest amounts of processing time and to determine the optimization techniques to apply.
|
Procedure 10: Determining optimization techniques
|
- Use the procview long report you created in Procedure 9, and your knowledge of your program to answer the following questions about the I/O-bound code.
- Does the code use formatted I/O, including namelist and list-directed I/O? If so, the code might be Using large amounts of user CPU time because the process must spend time in the library buffer translating between internal representation and ASCII characters. Use the formatted I/O techniques described in Section 4.1.
- Is the average request size of sequential unformatted I/O requests (shown in the Avg Bytes Per Call column) larger than 1 Mword (8 Mbyte)? If so, use the techniques described in Section 4.2.
- Is the average request size of sequential unformatted I/O requests (shown in the Avg Bytes Per Call column) 1 Mword (8 Mbyte) or smaller? If so, use the techniques described in Section 4.3.
- Does the code read or write nonsequentially? If so, use the techniques described in Section 4.4.
- Does the code read or write asynchronously? If so, use the techniques described in Section 4.5.
- Are alternative storage devices (SSDs, disk arrays, and so on) available for the code to use? If so, for information on optimal use of storage devices, see Section 4.6.
- Can you reduce the number of system calls related to the I/O process? Reducing system calls improves I/O efficiency. For ways to reduce the number of system calls in code, see Section 4.7.
- After applying any optimization technique, check your answers. Some optimization techniques may alter the order of operations in the code and might change the outcome. If the answers are not the same as those you received before applying the optimization technique, one of the following things is occurring:
- In applying the technique, you intended to change the code, but might have inadvertently changed the algorithm. If this is the case, revert to the code as it was written before applying the optimization technique. You might be able to either apply the technique differently so that it does not change the algorithm, or apply a different technique.
- The technique you applied might have exposed a numeric sensitivity in the code. If so, either revert to the code as it was written before applying the optimization technique, or try to remove the numeric instability.
- Return to the initial analysis phase to see if you are satisfied with the performance improvement brought about by applying a technique (see Section 1).
|
Next | Section 4: Optimizing I/O-bound code.
Previous | Section 2: Optimizing Memory-bound code
Return | Introduction and Table of Contents
|