Read Oracle RMAN 11g Backup and Recovery Online
Authors: Robert Freeman
Chapter 18: Performance Tuning RMAN Backup and Recovery Operations
453
NOTE
If you have your data striped on a large number of disks, you will not
need to multiplex your backups, and you can set
maxopenfiles
to a
value of 1. If you are striped across a smaller set of disks, consider
setting
maxopenfiles
to a value between 4 and 8. If you do not stripe
your data at all,
maxopenfiles
generally should be set to some value
greater than 8.
Multiplexing, and the establishment of the
filesperset
and
maxopenfiles
parameters, can have a significant impact (good and bad) on the performance of your backups. Tuning RMAN
multiplexing can decrease the overall time of your backups, as long as your system is capable of the parallel operations that occur during multiplexing. As with most things, too much of a good thing is too much, and certainly you can overparallelize your backups such that the system is overworked. In this case, you will quickly see the performance of your system diminish and your backup times increase.
Multiplexing can also have an impact on tape operations. Since tape systems are streaming devices, it’s important to keep the flow of data streaming to the device at a rate that allows it to continue to write without needing to pause. Generally, once a tape has a delay in the output data stream, the tape device will have to stop and reposition the write head before the next write can occur. This can result in significant delays in the overall performance of your backups. By setting
filesperset
high and
maxopenfiles
low, you can tune your backup so that it streams to your tape device as efficiently as possible. Beware, of course, of overdoing it and bogging down your system so much that the I/O channels or CPU can’t keep up with the flow of data that RMAN is providing. As always, finding the proper balance takes some patient tuning and monitoring.
Controlling the Overall Impact of RMAN Operations
Sometimes you want to tune RMAN
down
rather than up. Prior to Oracle Database 10
g,
you would use the RMAN parameters
rate
and
readrate
to throttle RMAN down, freeing system resources for other operations. These parameters would be set when you allocated channels for RMAN operations. These parameters are still available in Oracle Database
,
but they have been replaced.
Oracle Database makes controlling RMAN backups much easier. Now, you simply use the
duration
parameter in the
backup
command to control the duration of the backup. The
duration
parameter has an additional keyword,
minimize load
, that allows you to indicate to RMAN that it should minimize the I/O load required to back up the database over the given duration. For example, if the backup typically takes five hours and consumes 90 percent of the available I/O, you can indicate to RMAN that it should use a duration of ten hours for the backup. When this is included with the
minimize load
parameter, you might well expect to see only 45 to 50 percent of available I/O consumed, rather than the 90 percent. The negative side of this is, of course, that your backup will take longer. Here is an example of using the
duration
parameter when starting a backup; in this case, we want the backup to run ten hours:
Backup as copy database duration 10:00 minimize load database;
Of course, one problem with the use of the
duration
parameter is that the backup could actually take longer than ten hours. Any completed backup set can be used for recovery, even if the overall backup process fails due to duration issues. You can use the
partial
keyword to suppress RMAN errors in the event that the duration limit is exceeded and the backup fails.
454
Part III: Using RMAN Effectively
One final thing to note about the use of the
duration
parameter is that the database files with the oldest backups will be given priority over files that have newer RMAN backup dates. Thus, if the backup of a database with 20 datafiles fails after 10 are backed up, the next time the backup runs, the 10 that were not backed up would get first priority.
Tune the MML Layer
As you will recall from earlier chapters the MML is an API that Oracle provides that interfaces with the software of various vendors who provide external backup solutions (such as tape devices). Each component of RMAN may require some tuning effort, including the MML layer.
You need to consider a number of things with regard to the MML backup devices. Most are going to be running in asynchronous mode, but if they do not, that may be a big cause of your problems.
Also, sometimes DBAs will set the rate parameter when they allocate a channel for backups. This is generally something you do not want to do, because this will create an artificial performance bottleneck.
Also, some of the MML vendors provide various configurable parameters, such as a configurable buffer size, that you can configure in vendor-specific parameter files. Look into the tuning possibilities that these parameter files offer you.
There are other factors related to the MML layer, such as the supported transfer rate of the backup device you are using, compression, streaming, and the block size. You must analyze all of these factors in an overall effort to tune the performance of your RMAN backups.
Identifying Database–Related RMAN Issues
In many ways, RMAN tuning is a lot like SQL tuning. RMAN uses the Oracle Database in much the same way that SQL statements do, such as using the buffer cache, issuing dynamic SQL calls, and calling stored PL/SQL packages. These operations, such as timed wait events, show up in the Oracle Database–generated statistics. As a result, several views are available to help give you some idea as to the kinds of problems you might be encountering and the source of those problems.
A number of views are useful for RMAN performance tuning. This book isn’t a tuning book, but we can provide a few RMAN-specific insights. Some views you might be interested in with respect to RMAN tuning would include
■ V$RMAN_BACKUP_JOB_DETAILS
■ V$ACTIVE_SESSION_HISTORY
■ V$SESSION
■ V$PROCESS
■ V$SESSION_LONGOPS
■ V$BACKUP_ASYNC_IO
■ V$BACKUP_SYNC_IO
There are a number of different potential sources for performance problems. When you are performing read operations with RMAN, such as reading the control file, these components can be involved in the performance issues:
Chapter 18: Performance Tuning RMAN Backup and Recovery Operations
455
■ Control file
■
RMAN frequently needs to read the control file for RMAN metadata. If the control file is experiencing slow I/O (perhaps due to slow disk response times), this can slow down RMAN operations. In the past, certain RMAN-related bugs have also caused performance problems related to the database control file, so it’s important to get Oracle involved if you experience unexplainable performance problems involving the control file.
■ Recovery catalog
■
If you are using the recovery catalog, RMAN will frequently access the catalog to read the RMAN metadata. If the recovery catalog is experiencing slow I/O, this can slow down RMAN operations. Keep in mind that the recovery catalog is often a separate database from the database you are backing up. Thus, performance problems can be a result of the recovery catalog database or of the database being backed up. When performance tuning RMAN problems, then, you will need to look at the statistics of both the database being backed up and the recovery catalog database. In the past, certain RMAN-related bugs have also caused performance problems related to the recovery catalog, so it’s important to get Oracle involved if you experience unexplainable performance problems involving the control file.
■ Reading memory buffers
■
As with any other database operation, memory is important. The SGA should be properly configured. Typically, memory issues on the target database will surface for problems beyond RMAN. Memory issues on the recovery catalog database can cause significant performance issues.
■ Reading database blocks
■
RMAN must read database blocks either from memory or from disk. If the disks are not sufficiently responsive, then I/O rates will suffer. This can cause performance impacts on RMAN operations. Likewise, if the SGA is too small, you will end up reading blocks from disk more frequently, so this can have an impact on RMAN
performance.
Write operations (such as a datafile restore) can also cause performance issues. Many of the same components can be involved in write operations as with read operations. Database components that can be involved in write-related performance issues include
■ Control file
■ Recovery catalog
■ Writing to tape/disk
■
As with reading from tape or disk, the I/O rates that you can achieve can make a difference with respect to performance.
■ Writing to memory buffers
456
Part III: Using RMAN Effectively
We mentioned several views in this section that can be used to monitor and tune RMAN
performance. Using these views, you can determine how well the database, RMAN, and the MML are performing. You can also use these views to determine how long a backup or recovery process has taken and how much longer you can expect it to take. Let’s look at these views and how you can best use them.
V$RMAN_BACKUP_JOB_DETAILS
The V$RMAN_BACKUP_JOB_DETAILS view provides some insight into each backup that occurs in your database. The view provides details on the start time, stop time, elapsed time, and bytes associated with each RMAN backup operation. Here is an example of V$RMAN_BACKUP_JOB_
DETAILS:
select /*+ RULE */ session key, session recid,
start time, end time, output bytes, elapsed seconds
from v$rman backup job details
where start time > sysdate-180
and status 'COMPLETED'
And input type 'DB FULL';
SESSION KEY SESSION RECID START TIM END TIME OUTPUT BYTES ELAPSED SECONDS
----------- ------------- --------- --------- ------------ ---------------
456 456 11-JAN-09 11-JAN-09 228353024 397
461 461 12-JAN-09 12-JAN-09 229755904 422
In this example, we see that two backup operations have successfully executed. Over time, we would be very interested in how long the backups were taking and whether the trend is increasing. By looking at the trends with respect to backup execution times, we can address problems before they actually become problems.
V$SESSION_LONGOPS and V$SESSION
The V$SESSION_LONGOPS view is useful during a backup or restore operation to determine how long the operation has taken and how much longer it is expected to take to complete. Join this view to the V$SESSION view for additional information about your RMAN backup or recovery sessions. Here is an example of a join between V$SESSION_LONGOPS and V$SESSION during a database backup:
SQL> select a.sid, a.serial#, a.program, b.opname, b.time remaining
from v$session a, v$session longops b
where a.sid b.sid
and a.serial# b.serial#
and a.program like '%RMAN%'
and time remaining > 0;
SID SERIAL# PROGRAM USERNAME OPNAME TIME REMAINING
---- ------- -------- --------- -------------------- ---------------------
10 8 RMAN.EXE SYS RMAN: aggregate input 1438
14 3 RMAN.EXE SYS RMAN: full datafile backup 7390
In this example, we have an RMAN process running a backup. It has connected to the database as SID 14. The time remaining, 7390, is the expected time in seconds that this backup will take. You can thus determine how long your backup will take by looking at this report. Note that we also did a join to V$SESSION to get some additional information on our RMAN session, such as the username and the program name.