DB2 – How to manage backup failures

09 October,2012 by Tom Collins

Taking a DB2 backup is straightforward, but in a large DB2 environment where hundreds of mission critical DB2 databases are backed up daily, what steps are in place to deal with failed backups?

A  well managed backup strategy is one of the  DBA secrets to stress reduction  

Backups can fail for different reasons – software failure, corruption, hardware issues, tapes unavailable,connectivity issues are just some of the few reasons. From an Operational perspective , how do you deal with this issue?

On a basic level – 1) Attempt a backup 2) If success , OK , if failure , what next?

Normally , there are overnight backups. In most organisations Operations staff will manage the scheduling and follow up on DB2 backup failures. Dealing with failure will depend on the cause. If it’s a missing tape Operations staff can replace, but let’s say it’s a database corruption issue-  escalate to DBA

One of the issues for Operations staff , is developing a standardised approach to Backups. They might be managing a wide range of file systems – such as DB2, SQL Server , flat files etc. How can you manage the risk but keep the effectiveness of Operations staff ?

 Most large scalable backup systems will have a scheduling element. Backups are scheduled according to SLAs, RPO and RTO. What you don’t want is Operations taking a cowboy approach and logging on to DB2 server at any time of the day- potentially creating serious contention and possible downtime. At the same time , Opertion staff have to deal with backup failures.

I prefer a rescheduling approach. In other words, don’t log onto the system and issue another backup.

First, assess reason – look at log files and report to the subject matter experts.

Second , agree on a scheduled time.It might be immediate or later on.

One question arising is should Operations Staff log directly onto the server and run the backup, bypassing the resschedule?

 The questions to consider are :

1) Scheduling offers a standardised approach across the multi – filesystems such as SQL Server and DB2 environments i.e SQL Server backup failures are rescheduled . Would this be lost?

2) Are there other logging\monitoring  benefits that come with scheduling?

For example, if Operations logged on and issued the local backup script   , is there a standardised way they would assess success\failure?

3) It will give you more flexibility in planning - for out of business hrs. For example , if a backup failure  was discovered at 9am M-F , is it suitable for someone to log onto root and issue the backup command?

Related Posts

The Backup failed

DBA secrets to stress reduction  

DB2 – Restore database from a ONLINE backup

Author: Rambler(http://www.dba-db2.com)


Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.


Post a comment on DB2 – How to manage backup failures

Comments are moderated, and will not appear until the author has approved them.

dba-db2.com | DB2 Performance Tuning | DBA DB2:Everything | FAQ | Contact | Copyright