Oracle - Using archive log to roll forward a database in a snapshot
We are using NUTANIX machine, and want to use Application Consistency / Crash Consistency snapshot for DR.
We tested CC snapshot, the database can start up without any block corruptions.
But compare with AC, which needs to put the database into backup mode, it looks like not trustworth.
So, in our consideration, we will take AC/CC alternately. AC for once a day, and CC for every minutes.
If CC cannot be up, we plan to bring the latest AC up, and then, copy archive logs from the latest CC to AC, and then using it to roll forward.
The AC is equal to have a copy when the database is in a hot backup mode.
We use it as a base for testing. When try to open the database, an error will be reported, we use rman for recovery.
SQL> startup; ORACLE instance started. Total System Global Area 1073741824 bytes Fixed Size 8801008 bytes Variable Size 373294352 bytes Database Buffers 683671552 bytes Redo Buffers 7974912 bytes Database mounted. ORA-10873: file 1 needs to be either taken out of backup mode or media recovered ORA-01110: data file 1: '/home/oracle/oradata/O12201/datafile/o1_mf_system_g76xpjr1_.dbf' RMAN> catalog recovery area; RMAN> recover database; Starting recover at 14-MAR-19 allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=258 device type=DISK starting media recovery media recovery complete, elapsed time: 00:00:01 Finished recover at 14-MAR-19
We can see, rman doesn’t apply any archive log.
When we try to use “recover database” in SQL
SQL> recover database using backup controlfile until cancel; ORA-00279: change 1914225 generated at 03/14/2019 21:49:02 needed for thread 1 ORA-00289: suggestion : /home/oracle/fast_recovery_area/O12201/O12201/archivelog/2019_03_14/o1_mf_1_6_g8 p1gtxv_.arc ORA-00280: change 1914225 for thread 1 is in sequence #6 Specify log: {<RET>=suggested | filename | AUTO | CANCEL} auto ORA-00328: archived log ends at change 1814349, need later change 1914225 ORA-00334: archived log: '/home/oracle/fast_recovery_area/O12201/O12201/archivelog/2019_03_14/o1_mf_1_6_g 8p1gtxv_.arc'
An error is reported and no archive log is applied. The recovery process need the redo record on SCN 1914225
But Accurately, the latest redo is less than this SCN
SEQUENCE# FIRST_CHANGE# NEXT_CHANGE# ---------- ------------- ------------ 3 1814192 1814196 4 1814196 1814200 5 1814200 1814204 6 1814204 1814350 7 1814350 1814354 8 1814354 1814358 9 1814358 1814362 10 1814362 1814369
The recopy the database backup, and using “recover database using backup controlfile until cancel” in sqlplus directly.
This time, there is no error and database can be opened.
SQL> recover database using backup controlfile until cancel;
Let us go back to the previous situation.
If recover take the online redo to roll forward the copy. After forwarding, we can see in the file header, the checkpoint cnt is changed from 299 to 300, and scn is updated from 0x1baec1 to 0x1d3571
DATA FILE #1: DATA FILE #1: name #6: /home/oracle/oradata/O12201/datafile/o1_mf_system_ name #6: /home/oracle/oradata/O12201/datafile/o1_mf_system_ creation size=0 block size=8192 status=0xe flg=0x1 head=6 tai | creation size=0 block size=8192 status=0x2e flg=0x1 head=6 ta pdb_id 1, tablespace 0, index=2 krfil=1 prev_file_in_ts=0 pr pdb_id 1, tablespace 0, index=2 krfil=1 prev_file_in_ts=0 pr unrecoverable scn: 0x0000000000000000 01/01/1988 00:00:00 unrecoverable scn: 0x0000000000000000 01/01/1988 00:00:00 Checkpoint cnt:299 scn: 0x00000000001baec1 03/14/2019 21:49: | Checkpoint cnt:300 scn: 0x00000000001d3571 03/14/2019 21:49: Stop scn: 0xffffffffffffffff 03/14/2019 17:19:19 | Stop scn: 0x00000000001d3571 03/14/2019 21:49:43 Creation Checkpointed at scn: 0x0000000000000007 01/26/2017 Creation Checkpointed at scn: 0x0000000000000007 01/26/2017 thread:0 rba:(0x0.0.0) thread:0 rba:(0x0.0.0)
Checkpoint cnt:300 scn: 0x00000000001d3571 03/14/2019 21:49:
0x00000000001d3571 => 1914225, which is the error reported.
After open the database resetlogs, we can see the scn in archive log are increase a lot.
SEQUENCE# FIRST_CHANGE# NEXT_CHANGE# ---------- ------------- ------------ 2 1814188 1814192 3 1814192 1814196 4 1814196 1814200 5 1814200 1814204 6 1814204 2014227
For normally situation, if we use OS cp to backup database with “begin backup”, we don’t copy redo logs due to it is useless.
So, in this situation, because the restore the whole machine from a snapshot, so the redo log is there.
It would be better to remove them before recover.
[oracle@testenv oradata]$ cd O12201/onlinelog/ [oracle@testenv onlinelog]$ ls -ltr total 614412 -rw-r-----. 1 oracle oinstall 209715712 Mar 15 00:06 o1_mf_1_g76xsrcz_.log -rw-r-----. 1 oracle oinstall 209715712 Mar 15 00:06 o1_mf_2_g76xsrfx_.log -rw-r-----. 1 oracle oinstall 209715712 Mar 15 00:06 o1_mf_3_g76xsrj8_.log [oracle@testenv onlinelog]$ rm * RMAN> catalog recovery area; RMAN> recover database; Starting recover at 15-MAR-19 allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=237 device type=DISK starting media recovery media recovery failed RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of recover command at 03/15/2019 00:08:49 ORA-00283: recovery session canceled due to errors RMAN-11003: failure during parse/execution of SQL statement: alter database recover if needed start ORA-00283: recovery session canceled due to errors ORA-00313: open failed for members of log group 3 of thread 1 ORA-00312: online log 3 thread 1: '/home/oracle/oradata/O12201/onlinelog/o1_mf_3_g76xsrj8_.log' ORA-27037: unable to obtain file status Linux-x86_64 Error: 2: No such file or directory Additional information: 7 SQL> recover database using backup controlfile until cancel; ORA-00279: change 1814209 generated at 03/14/2019 21:49:02 needed for thread 1 ORA-00289: suggestion : /home/oracle/fast_recovery_area/O12201/O12201/archivelog/2019_03_14/o1_mf_1_6_g8 p1gtxv_.arc ORA-00280: change 1814209 for thread 1 is in sequence #6 Specify log: {<RET>=suggested | filename | AUTO | CANCEL} auto ORA-00279: change 1814350 generated at 03/14/2019 21:54:34 needed for thread 1 ORA-00289: suggestion : /home/oracle/fast_recovery_area/O12201/O12201/archivelog/2019_03_14/o1_mf_1_7_g8 p1gv1j_.arc ORA-00280: change 1814350 for thread 1 is in sequence #7 ORA-00278: log file '/home/oracle/fast_recovery_area/O12201/O12201/archivelog/2019_03_14/o1_mf_1_6_g 8p1gtxv_.arc' no longer needed for this recovery ...... ORA-00308: cannot open archived log '/home/oracle/fast_recovery_area/O12201/O12201/archivelog/2019_03_15/o1_mf_1_11_ %u_.arc' ORA-27037: unable to obtain file status Linux-x86_64 Error: 2: No such file or directory Additional information: 7
If the backup is a cold backup, there is no this problem.
But rman is still not working.
SQL> startup mount; ORACLE instance started. Total System Global Area 1073741824 bytes Fixed Size 8801008 bytes Variable Size 373294352 bytes Database Buffers 683671552 bytes Redo Buffers 7974912 bytes Database mounted. RMAN> catalog recovery area; RMAN> recover database; Starting recover at 14-MAR-19 using channel ORA_DISK_1 starting media recovery media recovery complete, elapsed time: 00:00:01 Finished recover at 14-MAR-19 SQL> recover database using backup controlfile until cancel; ORA-00279: change 1813573 generated at 03/14/2019 17:18:06 needed for thread 1 ORA-00289: suggestion : /home/oracle/fast_recovery_area/O12201/O12201/archivelog/2019_03_14/o1_mf_1_1_g8 okfwoq_.arc ORA-00280: change 1813573 for thread 1 is in sequence #1 Specify log: {<RET>=suggested | filename | AUTO | CANCEL} auto ORA-00279: change 1813666 generated at 03/14/2019 17:21:00 needed for thread 1 ORA-00289: suggestion : /home/oracle/fast_recovery_area/O12201/O12201/archivelog/2019_03_14/o1_mf_1_2_g8 okfwtc_.arc ORA-00280: change 1813666 for thread 1 is in sequence #2 ORA-00278: log file '/home/oracle/fast_recovery_area/O12201/O12201/archivelog/2019_03_14/o1_mf_1_1_g 8okfwoq_.arc' no longer needed for this recovery ...... ORA-00308: cannot open archived log '/home/oracle/fast_recovery_area/O12201/O12201/archivelog/2019_03_14/o1_mf_1_6_% u_.arc' ORA-27037: unable to obtain file status Linux-x86_64 Error: 2: No such file or directory Additional information: 7 SQL> alter database open resetlogs; Database altered.
Oracle - RMAN Backup Stratage with FRA and Tape
Now, we are planning to move our backup to AWS. The old backup strategy is not good for this situation.
Because we cannot just change DISK to SBT_TAPE, we also want to have a copy of backup on disk.
I decide:
- Use FRA to store backup locally for the recovery windows period.
- Backup FRA to AWS everyday after daily backup.
- For monthly / weekly backup, backup to AWS directly.
- Backup archivelog to AWS directly.
- Use FRA to obsolete local backup and archivelog.
- Using recovery window for obsolete of backup on disk.
- The recovery window is set to 7 days for Disk, and using 35 days in the script for Tape.
RMAN configuration:
CONFIGURE RETENTION POLICY TO RECOVERY WINDOW OF 7 DAYS; CONFIGURE BACKUP OPTIMIZATION ON; CONFIGURE DEFAULT DEVICE TYPE TO DISK; CONFIGURE CONTROLFILE AUTOBACKUP ON; CONFIGURE ARCHIVELOG DELETION POLICY TO BACKED UP 1 TIMES TO 'SBT_TAPE';
Never configure any default channel attribute for sbt_type, except for archivelog deletion policy.
It helps you to keep from mistake of delete all.
########################################################### # backup_keep_to_tape.rman ########################################################### # 1 DATE # 2 MONTHLY/WEEKLY # 3 KEEP PERIOD run{ allocate channel ch01 device type SBT_TAPE parms 'ENV=(ob_media_family=rman-testenv)'; backup incremental level 0 as compressed backupset filesperset=10 tag='&1_LEV0_&2' database keep until time 'sysdate+&3'; release channel ch01; } ########################################################### # backup_database_to_disk.rman ########################################################### # 1 DATE # 2 LEVEL 0/1/2 # 3 "CUMULATIVE"/"" run{ allocate channel ch01 device type disk; backup incremental level &2 &3 as compressed backupset filesperset=10 tag='&1_LEV&2_DAILY' database include current controlfile; release channel ch01; } ########################################################### # backup_fra_to_tape.rman ########################################################### run{ allocate channel ch01 device type SBT_TAPE parms 'ENV=(ob_media_family=rman-testenv)'; backup filesperset=10 recovery area; release channel ch01; } ########################################################### # backup_archive_to_tape.rman ########################################################### run{ allocate channel ch01 device type SBT_TAPE parms 'ENV=(ob_media_family=rman-testenv)'; backup as compressed backupset filesperset=10 archivelog all not backed up 1 times; release channel ch01; } ########################################################### # backup_daily.rman ########################################################### # 1 DATE # 2 LEVEL 0/1/2 # 3 "CUMULATIVE"/"" @backup_archive_to_tape.rman @backup_database_to_disk.rman &1 &2 "&3" @backup_fra_to_tape.rman @backup_archive_to_tape.rman ########################################################### # backup_release_fra.rman ########################################################### run{ allocate channel ch01 device type SBT_TAPE parms 'ENV=(ob_media_family=rman-testenv)'; backup filesperset=10 recovery area delete input ; release channel ch01; } ########################################################### # obsolete_fra.rman ########################################################### run{ allocate channel ch01 device type DISK; crosscheck backup DEVICE TYPE DISK; crosscheck copy DEVICE TYPE DISK; delete noprompt obsolete DEVICE TYPE DISK; release channel ch01; } ########################################################### # obsolete_tape.rman ########################################################### run{ allocate channel ch01 device type SBT_TAPE parms 'ENV=(ob_media_family=rman-testenv)'; crosscheck backup DEVICE TYPE SBT_TAPE; crosscheck copy DEVICE TYPE SBT_TAPE; delete noprompt obsolete recovery window of 60 days DEVICE TYPE SBT_TAPE; release channel ch01; }
Due to RMAN backups to AWS as to a tape library by using OSB, and I am using OSB + mhvtl to verify my strategy.
Install mhvtl
# yum install mt-st mtx lsscsi sg3_utils lzo-minilzo # wget http://mirror.centos.org/centos/7/os/x86_64/Packages/lzo-devel-2.06-8.el7.x86_64.rpm # rpm -ivh lzo-devel-2.06-8.el7.x86_64.rpm # wget https://sites.google.com/site/linuxvtl2/mhvtl-2016-03-10.tgz?attredirects=0 --no-check-certificate # tar xvf mhvtl-2016-03-10.tgz # groupadd vtl # useradd -g vtl vtl # cd mhvtl-1.5 # make distclean # cd kernel # make # make install # cd .. # make # make install # chown -R vtl:vtl /opt/mhvtl/ # /etc/init.d/mhvtl start ## 2 libraries, 4 tape drive for each. # lsscsi -g [0:0:0:0] disk VMware, VMware Virtual S 1.0 /dev/sda /dev/sg0 [2:0:0:0] cd/dvd NECVMWar VMware IDE CDR10 1.00 /dev/sr0 /dev/sg1 [3:0:0:0] mediumx STK L700 0105 /dev/sch0 /dev/sg5 [3:0:1:0] tape IBM ULT3580-TD5 0105 /dev/st0 /dev/sg2 [3:0:2:0] tape IBM ULT3580-TD5 0105 /dev/st1 /dev/sg3 [3:0:3:0] tape IBM ULT3580-TD4 0105 /dev/st2 /dev/sg4 [3:0:4:0] tape IBM ULT3580-TD4 0105 /dev/st3 /dev/sg6 [3:0:8:0] mediumx STK L80 0105 /dev/sch1 /dev/sg11 [3:0:9:0] tape STK T10000B 0105 /dev/st4 /dev/sg7 [3:0:10:0] tape STK T10000B 0105 /dev/st5 /dev/sg8 [3:0:11:0] tape STK T10000B 0105 /dev/st6 /dev/sg9 [3:0:12:0] tape STK T10000B 0105 /dev/st7 /dev/sg10 ## mhvtl processes, 8 for tape drives and 2 for libraries. # ps -ef|grep vtl vtl 10813 1 0 22:59 ? 00:00:00 vtltape -q 11 -v vtl 10816 1 0 22:59 ? 00:00:00 vtltape -q 12 -v vtl 10832 1 0 22:59 ? 00:00:00 vtltape -q 13 -v vtl 10837 1 0 22:59 ? 00:00:00 vtltape -q 14 -v vtl 10840 1 0 22:59 ? 00:00:00 vtltape -q 31 -v vtl 10847 1 0 22:59 ? 00:00:00 vtltape -q 32 -v vtl 10850 1 0 22:59 ? 00:00:00 vtltape -q 33 -v vtl 10858 1 0 22:59 ? 00:00:00 vtltape -q 34 -v vtl 10863 1 0 22:59 ? 00:00:00 vtllibrary -q 10 -v vtl 10867 1 0 22:59 ? 00:00:00 vtllibrary -q 30 -v root 10965 8701 0 23:00 pts/4 00:00:00 grep --color=auto vtl
Configuration Files: /etc/mhvtl
Device Files: /opt/mhvtl
The size of tapes is 500MB in default. In order to change to, modify CAPACITY in /etc/mhvtl/mhvtl.conf, and then, delete all folder in /etc/mhvtl, restart mhvtl.
# pwd /opt/mhvtl # dump_tape -f G03037TA Media density code: 0x4a Media type code : 0x26 Media description : STK T10KA media Tape Capacity : 4294967296 (4096 MBytes) Media type : Normal data Media : read-write Remaining Tape Capacity : 4294967296 (4096 MBytes) Total num of filemarks: 0 Segmentation fault (core dumped)
Install Oracle Secure Backup
# mkdir -p /usr/local/oracle/backup # cd /usr/local/oracle/backup # /tmp/osb_12.2.0.1.0_linux.x64_release/setup # obtool Oracle Secure Backup 12.2.0.1.0 login: admin Password: 12345678 # define the server also as mediaserver ob> lshost testenv admin,client (via OB) in service ob> chhost --addrole mediaserver testenv ob> lshost testenv admin,mediaserver,client (via OB) in service # create a user named oracle, which is the oracle user name, and allow rman and cmdline to login. ob> lsuser admin admin ob> mkuser -p 12345678 -c oracle -U oracle -G dba -N no -e westzq@gmail.com -h *:*:*+rman+cmdline oracle ob> lsuser admin admin oracle oracle # create a media group ob> mkmf -u -C -a rman-testenv ob> lsmf -l rman-testenv rman-testenv: Keep volume set: content manages reuse Appendable: yes Volume ID used: unique to this media family UUID: cc2f973a-23af-1037-ae7d-000c293757c5 # define library and driver # lsscsi -g [1:0:0:0] cd/dvd NECVMWar VMware IDE CDR10 1.00 /dev/sr0 /dev/sg1 [2:0:0:0] disk VMware, VMware Virtual S 1.0 /dev/sda /dev/sg0 [3:0:0:0] mediumx STK L700 0105 /dev/sch0 /dev/sg10 [3:0:1:0] tape IBM ULT3580-TD5 0105 /dev/st0 /dev/sg2 [3:0:2:0] tape IBM ULT3580-TD5 0105 /dev/st2 /dev/sg4 [3:0:3:0] tape IBM ULT3580-TD4 0105 /dev/st1 /dev/sg3 [3:0:4:0] tape IBM ULT3580-TD4 0105 /dev/st4 /dev/sg6 [3:0:8:0] mediumx STK L80 0105 /dev/sch1 /dev/sg11 [3:0:9:0] tape STK T10000B 0105 /dev/st3 /dev/sg5 [3:0:10:0] tape STK T10000B 0105 /dev/st5 /dev/sg7 [3:0:11:0] tape STK T10000B 0105 /dev/st6 /dev/sg8 [3:0:12:0] tape STK T10000B 0105 /dev/st7 /dev/sg9 ob> mkdev -t library -o -a testenv:/dev/sg10 lib01 ob> mkdev -t library -o -a testenv:/dev/sg11 lib02 ob> mkdev -t tape -o -a testenv:/dev/sg2 -l lib01 -d 1 tape01 ob> mkdev -t tape -o -a testenv:/dev/sg3 -l lib01 -d 2 tape02 ob> mkdev -t tape -o -a testenv:/dev/sg4 -l lib01 -d 3 tape03 ob> mkdev -t tape -o -a testenv:/dev/sg5 -l lib01 -d 4 tape04 ob> mkdev -t tape -o -a testenv:/dev/sg6 -l lib02 -d 1 tape05 ob> mkdev -t tape -o -a testenv:/dev/sg7 -l lib02 -d 2 tape06 ob> mkdev -t tape -o -a testenv:/dev/sg8 -l lib02 -d 3 tape07 ob> mkdev -t tape -o -a testenv:/dev/sg9 -l lib02 -d 4 tape08 # the device name may be changed, refer this article to change. Oracle Secure Backup 12.1.0.1.0 - All Backup jobs to tape failing with Error message: Pending resource availability (Doc ID 2144552.1) # after definition ob> lsdev library lib01 in service drive 1 tape01 in service drive 2 tape02 in service drive 3 tape03 in service drive 4 tape04 in service library lib02 in service drive 1 tape05 in service drive 2 tape06 in service drive 3 tape07 in service drive 4 tape08 in service ob> inventory -L lib01 ob> inventory -L lib02 # For testing run{ allocate channel ch01 device type sbt parms 'ENV=(ob_media_family=rman-testenv)'; backup spfile; } # after backup, we can see our backup jobs and pieces ob> lsvol -a VOID OOID Seq Volume ID Barcode Family Created Attributes 173 173 1 rman-testenv-000001 G03001TA rman-testenv 03/07.23:13 never closes; content manages reuse 175 175 1 rman-testenv-000002 G03002TA rman-testenv 03/07.23:14 never closes; content manages reuse ob> lspiece POID Database Content Copy Created Host Piece name 100 o12201 full 0 03/07.23:13 testenv g8trsfk9_1_1 101 o12201 full 0 03/07.23:13 testenv c-766629926-20190307-00 102 o12201 full 0 03/07.23:14 testenv gbtrsfm9_1_1 103 o12201 full 0 03/07.23:14 testenv gatrsfm9_1_1 104 o12201 full 0 03/07.23:14 testenv gdtrsfnd_1_1 105 o12201 full 0 03/07.23:14 testenv gctrsfnd_1_1 106 o12201 full 0 03/07.23:15 testenv getrsfo6_1_1 107 o12201 full 0 03/07.23:15 testenv gftrsfo6_1_1 108 o12201 full 0 03/07.23:15 testenv c-766629926-20190307-01
After configuration, we can do testing now
RO="su - oracle -c " RMAN="cd /home/oracle/rman; export ORACLE_SID=o12201; export ORAENV_ASK=NO; . oraenv; rman target / " timedatectl set-ntp no for YEAR in {2018..2018} do for MONTH in {1..7} do for DAY in {1..28} do for HOUR in {0,12} do echo $$ n=$(($DAY%7)) # echo $YEAR" "$MONTH" "$DAY" "$HOUR" "$n; timedatectl set-time "$YEAR-$MONTH-$DAY $HOUR:00:00"; if [[ $n -eq 1 && $DAY -le 7 && HOUR -eq 0 ]]; then echo "lev0 monthly backup" $RO "$RMAN <<EOF @backup_keep_to_tape.rman `date +%Y%m%d%H` \"MONTHLY\" 180 EOF " elif [[ $n -eq 1 && $DAY -ge 8 && HOUR -eq 0 ]]; then echo "lev0 weekly backup" $RO "$RMAN <<EOF @backup_keep_to_tape.rman `date +%Y%m%d%H` \"WEEKLY\" 90 EOF " elif [[ $n -eq 2 && HOUR -eq 0 ]]; then echo "lev0 daily backup" $RO "$RMAN <<EOF @backup_daily.rman `date +%Y%m%d%H` 0 \"\" EOF " elif [[ $n -eq 3 && HOUR -eq 0 ]]; then echo "lev2 daily backup" $RO "$RMAN <<EOF @backup_daily.rman `date +%Y%m%d%H` 2 \"\" EOF " elif [[ $n -eq 4 && HOUR -eq 0 ]]; then echo "lev1 daily backup" $RO "$RMAN <<EOF @backup_daily.rman `date +%Y%m%d%H` 1 \"\" EOF " elif [[ $n -eq 5 && HOUR -eq 0 ]]; then echo "lev1 daily backup" $RO "$RMAN <<EOF @backup_daily.rman `date +%Y%m%d%H` 1 \"CUMULATIVE\" EOF " elif [[ $n -eq 6 && HOUR -eq 0 ]]; then echo "lev2 daily backup" $RO "$RMAN <<EOF @backup_daily.rman `date +%Y%m%d%H` 2 \"\" EOF " elif [[ $n -eq 7 && HOUR -eq 0 ]]; then echo "lev1 daily backup" $RO "$RMAN <<EOF @backup_daily.rman `date +%Y%m%d%H` 1 \"\" EOF " else echo "archive log backup" $RO "$RMAN <<EOF @backup_archive_to_tape.rman EOF " fi done done done done
With this configuration,
When you try to delete all obsolete with default recovery window:
RMAN> delete obsolete; RMAN retention policy will be applied to the command RMAN retention policy is set to recovery window of 7 days using channel ORA_DISK_1 Deleting the following obsolete backups and copies: Type Key Completion Time Filename/Handle -------------------- ------ ------------------ -------------------- Backup Set 4073 09-MAY-18 Backup Piece 4419 09-MAY-18 5it2dqhb_1_2 Backup Set 4074 09-MAY-18 Backup Piece 4420 09-MAY-18 5jt2dqk1_1_2 Backup Set 4078 09-MAY-18 Backup Piece 4424 09-MAY-18 5nt2dqp7_1_1 Backup Set 4080 09-MAY-18 Backup Piece 4426 09-MAY-18 5pt2f4m4_1_1 ... Archive Log 1582 01-JUN-18 /u01/app/oracle/fast_recovery_area/o12201/O12201/archivelog/2018_06_01/o1_mf_1_1583_fk1kf66j_.arc Archive Log 1583 01-JUN-18 /u01/app/oracle/fast_recovery_area/o12201/O12201/archivelog/2018_06_01/o1_mf_1_1584_fk2vd3n1_.arc Backup Set 4269 01-JUN-18 Backup Piece 4671 01-JUN-18 bnt4bpa4_1_1 Archive Log 1584 02-JUN-18 /u01/app/oracle/fast_recovery_area/o12201/O12201/archivelog/2018_06_02/o1_mf_1_1585_fk45l43g_.arc Backup Set 4271 02-JUN-18 Backup Piece 4673 02-JUN-18 bpt4d3g4_1_1 RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of delete command at 06/09/2018 00:02:38 RMAN-06091: no channel allocated for maintenance (of an appropriate type)
There is no way to delete the backup pieces on tape because of mistake.
When you try to delete backup:
RMAN> delete backup; using channel ORA_DISK_1 List of Backup Pieces BP Key BS Key Pc# Cp# Status Device Type Piece Name ------- ------- --- --- ----------- ----------- ---------- 3694 3517 1 1 AVAILABLE DISK /u01/app/oracle/fast_recovery_area/o12201/O12201/autobackup/2018_03_06/o1_mf_s_970012848_f9w8409t_.bkp 3721 3540 1 1 AVAILABLE DISK /u01/app/oracle/fast_recovery_area/o12201/O12201/autobackup/2018_03_09/o1_mf_s_970272137_fb45bspw_.bkp 3736 3551 1 1 AVAILABLE DISK /u01/app/oracle/fast_recovery_area/o12201/O12201/autobackup/2018_03_10/o1_mf_s_970358449_fb6sn1d8_.bkp 3751 3562 1 1 AVAILABLE DISK /u01/app/oracle/fast_recovery_area/o12201/O12201/autobackup/2018_03_11/o1_mf_s_970444849_fb9g01h0_.bkp 3766 3573 1 1 AVAILABLE DISK /u01/app/oracle/fast_recovery_area/o12201/O12201/autobackup/2018_03_12/o1_mf_s_970531247_fbcyvj2w_.bkp 3781 3584 1 1 AVAILABLE DISK /u01/app/oracle/fast_recovery_area/o12201/O12201/autobackup/2018_03_13/o1_mf_s_970617648_fbgm7jv1_.bkp 3808 3607 1 1 AVAILABLE DISK /u01/app/oracle/fast_recovery_area/o12201/O12201/autobackup/2018_03_16/o1_mf_s_970876935_fbpjg7l2_.bkp 3823 3618 1 1 AVAILABLE DISK /u01/app/oracle/fast_recovery_area/o12201/O12201/autobackup/2018_03_17/o1_mf_s_970963242_fbs4qbt3_.bkp 4722 4308 1 1 AVAILABLE DISK /u01/app/oracle/fast_recovery_area/o12201/O12201/backupset/2018_06_05/o1_mf_ncnn1_2018060500_LEV1_DAIL_fkd2qn0c_.bkp 4723 4309 1 1 AVAILABLE DISK /u01/app/oracle/fast_recovery_area/o12201/O12201/autobackup/2018_06_05/o1_mf_s_977961655_fkd2qqk1_.bkp 4735 4317 1 1 AVAILABLE DISK /u01/app/oracle/fast_recovery_area/o12201/O12201/backupset/2018_06_06/o1_mf_nnnd2_2018060600_LEV2_DAIL_fkgq3cxf_.bkp 4736 4318 1 1 AVAILABLE DISK /u01/app/oracle/fast_recovery_area/o12201/O12201/7E60BE07B7D65699E053651010537F9B/backupset/2018_06_06/o1_mf_nnnd2_2018060600_LEV2_DAIL_fkgq3h4b_.bkp 4737 4319 1 1 AVAILABLE DISK /u01/app/oracle/fast_recovery_area/o12201/O12201/backupset/2018_06_06/o1_mf_ncnn2_2018060600_LEV2_DAIL_fkgq3kcl_.bkp 4738 4320 1 1 AVAILABLE DISK /u01/app/oracle/fast_recovery_area/o12201/O12201/autobackup/2018_06_06/o1_mf_s_978048050_fkgq3lx3_.bkp Do you really want to delete the above objects (enter YES or NO)?
Only backup pieces on disk will be deleted. The backup pieces on tape are safe.
We don’t backup archive logs to disk, so even the all backup pieces on disk are deleted accidently, we never lose any archive log.
But “delete copy” accidently is unrecoverable. We need to archive redo logs to a remote node through TNS against this problem.
Without allocate channel explicitly, the recover will only conside the backupset on the disk.
This is what I want. If we want to also conside the backup on the tape, also allocate a channel to tape
run{ allocate channel ch01 device type SBT_TAPE parms 'ENV=(ob_media_family=rman-testenv)'; allocate channel ch02 device type disk; restore database preview; }
And there is a disvantage of using FRA with Tape together for READ ONLY tablespace or pluggable database.
Due to FRA will consider all the backupsets which have been copied to tape as removable, so, when there is no more space, some of backup pieces will be removed. Including the backup pieces of READ ONLY
When OPTIMITZATION set to ON, and read only tablespaces will never be backed up again, however, it is DEVICE TYPE specified.
If the backup pieces on FRA are deleted, RMAN will backup READ ONLY on next backup, and then, copy to TAPE
And the backup pieces of READ ONLY will never be obsoleted with recovery window retention policy.
So, we may have many copied of READ ONLY tablespaces.
In order to express, I removed the entries of WEEKLY and MONTHLY backup.
RMAN> list backup of pluggable database 'PDB$SEED' summary; List of Backups =============== Key TY LV S Device Type Completion Time #Pieces #Copies Compressed Tag ------- -- -- - ----------- --------------- ------- ------- ---------- --- 2935 B 0 A SBT_TAPE 02-JAN-18 1 1 YES 2018010200_LEV0_DAILY 3740 B 0 A SBT_TAPE 02-APR-18 1 1 YES 2018040200_LEV0_DAILY 4472 B 0 A SBT_TAPE 23-JUN-18 1 1 YES 2018062300_LEV0_DAILY
I have three backups of PDB$SEED after obsolete.
The solution for this is that change pluggable database/tablespace to read write periodically.
After move to read write and then back to read only, them need to be backed up and again, and the old backup don’t need to restore database after out of recovery window.
For PDB$SEED, if we apply CPU/PSU patchset regularly, it will bring database out of read only mode internally.
Oracle - RMAN Backup Policy with KEEP
Our RMAN Obsolete is not working, the utilization ratio of backup file system reaches 90% percent. The former DBA uses his own script to choose backup set to delete, and they also use “find +mtime” and “rm” to delete backup.
It is very dangerous, especially then OPTIMIZATION is ON and there is a read only tablespace or pluggable read only database like ‘PDB$SEED’. RMAN only backup them one time. If the backup piece are removed without crosscheck, they is no backup for them anymore.
I redesign the backup and obsolete policy, depends on our requirement.
- Backup retention window is 7 days.
- Every week do a level 0 backup with 6 level 1 or 2 backup.
- The first level 0 backup each month needs to keep 1 year.
- The level 0 backup for every week needs to keep 5 weeks, but level 1 and 2 doesn’t need.
I am using KEEP UNTIL to achieve the long term keep policy. But, we needs to know what the keep exactly does: 1.The backup database with KEEP will backup the archive log which needed for recovery to consistent situation, controlfile, spfile and even the read only tablespace automatically. We don’t need to backup them especially.
using channel ORA_DISK_1 backup will be obsolete on date 01-APR-19 archived logs required to recover from this backup will be backed up channel ORA_DISK_1: starting compressed archived log backup set channel ORA_DISK_1: specifying archived log(s) in backup set input archived log thread=1 sequence=1202 RECID=1201 STAMP=996374548 channel ORA_DISK_1: starting piece 1 at 01-JAN-19 channel ORA_DISK_1: finished piece 1 at 01-JAN-19 piece handle=/u01/app/oracle/backup/O12201/ora_df996374548_s3121_s1 tag=20180101010000_LEVEL0_MONTHLY comment=NONE channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01 using channel ORA_DISK_1 backup will be obsolete on date 01-APR-19 archived logs required to recover from this backup will be backed up channel ORA_DISK_1: starting compressed full datafile backup set channel ORA_DISK_1: specifying datafile(s) in backup set including current SPFILE in backup set channel ORA_DISK_1: starting piece 1 at 01-JAN-19 channel ORA_DISK_1: finished piece 1 at 01-JAN-19 piece handle=/u01/app/oracle/backup/O12201/ora_df996374550_s3122_s1 tag=20180101010000_LEVEL0_MONTHLY comment=NONE channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01 using channel ORA_DISK_1 backup will be obsolete on date 01-APR-19 archived logs required to recover from this backup will be backed up channel ORA_DISK_1: starting compressed full datafile backup set channel ORA_DISK_1: specifying datafile(s) in backup set including current control file in backup set channel ORA_DISK_1: starting piece 1 at 01-JAN-19 channel ORA_DISK_1: finished piece 1 at 01-JAN-19 piece handle=/u01/app/oracle/backup/O12201/ora_df996374551_s3123_s1 tag=20180101010000_LEVEL0_MONTHLY comment=NONE channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01 Finished backup at 01-JAN-19
2.The incremental level 0 backup with KEEP cannot be used as the baseline of level 1 and level 2, due to it is out of retention policy. You will see an information “no parent backup or copy of datafile 1 found”, and RMAN will take an level 1 backup as level 0. And this backup will be used as the baseline of all level 1 and 2 forever after, if there was no another backup without keep. So a separate level 0 backup is needed.
Incremental Level 1 Backup Taking a Level 0 Backup Although Previous Level 0 exists (Doc ID 2023948.1)
using channel ORA_DISK_1 backup will be obsolete on date 08-JAN-19 archived logs required to recover from this backup will be backed up no parent backup or copy of datafile 1 found no parent backup or copy of datafile 3 found no parent backup or copy of datafile 4 found no parent backup or copy of datafile 7 found no parent backup or copy of datafile 6 found no parent backup or copy of datafile 5 found no parent backup or copy of datafile 8 found no parent backup or copy of datafile 10 found no parent backup or copy of datafile 9 found no parent backup or copy of datafile 11 found channel ORA_DISK_1: starting compressed incremental level 0 datafile backup set
3.If the database has read only tablespace and optimization is not set to ON, the all backup set of read only tablespace cannot be obsolete with recovery window policy. We need to use redundancy policy.
The following is my test script on VM. We have to disable VM to synchronize time automatically in advance.
For testing, I only keep monthly backup for 90 days and weekly for 32 days.
RMAN> CONFIGURE RETENTION POLICY TO RECOVERY WINDOW OF 7 DAYS; RMAN> CONFIGURE BACKUP OPTIMIZATION ON; RO="su - oracle -c " RMAN="export ORACLE_SID=o12201; export ORAENV_ASK=NO; . oraenv; rman target / " timedatectl set-ntp no for YEAR in {2018..2018} do for MONTH in {1..6} do for DAY in {1..28} do for HOUR in {0,6,12,18} do echo $$ n=$(($DAY%7)) # echo $YEAR" "$MONTH" "$DAY" "$HOUR" "$n; timedatectl set-time "$YEAR-$MONTH-$DAY $HOUR:00:00"; if [[ $n -eq 1 && $DAY -le 7 && HOUR -eq 0 ]]; then echo "lev0 monthly backup" $RO "$RMAN <<EOF backup incremental level 0 as compressed backupset tag '`date +%Y%m%d%H`_lev0_monthly' database keep until time 'sysdate+90' include current controlfile; backup incremental level 0 as compressed backupset tag '`date +%Y%m%d%H`_lev0_daily' database include current controlfile plus archivelog delete input not backed up 1 times; EOF " elif [[ $n -eq 1 && $DAY -ge 8 && HOUR -eq 0 ]]; then echo "lev0 weekly backup" $RO "$RMAN <<EOF backup incremental level 0 as compressed backupset tag '`date +%Y%m%d%H`_lev0_weekly' database keep until time 'sysdate+32' include current controlfile; backup incremental level 0 as compressed backupset tag '`date +%Y%m%d%H`_lev0_daily' database include current controlfile plus archivelog delete input not backed up 1 times; EOF " elif [[ $n -ne 1 && HOUR -eq 0 ]]; then echo "lev1 daily backup" $RO "$RMAN <<EOF backup incremental level 1 as compressed backupset tag '`date +%Y%m%d%H`_lev1_daily' database include current controlfile plus archivelog delete input not backed up 1 times; EOF " else echo "archive log backup" $RO "$RMAN <<EOF backup as compressed backupset archivelog all delete input not backed up 1 times; EOF " fi done done done done
Execute DELETE OBSOLETE after backup finished.
To display, I just show the backup of HR. We can see there is:
- 3 MONTHLY backups for April, May, and June
- 3 WEEKLY backups - the backup of June 1st is MONTHLY.
- 1 level 0 daily plus 6 level 1 daily.
RMAN> list backup of pluggable database 'HR' summary; List of Backups =============== Key TY LV S Device Type Completion Time #Pieces #Copies Compressed Tag ------- -- -- - ----------- --------------- ------- ------- ---------- --- 4009 B 0 A DISK 01-APR-18 1 1 YES 2018040100_LEV0_MONTHLY 4369 B 0 A DISK 01-MAY-18 1 1 YES 2018050100_LEV0_MONTHLY 4729 B 0 A DISK 01-JUN-18 1 1 YES 2018060100_LEV0_MONTHLY 4819 B 0 A DISK 08-JUN-18 1 1 YES 2018060800_LEV0_WEEKLY 4909 B 0 A DISK 15-JUN-18 1 1 YES 2018061500_LEV0_WEEKLY 4999 B 0 A DISK 22-JUN-18 1 1 YES 2018062200_LEV0_WEEKLY 5006 B 0 A DISK 22-JUN-18 1 1 YES 2018062200_LEV0_DAILY 5018 B 1 A DISK 23-JUN-18 1 1 YES 2018062300_LEV1_DAILY 5030 B 1 A DISK 24-JUN-18 1 1 YES 2018062400_LEV1_DAILY 5042 B 1 A DISK 25-JUN-18 1 1 YES 2018062500_LEV1_DAILY 5054 B 1 A DISK 26-JUN-18 1 1 YES 2018062600_LEV1_DAILY 5066 B 1 A DISK 27-JUN-18 1 1 YES 2018062700_LEV1_DAILY 5078 B 1 A DISK 28-JUN-18 1 1 YES 2018062800_LEV1_DAILY
Read only tablespace / pluggable database will be backup at the every backup with keep, and only have one backup at the very first time(2018010100) and never backup again when OPTIMIZATION is ON.
RMAN> list backup of pluggable database 'PDB$SEED' summary; List of Backups =============== Key TY LV S Device Type Completion Time #Pieces #Copies Compressed Tag ------- -- -- - ----------- --------------- ------- ------- ---------- --- 2936 B 0 A DISK 01-JAN-18 1 1 YES 2018010100_LEV0_DAILY 4010 B 0 A DISK 01-APR-18 1 1 YES 2018040100_LEV0_MONTHLY 4370 B 0 A DISK 01-MAY-18 1 1 YES 2018050100_LEV0_MONTHLY 4730 B 0 A DISK 01-JUN-18 1 1 YES 2018060100_LEV0_MONTHLY 4820 B 0 A DISK 08-JUN-18 1 1 YES 2018060800_LEV0_WEEKLY 4910 B 0 A DISK 15-JUN-18 1 1 YES 2018061500_LEV0_WEEKLY 5000 B 0 A DISK 22-JUN-18 1 1 YES 2018062200_LEV0_WEEKLY
Each backup with KEEP have the backup of controlfile, spfile, and read only. All entries which needed to restore the whole database will be included.
RMAN> list backup tag='2018040100_LEV0_MONTHLY'; List of Backup Sets =================== BS Key Type LV Size Device Type Elapsed Time Completion Time ------- ---- -- ---------- ----------- ------------ --------------- 4008 Incr 0 313.03M DISK 00:01:16 01-APR-18 BP Key: 4008 Status: AVAILABLE Compressed: YES Tag: 2018040100_LEV0_MONTHLY Piece Name: /u01/app/oracle/backup/O12201/ora_df972259203_s4205_s1 Keep: BACKUP_LOGS Until: 30-JUN-18 List of Datafiles in backup set 4008 File LV Type Ckp SCN Ckp Time Abs Fuz SCN Sparse Name ---- -- ---- ---------- --------- ----------- ------ ---- 1 0 Incr 1536226 01-APR-18 NO /u01/app/oracle/oradata/o12201/system01.dbf 3 0 Incr 1536226 01-APR-18 NO /u01/app/oracle/oradata/o12201/sysaux01.dbf 4 0 Incr 1536226 01-APR-18 NO /u01/app/oracle/oradata/o12201/undotbs01.dbf 7 0 Incr 1536226 01-APR-18 NO /u01/app/oracle/oradata/o12201/users01.dbf BS Key Type LV Size Device Type Elapsed Time Completion Time ------- ---- -- ---------- ----------- ------------ --------------- 4009 Incr 0 156.45M DISK 00:00:36 01-APR-18 BP Key: 4009 Status: AVAILABLE Compressed: YES Tag: 2018040100_LEV0_MONTHLY Piece Name: /u01/app/oracle/backup/O12201/ora_df972259289_s4206_s1 Keep: BACKUP_LOGS Until: 30-JUN-18 List of Datafiles in backup set 4009 Container ID: 3, PDB Name: HR File LV Type Ckp SCN Ckp Time Abs Fuz SCN Sparse Name ---- -- ---- ---------- --------- ----------- ------ ---- 9 0 Incr 1536263 01-APR-18 NO /u01/app/oracle/oradata/o12201/hr/system01.dbf 10 0 Incr 1536263 01-APR-18 NO /u01/app/oracle/oradata/o12201/hr/sysaux01.dbf 11 0 Incr 1536263 01-APR-18 NO /u01/app/oracle/oradata/o12201/hr/undotbs01.dbf 12 0 Incr 1536263 01-APR-18 NO /u01/app/oracle/oradata/o12201/hr/users01.dbf BS Key Type LV Size Device Type Elapsed Time Completion Time ------- ---- -- ---------- ----------- ------------ --------------- 4010 Incr 0 163.59M DISK 00:00:45 01-APR-18 BP Key: 4010 Status: AVAILABLE Compressed: YES Tag: 2018040100_LEV0_MONTHLY Piece Name: /u01/app/oracle/backup/O12201/ora_df972259334_s4207_s1 Keep: BACKUP_LOGS Until: 30-JUN-18 List of Datafiles in backup set 4010 Container ID: 2, PDB Name: PDB$SEED File LV Type Ckp SCN Ckp Time Abs Fuz SCN Sparse Name ---- -- ---- ---------- --------- ----------- ------ ---- 5 0 Incr 1442466 01-JAN-19 NO /u01/app/oracle/oradata/o12201/pdbseed/system01.dbf 6 0 Incr 1442466 01-JAN-19 NO /u01/app/oracle/oradata/o12201/pdbseed/sysaux01.dbf 8 0 Incr 1442466 01-JAN-19 NO /u01/app/oracle/oradata/o12201/pdbseed/undotbs01.dbf BS Key Size Device Type Elapsed Time Completion Time ------- ---------- ----------- ------------ --------------- 4011 10.00K DISK 00:00:00 01-APR-18 BP Key: 4011 Status: AVAILABLE Compressed: YES Tag: 2018040100_LEV0_MONTHLY Piece Name: /u01/app/oracle/backup/O12201/ora_df972259380_s4208_s1 Keep: BACKUP_LOGS Until: 30-JUN-18 List of Archived Logs in backup set 4011 Thrd Seq Low SCN Low Time Next SCN Next Time ---- ------- ---------- --------- ---------- --------- 1 1635 1536177 28-MAR-18 1536308 01-APR-18 BS Key Type LV Size Device Type Elapsed Time Completion Time ------- ---- -- ---------- ----------- ------------ --------------- 4012 Full 96.00K DISK 00:00:00 01-APR-18 BP Key: 4012 Status: AVAILABLE Compressed: YES Tag: 2018040100_LEV0_MONTHLY Piece Name: /u01/app/oracle/backup/O12201/ora_df972259381_s4209_s1 Keep: BACKUP_LOGS Until: 30-JUN-18 SPFILE Included: Modification time: 02-JAN-18 SPFILE db_unique_name: O12201 BS Key Type LV Size Device Type Elapsed Time Completion Time ------- ---- -- ---------- ----------- ------------ --------------- 4013 Full 2.20M DISK 00:00:01 01-APR-18 BP Key: 4013 Status: AVAILABLE Compressed: YES Tag: 2018040100_LEV0_MONTHLY Piece Name: /u01/app/oracle/backup/O12201/ora_df972259383_s4210_s1 Keep: BACKUP_LOGS Until: 3ls 0-JUN-18 Control File Included: Ckp SCN: 1536315 Ckp time: 01-APR-18
How to test whether the backup is recoverable? We can use preview.
The target SEQUENCE / SCN is the SEQUENCE / SCN of last archive log in this backup.
For this example, it is SEQUENCE 1635 or SCN 1536307.
RMAN> run{ 2> #set until scn 1536308; 3> set until sequence 1636; 4> restore database preview; 5> } executing command: SET until clause Starting restore at 29-JUN-18 using channel ORA_DISK_1 List of Backup Sets =================== BS Key Type LV Size Device Type Elapsed Time Completion Time ------- ---- -- ---------- ----------- ------------ --------------- 4008 Incr 0 313.03M DISK 00:01:16 01-APR-18 BP Key: 4008 Status: AVAILABLE Compressed: YES Tag: 2018040100_LEV0_MONTHLY Piece Name: /u01/app/oracle/backup/O12201/ora_df972259203_s4205_s1 Keep: BACKUP_LOGS Until: 30-JUN-18 List of Datafiles in backup set 4008 File LV Type Ckp SCN Ckp Time Abs Fuz SCN Sparse Name ---- -- ---- ---------- --------- ----------- ------ ---- 1 0 Incr 1536226 01-APR-18 NO /u01/app/oracle/oradata/o12201/system01.dbf 3 0 Incr 1536226 01-APR-18 NO /u01/app/oracle/oradata/o12201/sysaux01.dbf 4 0 Incr 1536226 01-APR-18 NO /u01/app/oracle/oradata/o12201/undotbs01.dbf 7 0 Incr 1536226 01-APR-18 NO /u01/app/oracle/oradata/o12201/users01.dbf BS Key Type LV Size Device Type Elapsed Time Completion Time ------- ---- -- ---------- ----------- ------------ --------------- 5000 Incr 0 163.59M DISK 00:00:30 22-JUN-18 BP Key: 5000 Status: AVAILABLE Compressed: YES Tag: 2018062200_LEV0_WEEKLY Piece Name: /u01/app/oracle/backup/O12201/ora_df979430494_s5197_s1 Keep: BACKUP_LOGS Until: 24-JUL-18 List of Datafiles in backup set 5000 Container ID: 2, PDB Name: PDB$SEED File LV Type Ckp SCN Ckp Time Abs Fuz SCN Sparse Name ---- -- ---- ---------- --------- ----------- ------ ---- 5 0 Incr 1442466 01-JAN-19 NO /u01/app/oracle/oradata/o12201/pdbseed/system01.dbf 6 0 Incr 1442466 01-JAN-19 NO /u01/app/oracle/oradata/o12201/pdbseed/sysaux01.dbf 8 0 Incr 1442466 01-JAN-19 NO /u01/app/oracle/oradata/o12201/pdbseed/undotbs01.dbf BS Key Type LV Size Device Type Elapsed Time Completion Time ------- ---- -- ---------- ----------- ------------ --------------- 4009 Incr 0 156.45M DISK 00:00:36 01-APR-18 BP Key: 4009 Status: AVAILABLE Compressed: YES Tag: 2018040100_LEV0_MONTHLY Piece Name: /u01/app/oracle/backup/O12201/ora_df972259289_s4206_s1 Keep: BACKUP_LOGS Until: 30-JUN-18 List of Datafiles in backup set 4009 Container ID: 3, PDB Name: HR File LV Type Ckp SCN Ckp Time Abs Fuz SCN Sparse Name ---- -- ---- ---------- --------- ----------- ------ ---- 9 0 Incr 1536263 01-APR-18 NO /u01/app/oracle/oradata/o12201/hr/system01.dbf 10 0 Incr 1536263 01-APR-18 NO /u01/app/oracle/oradata/o12201/hr/sysaux01.dbf 11 0 Incr 1536263 01-APR-18 NO /u01/app/oracle/oradata/o12201/hr/undotbs01.dbf 12 0 Incr 1536263 01-APR-18 NO /u01/app/oracle/oradata/o12201/hr/users01.dbf List of Backup Sets =================== BS Key Size Device Type Elapsed Time Completion Time ------- ---------- ----------- ------------ --------------- 4011 10.00K DISK 00:00:00 01-APR-18 BP Key: 4011 Status: AVAILABLE Compressed: YES Tag: 2018040100_LEV0_MONTHLY Piece Name: /u01/app/oracle/backup/O12201/ora_df972259380_s4208_s1 List of Archived Logs in backup set 4011 Thrd Seq Low SCN Low Time Next SCN Next Time ---- ------- ---------- --------- ---------- --------- 1 1635 1536177 28-MAR-18 1536308 01-APR-18 recovery will be done up to SCN 1536308 Media recovery start SCN is 1536226 Recovery must be done beyond SCN 1536263 to clear datafile fuzziness Finished restore at 29-JUN-18
We can also use preview to check whether the daily backup is recoverable.
RMAN> restore pluggable database "HR" preview; Starting restore at 29-JUN-18 using channel ORA_DISK_1 List of Backup Sets =================== BS Key Type LV Size Device Type Elapsed Time Completion Time ------- ---- -- ---------- ----------- ------------ --------------- 5006 Incr 0 156.46M DISK 00:00:27 22-JUN-18 BP Key: 5006 Status: AVAILABLE Compressed: YES Tag: 2018062200_LEV0_DAILY Piece Name: /u01/app/oracle/backup/O12201/ora_df979430610_s5203_s1 List of Datafiles in backup set 5006 Container ID: 3, PDB Name: HR File LV Type Ckp SCN Ckp Time Abs Fuz SCN Sparse Name ---- -- ---- ---------- --------- ----------- ------ ---- 9 0 Incr 1554221 22-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/system01.dbf 10 0 Incr 1554221 22-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/sysaux01.dbf 11 0 Incr 1554221 22-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/undotbs01.dbf 12 0 Incr 1554221 22-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/users01.dbf BS Key Type LV Size Device Type Elapsed Time Completion Time ------- ---- -- ---------- ----------- ------------ --------------- 5018 Incr 1 64.00K DISK 00:00:00 23-JUN-18 BP Key: 5018 Status: AVAILABLE Compressed: YES Tag: 2018062300_LEV1_DAILY Piece Name: /u01/app/oracle/backup/O12201/ora_df979516807_s5215_s1 List of Datafiles in backup set 5018 Container ID: 3, PDB Name: HR File LV Type Ckp SCN Ckp Time Abs Fuz SCN Sparse Name ---- -- ---- ---------- --------- ----------- ------ ---- 9 1 Incr 1554446 23-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/system01.dbf 10 1 Incr 1554446 23-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/sysaux01.dbf 11 1 Incr 1554446 23-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/undotbs01.dbf 12 1 Incr 1554446 23-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/users01.dbf BS Key Type LV Size Device Type Elapsed Time Completion Time ------- ---- -- ---------- ----------- ------------ --------------- 5030 Incr 1 64.00K DISK 00:00:01 24-JUN-18 BP Key: 5030 Status: AVAILABLE Compressed: YES Tag: 2018062400_LEV1_DAILY Piece Name: /u01/app/oracle/backup/O12201/ora_df979603209_s5227_s1 List of Datafiles in backup set 5030 Container ID: 3, PDB Name: HR File LV Type Ckp SCN Ckp Time Abs Fuz SCN Sparse Name ---- -- ---- ---------- --------- ----------- ------ ---- 9 1 Incr 1554665 24-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/system01.dbf 10 1 Incr 1554665 24-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/sysaux01.dbf 11 1 Incr 1554665 24-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/undotbs01.dbf 12 1 Incr 1554665 24-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/users01.dbf BS Key Type LV Size Device Type Elapsed Time Completion Time ------- ---- -- ---------- ----------- ------------ --------------- 5042 Incr 1 64.00K DISK 00:00:00 25-JUN-18 BP Key: 5042 Status: AVAILABLE Compressed: YES Tag: 2018062500_LEV1_DAILY Piece Name: /u01/app/oracle/backup/O12201/ora_df979689608_s5239_s1 List of Datafiles in backup set 5042 Container ID: 3, PDB Name: HR File LV Type Ckp SCN Ckp Time Abs Fuz SCN Sparse Name ---- -- ---- ---------- --------- ----------- ------ ---- 9 1 Incr 1554881 25-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/system01.dbf 10 1 Incr 1554881 25-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/sysaux01.dbf 11 1 Incr 1554881 25-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/undotbs01.dbf 12 1 Incr 1554881 25-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/users01.dbf BS Key Type LV Size Device Type Elapsed Time Completion Time ------- ---- -- ---------- ----------- ------------ --------------- 5054 Incr 1 64.00K DISK 00:00:01 26-JUN-18 BP Key: 5054 Status: AVAILABLE Compressed: YES Tag: 2018062600_LEV1_DAILY Piece Name: /u01/app/oracle/backup/O12201/ora_df979776008_s5251_s1 List of Datafiles in backup set 5054 Container ID: 3, PDB Name: HR File LV Type Ckp SCN Ckp Time Abs Fuz SCN Sparse Name ---- -- ---- ---------- --------- ----------- ------ ---- 9 1 Incr 1555091 26-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/system01.dbf 10 1 Incr 1555091 26-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/sysaux01.dbf 11 1 Incr 1555091 26-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/undotbs01.dbf 12 1 Incr 1555091 26-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/users01.dbf BS Key Type LV Size Device Type Elapsed Time Completion Time ------- ---- -- ---------- ----------- ------------ --------------- 5066 Incr 1 64.00K DISK 00:00:01 27-JUN-18 BP Key: 5066 Status: AVAILABLE Compressed: YES Tag: 2018062700_LEV1_DAILY Piece Name: /u01/app/oracle/backup/O12201/ora_df979862407_s5263_s1 List of Datafiles in backup set 5066 Container ID: 3, PDB Name: HR File LV Type Ckp SCN Ckp Time Abs Fuz SCN Sparse Name ---- -- ---- ---------- --------- ----------- ------ ---- 9 1 Incr 1555311 27-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/system01.dbf 10 1 Incr 1555311 27-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/sysaux01.dbf 11 1 Incr 1555311 27-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/undotbs01.dbf 12 1 Incr 1555311 27-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/users01.dbf BS Key Type LV Size Device Type Elapsed Time Completion Time ------- ---- -- ---------- ----------- ------------ --------------- 5078 Incr 1 64.00K DISK 00:00:01 28-JUN-18 BP Key: 5078 Status: AVAILABLE Compressed: YES Tag: 2018062800_LEV1_DAILY Piece Name: /u01/app/oracle/backup/O12201/ora_df979948807_s5275_s1 List of Datafiles in backup set 5078 Container ID: 3, PDB Name: HR File LV Type Ckp SCN Ckp Time Abs Fuz SCN Sparse Name ---- -- ---- ---------- --------- ----------- ------ ---- 9 1 Incr 1555527 28-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/system01.dbf 10 1 Incr 1555527 28-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/sysaux01.dbf 11 1 Incr 1555527 28-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/undotbs01.dbf 12 1 Incr 1555527 28-JUN-18 NO /u01/app/oracle/oradata/o12201/hr/users01.dbf List of Backup Sets =================== BS Key Size Device Type Elapsed Time Completion Time ------- ---------- ----------- ------------ --------------- 5080 3.50K DISK 00:00:00 28-JUN-18 BP Key: 5080 Status: AVAILABLE Compressed: YES Tag: 2018062800_LEV1_DAILY Piece Name: /u01/app/oracle/backup/O12201/ora_df979948811_s5277_s1 List of Archived Logs in backup set 5080 Thrd Seq Low SCN Low Time Next SCN Next Time ---- ------- ---------- --------- ---------- --------- 1 2063 1555513 28-JUN-18 1555534 28-JUN-18 BS Key Size Device Type Elapsed Time Completion Time ------- ---------- ----------- ------------ --------------- 5082 8.50K DISK 00:00:00 28-JUN-18 BP Key: 5082 Status: AVAILABLE Compressed: YES Tag: TAG20180628T060003 Piece Name: /u01/app/oracle/backup/O12201/ora_df979970403_s5279_s1 List of Archived Logs in backup set 5082 Thrd Seq Low SCN Low Time Next SCN Next Time ---- ------- ---------- --------- ---------- --------- 1 2064 1555534 28-JUN-18 1555584 28-JUN-18 BS Key Size Device Type Elapsed Time Completion Time ------- ---------- ----------- ------------ --------------- 5084 3.00K DISK 00:00:00 28-JUN-18 BP Key: 5084 Status: AVAILABLE Compressed: YES Tag: TAG20180628T120003 Piece Name: /u01/app/oracle/backup/O12201/ora_df979992003_s5281_s1 List of Archived Logs in backup set 5084 Thrd Seq Low SCN Low Time Next SCN Next Time ---- ------- ---------- --------- ---------- --------- 1 2065 1555584 28-JUN-18 1555631 28-JUN-18 BS Key Size Device Type Elapsed Time Completion Time ------- ---------- ----------- ------------ --------------- 5086 3.50K DISK 00:00:00 28-JUN-18 BP Key: 5086 Status: AVAILABLE Compressed: YES Tag: TAG20180628T180006 Piece Name: /u01/app/oracle/backup/O12201/ora_df980013606_s5283_s1 List of Archived Logs in backup set 5086 Thrd Seq Low SCN Low Time Next SCN Next Time ---- ------- ---------- --------- ---------- --------- 1 2066 1555631 28-JUN-18 1555677 28-JUN-18 recovery will be done up to SCN 1555527 Media recovery start SCN is 1555527 Recovery must be done beyond SCN 1555527 to clear datafile fuzziness Finished restore at 29-JUN-18
Oracle - spawn extproc by listener or database
We can use EXTPROC to invoke an external programs, like C.
There is an simple example about setting:
How to Dynamically Execute a SQL Script from a Stored Procedure? (Doc ID 341586.1)
Be aware that the IPC listener and extproc_connection_data is not mandatory, without IPC listener, a local extproc process will be started to communicate with server process. The C code I am using can be found:
How To Run Extproc Agent Under A Less Privileged Account (Doc ID 1204853.1)
SQL> create or replace directory libdir as '$ORACLE_HOME/lib'; Directory created. SQL> SQL> CREATE or replace LIBRARY shell_lib is 'shell.so' in libdir; 2 / Library created. SQL> CREATE OR REPLACE PACKAGE BD_UTIL as 2 FUNCTION shell_ejecucion(comando in char) return pls_integer; 3 END; 4 / Package created. SQL> SQL> CREATE OR REPLACE PACKAGE BODY BD_UTIL is 2 FUNCTION shell_ejecucion(comando in char) return pls_integer 3 as language C 4 name "sh" 5 library shell_lib 6 parameters (comando string,return int); 7 END bd_util; 8 / Package body created. SQL> declare 2 retorno binary_integer; 3 begin 4 retorno := bd_util.shell_ejecucion('/bin/date > /tmp/test.txt'); 5 dbms_output.put_line(retorno); 6 end; 7 / PL/SQL procedure successfully completed. SQL> host ls -l /tmp/test.txt -rw-r--r--. 1 oracle oinstall 29 Feb 25 22:10 /tmp/test.txt -- with IPC listener, and extproc_connection_data -- we can see there is a EXTPROC being spawned by Listener, and LOCAL=NO # ps -ef|grep extproc oracle 2732 1 0 22:41 ? 00:00:00 /opt/oracle/product/12.1/dbhome_1/bin/extprocPLSExtProc (LOCAL=NO) root 2741 26893 0 22:41 pts/2 00:00:00 grep --color=auto extproc -- after remove entry in tnsnames.ora -- a little delay, the process will be spawned after timeout, with LOCAL=YES # ps -ef|grep extproc oracle 2887 1 0 22:42 ? 00:00:00 /opt/oracle/product/12.1/dbhome_1/bin/extproc (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=BEQ)))
Usually, the EXTPROC will be folked in “oracle” which is the owner of database and listener.
For security consideration, we normally want to extproc can be spawned in the less privilege users.
So, for the situation above, there are two different way to configure this.
If the process is spawned by database, we can use credential.
SQL> begin 2 dbms_credential.create_credential( 3 credential_name => 'run_marc', 4 username => 'marc', 5 password => '123456' 6 ); 7 end; 8 / PL/SQL procedure successfully completed. SQL> CREATE or replace LIBRARY shell_lib is 'shell.so' in libdir credential run_marc; 2 / Library created. SQL> declare 2 retorno binary_integer; 3 begin 4 retorno := bd_util.shell_ejecucion('/bin/date > /tmp/test.txt'); 5 dbms_output.put_line(retorno); 6 end; 7 / 0 PL/SQL procedure successfully completed. SQL> host ls -l /tmp/test.txt -rw-rw-rw-. 1 marc users 29 Feb 25 22:52 /tmp/test.txt
We change the owner and permission of “/tmp/test.txt”, let user marc cannot write it.
# chown oracle.oinstall /tmp/test.txt # chmod 644 /tmp/test.txt SQL> set serveroutput on SQL> declare 2 retorno binary_integer; 3 begin 4 retorno := bd_util.shell_ejecucion('/bin/date > /tmp/test.txt'); 5 dbms_output.put_line(retorno); 6 end; 7 / 256 -- an error reported. PL/SQL procedure successfully completed.
Unfortunately, credential is not working if the extproc is spawned by listener.
I add extproc_connection_data in tnsnames.ora
SQL> host rm -rf /tmp/test.txt SQL> declare 2 retorno binary_integer; 3 begin 4 retorno := bd_util.shell_ejecucion('/bin/date > /tmp/test.txt'); 5 dbms_output.put_line(retorno); 6 end; 7 / 0 PL/SQL procedure successfully completed. -- the file is still generated by oracle SQL> host ls -l /tmp/test.txt -rw-r--r--. 1 oracle oinstall 29 Feb 25 23:13 /tmp/test.txt
How To Run Extproc Agent Under A Less Privileged Account (Doc ID 1204853.1)
gives a way but not working, even the process is spawned as marc
[root@testenv bin]# ps -ef|grep extproc marc 5936 1 0 23:24 ? 00:00:00 /opt/oracle/product/12.1/dbhome_1/bin/extprocPLSExtProc (LOCAL=NO) -- rerun the test code. -- the file is still generated by oracle, even the credential is to marc SQL> host ls -l /tmp/test.txt -rw-r--r--. 1 oracle oinstall 29 Feb 25 23:25 /tmp/test.txt
Due to the extproc which spawned by listener will inherent the priliveges of listener. So, we need to move IPC listener to the diffenent user.
$ whoami marc $ . oraenv ORACLE_SID = [o12102] ? o12102 The Oracle base remains unchanged with value /opt/oracle $ export TNS_ADMIN=/home/marc/admin $ lsnrctl start extproc_listener LSNRCTL for Linux: Version 12.1.0.2.0 - Production on 25-FEB-2019 23:33:02 Copyright (c) 1991, 2014, Oracle. All rights reserved. Starting /opt/oracle/product/12.1/dbhome_1/bin/tnslsnr: please wait... TNSLSNR for Linux: Version 12.1.0.2.0 - Production System parameter file is /home/marc/admin/listener.ora Log messages written to /opt/oracle/diag/tnslsnr/testenv/extproc_listener/alert/log.xml Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1))) Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=EXTPROC1))) STATUS of the LISTENER ------------------------ Alias extproc_listener Version TNSLSNR for Linux: Version 12.1.0.2.0 - Production Start Date 25-FEB-2019 23:33:02 Uptime 0 days 0 hr. 0 min. 0 sec Trace Level off Security ON: Local OS Authentication SNMP OFF Listener Parameter File /home/marc/admin/listener.ora Listener Log File /opt/oracle/diag/tnslsnr/testenv/extproc_listener/alert/log.xml Listening Endpoints Summary... (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1))) Services Summary... Service "PLSExtProc" has 1 instance(s). Instance "PLSExtProc", status UNKNOWN, has 1 handler(s) for this service... The command completed successfully $ ps -ef|grep tns root 22 2 0 Feb24 ? 00:00:00 [netns] oracle 2353 1 0 22:39 ? 00:00:00 /opt/oracle/product/12.1/dbhome_1/bin/tnslsnr LISTENER -inherit marc 6582 1 0 23:33 ? 00:00:00 /opt/oracle/product/12.1/dbhome_1/bin/tnslsnr extproc_listener -inherit -- change tnsnames.ora: extproc_connection_data = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1))) (CONNECT_DATA = (SID = PLSExtProc) (PRESENTATION = RO)) ) SQL> declare 2 retorno binary_integer; 3 begin 4 retorno := bd_util.shell_ejecucion('/bin/date > /tmp/test.txt'); 5 dbms_output.put_line(retorno); 6 end; 7 / PL/SQL procedure successfully completed. SQL> host ls -l /tmp/test.txt -rw-r--r--. 1 marc users 29 Feb 25 23:36 /tmp/test.txt
Be careful, from Oracle database version 12.2.0.1 and above, the “extproc” agent no longer resolves the EXTPROC_CONNECTION_DATA entry in tnsnames.ora for security reaons and the agent gets spawned by the database itself.
Configuring Listener To Spawn EXTPROC Agent Is Not Working in 12.2, EXTPROC_CONNECTION_DATA Is Not Getting Resolved. (Doc ID 2326592.1)
We have to use agent after 12.2 to use the listener to spawn an EXTPROC
$ cat listener.ora LISTENER = (DESCRIPTION_LIST = (DESCRIPTION = (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC2)) (ADDRESS = (PROTOCOL = TCP)(HOST = testenv)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCPS)(HOST = testenv)(PORT = 2484)) ) ) SID_LIST_LISTENER = (SID_LIST = (SID_DESC = (SID_NAME = PLSExtProc) (ORACLE_HOME=/opt/oracle/product/12.2/dbhome_1) (PROGRAM=/opt/oracle/product/12.2/dbhome_1/bin/extproc) (ENVS="EXTPROC_DLLS=ANY", "LD_LIBRARY_PATH=/home/oracle/lib") ) ) $ cat tnsnames.ora extproc_connection_data = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC2))) (CONNECT_DATA = (SID = PLSExtProc) (PRESENTATION = RO)) )
This time, I put shell.so to /home/oracle/lib
SQL> create or replace directory libdir as '/home/oracle/lib'; Directory created. SQL> SQL> CREATE or replace LIBRARY shell_lib is 'shell.so' in libdir; 2 / Library created. SQL> declare 2 retorno binary_integer; 3 begin 4 retorno := bd_util.shell_ejecucion('/bin/date > /tmp/test.txt'); 5 dbms_output.put_line(retorno); 6 end; 7 / declare * ERROR at line 1: ORA-28595: Extproc agent : Invalid DLL Path ORA-06512: at "SYS.BD_UTIL", line 2 ORA-06512: at line 4
Due to /home/oracle/lib is not on the LD_LIBRARY_PATH in .bash_profile, so, the EXTPROC spawned by database cannot find this one in 12.2
We can follow DOC 1109144.1 to set
How to Execute Both 32-bit and 64-bit External Procedures Within The Same 11.2 and Above Database (Doc ID 1109144.1)
Be aware that the tnsname cannot be extproc_connection_data
SQL> DATABASE LINK agent_link USING 'extproc_connection_data'; Database link created. SQL> CREATE or replace LIBRARY shell_lib is 'shell.so' in libdir AGENT 'agent_link'; Library created. SQL> declare 2 retorno binary_integer; 3 begin 4 retorno := bd_util.shell_ejecucion('/bin/date > /tmp/test.txt'); 5 dbms_output.put_line(retorno); 6 end; 7 / declare * ERROR at line 1: ORA-28595: Extproc agent : Invalid DLL Path ORA-06512: at "SYS.BD_UTIL", line 2 ORA-06512: at line 4 -- rename the tnsname to extproc_connection_data2 SQL> DROP DATABASE LINK agent_link; Database link dropped. SQL> CREATE DATABASE LINK agent_link USING 'extproc_connection_data2'; Database link created. SQL> CREATE or replace LIBRARY shell_lib is 'shell.so' in libdir AGENT 'agent_link'; Library created. SQL> declare 2 retorno binary_integer; 3 begin 4 retorno := bd_util.shell_ejecucion('/bin/date > /tmp/test.txt'); 5 dbms_output.put_line(retorno); 6 end; 7 / 0 PL/SQL procedure successfully completed.
Looks like it is impossible to disable database to spawn an EXTPROC. So, in order to control the behavior of EXTPROC
- always use listener to spawn EXTPROC
- use credential to restrict the EXTPROC the database spawned in the less privileged user.
- do not store .so file in the LD_LIBRARY_PATH, using listener to locate .so
- use ENVS=”EXTPROC_DLLS=ONLY:/home/oracle/lib/shell.so” in listener to restrict the .so the user can use.
- never give create library and grant the use of library to end user directly, use procedure to encapsulate it, and only give the execute privilege on this procedure to end user.
Oracle - Database Workload Replay
Recently, I am working on the loading test of our new environment.
Due to we dont have the team for the loading test specifically, I decided to use Database Replay for testing.
Be aware that the Database Replay needs the additional license.
First, we need to capture the activities on the production database. This is very simple, make sure we have enough disk space to keep the trace file.
The space needed to store them are vary, for our application, from 9am to 5pm, the system has 35003359 user calls and generates 11GB archived logs and 8.42 GB replay files.
And also, turn capture on will increase around 5% CPU usage.
It is recommended to restart the database, and put the database into restrict mode before start capture to make sure all activities can be captured.
But it is not mandatory, and it is really hard to achive. So, I ignored this suggestion.
I started capture at 8:58 am, and run it for the next 8 hours(28800 seconds)
We can do filter on the capture side, but it is hard to decide which user/module you want to filter at the begin, it would be better to do it on the replay side.
create directory db_replay_capture_dir as '/mnt/old_air04/app/oracle/admin/AIR/db_replay_capture'; exec dbms_workload_capture.start_capture(name=>'pprd_capture_1', dir=>'DB_REPLAY_CAPTURE_DIR', duration=> 28800);
After capture finished, generate replay report
set linesize 170 pagesize 999 set long 30000000 longchunksize 1000 serverout on variable cap_id number; exec :cap_id:= dbms_workload_capture.get_capture_info(dir=>'DB_REPLAY_CAPTURE_DIR'); variable cap_rpt clob; exec :cap_rpt := dbms_workload_capture.report(capture_id=>:cap_id, format=> dbms_workload_capture.type_html); select :cap_rpt from dual;
This report is very similar with an ASH report plus some metadata of capture:
Export AWR, it is needed when we run workload analyzer on the replay side. The dmp file is on the capture directory.
exec dbms_workload_capture.export_awr(capture_id=>1)
Copy the folder to the replay side.
Next, we need to prepare our target database. Using rman to duplicate one on the another machine is very simple.
Making sure the FORCE_LOGGING is on
1.Get the restore target SCN
SQL> select START_SCN from dba_workload_captures; START_SCN -------------------- 145166523720
2.Restore(with catalog, without connection to source)
run { allocate auxiliary channel dupl1 device type DISK; allocate auxiliary channel dupl2 device type DISK; set dbid 451988051; set until scn 145166523720; duplicate database to PPRD3 pfile =?/dbs/initPPRD3.ora noresume; }
3.Setting FRA
*.DB_FLASHBACK_RETENTION_TARGET=4320 # not necessary to turn flashback on *.db_recovery_file_dest=/t03/app/oracle/PPRD3_FRA *.db_recovery_file_dest_size=100G *.log_archive_dest_1='LOCATION=USE_DB_RECOVERY_FILE_DEST mandatory'
4.Restore target database into mount mode, create a guarantee restore point
RMAN> create restore point before_db_replay guarantee flashback database;
If you are doing upgrade testing, apply patchset before creating a restore point.
Now, we have the replay target database, and activity trace file on the target box, before replay, we need to preprocess trace file
exec dbms_workload_replay.process_capture('DB_REPLAY_CAPTURE_DIR');
It takes 2 hours for 8GB trace files.
And run the workload analyzer.
java -classpath $ORACLE_HOME/jdbc/lib/ojdbc6.jar:$ORACLE_HOME/rdbms/jlib/dbrparser.jar:$ORACLE_HOME/rdbms/jlib/dbranalyzer.jar \ oracle.dbreplay.workload.checker.CaptureChecker \ /t02/app/oracle/oradata/PPRD3/db_replay_capture \ jdbc:oracle:thin:@zoro.xxx.xxx.edu:1521:pprd3
The output is on the CAPTURE folder, it shows the potential impact of increasing the replay time and divergence.
In my case:
- A significant part of our workload comes from PL/SQL, it is very hard to synchronize the commit inside if we are using synchronized mode to replay. We need to consider to turn off synchronization mode.
- Many captured sessions had been connected when I turn the capture on, if the connection is not stateless, we may encounter many divergences
After preprocessing, we can start to run replay. It is recommended to do SPA before Database Replay, but this time, we are doing physical migration without database upgrade, so I skipped this step.
drop directory db_replay_capture_dir; create directory db_replay_capture_dir as '/t02/app/oracle/oradata/PPRD3/db_replay_capture'; exec dbms_workload_replay.initialize_replay(replay_name => 'air_capture_1', replay_dir=>'DB_REPLAY_CAPTURE_DIR') -- Set timeout of execution. -- If the execution time is longer than 1 minute(min_delay), Oracle will check the execution time on the source side. -- If it has been 5 times(delay_factor) longer, the query will be aborted. -- And if the query is longer than 60 minutes(max_delay), it will also be terminated. exec dbms_workload_replay.set_replay_timeout(enabled=>true, min_delay=>1, max_delay=>60, delay_factor=>5); -- create some filters to exclude unnecessary sessions. exec dbms_workload_replay.add_filter(fname=>'NO_ONBRDMGR', fattribute => 'USER', fvalue=>'ONBRDMGR') exec dbms_workload_replay.add_filter(fname=>'NO_DBSNMP', fattribute => 'USER', fvalue=>'DBSNMP') exec dbms_workload_replay.add_filter(fname=>'NO_SYSTEM', fattribute => 'USER', fvalue=>'SYSTEM') exec dbms_workload_replay.add_filter(fname=>'NO_SYS', fattribute => 'USER', fvalue=>'SYS') -- in order to use these filters, we need to create a filter set. -- All filters created after the last filter set will be included in this new filter set automatically. -- the filter/filter set will be created in the replay folder, not in database. -- slow, 5 minutes roughly. exec dbms_workload_replay.create_filter_set(replay_dir=>'DB_REPLAY_CAPTURE_DIR', filter_set=> 'FILTER', default_action=>'INCLUDE') -- use the filter we created. exec dbms_workload_replay.use_filter_set(filter_set=>'FILTER'); -- slow, 3 minutes exec dbms_workload_replay.prepare_replay(synchronization=>FALSE ,CONNECT_TIME_SCALE=>100, THINK_TIME_SCALE=>100, scale_up_multiplier=>1, think_time_auto_correct=>true );
Then, go to the OS, do a calibrate to know how many clients we need.
$ wrc mode=calibrate replaydir=/t02/app/oracle/oradata/PPRD3/db_replay_capture Workload Replay Client: Release 12.1.0.2.0 - Production on Fri Feb 22 11:41:48 2019 Copyright (c) 1982, 2014, Oracle and/or its affiliates. All rights reserved. Report for Workload in: /t02/app/oracle/oradata/PPRD3/db_replay_capture ----------------------- Recommendation: Consider using at least 22 clients divided among 6 CPU(s) You will need at least 127 MB of memory per client process. If your machine(s) cannot match that number, consider using more clients. Workload Characteristics: - max concurrency: 748 sessions - total number of sessions: 134805 Assumptions: - 1 client process per 50 concurrent sessions - 4 client processes per CPU - 256 KB of memory cache per concurrent session - think time scale = 100 - connect time scale = 100 - synchronization = TRUE
In my case, we need 22 clients. It is recommended that have clients on the separate machine.
export ORACLE_SID=PPRD3 export ADR_BASE=/l01/app/oracle/replay_client nohup wrc system/xxxxxxx mode=replay dscn_off=true replaydir=/t02/app/oracle/oradata/PPRD3/db_replay_capture > wrc1.log & nohup wrc system/xxxxxxx mode=replay dscn_off=true replaydir=/t02/app/oracle/oradata/PPRD3/db_replay_capture > wrc2.log & nohup wrc system/xxxxxxx mode=replay dscn_off=true replaydir=/t02/app/oracle/oradata/PPRD3/db_replay_capture > wrc3.log & nohup wrc system/xxxxxxx mode=replay dscn_off=true replaydir=/t02/app/oracle/oradata/PPRD3/db_replay_capture > wrc4.log & nohup wrc system/xxxxxxx mode=replay dscn_off=true replaydir=/t02/app/oracle/oradata/PPRD3/db_replay_capture > wrc5.log & nohup wrc system/xxxxxxx mode=replay dscn_off=true replaydir=/t02/app/oracle/oradata/PPRD3/db_replay_capture > wrc6.log & nohup wrc system/xxxxxxx mode=replay dscn_off=true replaydir=/t02/app/oracle/oradata/PPRD3/db_replay_capture > wrc7.log & nohup wrc system/xxxxxxx mode=replay dscn_off=true replaydir=/t02/app/oracle/oradata/PPRD3/db_replay_capture > wrc8.log & nohup wrc system/xxxxxxx mode=replay dscn_off=true replaydir=/t02/app/oracle/oradata/PPRD3/db_replay_capture > wrc9.log & nohup wrc system/xxxxxxx mode=replay dscn_off=true replaydir=/t02/app/oracle/oradata/PPRD3/db_replay_capture > wrc10.log & nohup wrc system/xxxxxxx mode=replay dscn_off=true replaydir=/t02/app/oracle/oradata/PPRD3/db_replay_capture > wrc11.log & nohup wrc system/xxxxxxx mode=replay dscn_off=true replaydir=/t02/app/oracle/oradata/PPRD3/db_replay_capture > wrc12.log & nohup wrc system/xxxxxxx mode=replay dscn_off=true replaydir=/t02/app/oracle/oradata/PPRD3/db_replay_capture > wrc13.log & nohup wrc system/xxxxxxx mode=replay dscn_off=true replaydir=/t02/app/oracle/oradata/PPRD3/db_replay_capture > wrc14.log & nohup wrc system/xxxxxxx mode=replay dscn_off=true replaydir=/t02/app/oracle/oradata/PPRD3/db_replay_capture > wrc15.log & nohup wrc system/xxxxxxx mode=replay dscn_off=true replaydir=/t02/app/oracle/oradata/PPRD3/db_replay_capture > wrc16.log & nohup wrc system/xxxxxxx mode=replay dscn_off=true replaydir=/t02/app/oracle/oradata/PPRD3/db_replay_capture > wrc17.log & nohup wrc system/xxxxxxx mode=replay dscn_off=true replaydir=/t02/app/oracle/oradata/PPRD3/db_replay_capture > wrc18.log & nohup wrc system/xxxxxxx mode=replay dscn_off=true replaydir=/t02/app/oracle/oradata/PPRD3/db_replay_capture > wrc19.log & nohup wrc system/xxxxxxx mode=replay dscn_off=true replaydir=/t02/app/oracle/oradata/PPRD3/db_replay_capture > wrc20.log & nohup wrc system/xxxxxxx mode=replay dscn_off=true replaydir=/t02/app/oracle/oradata/PPRD3/db_replay_capture > wrc21.log & nohup wrc system/xxxxxxx mode=replay dscn_off=true replaydir=/t02/app/oracle/oradata/PPRD3/db_replay_capture > wrc22.log &
Then, go back to database and we can start replay:
exec dbms_workload_replay.start_replay;
The script which can be used to monitor the approximate process of the replay.
But it is not working with synchronization=of
set serveroutput on DECLARE my_next_ticker NUMBER; clock NUMBER; wait_for_scn NUMBER; counts NUMBER; replay_id NUMBER; thr_failure NUMBER; start_time DATE; num_tickers NUMBER; min_scn NUMBER; max_scn NUMBER; done NUMBER; total_time INTERVAL DAY TO SECOND; CURSOR get_next_ticker(my_next_ticker NUMBER) IS SELECT spid, event, inst_id, wrc_id, client_pid FROM gv$workload_replay_thread WHERE file_id = my_next_ticker; BEGIN dbms_output.put_line('********************************'); dbms_output.put_line('* Replay Status Report *'); dbms_output.put_line('********************************'); ----------------------------------------- -- Make sure that a replay is in progress ----------------------------------------- SELECT count(*) INTO counts FROM dba_workload_replays WHERE status='IN PROGRESS'; if (counts = 0) then dbms_output.put_line('No replay in progress!'); return; end if; ------------------- -- Get replay state ------------------- SELECT id,start_time INTO replay_id, start_time FROM dba_workload_replays WHERE status='IN PROGRESS'; SELECT count(*) INTO counts FROM gv$workload_replay_thread WHERE session_type = 'REPLAY'; SELECT min(wait_for_scn), max(next_ticker), max(clock) INTO wait_for_scn, my_next_ticker, clock FROM v$workload_replay_thread WHERE wait_for_scn <> 0 AND session_type = 'REPLAY'; dbms_output.put_line('Replay has been running for: ' || to_char(systimestamp - start_time)); dbms_output.put_line('Current clock is: ' || clock); dbms_output.put_line('Replay is waiting on clock: ' || wait_for_scn); dbms_output.put_line(counts || ' threads are currently being replayed.'); ---------------------------------------- -- Find info about the next clock ticker ---------------------------------------- num_tickers := 0; for rec in get_next_ticker(my_next_ticker) loop -- We only want the next clock ticker num_tickers := num_tickers + 1; exit when num_tickers > 1; dbms_output.put_line('Next ticker is process ' || rec.spid || ' (' || rec.wrc_id || ',' || rec.client_pid || ') in instance ' || rec.inst_id || ' and is waiting on '); dbms_output.put_line(' ' || rec.event); end loop; --------------------------------------------------------------------------------------- -- Compute the replay progression and estimate the time left -- Note: This is an estimated time only, not an absolute value as it is based on SCN. --------------------------------------------------------------------------------------- SELECT min(post_commit_scn), max(post_commit_scn) INTO min_scn,max_scn FROM wrr$_replay_scn_order; done := (clock - min_scn) / (max_scn - min_scn); total_time := (systimestamp - start_time) / done; dbms_output.put_line('Estimated progression in replay: ' || to_char(100*done, '00') || '% done.'); dbms_output.put_line('Estimated time before completion: ' || ((1 - done) * total_time)); dbms_output.put_line('Estimated total time for replay: ' || total_time); dbms_output.put_line('Estimated final time for replay: ' || to_char(start_time + total_time, 'DD-MON-YY HH24:MI:SS')); END; /
After replay:
set linesize 170 pagesize 999 set long 30000000 longchunksize 1000 serverout on select dbms_workload_replay.report(replay_id => 53,format => 'HTML') from dual;
And also, we need to export the AWR of this replay, it is needed to do compare with the other replays.
exec DBMS_WORKLOAD_REPLAY.EXPORT_AWR (replay_id => 1); -- Import: CREATE USER capture_awr; SELECT DBMS_WORKLOAD_REPLAY.IMPORT_AWR (replay_id => 1, staging_schema => 'capture_awr') FROM DUAL;
Before start database replay, I suggest to read the following articles:
Troubleshooting Slow Application Testing Replay (Doc ID 2043847.1)
- Enough clients, run calibrate before start replay
- If too many session on WCR: replay lock order, add parameter dscn_off=true to the WRC client to ignore SCN dependencies during replay.
- If too many session on WCR: replay clock, try setting sync=false to speed up the replay. The reason is that some user action, such as calls to dbms_pipe, dbms_lock.sleep cannot synchronized during replay, and A significant part of the workload coming from PL/SQL and the PL/SQL blocks or functions have ‘complicated’ logic or multiple commits in them, they are hard to synchronize.
- Checking replay divergence
- JAVA_XA are not supported, it will be captured as normal PL/SQL workload. To avoid problems during workload replay, consider dropping the JAVA_XA package on the replay system to enable the replay to complete successfully.
- If application uses a large number of sequences, increase _wcr_seq_cache_size to 131070. It will increase the consuming of PGA
Database Capture and Replay: Common Errors and Reasons (Doc ID 463263.1)
- The performance of file system which is used to store capture/replay files
- The folder which used for capture must be empty, even no any hidden file.
- The Capture directory can not be on an ASM disk.
- In order to replay on RAC, the capture/replay folder has to be on the shared file system.
- When do capture, CAPTURE_STS=TRUE is not supported on RAC
- STATISTICS_LEVEL cannot be BASIC, it has to be TYPICAL at least.
- In sqlnet.ora, DIAG_ADR_ENABLED should be set to ON or set $ADR_BASE for client.
- The version we use to preprocess trace files should be the same as the version of database we are planning to play on.
- Increase the nproc for the user who is running clients.
Database Real Application Testing Diagnostic Information (Doc ID 1287620.1)
1.Server-side tracing
alter system set _wcr_control=8; alter system set _wcr_control=0;
2.Client-side tracing
wrc user/passwd@server replaydir=\<capture-dir\> workdir=\<client-traces-dir\> debug=ON
Scripts to Debug Slow Replay (Doc ID 760402.1)
If we want to re-execute the replay, we need to flashback database, in order to release the space on FRA, we need to recreate the restore point and delete all archived logs.
RMAN> flashback database to restore point before_db_replay; RMAN> alter database open resetlogs; RMAN> drop restore point before_db_replay; -- all flb files will be removed automatically RMAN> shutdown immediate; RMAN> startup mount; RMAN> delete archivelog all; RMAN> create restore point before_db_replay guarantee flashback database; RMAN> alter database open;
Miscellaneous:
- Login Strom Test
Set CONNECT_TIME_SCALE to 1 and all threads will try to connect to database at the same time. - Pressure Test
Set THINK_TIME_SCALE to 1 and all threads will try to issue the sql statements ASAP Set SCALE_UP_MULTIPLIER to larger than 1, and the QUERY statements will be run multiple times. - Loading Test
Keep the CONNECT_TIME_SCALE, THINK_TIME_SCALE, and SCALE_UP_MULTIPLIER to be the default value, check the exact replay time, it should be close to the capture time. - Query-only mode
Query-Only Mode Database Replay (Doc ID 1919760.1)
https://docs.oracle.com/database/121/RATUG/GUID-756772FF-1F7E-4EB5-AFB5-152CB3DF6704.htm#RATUG4204
Read-only replay is a good and easy way to find regressions of the select statements. The advantage of read-only replay over database replay is that it will execute queries with all bind sets. The captured workload will be executed concurrently as it was captured.
-- create a subdirectory under the replay folder. create directory DB_REPLAY_CAPTURE_DIR_RO as '/t02/app/oracle/oradata/PPRD3/db_replay_capture/readonly'; -- create a subset EXEC DBMS_WORKLOAD_REPLAY.GENERATE_CAPTURE_SUBSET ('DB_REPLAY_CAPTURE_DIR','DB_REPLAY_CAPTURE_DIR_RO', 'READONLY', 0, TRUE, 36000, FALSE, 1); EXEC DBMS_WORKLOAD_REPLAY.PROCESS_CAPTURE ('DB_REPLAY_CAPTURE_DIR_RO'); exec dbms_workload_replay.set_replay_directory('DB_REPLAY_CAPTURE_DIR') ; exec dbms_workload_replay.begin_replay_schedule('readonly-query-1') ; select dbms_workload_replay.add_capture(capture_dir_name => 'DB_REPLAY_CAPTURE_DIR_RO', start_delay_seconds => 0, stop_replay => 'N', query_only => 'Y') from dual ; exec dbms_workload_replay.end_replay_schedule ; exec dbms_workload_replay.initialize_replay(replay_name => 'pprd_capture_4', replay_dir=>'DB_REPLAY_CAPTURE_DIR_RO') exec dbms_workload_replay.prepare_replay(synchronization=>FALSE ,CONNECT_TIME_SCALE=>1, THINK_TIME_SCALE=>1 ); $ wrc mode=calibrate replaydir=/t02/app/oracle/oradata/PPRD3/db_replay_capture/readonly $ wrc system/xxxxxxx mode=replay replaydir=/t02/app/oracle/oradata/PPRD3/db_replay_capture/readonly exec dbms_workload_replay.start_replay;