Some info on Oracle setup at AGLT2
When Oracle stops working...
Check the EM:
https://umors.grid.umich.edu:1158/em
After restarting umors, it is usually necessary to do "service dbora stop" followed by a "start".
After kernel upgrades, it is necessary to update the Oracle ASM RPMS. They can be downloaded from here:
http://www.oracle.com/technology/software/tech/linux/asmlib/rhel5.html
Restarting/enabling streams propogation
If some interruption at the CERN end causes the propogation process to go into a disabled state, you may need to manually re-enable the job.
Note: Use "tnsping" to verify that the CERN oracle machines are online. Hosts are in /opt/app/oracle/oracle/product/10.2.0/asm/network/admin/tnsnames.ora under INTR.CERN.CH (UPDATE: tnsping times out even when propagation works, but that's supposed to be a way to check)
The easiest way is to use the EM:
https://umors.grid.umich.edu:1158/em
- Go to the Maintenance tab
- Under Streams click on Management
- Click on the "1" next to Propogation Jobs to see a list of the job(s) status and a link to the last error message
- To change the status, click on the link in the Status column. After enabling go back to the list and refresh a few times to see that the status remains "enabled". The link under the Error column will show any errors.
Before beginning production we need to update our current Oracle installation on umors.grid.umich.edu to 10.2.0.4 via
PatchSet10204 (patch 6810189) as well as applying a patch to the agent (p5908032_10203_GENERIC).
Notes about procedure:
- We need to determine the notification procedure (who/how) when we have maintenance on our calibration DB muoncal.grid.umich.edu.
- Find patches, readmes and knowledgebase articles via Oracle Metalink at http://metalink.oracle.com
- Setup downtime interval in Oracle via the EM interface http://umors.grid.umich.edu:1158/em
- Download/prepare patches and printout instructions
- Stop db, crs and agent
- srvctl stop database -d muoncal.grid.umich.edu
- srvctl stop asm
- sudo crsctl stop crs
- sudo /etc/init.d/oma stop
We need to apply a set of patches (tracking what CERN has done) to our Oracle instance. The primary web page is
https://twiki.cern.ch/twiki/bin/view/PSSGroup/PatchesOnTop10204 but there are also a number of links to check off from this site. Specifically there are some Streams patches and some recommendations from Gancho about replicates schema best practices at
https://twiki.cern.ch/twiki/bin/view/Atlas/DatabaseSchemasUnderReplication .
Our plan is to clean up the Oracle space usage (see below) first, then to upgrade our RHEL 4 U7 installation (primary upgrade is the new kernel) which will necessitate building a new RAID disk module and updating the ASM RPMS from Oracle (details below), followed by application of the "PatchesOnTop10204" as per the CERN Twiki.
See below for "preparation" work done. Here is the sequence of steps we need:
Notes about procedure:
- We need to determine the notification procedure (who/how) when we have maintenance on our calibration DB muoncal.grid.umich.edu.
- Find patches, readmes and knowledgebase articles via Oracle Metalink at http://metalink.oracle.com
- Setup downtime interval in Oracle via the EM interface http://umors.grid.umich.edu:1158/em
- Download/prepare patches and printout instructions
- Stop db, crs and agent (as 'root')
- service dbora stop
- service oracleasm stop
- The relevant patches to apply (as user 'oracle')
- First update 'opatch': 6880880 (Just 'cd $ORACLE_HOME' and 'unzip -x p688...')
- Apply 4693355 ASM does not close open descriptors...
- Apply 7592346 CPU Jan09
- Apply 7612639 10.2.0.4 GENERIC RECOMMENDED BUNDLE #3
- Apply 7033630 STREAMS related patch for APPLY (Special post-install notes)
All patch zip files were copied into
/app/oracle/patchsets/EMStagedPatches/
and then unzipped which creates a sub-directory with the patch number.
Details of each patches README follow.
I also applied the recommendations in Metalink Note 555579.1, 10.2.0.4 Patch Set - Availability and Known Issues:
SQL> alter system set "_enable_NUMA_optimization"=FALSE scope=spfile sid='*';
System altered.
SQL> alter system set "_db_block_numa"=1 scope=spfile sid='*';
System altered.
As of February 5, 2009 the STREAMS replications is still "hung". Recommendations from Eva
DaFonte/CERN suggested reading the following
MetaLink documents:
- Note 290143.1 Restarting the capture process
- Note 313279.1 Determining the status of the capture/archive log process
- Note 406479.1 Changed destination of the archive log file impacting STREAMS
Patch 4693355
First 'cd' to directory 4693355. Next 'export OBJECT_MODE=32_64'. Then 'opatch apply'.
muoncal on umors.grid.umich.edu: opatch apply
Invoking OPatch 10.2.0.4.5
Oracle Interim Patch Installer version 10.2.0.4.5
Copyright (c) 2008, Oracle Corporation. All rights reserved.
Oracle Home : /app/oracle/oracle/product/10.2.0/asm
Central Inventory : /app/oracle/oraInventory
from : /etc/oraInst.loc
OPatch version : 10.2.0.4.5
OUI version : 10.2.0.4.0
OUI location : /app/oracle/oracle/product/10.2.0/asm/oui
Log file location : /app/oracle/oracle/product/10.2.0/asm/cfgtoollogs/opatch/opatch2009-02-03_16-14-34PM.log
...
Verifying the update...
Inventory check OK: Patch ID 4693355 is registered in Oracle Home inventory with proper meta-data.
Files check OK: Files from Patch ID 4693355 are present in Oracle Home.
The local system has been patched and can be restarted.
OPatch succeeded.
Patch 7592346
First 'cd' to directory 7592346. Next 'opatch napply -skip_subset -skip_duplicate':
muoncal on umors.grid.umich.edu: opatch napply -skip_subset -skip_duplicate
Invoking OPatch 10.2.0.4.5
Oracle Interim Patch Installer version 10.2.0.4.5
Copyright (c) 2008, Oracle Corporation. All rights reserved.
UTIL session
Oracle Home : /app/oracle/oracle/product/10.2.0/asm
Central Inventory : /app/oracle/oraInventory
from : /etc/oraInst.loc
OPatch version : 10.2.0.4.5
OUI version : 10.2.0.4.0
OUI location : /app/oracle/oracle/product/10.2.0/asm/oui
Log file location : /app/oracle/oracle/product/10.2.0/asm/cfgtoollogs/opatch/opatch2009-02-03_16-22-32PM.log
Patch history file: /app/oracle/oracle/product/10.2.0/asm/cfgtoollogs/opatch/opatch_history.txt
Invoking utility "napply"
Checking conflict among patches...
Checking if Oracle Home has components required by patches...
Checking skip_duplicate
These patches will be skipped because they are duplicates of some patch(es) in the Oracle Home: 7155248,7155249,7155250,7155251,7155252,7155253,7155254,7197583
OPatch continues with these patches: 7375611,7375613,7375617,7592346,7609057,7609058
Checking skip_subset
Checking conflicts against Oracle Home...
Conflicts/Supersets for each patch are:
Patch : 7592346
Bug Superset of 7150470
Super set bugs are:
7150470
OPatch found that the following patches are not required.
They are either subset of the patches in Oracle Home (or) subset of the patches in the given list
(or) duplicate :
7155248 7155249 7155250 7155251 7155252 7155253 7155254 7197583
Following patches will be rolled back from Oracle Home on application of the patches in the given list :
7150470
Do you want to proceed? [y|n]
...
Return Code = 0
The local system has been patched and can be restarted.
UtilSession: N-Apply done.
--------------------------------------------------------------------------------
The following warnings have occurred during OPatch execution:
1) OUI-67302:
OPatch found that the following patches are not required.
They are either subset of the patches in Oracle Home (or) subset of the patches in the given list
(or) duplicate :
7155248 7155249 7155250 7155251 7155252 7155253 7155254 7197583
2) OUI-67303:
Following patches will be rolled back from Oracle Home on application of the patches in the given list :
7150470
--------------------------------------------------------------------------------
OPatch Session completed with warnings.
OPatch completed with warnings.
NOTE: We have patch 7150470 being rolled back...this will need to be handled. OK...nothing to do there since I checked after installing and found:
There are no patches to apply.
UtilSession: N-Apply done.
--------------------------------------------------------------------------------
The following warnings have occurred during OPatch execution:
1) OUI-67302:
OPatch found that the following patches are not required.
They are either subset of the patches in Oracle Home (or) subset of the patches in the given list
(or) duplicate :
7150470 7155248 7155249 7155250 7155251 7155252 7155253 7155254 7197583
There are no patches to apply.
--------------------------------------------------------------------------------
OPatch Session completed with warnings.
OPatch completed with warnings.
Next step is to run 'sh cpu_root.sh' ( as 'root'). Just worked...no output.
POSTINSTALL STEPS:
NOTE: This has to be done LATER since the ASM and DB instances are shut down.
After installing the patch, perform the following actions:
1.
Load modified .sql files into the database, as explained in Section 3.3.2.1.
2.
Recompile views in the database, if necessary, as explained in Section 3.3.2.2.
3.3.2.1 Loading Modified .sql Files into the Database
For a new or upgraded database, check Section 3.3.5, "Post Installation Instructions for New and Upgraded Databases" to see whether you need to perform the steps in this section.
To load modified .sql files into the database, follow these steps:
1.
If there is a database in the Oracle home that you are patching, start all database instances running from this Oracle home.
2.
For each database instance running on the Oracle home being patched, connect to the database using SQL*Plus. Connect as SYSDBA and run the catbundle.sql script as follows:
cd $ORACLE_HOME/rdbms/admin
sqlplus /nolog
SQL> CONNECT / AS SYSDBA
SQL> STARTUP
SQL> @catbundle.sql cpu apply
SQL> QUIT
For information about the catbundle.sql script, see My Oracle Support Note 605795.1 Introduction to Oracle Database catbundle.sql.
3.
Check the following log files in $ORACLE_HOME/cfgtoollogs/catbundle any errors:
catbundle_CPU_<database SID>_APPLY_<TIMESTAMP>.log
catbundle_CPU_<database sid>_GENERATE_<TIMESTAMP>.log
where TIMESTAMP is of the form YYYYMMMDD_HH_MM_SS. If there are errors, refer to Section 4, "Known Issues".
3.3.2.2 Recompiling Views in the Database
You may skip this section if you have recompiled views for this database during the installation of a previous CPU or if the database was created with release 11.1.0.6 or later. For new databases, see Section 3.3.5, "Post Installation Instructions for New and Upgraded Databases".
The time required to recompile the views and related objects depends on the total number of objects and on your system configuration. In one internal Oracle test with approximately 2000 views and 4000 objects, the total execution time for view_recompile_jan2008cpu.sql and utlrp.sql was about 30 minutes.
Note:
Depending on these considerations and your downtime schedule, you can choose to schedule the recompilation of views independent of the rest of the CPU installation. If you do this, your system will continue to work; however, the CPU installation will not be complete until the view recompilation is completed.
If you want to check whether view recompilation has already been performed for the database, execute the following statement.
SELECT * FROM registry$history where ID = '6452863';
If the view recompilation has been performed, this statement returns one or more rows. If the view recompilation has not been performed, this statement returns no rows.
To recompile the views in the database, follow these steps:
1.
Run the pre-check script (so named because it was initially released in CPUJan2008), which reports the maximum number of views and objects that may be recompiled:
cd $ORACLE_HOME/cpu/view_recompile
sqlplus /nolog
SQL> CONNECT / AS SYSDBA
SQL> @recompile_precheck_jan2008cpu.sql
SQL> QUIT
The purpose of this step is to help you determine whether view recompilation should be done at the same time as the CPU install, or scheduled later.
2.
If the database is not in a RAC environment, perform this step. (If the database is in a RAC environment, go to the next step.)
Run the view recompilation script. Note that this script is run with the database in upgrade mode, which restricts connections as SYSDBA.
cd $ORACLE_HOME/cpu/view_recompile
sqlplus /nolog
SQL> CONNECT / AS SYSDBA
SQL> SHUTDOWN IMMEDIATE
SQL> STARTUP UPGRADE
SQL> @view_recompile_jan2008cpu.sql
SQL> SHUTDOWN;
SQL> STARTUP;
SQL> QUIT
3.
If the database is in a RAC environment, run the view recompilation script as follows. Note that this script is run with the database in upgrade mode, which restricts connections as SYSDBA.
cd $ORACLE_HOME/cpu/view_recompile
sqlplus /nolog
SQL> CONNECT / AS SYSDBA
SQL> STARTUP NOMOUNT
SQL> ALTER SYSTEM SET CLUSTER_DATABASE=FALSE SCOPE=spfile;
SQL> SHUTDOWN
SQL> STARTUP UPGRADE
SQL> @view_recompile_jan2008cpu.sql
SQL> SHUTDOWN;
SQL> STARTUP NOMOUINT;
Set the CLUSTER_DATABASE initialization parameter to TRUE:
SQL> ALTER SYSTEM SET CLUSTER_DATABASE=TRUE SCOPE=spfile;
Restart the database:
SQL> SHUTDOWN;
SQL> STARTUP;
SQL> QUIT
4.
Check the log file for any errors. The log file is in the current directory and is named: vcomp_<sid>_<timestamp>.log
5.
If any invalid objects were reported, run the utlrp.sql script as follows:
cd $ORACLE_HOME/rdbms/admin
sqlplus /nolog
SQL> CONNECT / AS SYSDBA
SQL> @utlrp.sql
Then, manually recompile any invalid objects. For example:
SQL> alter package schemaname.packagename compile;
Patch 7612639
First 'cd' to directory 7612639. Next 'opatch napply -skip_subset -skip_duplicate'.
muoncal on umors.grid.umich.edu: opatch napply -skip_subset -skip_duplicate
Invoking OPatch 10.2.0.4.5
Oracle Interim Patch Installer version 10.2.0.4.5
Copyright (c) 2008, Oracle Corporation. All rights reserved.
UTIL session
Oracle Home : /app/oracle/oracle/product/10.2.0/asm
Central Inventory : /app/oracle/oraInventory
from : /etc/oraInst.loc
OPatch version : 10.2.0.4.5
OUI version : 10.2.0.4.0
OUI location : /app/oracle/oracle/product/10.2.0/asm/oui
Log file location : /app/oracle/oracle/product/10.2.0/asm/cfgtoollogs/opatch/opatch2009-02-03_16-31-45PM.log
...
Verifying the update...
Inventory check OK: Patch ID 7691766 is registered in Oracle Home inventory with proper meta-data.
Files check OK: Files from Patch ID 7691766 are present in Oracle Home.
Running make for target ioracle
Running make for target client_sharedlib
Running make for target client_sharedlib
The local system has been patched and can be restarted.
UtilSession: N-Apply done.
OPatch succeeded.
Patch 7033630
First 'cd' to directory 7033630. Next 'export OBJECT_MODE=32_64'. Then 'opatch apply'.
muoncal on umors.grid.umich.edu: opatch apply
Invoking OPatch 10.2.0.4.5
Oracle Interim Patch Installer version 10.2.0.4.5
Copyright (c) 2008, Oracle Corporation. All rights reserved.
Oracle Home : /app/oracle/oracle/product/10.2.0/asm
Central Inventory : /app/oracle/oraInventory
from : /etc/oraInst.loc
OPatch version : 10.2.0.4.5
OUI version : 10.2.0.4.0
OUI location : /app/oracle/oracle/product/10.2.0/asm/oui
Log file location : /app/oracle/oracle/product/10.2.0/asm/cfgtoollogs/opatch/opatch2009-02-03_16-37-17PM.log
...
Verifying the update...
Inventory check OK: Patch ID 7033630 is registered in Oracle Home inventory with proper meta-data.
Files check OK: Files from Patch ID 7033630 are present in Oracle Home.
OPatch succeeded.
POSTINSTALL (Will have to be done after we startup the DB):
# Patch Special Instructions:
# ---------------------------
# Post-install steps:
# 1) Stop streams or logical standby if they are running.
# 2) Connect to the database as SYS and load prvtlmc.plb.
# 3) Restart strams or logical standby as needed.
# NOTE: On a RAC, users should ensure that neither streams nor
# SQL Apply is running on any instance at the time prvtlmc.plb
# is loaded. Further note that prvtlmc.plb need only be loaded
# into the database once (from any instance).
Blocking Problem for 10.2.0.4 Patchset Application
As of January 30th we have also run out of space on umors.grid.umich.edu. The specific message on
https://umors.grid.umich.edu:1158/em is "Details Archiver is unable to archive a redo log because the output device is full or unavailable. The instance is open." We received an email from the CERN Oracle Streams monitoring as well:
Streams Monitor Error Report
Report date: 2009-01-30 10:09:40
Affected Site: UMICH
Affected Database: MUONCAL.GRID.UMICH.EDU
Process Name: STREAMS_CAPTURE_MICH
Error Time: 29-01-2009 21:07:57
Error Message: Capture latency higher than default limit of 90 mins. Please check.
Current process status: PAUSED FOR FLOW CONTROL
See also: https://oms3d.cern.ch:1159/streams/streams
This had previously happened in October 2008 and at that time we were able free up a significant amount of space by removing .trc files older than 120 days via
find . -mtime +120 -exec rm {} \;
in the
/app/oracle/oracle/product/10.2.0/asm/admin/muoncal/bdump
directory. This time the total space used was 3.2GB there but this command only freed up 200MB.
You can most clearly see the problem on umors.grid.umich.edu as follows:
[umors:bdump]# su - oracle
.profile executed
muoncal on umors.grid.umich.edu: sqlplus "SYS/xxxxxxxxx as SYSDBA"
SQL*Plus: Release 10.2.0.4.0 - Production on Fri Jan 30 09:44:36 2009
Copyright (c) 1982, 2007, Oracle. All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> SELECT NAME,TYPE,TOTAL_MB,FREE_MB FROM V$ASM_DISKGROUP;
NAME TYPE TOTAL_MB FREE_MB
------------------------------ ------ ---------- ----------
DATAFLASH NORMAL 476944 40
SQL>
Only 40MB are free (actually less since the ASM area is mirrored)! This is what we need to address to get the system operational so we can apply patches. The problem is almost certainly that we have too many archive log sequences stored and the old ones need to be cleaned up.
We need to run the
asmcmd
to access the ASM area. To do this 'su - oracle' and be sure to set the correct ORACLE_SID:
export ORACLE_SID=+ASM
asmcmd -p
This works but I wanted to check some more things before using
asmcmd
.
Running RMAN
To run 'rman' I had to do this:
muoncal on umors.grid.umich.edu: /app/oracle/oracle/product/10.2.0/asm/bin/rman TARGET SYS/xxxxxxx@muoncal
Recovery Manager: Release 10.2.0.4.0 - Production on Fri Jan 30 13:14:14 2009
Copyright (c) 1982, 2007, Oracle. All rights reserved.
connected to target database: MUONCAL (DBID=3255750732)
Fixing the ASM ARCHIVELOG Management
[umors:~]# su - oracle
.profile executed
muoncal on umors.grid.umich.edu: sqlplus "SYS/xxxxxxx as SYSDBA"
SQL*Plus: Release 10.2.0.4.0 - Production on Fri Jan 30 12:23:17 2009
Copyright (c) 1982, 2007, Oracle. All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> shutdown immediate
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
SQL> startup mount
ORACLE instance started.
Total System Global Area 2147483648 bytes
Fixed Size 2183624 bytes
Variable Size 966569528 bytes
Database Buffers 1174405120 bytes
Redo Buffers 4325376 bytes
Database mounted.
SQL> alter database noarchivelog;
alter database noarchivelog
*
ERROR at line 1:
ORA-38774: cannot disable media recovery - flashback database is enabled
SQL> alter database flashback off;
Database altered.
SQL> alter database noarchivelog;
Database altered.
SQL>
SQL> alter database open;
Database altered.
SQL> archive log list;
Database log mode No Archive Mode
Automatic archival Disabled
Archive destination +DATAFLASH/muoncal/
Oldest online log sequence 4907
Current log sequence 4909
Now we should be able to access things to allow us to clean up all the ARCHIVELOG content.
Reclaiming Space
On Oracle's Metalink I found a few good articles:
- Subject: How To Reclaim Asm Disk Space? Doc ID: 351866.1
- Subject: How To Use RMAN To Delete Archivelog Files (Including ASM) Doc ID: 368844.1
- Subject: How To Delete Archive Log Files Out Of +Asm? Doc ID: 300472.1
If you do
export ORACLE_SID\=+ASM
and then go into sqlplus you can actually generate a list of files to delete:
select 'alter diskgroup DATAFLASH drop file "DATAFLASH/MUONCAL/ARCHIVELOG/'|| to_char(b.creation_date,'YYYY_MM_DD') ||'/'|| a.name||'";' from v$asm_alias a, v$asm_file b where a.group_number = b.group_number and a.file_number = b.file_number and b.type = 'ARCHIVELOG' order by a.name;
However this didn't seem to work (created the SQL but didn't apply it).
What ended up cleaning up the bulk of the space was using
asmcmd
and doing:
+ASM on umors.grid.umich.edu: asmcmd
ASMCMD> pwd
+
ASMCMD> cd DATAFLASH
ASMCMD> cd MUONCAL/ARCHIVELOG
ASMCMD> rm -r 2006*
You may delete multiple files and/or directories.
Are you sure? (y/n) y
ASMCMD> rm -r 2007*
You may delete multiple files and/or directories.
Are you sure? (y/n) y
ASMCMD> rm -r 2008*
You may delete multiple files and/or directories.
Are you sure? (y/n) y
After this we had only 23\% used on the DATADISK +ASM area.
I re-setup the
ArchiveLog and Flashback that I had disabled above.
Upgrading UMORS Kernel
Part of our task is to get the most recent kernel in place on umors.grid.umich.edu. This is currently 2.6.9-78.0.13. There are two tricky parts to upgrading the kernel:
- The Oracle ASM rpms must be updated
- The mvSata driver must be rebuilt for the new kernel
For both of these we first want to install the new kernel and then worry about fixing things before we reboot.
I used
up2date-nox -l
to see what was available. Then I downloaded the newest 'smp' kernels via
up2date-nox -d kernel-smp*
.
The downloaded files show up in
/var/spool/up2date/
. I just did an RPM install:
[umors:~]# rpm -ivh /var/spool/up2date/kernel-smp-*.rpm
Preparing... ########################################### [100%]
1:kernel-smp-devel ########################################### [ 50%]
2:kernel-smp ########################################### [100%]
WARNING: No module mv_sata found for kernel 2.6.9-78.0.13.ELsmp, continuing anyway
WARNING: No module mv_sata found for kernel 2.6.9-78.0.13.ELsmp, continuing anyway
As you can see there is no mv_sata found. We need to build this for this new kernel and then re-create the initrd image.
To do this, first cd to
/root/mvSata-Linux-3.6.1/LinuxIAL
, save the existing build.sh and edit it to point to the new kernel. You must also fix the softlink for /usr/src/linux-2.6 to "point" to the new kernel area:
ln -s /usr/src/kernels/2.6.9-78.0.13.EL-smp-x86_64/ /usr/src/linux-2.6
After you run
build.sh
you will have 3 variants of the needed mv_sata.ko module in:
[umors:mvSata-Linux-3.6.1]# ls build/Linux/*
build/Linux/DebugError:
mv_sata.ko
build/Linux/DebugFull:
mv_sata.ko
build/Linux/Free:
mv_sata.ko
I usually 'install' the DebugError version:
cp build/Linux/DebugError/mv_sata.ko /lib/modules/2.6.9-78.0.13.ELsmp/kernel/drivers/scsi/
Then recreate 'initrd':
[umors:mvSata-Linux-3.6.1]# depmod -a 2.6.9-78.0.13.ELsmp
[umors:mvSata-Linux-3.6.1]# mkinitrd -v -f /boot/initrd-2.6.9-78.0.13.ELsmp.img 2.6.9-78.0.13.ELsmp
Creating initramfs
Looking for deps of module scsi_mod
Looking for deps of module sd_mod scsi_mod
Looking for deps of module scsi_mod
Looking for deps of module unknown
Looking for deps of module mv_sata scsi_mod
Looking for deps of module scsi_mod
Looking for deps of module ide-disk
Looking for deps of module raid1
Looking for deps of module ext3 jbd
Looking for deps of module jbd
Using modules: ./kernel/drivers/scsi/scsi_mod.ko ./kernel/drivers/scsi/sd_mod.ko ./kernel/drivers/scsi/mv_sata.ko ./kernel/drivers/md/raid1.ko ./kernel/fs/jbd/jbd.ko ./kernel/fs/ext3/ext3.ko
/sbin/nash -> /tmp/initrd.P18997/bin/nash
/sbin/insmod.static -> /tmp/initrd.P18997/bin/insmod
/sbin/udev.static -> /tmp/initrd.P18997/sbin/udev
/etc/udev/udev.conf -> /tmp/initrd.P18997/etc/udev/udev.conf
copy from /lib/modules/2.6.9-78.0.13.ELsmp/./kernel/drivers/scsi/scsi_mod.ko(elf64-x86-64) to /tmp/initrd.P18997/lib/scsi_mod.ko(elf64-x86-64)
copy from /lib/modules/2.6.9-78.0.13.ELsmp/./kernel/drivers/scsi/sd_mod.ko(elf64-x86-64) to /tmp/initrd.P18997/lib/sd_mod.ko(elf64-x86-64)
copy from /lib/modules/2.6.9-78.0.13.ELsmp/./kernel/drivers/scsi/mv_sata.ko(elf64-x86-64) to /tmp/initrd.P18997/lib/mv_sata.ko(elf64-x86-64)
copy from /lib/modules/2.6.9-78.0.13.ELsmp/./kernel/drivers/md/raid1.ko(elf64-x86-64) to /tmp/initrd.P18997/lib/raid1.ko(elf64-x86-64)
copy from /lib/modules/2.6.9-78.0.13.ELsmp/./kernel/fs/jbd/jbd.ko(elf64-x86-64) to /tmp/initrd.P18997/lib/jbd.ko(elf64-x86-64)
copy from /lib/modules/2.6.9-78.0.13.ELsmp/./kernel/fs/ext3/ext3.ko(elf64-x86-64) to /tmp/initrd.P18997/lib/ext3.ko(elf64-x86-64)
Loading module scsi_mod
Loading module sd_mod
Loading module mv_sata
Loading module raid1
Loading module jbd
Loading module ext3
Now you can also install the newest ASM rpm:
[umors:~]# rpm -ivh oracleasm-2.6.9-78.0.13.ELsmp-2.0.5-1.el4.x86_64.rpm
Preparing... ########################################### [100%]
1:oracleasm-2.6.9-78.0.13########################################### [100%]
Now we should be good to reboot into this new kernel!
Adding New Disks to UMORS
The last time we ran out of disk space we ordered five 1TB disks to be used to replace the four 250GB disks in UMORS (and one for a shelf-spare). We plan to migrate to these new disks as part of our January 30th maintenance but we need to determine how to do this. The first two disks (/dev/sda and /dev/sdb) are managed by software RAID on Linux. The second two disks (/dev/sdc and /dev/sdd) are managed by Oracle via ASM.
Dropping a disk, adding larger disk
(on umors)
su - oracle
export ORACLE_SID="+ASM"
sqlplus /NOLOG
CONNECT / AS SYSDBA
ASM information queries:
SELECT name, type, total_mb, free_mb, required_mirror_free_mb, required_mirror_free_mb, usable_file_mb FROM V$ASM_DISKGROUP
-- ShawnMcKee - 03 Sep 2008 Drop the disk, have to force or it stays "HUNG" and doesn't drop:
SQL> ALTER DISKGROUP DATAFLASH DROP DISK DISK2 FORCE;
(this is maybe not needed, the disk never goes offline. Just pulling it worked ok)
Adding the disk - step 1 is to label it:
[umors:~]# /etc/init.d/oracleasm createdisk DISK3 /dev/sdd1
Then add it to the diskgroup:
ALTER DISKGROUP DATAFLASH ADD DISK 'ORCL:DISK3' name DISK3
The above doesn't work right. Here is why. Oracle maintains "failure groups" of storage. For example, all disks on the system system/controller should really be in a single failure group and to have redundancy you need at least 2 failure groups. Our original configuration had two failure groups:
DISK1
and
DISK2
corresponding to disks DISK1 ('ORCL:DISK1') and DISK2 ('ORCL:DISK2'). What the above commands did is as follows:
- The 'DROP DISK' forced a removal of the only member of failure group
DISK2
thereby hanging that failure group
- The 'ADD DISK' command created a new failure group named
DISK3
To fix this we need to undo the 'ADD DISK' and then specifically RE-ADD this new disk to the original HUNG failure group
DISK2
. Here are the needed commands run via
SQLPLUS
as user 'oracle' with ORACLE_SID=+ASM:
SQL> alter diskgroup DATAFLASH drop disk DISK3;
Diskgroup altered.
SQL> ALTER diskgroup DATAFLASH ADD FAILGROUP DISK2 disk 'ORCL:DISK2' name DISK3;
Diskgroup altered.
This makes sure the disk is put into the correct failure group. Note is does force a rebalance (recopy) but once that finishes the MISSING/HUNG disk should disappear from ASM. See the detailed discussion at
http://www.oracloid.com/2006/05/vldb-with-asm/
Restoration of STREAMS Replication to CERN
On February 10th, 2009 Gancho Dimitrov was able to re-create the STREAMS replication from
muoncal.grid.umich.edu to
intr.cern.ch by doing the following. Note that this was neccessary because the "ARCHIVELOG" mode had been disabled (see above) which resulted in the loss of sequences from 4906-4911.
-- 10.Feb.2009
1) -- issue a data dictionary dump, get the SCN and use it as FIRST and START SCN for the CAPTURE
set serveroutput on
DECLARE
scn NUMBER;
BEGIN
DBMS_CAPTURE_ADM.BUILD ( first_scn => scn );
DBMS_OUTPUT.PUT_LINE ('First SCN: ' || scn);
END;
First SCN: 6047632153953
2) -- export the data with the FLASHBACK_SCN = the SCN from the data dictionary build
-- drop the tables on the INTR
BEGIN
FOR j in (SELECT owner, table_name tt FROM dba_tables WHERE owner = 'ATLAS_MUONCALIB_MICH') LOOP
execute immediate 'DROP TABLE ' || j.owner ||'."'||j.tt || '" cascade constraints PURGE ';
END LOOP;
END;
/
impdp strmadmin@intr FLASHBACK_SCN=6047632153953 SCHEMAS=ATLAS_MUONCALIB_MICH NOLOGFILE=Y TRANSFORM=SEGMENT_ATTRIBUTES:N NETWORK_LINK=MUONCAL.GRID.UMICH.EDU
3) Streams setup
-- SOURCE DATABASE MICHIGAN--
-- direct Streams setup
-- check init.ora parameters and create streams administrator user!!!
-- tnsnames.ora configuration
-- use create_strmadmin_10g.sql on version 10g
-- check database links
-- check supplemental logging
-- CUSTOMIZE!!!
define var_queue_name=STRM_QUEUE_CA_MICH;
define var_capture_name=STREAMS_CAPTURE_MICH;
define var_source_name=MUONCAL.GRID.UMICH.EDU;
define var_schema_name=ATLAS_MUONCALIB_MICH;
-- connect as Streams administrator: strmadmin
CONNECT strmadmin
-- create the streams queue
BEGIN
DBMS_STREAMS_ADM.SET_UP_QUEUE(
queue_table => 'T_' || '&var_queue_name',
queue_name => '&var_queue_name',
queue_user => 'STRMADMIN');
END;
/
-- I have removed that param from the CREATE_CAPTURE below
-- use_database_link => true,
-- create capture proceses
-- with First SCN: 6047632153953
BEGIN
DBMS_CAPTURE_ADM.CREATE_CAPTURE(
queue_name => 'STRMADMIN.' || '&var_queue_name',
capture_name => '&var_capture_name',
source_database => '&var_source_name',
start_scn => 6047632153953,
first_scn => 6047632153953,
logfile_assignment => 'implicit'
);
END;
/
-- add capture rules (once per schema!!!)
BEGIN
DBMS_STREAMS_ADM.ADD_SCHEMA_RULES(
schema_name => '&var_schema_name',
streams_type => 'CAPTURE',
streams_name => '&var_capture_name',
queue_name => 'STRMADMIN.' || '&var_queue_name',
include_dml => true,
include_ddl => true,
include_tagged_lcr => false,
source_database => '&var_source_name',
inclusion_rule => true);
END;
/
---- Oracle Recomendations!
-- reduce the checkpoint retention time (the default is 60 days )
BEGIN
DBMS_CAPTURE_ADM.ALTER_CAPTURE
('&var_capture_name',
checkpoint_retention_time => 7);
END;
/
---- minimize the number of logminer checkpoints when the database is very active
BEGIN
DBMS_CAPTURE_ADM.SET_PARAMETER(
capture_name => '&var_capture_name',
parameter => '_checkpoint_frequency',
value => 500);
END;
/
============================= STREAMS_APPLY_MUONCALIB_MICH ====================================
-- DESTINATION DATABASE --
-- direct Streams setup
DEFINE var_queue_name = Q_AP_MUONCALIB_MICH;
DEFINE var_apply_name = STREAMS_APPLY_MUONCALIB_MICH;
DEFINE var_source_name = MUONCAL.GRID.UMICH.EDU;
-- connect as Streams administrator: strmadmin
-- create the streams queue
BEGIN
DBMS_STREAMS_ADM.SET_UP_QUEUE(
queue_table => 'T' || '&var_queue_name',
queue_name => '&var_queue_name',
queue_user => 'STRMADMIN');
END;
/
-- queue affinity (only RAC database)
begin
dbms_aqadm.alter_queue_table(
queue_table => 'T' || '&var_queue_name',
primary_instance => 2,
secondary_instance => 1);
end;
/
-- create the apply process
BEGIN
DBMS_APPLY_ADM.CREATE_APPLY(
queue_name => 'STRMADMIN.' || '&var_queue_name',
apply_name => '&var_apply_name',
apply_user => 'STRMADMIN',
source_database => '&var_source_name',
apply_captured => TRUE);
END;
/
-- specify an apply user
BEGIN
DBMS_APPLY_ADM.ALTER_APPLY(
apply_name => '&var_apply_name',
apply_user => 'STRMADMIN');
END;
/
/*
-- not needed for the
-- add APPLY schema rules
-- avoid bug when dropping tables after adding PVSS replication between same source and destination
define var_schema_name= ATLAS_MUONCALIB_MICH
BEGIN
DBMS_STREAMS_ADM.ADD_SCHEMA_RULES(
schema_name => '&var_schema_name',
streams_type => 'APPLY',
streams_name => '&var_apply_name',
queue_name => 'STRMADMIN.' || '&var_queue_name',
include_dml => TRUE,
include_ddl => TRUE,
include_tagged_lcr => TRUE, -- special case for the APPLY on the ATLR
source_database => '&var_source_name',
inclusion_rule => true);
END;
/
*/
---- Oracle Recomendations!
---- workload mix of I/U/D activity, then the following apply parameters should be set:
BEGIN
DBMS_APPLY_ADM.SET_PARAMETER(
apply_name => '&var_apply_name',
parameter => 'commit_serialization',
value => 'full');
DBMS_APPLY_ADM.SET_PARAMETER(
apply_name => '&var_apply_name',
parameter => '_hash_table_size',
value => 10000000);
DBMS_APPLY_ADM.SET_PARAMETER(
apply_name => '&var_apply_name',
parameter => '_DYNAMIC_STMTS',
value => 'y');
DBMS_APPLY_ADM.SET_PARAMETER(
apply_name => '&var_apply_name',
parameter => 'parallelism',
value => 1);
DBMS_APPLY_ADM.SET_PARAMETER(
apply_name => '&var_apply_name',
parameter => '_txn_buffer_size',
value => 14);
DBMS_APPLY_ADM.SET_PARAMETER(
apply_name => '&var_apply_name',
parameter => 'DISABLE_ON_ERROR',
value => 'Y');
END;
/
-- start the apply process
BEGIN
DBMS_APPLY_ADM.START_APPLY(
apply_name => '&var_apply_name');
END;
/
===================== Propagation ===================================
define var_source_queue=STRM_QUEUE_CA_MICH;
define var_destination_queue= Q_AP_MUONCALIB_MICH;
define var_source_name=MUONCAL.GRID.UMICH.EDU;
define var_propagation_name=STREAMS_MUONCALIB_PROPAGATE;
define var_destination_name=INTR.CERN.CH;
define var_schema_name=ATLAS_MUONCALIB_MICH;
-- connect as Streams administrator: strmadmin
-- connect strmadmin
-- configure propagation from source to destination database
BEGIN
DBMS_PROPAGATION_ADM.CREATE_PROPAGATION(
propagation_name => '&var_propagation_name',
source_queue => 'STRMADMIN.' || '&var_source_queue',
destination_queue => 'STRMADMIN.' || '&var_destination_queue',
destination_dblink => '&var_destination_name',
queue_to_queue => TRUE );
END;
/
-- change propagation latency - Oracle recommendation
BEGIN
DBMS_AQADM.ALTER_PROPAGATION_SCHEDULE(
queue_name => 'STRMADMIN.' || '&var_source_queue',
destination => '"STRMADMIN"."&var_destination_queue"@&var_destination_name',
latency => 1);
END;
/
-- instantiation
DECLARE
iscn NUMBER; -- Variable to hold instantiation SCN value
BEGIN
-- iscn := DBMS_FLASHBACK.GET_SYSTEM_CHANGE_NUMBER();
iscn:= 6047632153953;
DBMS_APPLY_ADM.SET_SCHEMA_INSTANTIATION_SCN@&var_destination_name(
source_schema_name => '&var_schema_name',
source_database_name => '&var_source_name',
instantiation_scn => iscn,
recursive => true);
END;
/
============ source DB - INTR, start the CAPTURE =================
define var_capture_name=STREAMS_CAPTURE_PVSSCONF;
--prepare instantiation for the DB objects
BEGIN
DBMS_CAPTURE_ADM.prepare_schema_instantiation('&var_schema_name');
END;
/
BEGIN
DBMS_CAPTURE_ADM.start_capture('&var_capture_name');
END;
/
[test change -- delete me]
--
ShawnMcKee - 03 Sep 2008