Some info on Oracle setup at AGLT2

When Oracle stops working...

Check the EM: https://umors.grid.umich.edu:1158/em

After restarting umors, it is usually necessary to do "service dbora stop" followed by a "start".

After kernel upgrades, it is necessary to update the Oracle ASM RPMS. They can be downloaded from here:

http://www.oracle.com/technology/software/tech/linux/asmlib/rhel5.html

Restarting/enabling streams propogation

If some interruption at the CERN end causes the propogation process to go into a disabled state, you may need to manually re-enable the job.

Note: Use "tnsping" to verify that the CERN oracle machines are online. Hosts are in /opt/app/oracle/oracle/product/10.2.0/asm/network/admin/tnsnames.ora under INTR.CERN.CH (UPDATE: tnsping times out even when propagation works, but that's supposed to be a way to check)

The easiest way is to use the EM: https://umors.grid.umich.edu:1158/em
  • Go to the Maintenance tab
  • Under Streams click on Management
  • Click on the "1" next to Propogation Jobs to see a list of the job(s) status and a link to the last error message
  • To change the status, click on the link in the Status column. After enabling go back to the list and refresh a few times to see that the status remains "enabled". The link under the Error column will show any errors.

Patch for 10.2.0.3 (PatchSet10204)

Before beginning production we need to update our current Oracle installation on umors.grid.umich.edu to 10.2.0.4 via PatchSet10204 (patch 6810189) as well as applying a patch to the agent (p5908032_10203_GENERIC).

Notes about procedure:
  • We need to determine the notification procedure (who/how) when we have maintenance on our calibration DB muoncal.grid.umich.edu.
  • Find patches, readmes and knowledgebase articles via Oracle Metalink at http://metalink.oracle.com
  • Setup downtime interval in Oracle via the EM interface http://umors.grid.umich.edu:1158/em
  • Download/prepare patches and printout instructions
  • Stop db, crs and agent
    • srvctl stop database -d muoncal.grid.umich.edu
    • srvctl stop asm
    • sudo crsctl stop crs
    • sudo /etc/init.d/oma stop

Patches for 10.2.0.4 for January 2009 (PatchesJan2009on10204)

We need to apply a set of patches (tracking what CERN has done) to our Oracle instance. The primary web page is https://twiki.cern.ch/twiki/bin/view/PSSGroup/PatchesOnTop10204 but there are also a number of links to check off from this site. Specifically there are some Streams patches and some recommendations from Gancho about replicates schema best practices at https://twiki.cern.ch/twiki/bin/view/Atlas/DatabaseSchemasUnderReplication .

Our plan is to clean up the Oracle space usage (see below) first, then to upgrade our RHEL 4 U7 installation (primary upgrade is the new kernel) which will necessitate building a new RAID disk module and updating the ASM RPMS from Oracle (details below), followed by application of the "PatchesOnTop10204" as per the CERN Twiki.

See below for "preparation" work done. Here is the sequence of steps we need:

Notes about procedure:
  • We need to determine the notification procedure (who/how) when we have maintenance on our calibration DB muoncal.grid.umich.edu.
  • Find patches, readmes and knowledgebase articles via Oracle Metalink at http://metalink.oracle.com
  • Setup downtime interval in Oracle via the EM interface http://umors.grid.umich.edu:1158/em
  • Download/prepare patches and printout instructions
  • Stop db, crs and agent (as 'root')
    • service dbora stop
    • service oracleasm stop
  • The relevant patches to apply (as user 'oracle')
    • First update 'opatch': 6880880 (Just 'cd $ORACLE_HOME' and 'unzip -x p688...')
    • Apply 4693355 ASM does not close open descriptors...
    • Apply 7592346 CPU Jan09
    • Apply 7612639 10.2.0.4 GENERIC RECOMMENDED BUNDLE #3
    • Apply 7033630 STREAMS related patch for APPLY (Special post-install notes)
All patch zip files were copied into /app/oracle/patchsets/EMStagedPatches/ and then unzipped which creates a sub-directory with the patch number.

Details of each patches README follow.

I also applied the recommendations in Metalink Note 555579.1, 10.2.0.4 Patch Set - Availability and Known Issues:

SQL> alter system set "_enable_NUMA_optimization"=FALSE scope=spfile sid='*';

System altered.

SQL> alter system set "_db_block_numa"=1 scope=spfile sid='*';

System altered.

As of February 5, 2009 the STREAMS replications is still "hung". Recommendations from Eva DaFonte/CERN suggested reading the following MetaLink documents:

  • Note 290143.1 Restarting the capture process
  • Note 313279.1 Determining the status of the capture/archive log process
  • Note 406479.1 Changed destination of the archive log file impacting STREAMS

Patch 4693355

First 'cd' to directory 4693355. Next 'export OBJECT_MODE=32_64'. Then 'opatch apply'.

muoncal on umors.grid.umich.edu: opatch apply
Invoking OPatch 10.2.0.4.5

Oracle Interim Patch Installer version 10.2.0.4.5
Copyright (c) 2008, Oracle Corporation.  All rights reserved.


Oracle Home       : /app/oracle/oracle/product/10.2.0/asm
Central Inventory : /app/oracle/oraInventory
   from           : /etc/oraInst.loc
OPatch version    : 10.2.0.4.5
OUI version       : 10.2.0.4.0
OUI location      : /app/oracle/oracle/product/10.2.0/asm/oui
Log file location : /app/oracle/oracle/product/10.2.0/asm/cfgtoollogs/opatch/opatch2009-02-03_16-14-34PM.log

...

Verifying the update...
Inventory check OK: Patch ID 4693355 is registered in Oracle Home inventory with proper meta-data.
Files check OK: Files from Patch ID 4693355 are present in Oracle Home.

The local system has been patched and can be restarted.


OPatch succeeded.

Patch 7592346

First 'cd' to directory 7592346. Next 'opatch napply -skip_subset -skip_duplicate':
muoncal on umors.grid.umich.edu: opatch napply -skip_subset -skip_duplicate
Invoking OPatch 10.2.0.4.5

Oracle Interim Patch Installer version 10.2.0.4.5
Copyright (c) 2008, Oracle Corporation.  All rights reserved.

UTIL session

Oracle Home       : /app/oracle/oracle/product/10.2.0/asm
Central Inventory : /app/oracle/oraInventory
   from           : /etc/oraInst.loc
OPatch version    : 10.2.0.4.5
OUI version       : 10.2.0.4.0
OUI location      : /app/oracle/oracle/product/10.2.0/asm/oui
Log file location : /app/oracle/oracle/product/10.2.0/asm/cfgtoollogs/opatch/opatch2009-02-03_16-22-32PM.log

Patch history file: /app/oracle/oracle/product/10.2.0/asm/cfgtoollogs/opatch/opatch_history.txt

Invoking utility "napply"
Checking conflict among patches...
Checking if Oracle Home has components required by patches...
Checking skip_duplicate
These patches will be skipped because they are duplicates of some patch(es) in the Oracle Home: 7155248,7155249,7155250,7155251,7155252,7155253,7155254,7197583
OPatch continues with these patches: 7375611,7375613,7375617,7592346,7609057,7609058
Checking skip_subset
Checking conflicts against Oracle Home...

Conflicts/Supersets for each patch are:

Patch : 7592346

        Bug Superset of 7150470
        Super set bugs are:
        7150470

OPatch found that the following patches are not required.
They are either subset of the patches in Oracle Home (or) subset of the patches in the given list
(or) duplicate :
   7155248   7155249   7155250   7155251   7155252   7155253   7155254   7197583

Following patches will be rolled back from Oracle Home on application of the patches in the given list :
   7150470

Do you want to proceed? [y|n]

...


Return Code = 0

The local system has been patched and can be restarted.

UtilSession: N-Apply done.
--------------------------------------------------------------------------------
The following warnings have occurred during OPatch execution:
1) OUI-67302:
OPatch found that the following patches are not required.
They are either subset of the patches in Oracle Home (or) subset of the patches in the given list
(or) duplicate :
   7155248   7155249   7155250   7155251   7155252   7155253   7155254   7197583
2) OUI-67303:
Following patches will be rolled back from Oracle Home on application of the patches in the given list :
   7150470
--------------------------------------------------------------------------------
OPatch Session completed with warnings.

OPatch completed with warnings.

NOTE: We have patch 7150470 being rolled back...this will need to be handled. OK...nothing to do there since I checked after installing and found:
There are no patches to apply.
UtilSession: N-Apply done.
--------------------------------------------------------------------------------
The following warnings have occurred during OPatch execution:
1) OUI-67302:
OPatch found that the following patches are not required.
They are either subset of the patches in Oracle Home (or) subset of the patches in the given list
(or) duplicate :
   7150470   7155248   7155249   7155250   7155251   7155252   7155253   7155254   7197583
There are no patches to apply.
--------------------------------------------------------------------------------
OPatch Session completed with warnings.

OPatch completed with warnings.

Next step is to run 'sh cpu_root.sh' ( as 'root'). Just worked...no output.

POSTINSTALL STEPS:

NOTE: This has to be done LATER since the ASM and DB instances are shut down.

After installing the patch, perform the following actions:

   1.

      Load modified .sql files into the database, as explained in Section 3.3.2.1.
   2.

      Recompile views in the database, if necessary, as explained in Section 3.3.2.2.

3.3.2.1 Loading Modified .sql Files into the Database

For a new or upgraded database, check Section 3.3.5, "Post Installation Instructions for New and Upgraded Databases" to see whether you need to perform the steps in this section.

To load modified .sql files into the database, follow these steps:

   1.

      If there is a database in the Oracle home that you are patching, start all database instances running from this Oracle home.
   2.

      For each database instance running on the Oracle home being patched, connect to the database using SQL*Plus. Connect as SYSDBA and run the catbundle.sql script as follows:

      cd $ORACLE_HOME/rdbms/admin
      sqlplus /nolog
      SQL> CONNECT / AS SYSDBA
      SQL> STARTUP
      SQL> @catbundle.sql cpu apply
      SQL> QUIT

      For information about the catbundle.sql script, see My Oracle Support Note 605795.1 Introduction to Oracle Database catbundle.sql.
   3.

      Check the following log files in $ORACLE_HOME/cfgtoollogs/catbundle any errors:

      catbundle_CPU_<database SID>_APPLY_<TIMESTAMP>.log
      catbundle_CPU_<database sid>_GENERATE_<TIMESTAMP>.log

      where TIMESTAMP is of the form YYYYMMMDD_HH_MM_SS. If there are errors, refer to Section 4, "Known Issues".

3.3.2.2 Recompiling Views in the Database

You may skip this section if you have recompiled views for this database during the installation of a previous CPU or if the database was created with release 11.1.0.6 or later. For new databases, see Section 3.3.5, "Post Installation Instructions for New and Upgraded Databases".

The time required to recompile the views and related objects depends on the total number of objects and on your system configuration. In one internal Oracle test with approximately 2000 views and 4000 objects, the total execution time for view_recompile_jan2008cpu.sql and utlrp.sql was about 30 minutes.

Note:
Depending on these considerations and your downtime schedule, you can choose to schedule the recompilation of views independent of the rest of the CPU installation. If you do this, your system will continue to work; however, the CPU installation will not be complete until the view recompilation is completed.

If you want to check whether view recompilation has already been performed for the database, execute the following statement.

SELECT * FROM registry$history where ID = '6452863';

If the view recompilation has been performed, this statement returns one or more rows. If the view recompilation has not been performed, this statement returns no rows.

To recompile the views in the database, follow these steps:

   1.

      Run the pre-check script (so named because it was initially released in CPUJan2008), which reports the maximum number of views and objects that may be recompiled:

      cd $ORACLE_HOME/cpu/view_recompile
      sqlplus /nolog
      SQL> CONNECT / AS SYSDBA
      SQL> @recompile_precheck_jan2008cpu.sql
      SQL> QUIT

      The purpose of this step is to help you determine whether view recompilation should be done at the same time as the CPU install, or scheduled later.
   2.

      If the database is not in a RAC environment, perform this step. (If the database is in a RAC environment, go to the next step.)

      Run the view recompilation script. Note that this script is run with the database in upgrade mode, which restricts connections as SYSDBA.

      cd $ORACLE_HOME/cpu/view_recompile
      sqlplus /nolog
      SQL> CONNECT / AS SYSDBA
      SQL> SHUTDOWN IMMEDIATE
      SQL> STARTUP UPGRADE
      SQL> @view_recompile_jan2008cpu.sql
      SQL> SHUTDOWN;
      SQL> STARTUP;
      SQL> QUIT

   3.

      If the database is in a RAC environment, run the view recompilation script as follows. Note that this script is run with the database in upgrade mode, which restricts connections as SYSDBA.

      cd $ORACLE_HOME/cpu/view_recompile
      sqlplus /nolog
      SQL> CONNECT / AS SYSDBA
      SQL> STARTUP NOMOUNT
      SQL> ALTER SYSTEM SET CLUSTER_DATABASE=FALSE SCOPE=spfile;
      SQL> SHUTDOWN
      SQL> STARTUP UPGRADE
      SQL> @view_recompile_jan2008cpu.sql
      SQL> SHUTDOWN;
      SQL> STARTUP NOMOUINT;

      Set the CLUSTER_DATABASE initialization parameter to TRUE:

      SQL> ALTER SYSTEM SET CLUSTER_DATABASE=TRUE SCOPE=spfile;

      Restart the database:

      SQL> SHUTDOWN;
      SQL> STARTUP;
      SQL> QUIT

   4.

      Check the log file for any errors. The log file is in the current directory and is named: vcomp_<sid>_<timestamp>.log
   5.

      If any invalid objects were reported, run the utlrp.sql script as follows:

      cd $ORACLE_HOME/rdbms/admin
      sqlplus /nolog
      SQL> CONNECT / AS SYSDBA
      SQL> @utlrp.sql

      Then, manually recompile any invalid objects. For example:

      SQL> alter package schemaname.packagename compile;

Patch 7612639

First 'cd' to directory 7612639. Next 'opatch napply -skip_subset -skip_duplicate'.

muoncal on umors.grid.umich.edu: opatch napply -skip_subset -skip_duplicate
Invoking OPatch 10.2.0.4.5

Oracle Interim Patch Installer version 10.2.0.4.5
Copyright (c) 2008, Oracle Corporation.  All rights reserved.

UTIL session

Oracle Home       : /app/oracle/oracle/product/10.2.0/asm
Central Inventory : /app/oracle/oraInventory
   from           : /etc/oraInst.loc
OPatch version    : 10.2.0.4.5
OUI version       : 10.2.0.4.0
OUI location      : /app/oracle/oracle/product/10.2.0/asm/oui
Log file location : /app/oracle/oracle/product/10.2.0/asm/cfgtoollogs/opatch/opatch2009-02-03_16-31-45PM.log

...

Verifying the update...
Inventory check OK: Patch ID 7691766 is registered in Oracle Home inventory with proper meta-data.
Files check OK: Files from Patch ID 7691766 are present in Oracle Home.
Running make for target ioracle
Running make for target client_sharedlib
Running make for target client_sharedlib

The local system has been patched and can be restarted.

UtilSession: N-Apply done.

OPatch succeeded.

Patch 7033630

First 'cd' to directory 7033630. Next 'export OBJECT_MODE=32_64'. Then 'opatch apply'.

muoncal on umors.grid.umich.edu: opatch apply
Invoking OPatch 10.2.0.4.5

Oracle Interim Patch Installer version 10.2.0.4.5
Copyright (c) 2008, Oracle Corporation.  All rights reserved.


Oracle Home       : /app/oracle/oracle/product/10.2.0/asm
Central Inventory : /app/oracle/oraInventory
   from           : /etc/oraInst.loc
OPatch version    : 10.2.0.4.5
OUI version       : 10.2.0.4.0
OUI location      : /app/oracle/oracle/product/10.2.0/asm/oui
Log file location : /app/oracle/oracle/product/10.2.0/asm/cfgtoollogs/opatch/opatch2009-02-03_16-37-17PM.log

...


Verifying the update...
Inventory check OK: Patch ID 7033630 is registered in Oracle Home inventory with proper meta-data.
Files check OK: Files from Patch ID 7033630 are present in Oracle Home.

OPatch succeeded.

POSTINSTALL (Will have to be done after we startup the DB):
#  Patch Special Instructions:
#  ---------------------------
#  Post-install steps:
#  1) Stop streams or logical standby if they are running.
#  2) Connect to the database as SYS and load prvtlmc.plb.
#  3) Restart strams or logical standby as needed.
#  NOTE: On a RAC, users should ensure that neither streams nor
#  SQL Apply is running on any instance at the time prvtlmc.plb
#  is loaded.  Further note that prvtlmc.plb need only be loaded
#  into the database once (from any instance).

Blocking Problem for 10.2.0.4 Patchset Application

As of January 30th we have also run out of space on umors.grid.umich.edu. The specific message on https://umors.grid.umich.edu:1158/em is "Details Archiver is unable to archive a redo log because the output device is full or unavailable. The instance is open." We received an email from the CERN Oracle Streams monitoring as well:

Streams Monitor Error Report
Report date: 2009-01-30 10:09:40

Affected Site: UMICH
Affected Database: MUONCAL.GRID.UMICH.EDU
Process Name: STREAMS_CAPTURE_MICH
Error Time: 29-01-2009 21:07:57
Error Message: Capture latency higher than default limit of 90 mins. Please check.
Current process status: PAUSED FOR FLOW CONTROL

See also: https://oms3d.cern.ch:1159/streams/streams

This had previously happened in October 2008 and at that time we were able free up a significant amount of space by removing .trc files older than 120 days via find . -mtime +120 -exec rm {} \; in the /app/oracle/oracle/product/10.2.0/asm/admin/muoncal/bdump directory. This time the total space used was 3.2GB there but this command only freed up 200MB.

You can most clearly see the problem on umors.grid.umich.edu as follows:

[umors:bdump]# su - oracle
.profile executed
muoncal on umors.grid.umich.edu: sqlplus "SYS/xxxxxxxxx as SYSDBA"

SQL*Plus: Release 10.2.0.4.0 - Production on Fri Jan 30 09:44:36 2009

Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> SELECT NAME,TYPE,TOTAL_MB,FREE_MB FROM V$ASM_DISKGROUP;

NAME                           TYPE     TOTAL_MB    FREE_MB
------------------------------ ------ ---------- ----------
DATAFLASH                      NORMAL     476944         40

SQL>

Only 40MB are free (actually less since the ASM area is mirrored)! This is what we need to address to get the system operational so we can apply patches. The problem is almost certainly that we have too many archive log sequences stored and the old ones need to be cleaned up.

We need to run the asmcmd to access the ASM area. To do this 'su - oracle' and be sure to set the correct ORACLE_SID:

export ORACLE_SID=+ASM

asmcmd -p

This works but I wanted to check some more things before using asmcmd.

Running RMAN

To run 'rman' I had to do this:

muoncal on umors.grid.umich.edu: /app/oracle/oracle/product/10.2.0/asm/bin/rman TARGET SYS/xxxxxxx@muoncal

Recovery Manager: Release 10.2.0.4.0 - Production on Fri Jan 30 13:14:14 2009

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

connected to target database: MUONCAL (DBID=3255750732)

Fixing the ASM ARCHIVELOG Management

[umors:~]# su - oracle
.profile executed
muoncal on umors.grid.umich.edu: sqlplus "SYS/xxxxxxx as SYSDBA"

SQL*Plus: Release 10.2.0.4.0 - Production on Fri Jan 30 12:23:17 2009

Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> shutdown immediate
ORA-01109: database not open


Database dismounted.
ORACLE instance shut down.
SQL> startup mount
ORACLE instance started.

Total System Global Area 2147483648 bytes
Fixed Size                  2183624 bytes
Variable Size             966569528 bytes
Database Buffers         1174405120 bytes
Redo Buffers                4325376 bytes
Database mounted.
SQL> alter database noarchivelog;
alter database noarchivelog
*
ERROR at line 1:
ORA-38774: cannot disable media recovery - flashback database is enabled


SQL> alter database flashback off;

Database altered.

SQL> alter database noarchivelog;

Database altered.

SQL>
SQL> alter database open;

Database altered.

SQL> archive log list;
Database log mode              No Archive Mode
Automatic archival             Disabled
Archive destination            +DATAFLASH/muoncal/
Oldest online log sequence     4907
Current log sequence           4909

Now we should be able to access things to allow us to clean up all the ARCHIVELOG content.

Reclaiming Space

On Oracle's Metalink I found a few good articles:

  • Subject: How To Reclaim Asm Disk Space? Doc ID: 351866.1
  • Subject: How To Use RMAN To Delete Archivelog Files (Including ASM) Doc ID: 368844.1
  • Subject: How To Delete Archive Log Files Out Of +Asm? Doc ID: 300472.1
If you do export ORACLE_SID\=+ASM and then go into sqlplus you can actually generate a list of files to delete:
select 'alter diskgroup DATAFLASH drop file "DATAFLASH/MUONCAL/ARCHIVELOG/'|| to_char(b.creation_date,'YYYY_MM_DD') ||'/'|| a.name||'";' from v$asm_alias a, v$asm_file b where a.group_number = b.group_number and a.file_number = b.file_number and b.type = 'ARCHIVELOG' order by a.name;

However this didn't seem to work (created the SQL but didn't apply it).

What ended up cleaning up the bulk of the space was using asmcmd and doing:

+ASM on umors.grid.umich.edu: asmcmd 
ASMCMD> pwd
+
ASMCMD> cd DATAFLASH
ASMCMD> cd MUONCAL/ARCHIVELOG
ASMCMD> rm -r 2006*
You may delete multiple files and/or directories.
Are you sure? (y/n) y
ASMCMD> rm -r 2007*
You may delete multiple files and/or directories.
Are you sure? (y/n) y
ASMCMD> rm -r 2008*
You may delete multiple files and/or directories.
Are you sure? (y/n) y

After this we had only 23\% used on the DATADISK +ASM area.

I re-setup the ArchiveLog and Flashback that I had disabled above.

Upgrading UMORS Kernel

Part of our task is to get the most recent kernel in place on umors.grid.umich.edu. This is currently 2.6.9-78.0.13. There are two tricky parts to upgrading the kernel:

  • The Oracle ASM rpms must be updated
  • The mvSata driver must be rebuilt for the new kernel
For both of these we first want to install the new kernel and then worry about fixing things before we reboot.

I used up2date-nox -l to see what was available. Then I downloaded the newest 'smp' kernels via up2date-nox -d kernel-smp*.

The downloaded files show up in /var/spool/up2date/. I just did an RPM install:

[umors:~]# rpm -ivh /var/spool/up2date/kernel-smp-*.rpm
Preparing...                ########################################### [100%]
   1:kernel-smp-devel       ########################################### [ 50%]
   2:kernel-smp             ########################################### [100%]
WARNING: No module mv_sata found for kernel 2.6.9-78.0.13.ELsmp, continuing anyway
WARNING: No module mv_sata found for kernel 2.6.9-78.0.13.ELsmp, continuing anyway

As you can see there is no mv_sata found. We need to build this for this new kernel and then re-create the initrd image.

To do this, first cd to /root/mvSata-Linux-3.6.1/LinuxIAL, save the existing build.sh and edit it to point to the new kernel. You must also fix the softlink for /usr/src/linux-2.6 to "point" to the new kernel area:

ln -s /usr/src/kernels/2.6.9-78.0.13.EL-smp-x86_64/ /usr/src/linux-2.6

After you run build.sh you will have 3 variants of the needed mv_sata.ko module in:

[umors:mvSata-Linux-3.6.1]# ls build/Linux/*
build/Linux/DebugError:
mv_sata.ko

build/Linux/DebugFull:
mv_sata.ko

build/Linux/Free:
mv_sata.ko

I usually 'install' the DebugError version: cp build/Linux/DebugError/mv_sata.ko /lib/modules/2.6.9-78.0.13.ELsmp/kernel/drivers/scsi/

Then recreate 'initrd':

[umors:mvSata-Linux-3.6.1]# depmod -a 2.6.9-78.0.13.ELsmp
[umors:mvSata-Linux-3.6.1]# mkinitrd -v -f /boot/initrd-2.6.9-78.0.13.ELsmp.img  2.6.9-78.0.13.ELsmp
Creating initramfs
Looking for deps of module scsi_mod
Looking for deps of module sd_mod        scsi_mod
Looking for deps of module scsi_mod
Looking for deps of module unknown
Looking for deps of module mv_sata       scsi_mod
Looking for deps of module scsi_mod
Looking for deps of module ide-disk
Looking for deps of module raid1
Looking for deps of module ext3  jbd
Looking for deps of module jbd
Using modules:  ./kernel/drivers/scsi/scsi_mod.ko ./kernel/drivers/scsi/sd_mod.ko ./kernel/drivers/scsi/mv_sata.ko ./kernel/drivers/md/raid1.ko ./kernel/fs/jbd/jbd.ko ./kernel/fs/ext3/ext3.ko
/sbin/nash -> /tmp/initrd.P18997/bin/nash
/sbin/insmod.static -> /tmp/initrd.P18997/bin/insmod
/sbin/udev.static -> /tmp/initrd.P18997/sbin/udev
/etc/udev/udev.conf -> /tmp/initrd.P18997/etc/udev/udev.conf
copy from /lib/modules/2.6.9-78.0.13.ELsmp/./kernel/drivers/scsi/scsi_mod.ko(elf64-x86-64) to /tmp/initrd.P18997/lib/scsi_mod.ko(elf64-x86-64)
copy from /lib/modules/2.6.9-78.0.13.ELsmp/./kernel/drivers/scsi/sd_mod.ko(elf64-x86-64) to /tmp/initrd.P18997/lib/sd_mod.ko(elf64-x86-64)
copy from /lib/modules/2.6.9-78.0.13.ELsmp/./kernel/drivers/scsi/mv_sata.ko(elf64-x86-64) to /tmp/initrd.P18997/lib/mv_sata.ko(elf64-x86-64)
copy from /lib/modules/2.6.9-78.0.13.ELsmp/./kernel/drivers/md/raid1.ko(elf64-x86-64) to /tmp/initrd.P18997/lib/raid1.ko(elf64-x86-64)
copy from /lib/modules/2.6.9-78.0.13.ELsmp/./kernel/fs/jbd/jbd.ko(elf64-x86-64) to /tmp/initrd.P18997/lib/jbd.ko(elf64-x86-64)
copy from /lib/modules/2.6.9-78.0.13.ELsmp/./kernel/fs/ext3/ext3.ko(elf64-x86-64) to /tmp/initrd.P18997/lib/ext3.ko(elf64-x86-64)
Loading module scsi_mod
Loading module sd_mod
Loading module mv_sata
Loading module raid1
Loading module jbd
Loading module ext3

Now you can also install the newest ASM rpm:

[umors:~]# rpm -ivh oracleasm-2.6.9-78.0.13.ELsmp-2.0.5-1.el4.x86_64.rpm
Preparing...                ########################################### [100%]
   1:oracleasm-2.6.9-78.0.13########################################### [100%]

Now we should be good to reboot into this new kernel!

Adding New Disks to UMORS

The last time we ran out of disk space we ordered five 1TB disks to be used to replace the four 250GB disks in UMORS (and one for a shelf-spare). We plan to migrate to these new disks as part of our January 30th maintenance but we need to determine how to do this. The first two disks (/dev/sda and /dev/sdb) are managed by software RAID on Linux. The second two disks (/dev/sdc and /dev/sdd) are managed by Oracle via ASM.

Dropping a disk, adding larger disk

(on umors)
su - oracle
export ORACLE_SID="+ASM"
sqlplus /NOLOG
CONNECT / AS SYSDBA

ASM information queries:
SELECT name, type, total_mb, free_mb, required_mirror_free_mb, required_mirror_free_mb, usable_file_mb FROM V$ASM_DISKGROUP

-- ShawnMcKee - 03 Sep 2008 Drop the disk, have to force or it stays "HUNG" and doesn't drop:
SQL> ALTER DISKGROUP DATAFLASH DROP DISK DISK2 FORCE;             

(this is maybe not needed, the disk never goes offline. Just pulling it worked ok)
Adding the disk - step 1 is to label it:
[umors:~]# /etc/init.d/oracleasm createdisk DISK3 /dev/sdd1

Then add it to the diskgroup:
ALTER DISKGROUP DATAFLASH ADD DISK 'ORCL:DISK3' name DISK3

The above doesn't work right. Here is why. Oracle maintains "failure groups" of storage. For example, all disks on the system system/controller should really be in a single failure group and to have redundancy you need at least 2 failure groups. Our original configuration had two failure groups: DISK1 and DISK2 corresponding to disks DISK1 ('ORCL:DISK1') and DISK2 ('ORCL:DISK2'). What the above commands did is as follows:

  • The 'DROP DISK' forced a removal of the only member of failure group DISK2 thereby hanging that failure group
  • The 'ADD DISK' command created a new failure group named DISK3
To fix this we need to undo the 'ADD DISK' and then specifically RE-ADD this new disk to the original HUNG failure group DISK2. Here are the needed commands run via SQLPLUS as user 'oracle' with ORACLE_SID=+ASM:

SQL> alter diskgroup DATAFLASH drop disk DISK3;

Diskgroup altered.

SQL> ALTER diskgroup DATAFLASH ADD FAILGROUP DISK2 disk 'ORCL:DISK2' name DISK3;

Diskgroup altered.

This makes sure the disk is put into the correct failure group. Note is does force a rebalance (recopy) but once that finishes the MISSING/HUNG disk should disappear from ASM. See the detailed discussion at http://www.oracloid.com/2006/05/vldb-with-asm/

Restoration of STREAMS Replication to CERN

On February 10th, 2009 Gancho Dimitrov was able to re-create the STREAMS replication from muoncal.grid.umich.edu to intr.cern.ch by doing the following. Note that this was neccessary because the "ARCHIVELOG" mode had been disabled (see above) which resulted in the loss of sequences from 4906-4911.

-- 10.Feb.2009 

1) -- issue a data dictionary dump, get the SCN and use it as FIRST and START SCN for the CAPTURE 

set serveroutput on 

DECLARE
scn NUMBER;
BEGIN
DBMS_CAPTURE_ADM.BUILD ( first_scn => scn );
DBMS_OUTPUT.PUT_LINE ('First SCN: ' || scn);
END;


First SCN: 6047632153953


2) -- export the data with the FLASHBACK_SCN = the SCN from the data dictionary build 

-- drop the tables on the INTR

BEGIN
        FOR j in (SELECT  owner, table_name tt FROM dba_tables WHERE owner = 'ATLAS_MUONCALIB_MICH') LOOP
                execute immediate 'DROP TABLE ' || j.owner ||'."'||j.tt || '" cascade constraints PURGE ';
        END LOOP;
END;
/


impdp strmadmin@intr FLASHBACK_SCN=6047632153953 SCHEMAS=ATLAS_MUONCALIB_MICH NOLOGFILE=Y TRANSFORM=SEGMENT_ATTRIBUTES:N NETWORK_LINK=MUONCAL.GRID.UMICH.EDU


3) Streams setup 


-- SOURCE DATABASE MICHIGAN--
-- direct Streams setup

-- check init.ora parameters and create streams administrator user!!!
-- tnsnames.ora configuration
-- use create_strmadmin_10g.sql on version 10g
-- check database links
-- check supplemental logging

-- CUSTOMIZE!!!
define var_queue_name=STRM_QUEUE_CA_MICH;
define var_capture_name=STREAMS_CAPTURE_MICH;
define var_source_name=MUONCAL.GRID.UMICH.EDU;
define var_schema_name=ATLAS_MUONCALIB_MICH;


-- connect as Streams administrator: strmadmin
CONNECT strmadmin

-- create the streams queue
BEGIN
  DBMS_STREAMS_ADM.SET_UP_QUEUE(
   queue_table => 'T_' || '&var_queue_name',
   queue_name => '&var_queue_name',
   queue_user => 'STRMADMIN');
END;
/


-- I have removed that param from the CREATE_CAPTURE below
    -- use_database_link  => true,


-- create capture proceses
-- with First SCN: 6047632153953
BEGIN
  DBMS_CAPTURE_ADM.CREATE_CAPTURE(
    queue_name         => 'STRMADMIN.' || '&var_queue_name',
    capture_name       => '&var_capture_name',
    source_database    => '&var_source_name',
    start_scn          => 6047632153953,
    first_scn          => 6047632153953,
    logfile_assignment => 'implicit'
   );
END;
/


-- add capture rules (once per schema!!!)
BEGIN
 DBMS_STREAMS_ADM.ADD_SCHEMA_RULES(
    schema_name => '&var_schema_name',
    streams_type => 'CAPTURE',
    streams_name => '&var_capture_name',
    queue_name => 'STRMADMIN.' || '&var_queue_name',
    include_dml => true,
    include_ddl => true,
    include_tagged_lcr => false,
    source_database => '&var_source_name',
    inclusion_rule => true);
 END;
 /

---- Oracle Recomendations!
-- reduce the checkpoint retention time (the default is 60 days )
BEGIN
   DBMS_CAPTURE_ADM.ALTER_CAPTURE
      ('&var_capture_name',
      checkpoint_retention_time => 7); 
END;
/
---- minimize the number of logminer checkpoints when the database is very active
BEGIN
  DBMS_CAPTURE_ADM.SET_PARAMETER(
    capture_name => '&var_capture_name',
    parameter    => '_checkpoint_frequency',
    value        => 500);
END;
/


============================= STREAMS_APPLY_MUONCALIB_MICH ====================================

-- DESTINATION DATABASE --
-- direct Streams setup

DEFINE var_queue_name = Q_AP_MUONCALIB_MICH;
DEFINE var_apply_name = STREAMS_APPLY_MUONCALIB_MICH;
DEFINE var_source_name = MUONCAL.GRID.UMICH.EDU;

-- connect as Streams administrator: strmadmin

-- create the streams queue
BEGIN
  DBMS_STREAMS_ADM.SET_UP_QUEUE(
   queue_table => 'T' || '&var_queue_name',
   queue_name => '&var_queue_name',
   queue_user => 'STRMADMIN');
END;
/

-- queue affinity (only RAC database)
 begin
  dbms_aqadm.alter_queue_table(
     queue_table => 'T' || '&var_queue_name',
     primary_instance => 2,
     secondary_instance => 1);
 end;
 /

-- create the apply process
BEGIN
  DBMS_APPLY_ADM.CREATE_APPLY(
     queue_name => 'STRMADMIN.' || '&var_queue_name',  
     apply_name => '&var_apply_name',
     apply_user => 'STRMADMIN',
     source_database => '&var_source_name',
     apply_captured => TRUE);
END;
/
     
-- specify an apply user 
BEGIN
  DBMS_APPLY_ADM.ALTER_APPLY(
    apply_name => '&var_apply_name',
    apply_user => 'STRMADMIN');
END;
/


/*
-- not needed for the 
-- add APPLY schema rules 
-- avoid bug when dropping tables after adding PVSS replication between same source and destination

define var_schema_name= ATLAS_MUONCALIB_MICH

BEGIN
  DBMS_STREAMS_ADM.ADD_SCHEMA_RULES(
    schema_name        => '&var_schema_name',
    streams_type       => 'APPLY',
    streams_name       => '&var_apply_name',
    queue_name         => 'STRMADMIN.' || '&var_queue_name',
    include_dml        => TRUE,
    include_ddl        => TRUE,
    include_tagged_lcr => TRUE, -- special case for the APPLY on the ATLR 
    source_database    => '&var_source_name',
    inclusion_rule     => true);
END;
/

*/

---- Oracle Recomendations!
---- workload mix of I/U/D activity, then the following apply parameters should be set:

BEGIN
  DBMS_APPLY_ADM.SET_PARAMETER(
    apply_name   => '&var_apply_name',
    parameter    => 'commit_serialization',
    value        => 'full');
  DBMS_APPLY_ADM.SET_PARAMETER(
    apply_name   => '&var_apply_name',
    parameter    => '_hash_table_size',
    value        => 10000000);
  DBMS_APPLY_ADM.SET_PARAMETER(
    apply_name   => '&var_apply_name',
    parameter    => '_DYNAMIC_STMTS',
    value        => 'y');
  DBMS_APPLY_ADM.SET_PARAMETER(
    apply_name   => '&var_apply_name',
    parameter    => 'parallelism',
    value        => 1);
  DBMS_APPLY_ADM.SET_PARAMETER(
    apply_name   => '&var_apply_name',
    parameter    => '_txn_buffer_size',
    value        => 14);
  DBMS_APPLY_ADM.SET_PARAMETER(
     apply_name => '&var_apply_name',
     parameter => 'DISABLE_ON_ERROR',
     value => 'Y');
END;
/

-- start the apply process       
BEGIN
  DBMS_APPLY_ADM.START_APPLY(
    apply_name => '&var_apply_name');
END;
/

===================== Propagation ===================================

define var_source_queue=STRM_QUEUE_CA_MICH;
define var_destination_queue= Q_AP_MUONCALIB_MICH;
define var_source_name=MUONCAL.GRID.UMICH.EDU;
define var_propagation_name=STREAMS_MUONCALIB_PROPAGATE;
define var_destination_name=INTR.CERN.CH;
define var_schema_name=ATLAS_MUONCALIB_MICH;

-- connect as Streams administrator: strmadmin
-- connect strmadmin

-- configure propagation from source to destination database 

BEGIN
 DBMS_PROPAGATION_ADM.CREATE_PROPAGATION(
     propagation_name        => '&var_propagation_name',
     source_queue            => 'STRMADMIN.' || '&var_source_queue',
     destination_queue       => 'STRMADMIN.' || '&var_destination_queue',
     destination_dblink      => '&var_destination_name',
     queue_to_queue => TRUE );
END;
/


-- change propagation latency - Oracle recommendation
BEGIN
 DBMS_AQADM.ALTER_PROPAGATION_SCHEDULE(
    queue_name => 'STRMADMIN.' || '&var_source_queue',
    destination => '"STRMADMIN"."&var_destination_queue"@&var_destination_name',
    latency => 1);
END;
/

-- instantiation

DECLARE
  iscn  NUMBER;    -- Variable to hold instantiation SCN value
BEGIN
 -- iscn := DBMS_FLASHBACK.GET_SYSTEM_CHANGE_NUMBER();
   iscn:= 6047632153953;
DBMS_APPLY_ADM.SET_SCHEMA_INSTANTIATION_SCN@&var_destination_name(
    source_schema_name    => '&var_schema_name',
    source_database_name  => '&var_source_name',
    instantiation_scn     => iscn,
    recursive => true);
END;
/



============ source DB - INTR, start the CAPTURE =================
define var_capture_name=STREAMS_CAPTURE_PVSSCONF;

--prepare instantiation for the DB objects  
BEGIN
   DBMS_CAPTURE_ADM.prepare_schema_instantiation('&var_schema_name');
END;
/

BEGIN
   DBMS_CAPTURE_ADM.start_capture('&var_capture_name');
END;
/

[test change -- delete me]

-- ShawnMcKee - 03 Sep 2008
Topic revision: r22 - 01 Feb 2020, PhilippeLaurens
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback