Voting Disk(VD) Administration


If we choose ASM diskgroup to store Voting disk then the number of voting files is decided by the redundancy level of the ASM diskgroup .

Ø  A diskgroup with External redundancy can store only one voting disk.

Ø  If we choose Normal Redundancy diskgroup  three voting disks will be created . A normal redundancy disk group must contain at least two failure groups but if you are storing your voting disks on Oracle ASM, then a normal redundancy disk group must contain at least three failure groups.

Ø  Choosing a  High Redundancy diskgroup creates five voting disks . A high redundancy disk group must contain at least three failure groups but if you are storing your voting disks on Oracle ASM, then a high redundancy diskgroup must contain at least five failure groups.

In 11gr2 we no longer required to backup the VD .It is automatically backedup in OCR .

NOTE: Current  setup has the below Diskgroups and redundancy level .

SQL> select name,type from v$asm_diskgroup;

NAME                           TYPE
------------------------------ ------
CRS                            NORMAL
FRA                            EXTERN
DATA                         EXTERN


How to check VD details

crsctl query css votedisk

[root@rac1 bin]# ./crsctl query css votedisk

##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   fc675115419f4febbfa67c9c38e9ac47 (/dev/asm-disk4) [DATA]
Located 1 voting disk(s).
[root@rac1 bin]#

How to move/replace VD from one diskgroup to another diskgroup

Currently VD is on DATA diskgroup which is an EXTERNAL REDUNDANCY so I have only one voting disk .

[root@rac1 bin]# ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   fc675115419f4febbfa67c9c38e9ac47 (/dev/asm-disk4) [DATA]
Located 1 voting disk(s).

Using Replace option I can move VD from one diskgroup to other diskgroup . In this example I am moving from DATA to CRS . Where CRS diskgroup is NORMAL REDUNDANCY so three voting disks are created .

[root@rac1 bin]# ./crsctl replace votedisk +CRS
Successful addition of voting disk 8191661ee0a64fd7bf6cb15f26e12b45.
Successful addition of voting disk e8ce378aff654f79bf328360862db46a.
Successful addition of voting disk 7a4b690932524fb7bfef90a03ff4620a.
Successful deletion of voting disk fc675115419f4febbfa67c9c38e9ac47.
Successfully replaced voting disk group with +CRS.
CRS-4266: Voting file(s) successfully replaced

[root@rac1 bin]# ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   8191661ee0a64fd7bf6cb15f26e12b45 (/dev/asm-disk8) [CRS]
 2. ONLINE   e8ce378aff654f79bf328360862db46a (/dev/asm-disk9) [CRS]
 3. ONLINE   7a4b690932524fb7bfef90a03ff4620a (/dev/asm-disk10) [CRS]
Located 3 voting disk(s).

Restore VD when atleast one copy of OCR is available

Ø  Check voting disks status

Ø  Please note that we need to have at least more than half of the voting disks ONLINE for CRS to be running  

[root@rac1 bin]# ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   8191661ee0a64fd7bf6cb15f26e12b45 (/dev/asm-disk8) [CRS]
 2. ONLINE   e8ce378aff654f79bf328360862db46a (/dev/asm-disk9) [CRS]
 3. ONLINE   7a4b690932524fb7bfef90a03ff4620a (/dev/asm-disk10) [CRS]
Located 3 voting disk(s).


Ø  To simulate the corruption of voting disk , I am nullifying one of the voting disk using dd command .


[root@rac1 ~]# dd if=/dev/zero of=/dev/asm-disk8 bs=4096 count=1000000
dd: writing `/dev/asm-disk8': No space left on device
261049+0 records in
261048+0 records out
1069254144 bytes (1.1 GB) copied, 7.54359 s, 142 MB/s
[root@rac1 ~]#

Ø  We can see the STATE of disk changed to PENDOFFLINE status , which will be soon changed to OFFLINE .

[root@rac1 bin]# ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
1. PENDOFFL 8191661ee0a64fd7bf6cb15f26e12b45 (/dev/asm-disk8) [CRS]
2. ONLINE   e8ce378aff654f79bf328360862db46a (/dev/asm-disk9) [CRS]
3. ONLINE   7a4b690932524fb7bfef90a03ff4620a (/dev/asm-disk10) [CRS]

[root@rac1 bin]# ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. OFFLINE  8191661ee0a64fd7bf6cb15f26e12b45 (/dev/asm-disk8) [CRS]
 2. ONLINE   e8ce378aff654f79bf328360862db46a (/dev/asm-disk9) [CRS]
 3. ONLINE   7a4b690932524fb7bfef90a03ff4620a (/dev/asm-disk10) [CRS]

Ø  Out of three , one voting disk is OFFLINE .other two VD are ONLINE which are more than half so cluster will run fine .

[root@rac1 bin]# ./crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
[root@rac1 bin]#


Ø  I am nullifying one more voting disk

[root@rac1 ~]# dd if=/dev/zero of=/dev/asm-disk9 bs=4096 count=1000000
dd: writing `/dev/asm-disk9': No space left on device
261049+0 records in
261048+0 records out
1069254144 bytes (1.1 GB) copied, 14.0521 s, 76.1 MB/s
[root@rac1 ~]#

Ø  Keep checking the status  using the below command

[root@rac1 bin]# ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. OFFLINE  8191661ee0a64fd7bf6cb15f26e12b45 (/dev/asm-disk8) [CRS]
 2. PENDOFFL e8ce378aff654f79bf328360862db46a (/dev/asm-disk9) [CRS]
 3. ONLINE   7a4b690932524fb7bfef90a03ff4620a (/dev/asm-disk10) [CRS]
Located 3 voting disk(s).
[root@rac1 bin]#

[root@rac1 bin]# ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. OFFLINE  8191661ee0a64fd7bf6cb15f26e12b45 (/dev/asm-disk8) [CRS]
 2. PENDOFFL e8ce378aff654f79bf328360862db46a (/dev/asm-disk9) [CRS]
 3. ONLINE   7a4b690932524fb7bfef90a03ff4620a (/dev/asm-disk10) [CRS]
Located 3 voting disk(s).
[root@rac1 bin]#

[root@rac1 bin]# ./crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

Ø  I see CRS is going down  and this would bring  the cluster services down on all the nodes

Node 1 :

[root@rac1 bin]# ./crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
CRS-4533: Event Manager is online

[root@rac1 bin]# ./crsctl query css votedisk
Unable to communicate with the Cluster Synchronization Services daemon.
[root@rac1 bin]# ./crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
CRS-4534: Cannot communicate with Event Manager
[root@rac1 bin]#

Node 2 :

[root@rac2 ~]# ps -ef|grep smon
root     29140     1  3 00:52 ?        00:00:01 /u01/app/12.2.0.1/grid/bin/osysmond.bin
root     30062 30035  0 00:52 pts/0    00:00:00 grep smon

[root@rac2 bin]# ./crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
CRS-4534: Cannot communicate with Event Manager
[root@rac2 bin]#

Ø  We will not be able to stop the CRS gracefully at this stage .

[root@rac2 bin]# ./crsctl stop crs
CRS-2796: The command may not proceed when Cluster Ready Services is not running
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac2'
CRS-2795: Shutdown of Oracle High Availability Services-managed resources on 'rac2' has failed
CRS-4687: Shutdown command has completed with errors.
CRS-4000: Command Stop failed, or completed with errors.

Ø  Stop the CRS using FORCE option on all the nodes  .

Node 2 :

[root@rac2 bin]# ./crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac2'
CRS-2673: Attempting to stop 'ora.gpnpd' on 'rac2'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'rac2'
CRS-2673: Attempting to stop 'ora.evmd' on 'rac2'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'rac2'
CRS-2673: Attempting to stop 'ora.crf' on 'rac2'
CRS-2677: Stop of 'ora.gpnpd' on 'rac2' succeeded
CRS-2677: Stop of 'ora.drivers.acfs' on 'rac2' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'rac2' succeeded
CRS-2677: Stop of 'ora.crf' on 'rac2' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'rac2'
CRS-2677: Stop of 'ora.evmd' on 'rac2' succeeded
CRS-2677: Stop of 'ora.gipcd' on 'rac2' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rac2' has completed
CRS-4133: Oracle High Availability Services has been stopped.
[root@rac2 bin]#

Node 1 :

[root@rac1 bin]# ./crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac1'
CRS-2673: Attempting to stop 'ora.crf' on 'rac1'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'rac1'
CRS-2673: Attempting to stop 'ora.gpnpd' on 'rac1'
CRS-2673: Attempting to stop 'ora.evmd' on 'rac1'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'rac1'
CRS-2677: Stop of 'ora.drivers.acfs' on 'rac1' succeeded
CRS-2677: Stop of 'ora.crf' on 'rac1' succeeded
CRS-2677: Stop of 'ora.gpnpd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.evmd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'rac1'
CRS-2677: Stop of 'ora.gipcd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'rac1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rac1' has completed
CRS-4133: Oracle High Availability Services has been stopped.
[root@rac1 bin]#

Ø  Start the CRS in EXCLUSIVE MODE in any one node only   .


[root@rac1 bin]# ./crsctl start crs -excl -nocrs
CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.evmd' on 'rac1'
CRS-2672: Attempting to start 'ora.mdnsd' on 'rac1'
CRS-2676: Start of 'ora.evmd' on 'rac1' succeeded
CRS-2676: Start of 'ora.mdnsd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'rac1'
CRS-2676: Start of 'ora.gpnpd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac1'
CRS-2672: Attempting to start 'ora.gipcd' on 'rac1'
CRS-2676: Start of 'ora.cssdmonitor' on 'rac1' succeeded
CRS-2676: Start of 'ora.gipcd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rac1'
CRS-2672: Attempting to start 'ora.diskmon' on 'rac1'
CRS-2676: Start of 'ora.diskmon' on 'rac1' succeeded
CRS-2676: Start of 'ora.cssd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'rac1'
CRS-2672: Attempting to start 'ora.ctssd' on 'rac1'
CRS-2676: Start of 'ora.ctssd' on 'rac1' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'rac1'
CRS-2676: Start of 'ora.asm' on 'rac1' succeeded

[root@rac1 bin]# ps -ef|grep smon
oracle   24835     1  0 00:56 ?        00:00:00 asm_smon_+ASM1
root     25233  5055  0 00:57 pts/1    00:00:00 grep smon

[root@rac1 bin]# ps -ef|grep crsd
root     25266  5055  0 00:57 pts/1    00:00:00 grep crsd

 [root@rac1 bin]# ./crsctl stat res -init -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  ONLINE       rac1                     STABLE
ora.cluster_interconnect.haip
      1        ONLINE  ONLINE       rac1                     STABLE
ora.crf
      1        OFFLINE OFFLINE                               STABLE
ora.crsd
      1        OFFLINE OFFLINE                               STABLE
ora.cssd
      1        ONLINE  ONLINE       rac1                     STABLE
ora.cssdmonitor
      1        ONLINE  ONLINE       rac1                     STABLE
ora.ctssd
      1        ONLINE  ONLINE       rac1                     OBSERVER,STABLE
ora.diskmon
      1        OFFLINE OFFLINE                               STABLE
ora.drivers.acfs
      1        ONLINE  ONLINE       rac1                     STABLE
ora.evmd
      1        ONLINE  INTERMEDIATE rac1                     STABLE
ora.gipcd
      1        ONLINE  ONLINE       rac1                     STABLE
ora.gpnpd
      1        ONLINE  ONLINE       rac1                     STABLE
ora.mdnsd
      1        ONLINE  ONLINE       rac1                     STABLE
ora.storage
      1        OFFLINE OFFLINE                               STABLE

--------------------------------------------------------------------------------

Ø  I am relocating VD to other diskgroup DATA which is an EXTERNAL REDUNDANCY. Only one VD will be created in DATA .

[root@rac1 bin]# ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. OFFLINE  8191661ee0a64fd7bf6cb15f26e12b45 () []
 2. OFFLINE  e8ce378aff654f79bf328360862db46a () []
 3. ONLINE   7a4b690932524fb7bfef90a03ff4620a (/dev/asm-disk10) [CRS]
Located 3 voting disk(s).
[root@rac1 bin]#

[root@rac1 bin]# ./crsctl replace votedisk +DATA
Successful addition of voting disk e4c2be6b1c114f7ebf4e1689ab0ac3dd.
Successful deletion of voting disk 8191661ee0a64fd7bf6cb15f26e12b45.
Successful deletion of voting disk e8ce378aff654f79bf328360862db46a.
Successful deletion of voting disk 7a4b690932524fb7bfef90a03ff4620a.
Successfully replaced voting disk group with +DATA.
CLSU-00100: operating system function: kgfnStmtExecute failed with error data: 0
CLSU-00101: operating system error message: Error 0
CLSU-00103: error location: kgfdvfDel01
CLSU-00104: additional error information: ORA-15001: diskgroup "CRS" does not exist or is not mounted
CLSU-00104: additional error information: ORA-06512: at line 4
CLSU-00104: additional error information: ORA-06512: at "SYS.X$DBMS_DISKGROUP", line 548
CLSU-00104: additional error information: ORA-06512: at line 2
CRS-4266: Voting file(s) successfully replaced
[root@rac1 bin]#


Ø  Currently VD has been moved to DATA diskgroup which is an external redundancy so only one VD is required and I have 1 VD ONLINE to bring up my crs .

[root@rac1 bin]# ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   e4c2be6b1c114f7ebf4e1689ab0ac3dd (/dev/asm-disk4) [DATA]
Located 1 voting disk(s).

Ø  Stop CRS and START it in normal mode

[root@rac1 bin]# ./crsctl stop crs
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac1'
CRS-2673: Attempting to stop 'ora.evmd' on 'rac1'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'rac1'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'rac1'
CRS-2673: Attempting to stop 'ora.gpnpd' on 'rac1'
CRS-2677: Stop of 'ora.drivers.acfs' on 'rac1' succeeded
CRS-2677: Stop of 'ora.evmd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.ctssd' on 'rac1'
CRS-2673: Attempting to stop 'ora.asm' on 'rac1'
CRS-2677: Stop of 'ora.mdnsd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.gpnpd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.asm' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'rac1'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'rac1'
CRS-2677: Stop of 'ora.cssd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'rac1'
cRS-2677: Stop of 'ora.gipcd' on 'rac1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rac1' has completed
CRS-4133: Oracle High Availability Services has been stopped.

[root@rac1 bin]# ./crsctl start crs
CRS-4123: Oracle High Availability Services has been started.
[root@rac1 bin]#

[root@rac1 bin]# ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   e4c2be6b1c114f7ebf4e1689ab0ac3dd (/dev/asm-disk4) [DATA]
Located 1 voting disk(s).

[root@rac1 bin]# ./ocrcheck -details
Status of Oracle Cluster Registry is as follows :
         Version                  :          4
         Total space (kbytes)     :     409568
         Used space (kbytes)      :       2284
         Available space (kbytes) :     407284
         ID                       :  766264505
         Device/File Name         : +DATA/rac-cluster/OCRFILE/registry.255.999481825
                                   Device/File integrity check succeeded
                                   Device/File not configured
                                   Device/File not configured
                                   Device/File not configured
                                   Device/File not configured
         Cluster registry integrity check succeeded
         Logical corruption check succeeded
[root@rac1 bin]#

Ø  Now crs would have come in normal state  

[root@rac1 bin]# ./crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
[root@rac1 bin]#




Restore VD when all copies of OCR  and VD are lost/corrupted



Ø  Previously we have corrupted disks in CRS so now I am going to recreate the diskgroup and move VD ,OCR and my ASM SPFILE  to the CRS diskgroup 


SQL> drop diskgroup CRS force including contents;

Diskgroup dropped.


SQL> create diskgroup CRS normal redundancy disk '/dev/asm-disk8','/dev/asm-disk9','/dev/asm-disk10';

Diskgroup created.


          Mount diskgroup on second node

SQL> alter diskgroup CRS mount;

Diskgroup altered.


Ø  Moving ASM SPFILE to CRS diskgroup .


SQL> show parameter spfile

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
spfile                               string      +DATA/rac-cluster/ASMPARAMETER
                                                FILE/registry.253.982502029

SQL> create pfile='/home/oracle/asmpfile.ora' from spfile;

File created.

SQL> create spfile='+CRS' from pfile='/home/oracle/asmpfile.ora';

File created.

SQL> show parameter spfile

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
spfile                               string      +DATA/rac-cluster/ASMPARAMETER
                                                FILE/registry.253.982502029


[root@rac1 bin]# ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
1. ONLINE   e4c2be6b1c114f7ebf4e1689ab0ac3dd (/dev/asm-disk4) [DATA]
Located 1 voting disk(s).


[root@rac1 bin]# ./ocrcheck -details
Status of Oracle Cluster Registry is as follows :
        Version                  :          4
        Total space (kbytes)     :     409568
        Used space (kbytes)      :       2284
        Available space (kbytes) :     407284
        ID                       :  766264505
        Device/File Name         : +DATA/rac-cluster/OCRFILE/registry.255.999481825
                                   Device/File integrity check succeeded
                                   Device/File not configured
                                   Device/File not configured 
                                   Device/File not configured
                                   Device/File not configured
         Cluster registry integrity check succeeded
         Logical corruption check succeeded

[root@rac1 bin]# ./ocrconfig -add +CRS
PROT-30: The Oracle Cluster Registry location to be added is not usable.
PROC-50: The Oracle Cluster Registry location to be added is inaccessible on nodes rac2.


Ø  Moving OCR to CRS diskgroup by adding a new copy in to CRS and deleting from DATA


[root@rac1 bin]# ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
1. ONLINE   e4c2be6b1c114f7ebf4e1689ab0ac3dd (/dev/asm-disk4) [DATA]
Located 1 voting disk(s).


[root@rac1 bin]# ./ocrconfig -add +CRS
[root@rac1 bin]# ./ocrconfig -delete +DATA

[root@rac1 bin]# ./ocrcheck -details
Status of Oracle Cluster Registry is as follows :
        Version                  :          4
        Total space (kbytes)     :     409568
        Used space (kbytes)      :       2284
        Available space (kbytes) :     407284
        ID                       :  766264505
        Device/File Name         : +CRS/rac-cluster/OCRFILE/registry.255.999652799
                                   Device/File integrity check succeeded
                                   Device/File not configured
                                   Device/File not configured
                                   Device/File not configured
                                   Device/File not configured
         Cluster registry integrity check succeeded
         Logical corruption check succeeded


Ø  Move Voting disk to CRS diskgroup

[root@rac1 bin]# ./crsctl replace votedisk +CRS
Successful addition of voting disk 53ba18777b624f80bf98a6a496092c65.
Successful addition of voting disk 30d439cafa1d4f64bfe89bdb6856b86a.
Successful addition of voting disk 810896260d384f5dbf485fb026029101.
Successful deletion of voting disk e4c2be6b1c114f7ebf4e1689ab0ac3dd.
Successfully replaced voting disk group with +CRS.
CRS-4266: Voting file(s) successfully replaced


[root@rac1 bin]# ./crsctl query css votedisk
##  STATE    File Universal Id               File Name Disk group
--  -----    -----------------                --------- ---------
1. ONLINE   53ba18777b624f80bf98a6a496092c65 (/dev/asm-disk8) [CRS]
2. ONLINE   30d439cafa1d4f64bfe89bdb6856b86a (/dev/asm-disk9) [CRS]
3. ONLINE   810896260d384f5dbf485fb026029101 (/dev/asm-disk10) [CRS]
Located 3 voting disk(s).


Ø  At this Stage , I have OCR ,VD and ASM SPFILE on CRS diskgroup  . Loosing this diskgroup will bring down entire stack .

Ø  Let us  corrupt one VD using DD command  


[root@rac1 ~]# dd if=/dev/zero of=/dev/asm-disk8 bs=4096 count=1000000
dd: writing `/dev/asm-disk8': No space left on device
261049+0 records in
261048+0 records out
1069254144 bytes (1.1 GB) copied, 9.50628 s, 112 MB/s


 [root@rac1 bin]# ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
1. ONLINE   30d439cafa1d4f64bfe89bdb6856b86a (/dev/asm-disk9) [CRS]
2. ONLINE   810896260d384f5dbf485fb026029101 (/dev/asm-disk10) [CRS]
Located 2 voting disk(s).

Ø  Let us  corrupt one more voting disk .

[root@rac1 ~]# dd if=/dev/zero of=/dev/asm-disk9 bs=4096 count=1000000
dd: writing `/dev/asm-disk9': No space left on device
261049+0 records in
261048+0 records out
1069254144 bytes (1.1 GB) copied, 6.91994 s, 155 MB/s
[root@rac1 ~]#

[root@rac1 bin]# ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
1. PENDOFFL 30d439cafa1d4f64bfe89bdb6856b86a (/dev/asm-disk9) [CRS]
2. ONLINE   810896260d384f5dbf485fb026029101 (/dev/asm-disk10) [CRS]
Located 2 voting disk(s).
[root@rac1 bin]#

[root@rac1 bin]# ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
1. PENDOFFL 30d439cafa1d4f64bfe89bdb6856b86a (/dev/asm-disk9) [CRS]
2. ONLINE   810896260d384f5dbf485fb026029101 (/dev/asm-disk10) [CRS]
Located 2 voting disk(s).

Ø  In ALERT LOG we can see that the CSS crashes .

2019-02-08 01:30:57.707 [OCSSD(11083)]CRS-1705: Found 1 configured voting files but 2 voting files are required, terminating to ensure data integrity; details at (:CSSNM00021:) in /u01/app/oracle/diag/crs/rac1/crs/trace/ocssd.trc

2019-02-08 01:30:58.708 [OCSSD(11083)]CRS-1656: The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /u01/app/oracle/diag/crs/rac1/crs/trace/ocssd.trc

2019-02-08 01:31:00.170 [ORAROOTAGENT(27253)]CRS-5019: All OCR locations are on ASM disk groups [CRS], and none of these disk groups are mounted. Details are at "(:CLSN00140:)" in "/u01/app/oracle/diag/crs/rac1/crs/trace/ohasd_orarootagent_root.trc".

Ø  Corrupting the only last copy of VD to make entire diskgroup unusable  .

[root@rac1 ~]# dd if=/dev/zero of=/dev/asm-disk10 bs=4096 count=1000000
dd: writing `/dev/asm-disk10': No space left on device
261049+0 records in
261048+0 records out
1069254144 bytes (1.1 GB) copied, 5.01601 s, 213 MB/s

[root@rac1 bin]# ./crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
CRS-4533: Event Manager is online

[root@rac1 bin]# ./crsctl query css votedisk
Unable to communicate with the Cluster Synchronization Services daemon.
[root@rac1 bin]#

Ø  Stop CRS using force option to bring down entire stack on all the nodes .


[root@rac1 bin]# ./crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac1'
CRS-2673: Attempting to stop 'ora.crf' on 'rac1'
CRS-2673: Attempting to stop 'ora.gpnpd' on 'rac1'
CRS-2673: Attempting to stop 'ora.evmd' on 'rac1'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'rac1'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'rac1'
CRS-2677: Stop of 'ora.drivers.acfs' on 'rac1' succeeded
CRS-2677: Stop of 'ora.crf' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'rac1'
CRS-2677: Stop of 'ora.gpnpd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.evmd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.gipcd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'rac1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rac1' has completed
CRS-4133: Oracle High Availability Services has been stopped.
[root@rac1 bin]#


Ø  Start CRS stack in exclusive mode with nocrs option on any one node .

 [root@rac1 bin]# ./crsctl start crs -excl -nocrs
CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.evmd' on 'rac1'
CRS-2672: Attempting to start 'ora.mdnsd' on 'rac1'
CRS-2676: Start of 'ora.evmd' on 'rac1' succeeded
CRS-2676: Start of 'ora.mdnsd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'rac1'
CRS-2676: Start of 'ora.gpnpd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac1'
CRS-2672: Attempting to start 'ora.gipcd' on 'rac1'
CRS-2676: Start of 'ora.cssdmonitor' on 'rac1' succeeded
CRS-2676: Start of 'ora.gipcd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rac1'
CRS-2672: Attempting to start 'ora.diskmon' on 'rac1'
CRS-2676: Start of 'ora.diskmon' on 'rac1' succeeded
CRS-2676: Start of 'ora.cssd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.drivers.acfs' on 'rac1'
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'rac1'
CRS-2672: Attempting to start 'ora.ctssd' on 'rac1'
CRS-2676: Start of 'ora.ctssd' on 'rac1' succeeded
CRS-2676: Start of 'ora.drivers.acfs' on 'rac1' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'rac1'
CRS-2676: Start of 'ora.asm' on 'rac1' succeeded

[root@rac1 bin]# ./ocrcheck
PROT-602: Failed to retrieve data from the cluster registry
PROC-26: Error while accessing the physical storage Storage layer error [Insufficient quorum to open OCR devices] [0]

[root@rac1 bin]# ./ocrconfig -showbackup
PROT-26: Oracle Cluster Registry backup locations were retrieved from a local copy
PROT-24: Auto backups for the Oracle Cluster Registry are not available
PROT-25: Manual backups for the Oracle Cluster Registry are not available


Ø  We can see  dummy ASM instance started   .

[root@rac1 bin]#
[oracle@rac1 trace]$ sqlplus "/as sysasm"
SQL*Plus: Release 12.2.0.1.0 Production on Fri Feb 8 01:56:14 2019
Copyright (c) 1982, 2016, Oracle.  All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> show parameter spfile
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
spfile                               string

SQL> show parameter disk_string
SQL> show parameter disk

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
asm_diskgroups                       string
asm_diskstring                       string


Ø  Recreate CRS diskgroup   and restore OCR and Voting disk .

SQL> create diskgroup CRS normal redundancy disk '/dev/asm-disk8','/dev/asm-disk9','/dev/asm-disk10';

Diskgroup created.


SQL> show parameter asm
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
asm_diskgroups                       string      CRS
asm_diskstring                       string
asm_power_limit                      integer     1
asm_preferred_read_failure_groups    string
SQL>

Ø  Restore OCR from backup copy .

[root@rac1 bin]# ./ocrconfig -restore +FRA:/rac-cluster/OCRBACKUP/backup_20190208_004246.ocr.360.999650573


Ø  Restore Voting Disk .

[root@rac1 bin]# ./crsctl replace votedisk +CRS
CRS-4602: Failed 27 to add voting file 3556b6ddc04e4f47bfb36062aba9759f.
CRS-4602: Failed 27 to add voting file ce44f7e7b5f94fd0bff68d9aae83652f.
CRS-4602: Failed 27 to add voting file f11db05955894feabfb5810ea0a631bf.
Failed to replace voting disk group with +CRS.
CRS-4000: Command Replace failed, or completed with errors.
[root@rac1 bin]#

Restore voting disk failed with the errors . Set the ASM_DISKSTRING parameter and try restoring the voting disk again .
SQL> alter system set asm_diskstring='/dev/asm*' ;
System altered.

SQL> show parameter asm
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
asm_diskgroups                       string      FRA, CRS
asm_diskstring                       string      /dev/asm*
asm_power_limit                      integer     1
asm_preferred_read_failure_groups    string
SQL>

Retry restoring VD .

[root@rac1 bin]# ./crsctl replace votedisk +CRS
Successful addition of voting disk 750dfbfa59b64f90bfc843635d9cba7e.
Successful addition of voting disk 19d15e4614594f01bfadb92c5a55a55d.
Successful addition of voting disk 47f89ee448134f77bfd19b2b5e008d97.
Successfully replaced voting disk group with +CRS.
CRS-4266: Voting file(s) successfully replaced
[root@rac1 bin]#

[root@rac1 bin]# ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
1. ONLINE   750dfbfa59b64f90bfc843635d9cba7e (/dev/asm-disk8) [CRS]
2. ONLINE   19d15e4614594f01bfadb92c5a55a55d (/dev/asm-disk9) [CRS]
3. ONLINE   47f89ee448134f77bfd19b2b5e008d97 (/dev/asm-disk10) [CRS]
Located 3 voting disk(s).
[root@rac1 bin]#

Recreate ASM SPFILE to CRS diskgroup .

SQL>  show parameter asm
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
asm_diskgroups                       string      FRA, CRS, DATA
asm_diskstring                       string      /dev/asm*
asm_power_limit                      integer     1
asm_preferred_read_failure_groups    string


SQL> show parameter spfile
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
spfile                               string

SQL> create spfile='+CRS' from memory;
File created.


Ø  Now We have restored OCR , VD and ASM SPFILE to CRS diskgroup . Stop CRS gracefully and start it in normal mode.


[root@rac1 bin]# ./crsctl stop crs
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac1'
CRS-2673: Attempting to stop 'ora.ctssd' on 'rac1'
CRS-2673: Attempting to stop 'ora.gpnpd' on 'rac1'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'rac1'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'rac1'
CRS-2677: Stop of 'ora.gpnpd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.drivers.acfs' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.evmd' on 'rac1'
CRS-2673: Attempting to stop 'ora.storage' on 'rac1'
CRS-2677: Stop of 'ora.storage' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'rac1'
CRS-2677: Stop of 'ora.mdnsd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.evmd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.asm' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'rac1'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'rac1'
CRS-2677: Stop of 'ora.cssd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.crf' on 'rac1'
CRS-2677: Stop of 'ora.crf' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'rac1'
CRS-2677: Stop of 'ora.gipcd' on 'rac1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rac1' has completed
CRS-4133: Oracle High Availability Services has been stopped.


Ø  Start CRS on all other nodes and run the cluster integrity verification check.

[root@rac1 bin]# ./crsctl start crs
CRS-4123: Oracle High Availability Services has been started.

[root@rac2 bin]# ./crsctl start crs
CRS-4123: Oracle High Availability Services has been started.


[oracle@rac1 ~]$ cluvfy comp ocr -n all -verbose
Verifying OCR Integrity ...
PASSED
Verification of OCR integrity was successful.
CVU operation performed:      OCR integrity
Date:                         Feb 8, 2019 1:51:24 PM
CVU home:                     /u01/app/12.2.0.1/grid/
User:                         oracle




[oracle@rac1 ~]$ cluvfy comp vdisk -n all -verbose
Verifying Voting Disk ...PASSED
Verification of Voting Disk was successful.
CVU operation performed:      Voting Disk
Date:                         Feb 8, 2019 2:04:08 PM
CVU home:                     /u01/app/12.2.0.1/grid/
User:                         oracle
[oracle@rac1 ~]$







No comments:

Post a Comment

Change Private IP Network Interface /Subnet/Netmask

Scenario: 1 In this scenario, I am replacing the private interface eth2 with the new interface eth3 and also change in subnet ...