Oracle node error. When trying to add a node to Oracle RAC
CRS-0215 / ONS Failed to Start. Pingwait Exited
With Exit Status 2
Extract from Oracle KB Article
Applies to:
Oracle Server - Enterprise Edition - Version: 10.1.0.2 to
11.1.0.7 - Release: 10.1 to 11.1 Information in this document applies to any
platform.
Symptoms
Problem can occur during an installation of CRS or while
adding a new node.
* ONS
fails to startup on one or both the nodes during a new install.
* When
adding nodes, the ONS on the new node fails to start.
srvctl start nodeapps -n node2
CRS-0215: Could not start resource 'ora.node2.ons'.
$RDBMS_HOME/opmn/logs/ons.log does not have any updates.
Cause
The problem is that the remote port for ONS is used or
not available.
Solution
Set up debugging :
srvctl stop nodeapps -n node2
crsctl debug log res 'ora.node2.ons:5'
srvctl start nodeapps -n nodename
- $ORA_CRS_HOME/log//racg/ora.node2.ons
Oracle Database 11g CRS Release 11.1.0.6.0 - Production
Copyright 1996, 2007 Oracle. All rights reserved.
2008-12-05 13:13:11.296: [RACG][3184]
[3144][3184][ora.node2.ons]: ons failed to start. pingwait exited with exit
status 2
Test:
ons.config :
localport=6150
useocr=on
allowgroup=true
usesharedinstall=true
onsctl ping on node 2 :
onsctl ping
Number of onsconfiguration retrieved, numcfg = 2
onscfg[0]
{node = node1, port = 6251}
Adding remote host node1:6251
onscfg[1]
{node = node2, port = 6251}
Adding remote host node2:6251
ons is NOT running . . .
onsctl start on node2 :
onsctl start
Number of onsconfiguration retrieved, numcfg = 2
onscfg[0]
{node = node1, port = 6251}
Adding remote host node1:6251
onscfg[1]
{node = node2, port = 6251}
Adding remote host node2:6251
Number of onsconfiguration retrieved, numcfg = 2
onscfg[0]
{node = dbserver01, port = 6251}
Adding remote host node1:6251
onscfg[1]
{node = dbserver02, port = 6251}
Adding remote host node2:6251
ons failed to start. pingwait exited with exit status 2
OCRDUMP
[DATABASE.ONS_HOSTS.node1]
ORATEXT : node1
SECURITY : {USER_PERMISSION : PROCR_ALL_ACCESS,
GROUP_PERMISSION : PROCR_READ,
OTHER_PERMISSION : PROCR_READ, USER_NAME : administrator,
GROUP_NAME : }
[DATABASE.ONS_HOSTS.node1.PORT]
ORATEXT : 6251
SECURITY : {USER_PERMISSION : PROCR_ALL_ACCESS,
GROUP_PERMISSION : PROCR_READ,
OTHER_PERMISSION : PROCR_READ, USER_NAME : administrator,
GROUP_NAME : }
[DATABASE.ONS_HOSTS.node2]
ORATEXT : node2
SECURITY : {USER_PERMISSION : PROCR_ALL_ACCESS,
GROUP_PERMISSION : PROCR_READ,
OTHER_PERMISSION : PROCR_READ, USER_NAME : administrator,
GROUP_NAME : }
[DATABASE.ONS_HOSTS.node2.PORT]
ORATEXT : 6251
SECURITY : {USER_PERMISSION : PROCR_ALL_ACCESS,
GROUP_PERMISSION : PROCR_READ,
OTHER_PERMISSION : PROCR_READ, USER_NAME : administrator
Note: Remote port is registered in the ocrdump. You do
not need to have the RemortPort in the ons.conf file.
Solution :
The key issue is that the remote port that we see in the
ocrdump is used/unavailable
Ran netstat and found that the node did not have a free
port 6251.
Reconfigured ons to use a different free port. In this
case we used 2200
srvctl stop nodeapps -n node2
racgons remove_config node2:6251
racgons add_config node2:2200
The two nodes can be configured with different ports.
However in this case we made the same modification to both the nodes.
ocrdump will now reflect the new PORT.
- srvctl start nodeapps -n node2 and it started
successfully.
Variation of this problem :
The problem can manifest when adding a node on windows
and perhaps on other platforms as well.
From our
documentation :
Oracle? Database Release Notes
11g Release 1 (11.1) for Microsoft Windows
Part Number B32005-06
4 Installation, Configuration, and Upgrade Issues
4.7 Incorrect Port Number Registered for the New Node
When you run the crssetup.add.bat batch file to add
another node, incorrect
port number is registered for the new node.
Workaround: Complete the following procedure to resolve
this issue:
After running the crssetup.add.bat batch file, ignore the
error messages
similar to the following error message:
Starting ONS application resource on (*) nodes1:CRS-0215:
Could not start
resource 'ora.*.ons'
Use the following command to stop the nodeapps service on
all the newly added
nodes:
srvctl stop nodeapps -n node
Use the following command to delete the existing ONS port
number registration:
racgons remove_config node:4948
Use the following command to add an ONS port number:
racgons add_config node:remote_port
Use the following command to start the nodeapps service
on all the newly added
nodes:
srcvtl start nodeapps -n node
*********************SOLUTION FOR WINDOWS ********************
The issue in windows in my case wasnt the port being used. The issue was due to the fact that the port being configured with the hostname was in the wrong case.
Get the hostname via Command prompt windows and 'Hostname'. In my case the hostname we were trying to get ONS to run was node003 and it kept failing.
Looking up the hostname showed the hostname as NODE003 instead. So doing the below with the correct CASE resolved the issue.
D:\oracle\product\11.1.0\crs\BIN>srvctl stop nodeapps
-n node003
D:\oracle\product\11.1.0\crs\BIN>racgons remove_config node003:2300
racgons: Existing key value on node003 = 2300.
racgons: node003:2300 removed from OCR.
D:\oracle\product\11.1.0\crs\BIN>racgons add_config NODE003:6251
D:\oracle\product\11.1.0\crs\BIN>srvctl start nodeapps
-n NODE003
D:\oracle\product\11.1.0\crs\BIN>srvctl status
nodeapps -n node003 VIP is running on node: node003 GSD is running on
node: node003 Listener is running on node: node003 ONS daemon is
running on node: node003