14 May 2013

When Oracle Configuration Assistants, sqlplus, emctl, opmn and even oraenv Start Failing

Ok, this is one of those weird things that hopefully I won't ever experience again - now that I'm blogging about it...

I am currently building virtual machine templates and images using Oracle VirtualBox VM on a laptop in preparation for transferring same images to VMWare ESXi nodes at a customer site in order to reduce the lead time. As some of the images have dependencies (namely, the network) I started using host-only networking as opposed to NAT networking with Oracle VirtualBox. However, one downside to using host-only networking is that it's host-only so no external access per default to the Internet - with the ability to download and update software.

So what I did was install the Squid Proxy server on the host and configured it such that it would accept HTTP traffic on the host-only interface and forward to the internet. Only pre-requisite was to set the http_proxy and ftp_proxy environment variables, which I did in an /etc/profile.d/profile.sh script such that it would be inherited by all (Bourne derived) shells. I realise that I could potentially have used Microsoft's Internet Sharing mechanism but with countless interfaces and wonderful and strange configurations I wasn't sure how much that was going to break (ironically... Do read on.)

This worked really well and all seemed to function nicely in terms of updating and downloading software.

Then when I restarted the VMs a few days later things suddenly started breaking. My OEM VM would refuse to start-up the database for no apparent reason. Further investigation showed that manually running oraenv would fail and $ORACLE_BASE would get populated by the string "Failed to create XML context, error 54." Turns out that the orabase command would return this string and fail - an strace indicated that it was indeed loading XML file(s) such as oraclehomeproperties.xml.

Also, I would see errors when trying to stop the OEM middle-tier because OPMN suddenly refused to parse its opmn.xml file (it's still a wonder how it could start the services in the first place):

Output messages of the command :
opmnctl stopall: error parsing /u01/app/oracle/Middleware/gc_inst/WebTierIH1/config/OPMN/opmn/opmn.xml

2013-05-14 09:08:17,278 [main] INFO  util.EmctlUtil logp.251 - error messages of the command :
[2013-05-14T09:08:17+01:00] [opmn] [ERROR:1] [107] [internal] XML parser init: error 54.

But hello - same error code (54). So there's a correlation between the two.

Separately I had another VM configured for Oracle ASM and SQLPLUS would fail (but ASMCA continue) during ASM installation:

INFO: Read:
INFO: Read: Configuring ASM failed with the following message:
INFO: Read: There is an error in creating the following process:
INFO: Read: /u01/app/grid/product/11.2.0/grid/bin/sqlplus -S -N /NOLOG
INFO: Read: The error is:
INFO: Read: Error 46 initializing SQL*Plus
INFO: Read: HTTP proxy setting has incorrect value
INFO: Read: SP2-1502: The HTTP proxy server specified by http_proxy is not accessible
INFO: Read:
INFO: Read:
INFO: Completed Plugin named: Automatic Storage Management Configuration Assistant
INFO: Automatic Storage Management Configuration Assistant failed.

Because it had been a few days it took me some time to connect the dots between all these failures - SQLPLUS can load resources from URLs and it would seem that setting http_proxy can make it fail. Also, the XML parsers suddenly started failing when parsing files like opmn.xml and oraclehomeproperties.xml - and could fail because they try to access URL resources too when parsing.

So at this stage I removed the http_proxy and ftp_proxy environment variables so that they are no longer set by default - this solved all my problems.

So when I need to wget files or yum update I will just need to remember to run a script to set the proxies.

Lesson learnt: Never ever set http_proxy and ftp_proxy when running Oracle software. Whether this is a bug or a feature (by design) is unknown but now at least I know what the problem is...


Thomas Brotherton said...

Thank you for the help. I was beating my head against this at work for a couple of days, getting OPMN to work right for a Forms install.

Gábor Szilos said...

Thanks a lot man! You saved my day. I would emphasize that the environment variable must be absolutely killed from any start up script ie.: if it was set in /etc/profile

I put a link to your post on my blog.

David said...

I had the exact same issue and removing the http_proxy environment variable (that i had set for perl) solved the problem. Thank you. I already went crazy searching for the reason.