Hello All, I've been a slient member for awhile, now I have a question/problem I hope someone can answer. I have an Oracle 9i RAC set up consisting of 3 database servers, two apps servers and an EMC Clarion CX700 SAN. The servers are all running Red Hat 2.1, the db servers are at kernel .34 and the apps servers are at .49. I have 5 nfs mount points shared out by the first database server. One mount point is for logs from the database servers (only they connect to it), and the others are for miscellaneous files and storage. We have been doing failover testing, shutting down a database server and seeing if the other two keep processing requests. We have a problem when we shut down the first database server, the one sharing the nfs volumes. The other two servers just hang trying to connect to the shares. I have tried every option in my fstab file to make the servers stop trying to reach the nfs shares if the server is not available, but nothing has worked. Here is one of the lines from the fstab file where I mount a share: shoprod1:/applcsf /applcsf nfs soft,rsize=8192,wsize=8192,retrans=6,timeo=14,intr Can you see anything wrong with this line, or can anyone suggest something else to try. I can't even forceably unmount the shares with an umount -f. And if we try to kill any processes accessing the shares the kill command just hangs (even with a kill -9). Any suggestions? Thanx, Don Peterson Network/Telecomm Mgr. Sterilite Corporation
On Tue, Oct 12, 2004 at 11:15:11AM -0400, Don Peterson wrote:
I mount a share: shoprod1:/applcsf /applcsf nfs soft,rsize=8192,wsize=8192,retrans=6,timeo=14,intr Can you see anything wrong with this line, or can anyone suggest something else to try. I can't even forceably unmount the shares with an umount -f. And if we try to kill any processes accessing the shares the kill command just hangs (even with a kill -9). Any suggestions?
I don't think there is any good solution. Using the "soft" and "intr" mount options is an extremely bad idea. Do it only if you don't care about losing/corrupting your data. Important data should always be mounted "hard,nointr". Personally, I avoid NFS like the plague. I only use it for readonly mounts (where soft,intr is less of a problem). Instead, I've standardized on rsync over ssh with RSA keys to get data between servers that need to be synchronized.
Don> I have an Oracle 9i RAC set up consisting of 3 database servers, Don> two apps servers and an EMC Clarion CX700 SAN. The servers are Don> all running Red Hat 2.1, the db servers are at kernel .34 and the Don> apps servers are at .49. I have 5 nfs mount points shared out by Don> the first database server. This is the problem, the fact that you're using NFS in a clustered environment. Have you looked into using a Cluster Filesystem for your needs? Some options would be: http://www.redhat.com/software/rha/gfs/ It even mentions Oracle 9i in the blurb. Don> One mount point is for logs from the database servers (only they Don> connect to it), Why do they all need to log to one filesystem? Can't they log locally and if you want to aggregate the logs, you could rsync them hourly from the servers to a central log host. Don> and the others are for miscellaneous files and storage. We have Don> been doing failover testing, shutting down a database server and Don> seeing if the other two keep processing requests. We have a Don> problem when we shut down the first database server, the one Don> sharing the nfs volumes. The other two servers just hang trying Don> to connect to the shares. I have tried every option in my fstab Don> file to make the servers stop trying to reach the nfs shares if Don> the server is not available, but nothing has worked. Here is one Don> of the lines from the fstab file where I mount a share: Yeah, what you're trying to do isn't going to work without either: 1. moving to a Cluster Filesystem so that all hosts and read/write to the same filesystem on the Clarrion SAN concurrently. 2. moving to a Cluster setup where you have TWO servers in a cluster, and the cluster provides the NFS file service to other servers. 3. Getting a NAS box which provides NFS service to the DB servers and which has the require reliability you need. I personally like NetApps, but they can be pricey. But even with a single head, they're reliable and run well. I had one box with an uptime of almost 500 days. Oh yeah, UPS and Generator backup helps as well. *grin* You could get away with using another cheaper NAS, but since you spent the money on the SAN, why not just use that and GFS to provide the storage you need. Esp if most of the filesystems you provide are read (mostly) then you shouldn't have many problems. John John Stoffel - Senior Unix Systems Administrator - Lucent Technologies stoffel@lucent.com - http://www.lucent.com - 978-952-7548
==> Regarding Re: [Wlug] NFS question; "John Stoffel" <stoffel@lucent.com> adds: Don> I have an Oracle 9i RAC set up consisting of 3 database servers, two Don> apps servers and an EMC Clarion CX700 SAN. The servers are all Don> running Red Hat 2.1, the db servers are at kernel .34 and the apps Don> servers are at .49. I have 5 nfs mount points shared out by the first Don> database server. stoffel> This is the problem, the fact that you're using NFS in a clustered stoffel> environment. Have you looked into using a Cluster Filesystem for stoffel> your needs? Some options would be: stoffel> http://www.redhat.com/software/rha/gfs/ stoffel> It even mentions Oracle 9i in the blurb. Don, are you able to use OCFS? If not, you could at least cluster the NFS server with 2 RHEL boxes. I believe this will be cheaper than buying a redundant NAS (and serves precisely the same purpose). In fact, I believe that the cluster software was included in the AS2.1 license. Don> One mount point is for logs from the database servers (only they Don> connect to it), stoffel> Why do they all need to log to one filesystem? Can't they log stoffel> locally and if you want to aggregate the logs, you could rsync stoffel> them hourly from the servers to a central log host. Don> and the others are for miscellaneous files and storage. We have been Don> doing failover testing, shutting down a database server and seeing if Don> the other two keep processing requests. We have a problem when we Don> shut down the first database server, the one sharing the nfs volumes. Don> The other two servers just hang trying to connect to the shares. I Don> have tried every option in my fstab file to make the servers stop Don> trying to reach the nfs shares if the server is not available, but Don> nothing has worked. Here is one of the lines from the fstab file Don> where I mount a share: stoffel> Yeah, what you're trying to do isn't going to work without either: stoffel> 1. moving to a Cluster Filesystem so that all hosts and read/write stoffel> to the same filesystem on the Clarrion SAN concurrently. stoffel> 2. moving to a Cluster setup where you have TWO servers in a stoffel> cluster, and the cluster provides the NFS file service to other stoffel> servers. Right. This is an easy option, I think, if you can afford 1 or 2 more boxes (Depending on whether you want to have a dedicated file server not running database apps). stoffel> 3. Getting a NAS box which provides NFS service to the DB servers stoffel> and which has the require reliability you need. stoffel> I personally like NetApps, but they can be pricey. But even with stoffel> a single head, they're reliable and run well. I had one box with stoffel> an uptime of almost 500 days. Oh yeah, UPS and Generator backup stoffel> helps as well. *grin* stoffel> You could get away with using another cheaper NAS, but since you stoffel> spent the money on the SAN, why not just use that and GFS to stoffel> provide the storage you need. Esp if most of the filesystems you stoffel> provide are read (mostly) then you shouldn't have many problems. I do believe RHEL 3 has been certified with Oracle 9i. I'm not sure about the particular apps you are using, but the db itself should work just fine. I would ask your contact at Red Hat. -Jeff
"Jeff" == Jeff Moyer <jmoyer@redhat.com> writes:
Jeff> Don, are you able to use OCFS? If not, you could at least Jeff> cluster the NFS server with 2 RHEL boxes. I believe this will Jeff> be cheaper than buying a redundant NAS (and serves precisely the Jeff> same purpose). In fact, I believe that the cluster software was Jeff> included in the AS2.1 license. This is too funny, I didn't even see this reply before I wrote the other one suggesting just this type of idea. Heh! Since you can buy 2U boxes with a bunch of mirrored storage four around 3k/per, you should be able to setup a cluster providing NFS services to the other hosts pretty cheaply and quickly. John
participants (4)
-
Charles R. Anderson
-
Don Peterson
-
Jeff Moyer
-
John Stoffel