Meant to sent this to the whole list. ---------- Forwarded message ---------- From: Chuck Haines <chaines@gmail.com> Date: Thu, 9 Sep 2004 13:44:53 -0400 Subject: Re: [Wlug] NFS Trouble To: jmoyer@redhat.com Server: Kernel = 2.4.21-15.ELsmp nfs-util = 1.0.6-21EL Client: Kernel = 2.4.21-15.ELsmp nfs-util = 1.0.6-21EL Both machines are running RedHat Advance Server 3.0 ( Taroon Update 2). When performing the copy, top on the server loops like this after about 75% of the transfer: 13:43:19 up 26 days, 6:05, 2 users, load average: 7.19, 3.30, 1.62 125 processes: 124 sleeping, 1 running, 0 zombie, 0 stopped CPU states: cpu user nice system irq softirq iowait idle total 0.2% 0.0% 1.6% 0.0% 0.4% 48.3% 49.3% cpu00 0.0% 0.0% 1.6% 0.0% 0.0% 66.6% 31.6% cpu01 0.4% 0.0% 1.6% 0.0% 0.8% 30.0% 67.0% Mem: 2061576k av, 2044428k used, 17148k free, 0k shrd, 351044k buff 1425940k actv, 270704k in_d, 31124k in_c Swap: 1052248k av, 19780k used, 1032468k free 1371100k cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 24242 bkmcd 15 0 4648 4220 3012 S 0.4 0.2 0:03 1 smbd 165 root 17 0 0 0 0 SW 0.2 0.0 26:10 0 raid5d 4783 root 15 0 0 0 0 DW 0.2 0.0 80:48 0 nfsd 4790 root 15 0 0 0 0 DW 0.2 0.0 80:29 1 nfsd 4788 root 15 0 0 0 0 DW 0.2 0.0 82:35 0 nfsd 4785 root 15 0 0 0 0 DW 0.2 0.0 80:40 0 nfsd 4784 root 15 0 0 0 0 DW 0.2 0.0 81:18 1 nfsd 24515 chaines 15 0 1256 1256 900 R 0.2 0.0 0:00 0 top 1 root 15 0 508 468 448 S 0.0 0.0 0:24 0 init 2 root RT 0 0 0 0 SW 0.0 0.0 0:00 0 migration/0 As you can see, most of the CPU is in an IO wait state. Hope this helps, Chuck On Thu, 9 Sep 2004 12:54:28 -0400, Jeff Moyer <jmoyer@redhat.com> wrote:
==> Regarding [Wlug] NFS Trouble; Chuck Haines <chaines@gmail.com> adds:
chaines> I have an interesting problem that is occuring. I have a /home chaines> directory on a file server and I have several login servers that chaines> mount this /home directory via NFS. Now this /home directory has chaines> a ton of files and directorys (about 2000 user accounts). I chaines> noticed that when I perform a copy of a large file ( I tested with chaines> a 50 MB file) on a login server from something like /tmp to the chaines> NFS mounted /home, the load average on the file server jumps up to chaines> over 7, and seems to take forever (around 3-4 minutes). If I copy chaines> a large file (once again I tested with 50 MB) from the NFS mounted chaines> /home to like /tmp on the login server, there is no noticable load chaines> average increase and it occurs in approximately 10-15 seconds. So chaines> it seems that writes to the NFS mounted /home are taking longer chaines> than they should and are spiking the load average. Both the file chaines> server and the login servers are dual P3's with 2 gigs of ram. chaines> Has anyone ever seen anything like this, or knows of any solutions chaines> to this problem?
Kernel version (client and server)? Version of nfs-utils? What does top show as taking up CPU when you do the copy and your load average spikes?
-Jeff
-- Chuck Haines chaines@gmail.com ------------------------------------------- Tau Kappa Epsilon Fraternity WPI Class of 2005 ------------------------------------------- AIM: CyberGrex YIM: CyberGrex_27 ICQ: 3707881 ------------------------------------------- -- Chuck Haines chaines@gmail.com ------------------------------------------- Tau Kappa Epsilon Fraternity WPI Class of 2005 ------------------------------------------- AIM: CyberGrex YIM: CyberGrex_27 ICQ: 3707881 -------------------------------------------