Meant to sent this to the whole list.
---------- Forwarded message ----------
From: Chuck Haines <chaines(a)gmail.com>
Date: Thu, 9 Sep 2004 13:44:53 -0400
Subject: Re: [Wlug] NFS Trouble
To: jmoyer(a)redhat.com
Server:
Kernel = 2.4.21-15.ELsmp
nfs-util = 1.0.6-21EL
Client:
Kernel = 2.4.21-15.ELsmp
nfs-util = 1.0.6-21EL
Both machines are running RedHat Advance Server 3.0 ( Taroon Update 2).
When performing the copy, top on the server loops like this after
about 75% of the transfer:
13:43:19 up 26 days, 6:05, 2 users, load average: 7.19, 3.30, 1.62
125 processes: 124 sleeping, 1 running, 0 zombie, 0 stopped
CPU states: cpu user nice system irq softirq iowait idle
total 0.2% 0.0% 1.6% 0.0% 0.4% 48.3% 49.3%
cpu00 0.0% 0.0% 1.6% 0.0% 0.0% 66.6% 31.6%
cpu01 0.4% 0.0% 1.6% 0.0% 0.8% 30.0% 67.0%
Mem: 2061576k av, 2044428k used, 17148k free, 0k shrd, 351044k buff
1425940k actv, 270704k in_d, 31124k in_c
Swap: 1052248k av, 19780k used, 1032468k free 1371100k cached
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
24242 bkmcd 15 0 4648 4220 3012 S 0.4 0.2 0:03 1 smbd
165 root 17 0 0 0 0 SW 0.2 0.0 26:10 0 raid5d
4783 root 15 0 0 0 0 DW 0.2 0.0 80:48 0 nfsd
4790 root 15 0 0 0 0 DW 0.2 0.0 80:29 1 nfsd
4788 root 15 0 0 0 0 DW 0.2 0.0 82:35 0 nfsd
4785 root 15 0 0 0 0 DW 0.2 0.0 80:40 0 nfsd
4784 root 15 0 0 0 0 DW 0.2 0.0 81:18 1 nfsd
24515 chaines 15 0 1256 1256 900 R 0.2 0.0 0:00 0 top
1 root 15 0 508 468 448 S 0.0 0.0 0:24 0 init
2 root RT 0 0 0 0 SW 0.0 0.0 0:00 0 migration/0
As you can see, most of the CPU is in an IO wait state.
Hope this helps,
Chuck
On Thu, 9 Sep 2004 12:54:28 -0400, Jeff Moyer <jmoyer(a)redhat.com> wrote:
> ==> Regarding [Wlug] NFS Trouble; Chuck Haines <chaines(a)gmail.com> adds:
>
> chaines> I have an interesting problem that is occuring. I have a /home
> chaines> directory on a file server and I have several login servers that
> chaines> mount this /home directory via NFS. Now this /home directory has
> chaines> a ton of files and directorys (about 2000 user accounts). I
> chaines> noticed that when I perform a copy of a large file ( I tested with
> chaines> a 50 MB file) on a login server from something like /tmp to the
> chaines> NFS mounted /home, the load average on the file server jumps up to
> chaines> over 7, and seems to take forever (around 3-4 minutes). If I copy
> chaines> a large file (once again I tested with 50 MB) from the NFS mounted
> chaines> /home to like /tmp on the login server, there is no noticable load
> chaines> average increase and it occurs in approximately 10-15 seconds. So
> chaines> it seems that writes to the NFS mounted /home are taking longer
> chaines> than they should and are spiking the load average. Both the file
> chaines> server and the login servers are dual P3's with 2 gigs of ram.
> chaines> Has anyone ever seen anything like this, or knows of any solutions
> chaines> to this problem?
>
> Kernel version (client and server)? Version of nfs-utils? What does top
> show as taking up CPU when you do the copy and your load average spikes?
>
> -Jeff
>
--
Chuck Haines
chaines(a)gmail.com
-------------------------------------------
Tau Kappa Epsilon Fraternity
WPI Class of 2005
-------------------------------------------
AIM: CyberGrex
YIM: CyberGrex_27
ICQ: 3707881
-------------------------------------------
--
Chuck Haines
chaines(a)gmail.com
-------------------------------------------
Tau Kappa Epsilon Fraternity
WPI Class of 2005
-------------------------------------------
AIM: CyberGrex
YIM: CyberGrex_27
ICQ: 3707881
-------------------------------------------