Thursday, April 19, 2007

NFS server - file stats

Have you ever wondered what files are most accessed on your nfs server? How good are those files cached? You've got many nfs clients...

We've put new nfs server on Solaris 10, Opteron server, Sun Cluster 3.2, ZFS, etc.
So far only part of production data are served and we see somewhat surprising numbers.


bash-3.00# /usr/local/sbin/nicstat.pl 10 3
[omitting first output]
Time Int rKb/s wKb/s rPk/s wPk/s rAvs wAvs %Util Sat
03:04:20 nge1 0.07 0.05 1.20 1.20 61.50 46.67 0.00 0.00
03:04:20 nge0 0.07 0.05 1.20 1.20 61.50 46.67 0.00 0.00
03:04:20 e1000g1 71.87 0.13 446.22 1.20 164.92 114.83 0.06 0.00
03:04:20 e1000g0 0.34 10117.91 5.40 7120.07 64.00 1455.15 8.29 0.00
Time Int rKb/s wKb/s rPk/s wPk/s rAvs wAvs %Util Sat
03:04:30 nge1 0.08 0.06 1.30 1.30 62.77 47.54 0.00 0.00
03:04:30 nge0 0.08 0.06 1.30 1.30 62.77 47.54 0.00 0.00
03:04:30 e1000g1 69.13 0.14 430.27 1.30 164.53 110.92 0.06 0.00
03:04:30 e1000g0 0.43 9827.54 6.79 6914.19 64.29 1455.47 8.05 0.00
bash-3.00#


So we have 9-10MB/s being served.


bash-3.00# iostat -xnz 1
[omitting first output]
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
^C


Well but we do not touch disks at all.
'zpool iostat 1' also confirms that.

Now I wonder what files are we actually serving right now.


bash-3.00# ./rfileio.d 10s
Tracing...

Read IOPS, top 20 (count)
/media/d001/a/nfs.wsx logical 101
/media/d001/a/0410_komentarz_walutowy.wmv logical 712
/media/d001/a/0410_komentarz_gieldowy.wmv logical 3654

Read Bandwidth, top 20 (bytes)
/media/d001/a/nfs.wsx logical 188264
/media/d001/a/0410_komentarz_walutowy.wmv logical 1016832
/media/d001/a/0410_komentarz_gieldowy.wmv logical 96774144

Total File System miss-rate: 0%
^C


In 10 seconds we read ~95MB so it agrees with 9-10MB/s as nicstat reported. Everything is read as "logical" - agrees.
And most important - we now which files are served!
So it's time to tune nfs clients... :)

You can find rfileio.d script in the DTraceToolkit (although I modified it slightly).

Now imagine what you can do with such possibilities on more busy servers. You don't have to guess what files are most served and how good they cache. Using another script 'rfileio.d' you can break down statistics by file systems. And if you want to customize them you can easily and safely do so as those scripts are written in DTrace.

Of course all of the above is safe to run in a production - that's most important thing.

Additionally to put it clearly - I did it on nfs server, not nfs clients so it doesn't matter if your clients are *BSD, Linux, Windows, Solaris, ... as long as your nfs server is running Solaris.

1 comment:

cdmackay said...

hi Mike, have you tried the new fsstat(1M) at all? cheers, calum.