Corey Kovacs
2015-05-06 05:55:27 UTC
Hello,
I recently migrated approximately 100 rhel 6 machines from NFS3 to NFS4
(server is rhel 6.6, clients are a mix from 6.3 to 6.6). Things went pretty
smooth until several hours into the new configuration, then things started
running very slowly. Restarting the nfs server process clears the issue
which seems to indicate the server is the problem.
The file system itself can do some very high io throughput, on the order of
1GB/sec sustained and the "th" values in /proc/net/rpc/nfsd never increase
which indicates i/o's are completing on time with no thread starvation. The
server itself is set for 384 threads. During the previous NFS3 config the
thread count was much higher and had no problems.
I suspect file locking as the primary application in use ( an in house app)
uses a lot of little startup scripts which call other scripts to set up the
environment etc. Under normal circumstances this startup takes about 6
seconds. Over time that duration increases up to 30 and even 70 seconds in
some cases.
I've scoured every reference to nfs4 performance degradation I could find
but nothing seems to call out what we are experiencing. A few retrans exist
in nfsstat but nothing that stands out. Generally, everything "look" OK but
clearly is not.
Oh, and this is all being run over 10G Ethernet.
If memory serves, I believe the kernel is 2.6.2-504.8.1 on the server.
Any ideas about what else to check would be greatly appreciated.
Thanks
-C
I recently migrated approximately 100 rhel 6 machines from NFS3 to NFS4
(server is rhel 6.6, clients are a mix from 6.3 to 6.6). Things went pretty
smooth until several hours into the new configuration, then things started
running very slowly. Restarting the nfs server process clears the issue
which seems to indicate the server is the problem.
The file system itself can do some very high io throughput, on the order of
1GB/sec sustained and the "th" values in /proc/net/rpc/nfsd never increase
which indicates i/o's are completing on time with no thread starvation. The
server itself is set for 384 threads. During the previous NFS3 config the
thread count was much higher and had no problems.
I suspect file locking as the primary application in use ( an in house app)
uses a lot of little startup scripts which call other scripts to set up the
environment etc. Under normal circumstances this startup takes about 6
seconds. Over time that duration increases up to 30 and even 70 seconds in
some cases.
I've scoured every reference to nfs4 performance degradation I could find
but nothing seems to call out what we are experiencing. A few retrans exist
in nfsstat but nothing that stands out. Generally, everything "look" OK but
clearly is not.
Oh, and this is all being run over 10G Ethernet.
If memory serves, I believe the kernel is 2.6.2-504.8.1 on the server.
Any ideas about what else to check would be greatly appreciated.
Thanks
-C
--
redhat-list mailing list
unsubscribe mailto:redhat-list-***@redhat.com?subject=unsubscribe
https://www.redhat.com/mailman/listinfo/redhat-list
redhat-list mailing list
unsubscribe mailto:redhat-list-***@redhat.com?subject=unsubscribe
https://www.redhat.com/mailman/listinfo/redhat-list