Tuesday, December 10, 2013

NFS VAAI Statistics for NetApp Storage

Welcome: To stay updated with all my Blog posts follow me on Twitter @arunpande !!
 
In this blog I will discuss the NFS VAAI statistics that can be used on the NetApp storage to measure the performance and troubleshoot VAAI related issues. These statistics will help you determine if Copy Offload is being used by the Storage Array. In this blog I will cover 7 Mode and Clustered Data ONTAP.


On the NetApp Storage use the following command to monitor the NFS VAAI statistics. I have highlighted the important stats in RED throughout the blog. Note that I have deliberately removed some metrics from the output to make it more readable.


In general irrespective of the version of ONTAP, you can use sysstat –x 1 to monitor/measure the CPU, Memory, Disk, Network and other parameters. When VAAI primitives are used the Network utilization would be comparatively low compared to the disk usage because the clone & snapshots are now offloaded to the Storage Array. Hence resulting in reduced network usage between the ESXi hosts and the NetApp Storage Array. This command could give you some indication about Copy Offload and other primitives. However this may not be conclusive because there may be other workloads resulting in high network usage even when VAAI is being used. To precisely monitor the Copy Success and Errors use the following commands.


  1. Data ONTAP 7 Mode – In 7 Mode there are two commands available that can be used to view the NFS VAAI statistics.


fas2040> nfs vstorage stats
NFS COL counters are :
                    Copy Reqs: 0
                   Abort Reqs: 0
                  Status Reqs: 0
                  Notify Reqs: 0
                  Revoke Reqs: 0
                Invalid Parms: 0
       Authorization Failures: 0
      Authentication Failures: 0
              Copy Fail ISDIR: 0
            Copy Fail OFFLINE: 0
              Copy Fail STALE: 0
                 Copy Fail IO: 0
            Copy Fail NOSPACE: 0
          Copy Fail DISKQUOTA: 0
           Copy Fail READONLY: 0
               Copy Fail PERM: 0
            Copy Fail EXPIRED: 0
           Copy Fail RESOURCE: 0
           Copy Fail TOOSMALL: 0
        Copy Fail BAD STATEID: 0
              Copy Fail OTHER: 0
               Intravol Moves: 0
               Intervol Moves: 0
               Fail Space RES: 0



fas2040> nfs stat


Server rpc:
TCP:
calls       badcalls    nullrecv    badlen      xdrcall
2           0           0           0           0


UDP:
calls       badcalls    nullrecv    badlen      xdrcall
0           0           0           0           0


IPv4:
calls       badcalls    nullrecv    badlen      xdrcall
2           0           0           0           0


IPv6:
calls       badcalls    nullrecv    badlen      xdrcall
0           0           0           0           0


Server nfs:
calls       badcalls
2           0


Server nfs V3: (2 calls)
null       getattr    setattr    lookup     access     readlink   read
2 100%     0 0%       0 0%       0 0%       0 0%       0 0%       0 0%
write      create     mkdir      symlink    mknod      remove     rmdir
0 0%       0 0%       0 0%       0 0%       0 0%       0 0%       0 0%
rename     link       readdir    readdir+   fsstat     fsinfo     pathconf
0 0%       0 0%       0 0%       0 0%       0 0%       0 0%       0 0%
commit
0 0%


Read request stats (version 3)
0-511      512-1023   1K-2047    2K-4095    4K-8191    8K-16383   16K-32767  32K-65535  64K-131071 > 131071
0          0          0          0          0          0          0          0          0          0
Write request stats (version 3)
0-511      512-1023   1K-2047    2K-4095    4K-8191    8K-16383   16K-32767  32K-65535  64K-131071 > 131071
0          0          0          0          0          0          0          0          0          0



  1. Clustered Data ONTAP 8.x


NOTE: For Clustered Data ONTAP 8.2 you have to execute this command from diagnostic mode and use statistics-v1 command to get the copy_manager statistics.


To enter diagnostic mode use the following:
cluster1::> set diag
Warning: These diagnostic commands are for use by NetApp personnel only.
Do you want to continue? {y|n}: y
cluster1::*>
cluster1::*> statistics-v1 show -node cluster1-01 -object copy_manager


For previous versions of Clustered Data ONTAP use the following:


cluster1::> statistics show -node cluster1-01 -object copy_manager


Node: cluster1-01
   Object.Instance.Counter                                 Value         Delta
   ----------------------------------------------- ------------- -------------
   copy_manager.copy_stats.instance_name             copy_stats
            -
   copy_manager.copy_stats.node_name                           -             -
   copy_manager.copy_stats.instance_uuid                       -             -
   copy_manager.copy_stats.copy_success                        1             -
   copy_manager.copy_stats.copy_failure                        0             -
   copy_manager.copy_stats.copyStatus_success                  0             -
   copy_manager.copy_stats.copyStatus_failure                  0             -
   copy_manager.copy_stats.copyAbort_success                   0             -
   copy_manager.copy_stats.copyAbort_failure                   0             -
   copy_manager.copy_stats.copyCallback_success                0             -
   copy_manager.copy_stats.copyCallback_failure                0             -
   copy_manager.copy_stats.copyNotify_success                  1             -
   copy_manager.copy_stats.copyNotify_failure                  0             -
   copy_manager.copy_stats.copyRevoke_success                  1             -
   copy_manager.copy_stats.copyRevoke_failure                  0             -
   copy_manager.copy_stats.copyAuthCheck_success               0             -
   copy_manager.copy_stats.copyAuthCheck_failure               0             -
   copy_manager.copy_stats.bytes_copied                        0             -
Node: cluster1-01
   Object.Instance.Counter                                 Value         Delta
   ----------------------------------------------- ------------- -------------
   copy_manager.copy_stats.intra_vol_copy_cnt                  1             -
   copy_manager.copy_stats.inter_vol_copy_cnt                  0             -
   copy_manager.copy_stats.inter_node_copy_cnt                 0             -
   copy_manager.copy_stats.inter_clust_copy_cnt                0             -
   copy_manager.copy_stats.fail_mem_alloc                      0             -
   copy_manager.copy_stats.fail_isdir                          0             -
   copy_manager.copy_stats.fail_offline                        0             -
   copy_manager.copy_stats.fail_stale                          0             -
   copy_manager.copy_stats.fail_io                             0             -
   copy_manager.copy_stats.fail_nospace                        0             -
   copy_manager.copy_stats.fail_readonly                       0             -
   copy_manager.copy_stats.fail_authcheck                      0             -
   copy_manager.copy_stats.fail_no_resource                    0             -
   copy_manager.copy_stats.fail_other                          0             -
   copy_manager.copy_stats.intra_volume_copy_success           1             -
   copy_manager.copy_stats.intra_volume_copy_failure           0             -
   copy_manager.copy_stats.intra_volume_copyStatus_success     0             -
   copy_manager.copy_stats.intra_volume_copyStatus_failure     0             -
   copy_manager.copy_stats.intra_volume_copyAbort_success      0             -


Node: cluster1-01
   Object.Instance.Counter                                 Value         Delta
   ----------------------------------------------- ------------- -------------
   copy_manager.copy_stats.intra_volume_copyAbort_failure      0             -
   copy_manager.copy_stats.inter_volume_copy_success           0             -
   copy_manager.copy_stats.inter_volume_copy_failure           0             -
   copy_manager.copy_stats.inter_volume_copyStatus_success     0             -
   copy_manager.copy_stats.inter_volume_copyStatus_failure     0             -
   copy_manager.copy_stats.inter_volume_copyAbort_success      0             -
   copy_manager.copy_stats.inter_volume_copyAbort_failure      0             -
   copy_manager.copy_stats.inter_volume_copyCallback_success   0             -
   copy_manager.copy_stats.inter_volume_copyCallback_failure   0             -


In addition to the above command you can also check the nps1 status to troubleshoot NFS VAAI related issues.

cluster1::> system node run -node cluster1-01 -command stats show nps1
nps1:nps1:instance_name:nps1
nps1:nps1:node_name:
nps1:nps1:instance_uuid:
nps1:nps1:null_success:0
nps1:nps1:null_error:0
nps1:nps1:compound_success:0
nps1:nps1:compound_error:0
nps1:nps1:access_success:0
nps1:nps1:access_error:0
nps1:nps1:verify_success:0
nps1:nps1:verify_error:0
nps1:nps1:write_success:0
nps1:nps1:write_error:0
nps1:nps1:set_ssv_error:0
nps1:nps1:test_stateid_success:0
nps1:nps1:test_stateid_error:0
nps1:nps1:want_delegation_success:0
nps1:nps1:want_delegation_error:0
nps1:nps1:destroy_clientid_success:0
nps1:nps1:destroy_clientid_error:0
nps1:nps1:reclaim_complete_success:0
nps1:nps1:reclaim_complete_error:0
nps1:nps1:copy_notify_success:1
nps1:nps1:copy_notify_error:0
nps1:nps1:copy_revoke_success:1
nps1:nps1:copy_revoke_error:0
nps1:nps1:copy_success:1
nps1:nps1:copy_error:0
nps1:nps1:copy_abort_success:0
nps1:nps1:copy_abort_error:0
nps1:nps1:copy_status_success:0
nps1:nps1:copy_status_error:0