WMI Bug with Scale Out File Server

During the build out of our Windows Azure Pack infrastructure, I uncovered what I believe is a bug with WMI and Scale Out File Server. For us, the issue bubbled up in Virtual Machine Manager where deployments of VM templates from a library on a SOFS share would randomly fail with the following error:

Error (12710)

VMM does not have appropriate permissions to access the Windows Remote Management resources on the server ( CLOUD-LIBRARY01.domain.com).

Unknown error (0x80338105)

This issue was intermittent, and rebooting the SOFS nodes always seemed to clear up the problem. Upon tracing the process, I found BITS was getting an Access Denied error when attempting to create the URL in wsman. Furthermore, VMM was effectively saying the path specified did not exist. From the VMM trace:

ConvertUNCPathToPhysicalPath (catch CarmineException) [[(CarmineException#f0912a) { Microsoft.VirtualManager.Utils.CarmineException: The specified path is not a valid share path on CLOUD-LIBRARY01.domain.com.  Specify a valid share path on CLOUD-LIBRARY01.domain.com to the virtual machine to be saved, and then try the operation again.

Further testing, I found I got mixed results when querying cluster share properties via WMI:

PS C:\Users\jeff> gwmi Win32_ClusterShare -ComputerName CLOUD-LIBRARY01

None.

PS C:\Users\jeff> gwmi Win32_ClusterShare -ComputerName CLOUD-LIBRARY01

Name                                    Path                                    Description
—-                                    —-                                    ———–
\\CLOUD-VMMLIB\ClusterStorage$          C:\ClusterStorage                       Cluster Shared Volumes Default Share
\\CLOUD-LIBRARY\ClusterStorage$         C:\ClusterStorage                       Cluster Shared Volumes Default Share
\\CLOUD-VMMLIB\MSSCVMMLibrary           C:\ClusterStorage\Volume1\Shares\MSS…

Finally, while viewing procmon while performing the WMI queries:

A success:

Date & Time:  6/3/2014 3:56:20 PM
Event Class:   File System
Operation:     CreateFile
Result: SUCCESS
Path:   \\CLOUD-VMMLIB\PIPE\srvsvc
TID:    996
Duration:       0.0006634
Desired Access:        Generic Read/Write
Disposition:    Open
Options:        Non-Directory File, Open No Recall
Attributes:     n/a
ShareMode:   Read, Write
AllocationSize: n/a
Impersonating:         S-1-5-21-xxxx
OpenResult:   Opened

A failure:

Date & Time:  6/3/2014 3:56:57 PM
Event Class:   File System
Operation:     CreateFile
Result: ACCESS DENIED
Path:   \\CLOUD-VMMLIB\PIPE\srvsvc
TID:    996
Duration:       0.0032664
Desired Access:        Generic Read/Write
Disposition:    Open
Options:        Non-Directory File, Open No Recall
Attributes:     n/a
ShareMode:   Read, Write
AllocationSize: n/a
Impersonating:         S-1-5-21-xxx

What’s happening here is that WMI is attempting to access the named pipe of the server service on the SOFS cluster object. Because we’re using SOFS, the DNS entry for the SOFS cluster object contains IP’s for every server in the cluster. The WMI call attempts to connect using the cluster object name, but because of DNS round robin, that may or may not be the local node. It would have appropriate access to that named pipe for the local server, but it will not for other servers in the cluster.

There are two workarounds for this issue. First, you can add a local hosts file entry on each of the cluster nodes containing the SOFS cluster object pointing back to localhost, or second, you can add the computer account(s) of each cluster node to the local Administrators group of all other cluster nodes. We chose to implement the first workaround until the issue can be corrected by Microsoft.

3 thoughts on “WMI Bug with Scale Out File Server

  1. Very Interesting find. I am currently working with MS Support on an issue where I am using A SOFS for my Library server and Deploy from Template fails when trying to copy the Unattend.xml, even though it can copy the VHDX from the same share. It also clears up for a time after rebooting the SOFS cluster nodes.
    In your first work around I understand you are adding an entry to the local hosts file on each of the cluster nodes but I am not clear on the Entry to add. can you post an example?
    Lets say my cluster network name in SOFS01 and I have two cluster node SOFSA and SOFSB. Would mt HOSTS file on each node be the same and look like:
    SOFS01 localhost

    • Tim,

      You need to add the SOFS resource name, not the primary cluster name, to the hosts file. So assuming the cluster name is SOFS, and the nodes are SOFSA (1.1.1.1) and SOFSB (2.2.2.2), and the file share name is SOFS01, then you would add these entries:

      On SOFSA:
      1.1.1.1 SOFS01
      1.1.1.1 SOFS01.domain.com

      On SOFSB:
      2.2.2.2 SOFS01
      2.2.2.2 SOFS01.domain.com

      If you do an nslookup on the SOFS01 name, you should see it return two entries – one for 1.1.1.1 and one for 2.2.2.2. These entries in the local hosts files that requests from the local host for the SOFS01 share always go to the local host.

Leave a Reply

Your email address will not be published. Required fields are marked *