On a freshly built environment (a FlexPod) running vSphere 5.5 Update 1, I ran into an issue with Citrix XenDesktop 7.1. Provisioning desktops using MCS worked fine for machine catalogs without Personal vDisk (PvD), but not for PvD catalogs. Once the VMs were built, they would start, but VMware Tools would crash, the machines wouldn’t get an IP address, and logging into the console produced a black screen and high CPU usage until the VM was shut down.
Working with Citrix Support, I determined that the PvD disk was corrupted. According to Citrix, this usually happens because of an improper PvD inventory update on the gold image. Regardless of the actual cause, Citrix gave me two options: (1) Build a new gold image, or (2) downgrade VMware Tools from 5.5 Update 1 to 5.5. While another team member started the new gold-image build process, I continued with the VMware Tools downgrade.
It took a bit of digging, but I found that VMware has good support for older versions of VMware Tools. See the KB article here:
They maintain a repository of older versions of VMware tools going way, way back in time. The downgrade was as simple as pulling down the VMware Tools for ESXi 5.5 GA (about 77 MB), then mounting the CD on the VM, and installing Tools. The only difference is that the launch process isn’t the usual VM > Guest > Install/Upgrade VMware Tools, but rather just mounting and running the Tools installer from the .ISO image like any other piece of software.
In case you’re wondering: The VMware Tools downgrade solved the original problem with PvD provisioning.
Trend Micro Deep Security Manager (DSM) is finicky when it comes to SSL. It uses the Tomcat web server, and its associated Java certificate keystore methodology (which I don’t understand very well), so it’s nontrivial to work with compared to most servers when it comes to adding or replacing a signed certificate.
With Deep Security Manager, to give domain users administration rights, you must replace the default self-signed cert with a CA-signed certificate. There’s not a lot of good documentation on the web that covers this process, so I decided to write this up after about the fifth time that I’ve grunted my way through the process. As always, I hope you find this helpful.
Assumption: Deep Security Manager is installed on a Windows server VM.
Step By Step
- Take a snapshot of the Deep Security manager VM and name it “Pre-SSL” or something similar.
- RDP into the Deep Security Manager VM. Open an elevated command prompt (Start > CMD > right-click and select Run as Administrator).
- Create a folder called c:\certs.
- Stop the Deep Security Manager service using the command: net stop “Trend Micro Deep Security Manager” and check to make sure that the service has stopped. It can take a full minute or more to stop.
- Change to the Deep Security Manager directory. The default is c:\Program Files\Trend Micro\Deep Security Manager.
- Determine the Java keystore password by issuing the command: installfiles\genkey.bat. The keystore password is the string immediately following the ‘-storepass’ parameter. Its format will be something like this: bEwzWtCe.
- In the following steps, replace “bEwzWtCe” with the password you captured in step 6.
- Issue the command: keytool -delete -storepass bEwzWtCe -alias tomcat -keystore .keystore
- Issue the command: keytool -genkey -storepass bEwzWtCe -alias tomcat -keyalg RSA -keystore .keystore
- In the dialog that follows, the first prompt is for your name. Don’t use your name—use the FQDN of the Deep Security Manager machine. For example: trend01.acme.local. Complete the other fields with the appropriate customer information. For OU, you can use “IT,” the customer’s full company name, or something else, if they have a preference.
- Issue the command: keytool -certreq -storepass bEwzWtCe -keyalg RSA -alias tomcat -file c:\certs\certreq.txt -keystore .keystore
- On a domain-joined machine, export the root certificate using the Certificates snap-in (for the Local Computer) in MMC. Export the cert using the default settings. Name it root.cer and copy it to the c:\certs directory on the Deep Security Manager machine.
- Issue the command: keytool -import -alias root -storepass bEwzWtCe -trustcacerts -file c:\certs\root.cer -keystore .keystore. When prompted, type “yes” to accept the certificate into the keystore.
- Generate the certificate for the Deep Security Manager, either using the customer’s Windows domain CA (preferred), or a trusted certificate authority. Use a web server template.
- Download the certificate chain (not just the cert) in DER (p7b) format. Save the file as dsmcertnew.p7b. Copy it to the c:\certs directory on the Deep Security Manager machine.
- Issue the command: keytool -import -alias tomcat -storepass bEwzWtCe -file c:\certs\dsmcertnew.p7b -keystore .keystore. When prompted, type “yes” to accept the certificate into the keystore.
- Start the Deep Security Manager service using the command: net start “Trend Micro Deep Security Manager”
- Check to make sure the service has started.
- Log into Deep Security Manager and verify that the signed certificate is in use. Use the FQDN of the Deep Security Manager when connecting to it with a browser. You shouldn’t receive a certificate error, and if you check the certificate chain (use the lock icon in the browser bar to get to it), you should see the certificate chain with the correct CA and the FQDN of the DSM.
- If all goes well, delete the snapshot you took in step 1. If not, revert to the snapshot and come back to it another time.
You can now go about the process of connecting the Deep Security Manager to your domain to import users under Administration > Users > Synchronize with Directory. Use TLS (port 636), unless it’s not enabled.
Configuring Fibre Channel boot-from-SAN is tricky. iSCSI boot-from-SAN is similar, but has additional considerations.
Boot-from-SAN with iSCSI storage is more involved than FC boot-from-SAN, and much more involved than local storage ESXi installations. There’s some front-end work to configure boot-from-SAN on the UCS side, and then some work on a per-host basis to configure the boot LUNs, test boot behavior, troubleshoot (don’t discount this!), and so forth. It can take a lot of debugging before going into a production mode of deploying each host’s settings. If there’s a legitimate reason to boot from SAN, such as a large number of hosts and a desire to have truly stateless servers with service profile portability, then boot-from-SAN makes sense. Anything less than about 10 hosts in a given workload type (ESXi for servers, ESXi for VDI, Linux, Windows, etc), makes boot-from-SAN less advantageous because of the extra effort to configure it.
iSCSI boot-from-SAN has some configuration details that you need to keep in mind:
- In any boot-from-SAN scenario, each host needs its own boot LUN. Typically I configure 5-GB boot LUNs (FC or iSCSI), which is about 4 GB more than needed, but it’s better to allow a good bit of overhead. Doing so requires explicit iSCSI initiator mapping to each LUN.
- ESXi cannot dump its core logs to the iSCSI boot LUN. You can mitigate this by installing the Log Collector software (part of the vCenter installer) on a VM in the environment–perhaps the vCenter server or a utility box–and point all the hosts to that.
- ESXi needs a logging location. Boot-from-SAN (and USB and SD as well) requires you to create a log folder on a datastore and point the host to it. After ESXi installation, the host will complain that it has no persistent storage for logging; a quick google search turns up the KB article.
- iSCSI boot-from-SAN with ESXi automatically creates a vSwitch and a port group with a single NIC for iSCSI boot. You should not modify this switch or the port group. Its settings may change on a subsequent reboot if the primary boot NIC isn’t available. This gives rise to some troubles later with respect to the iSCSI software initiator, which UCS also automatically creates on the ESXi host when you configure boot-from-iSCSI. This is the strongest argument against doing iSCSI boot-from-iSCSI.
- Don’t configure fabric failover for the UCS boot NICs. Configure one NIC per fabric, with no failover. Use one for the primary boot NIC and the other for the secondary boot NIC.
- iSCSI boot-from-SAN creates a software iSCSI initiator on the host as part of the boot process. This initiator comes from the pool(s) you create in UCS Manager. As the keen-eyed reader will recognize, you cannot create more than one iSCSI software initiator on an ESXi host. Therefore, all iSCSI volumes must be mounted through this initiator.
- If you have other IP storage (NFS, for example), configure it to use separate NICs if possible. Put NFS on its own vSwitch (one NIC on fabric A, one on fabric B), and in a different subnet, than iSCSI. On NetApp arrays, this is easy to do because you can create different VIFs for iSCSI and NFS storage
- This one really warrants its own subject–but for the purposes of this post, I’ll keep it to this: If you have separate uplinks from the fabric interconnects to separate upstream switches, a common case for which is when you have dedicated storage switches, you must configure Disjoint Layer 2 to keep the storage traffic on only the correct uplinks, or you will have nightmares making SAN boot work consistently.
A bit more on why you may not want to boot from iSCSI storage. When you configure iSCSI storage on an ESXi host that isn’t configured to boot from SAN, you manually create the iSCSI initiator on the host, then bind your storage NICs to it so that you can configure and take advantage of ESXi iSCSI storage multipathing. With boot-from-iSCSI, multipathing is impossible. Good thing UCS hosts have 10-Gbps NICs, right?
When you configure boot-from-iSCSI in UCS, the UCS creates the iSCSI software initiator on the ESXi host, as mentioned earlier. It uses an IQN based on your initiator pool for the fabric from which the host boots. Let’s say that you create a second iSCSI vSwitch for storage, and configure its NICs in accordance with VMware’s recommended practices–two vmkernel ports, two NICs, with one NIC Active and one Unused per vmkernel port. When you then map these NICs to the iSCSI initiator, all connectivity to the iSCSI storage drops. The only way I’ve found to keep it connected, so far, is to unmap the separate storage NICs from the software iSCSI initiator. Thus the single-link constraint.
A few additional tips:
- Watch the host boot through the KVM and make sure that the software initiators are logging into the storage. You should see this occur during the boot sequence, and you should see the correct host drivers start when ESXi is booting.
- Check the SAN to make sure you are seeing only the desired initiators logging into the boot LUNs.
- Perform several reboots after installing ESXi and watch the KVM to see that the hosts boot consistently before continuing with any other host configuration.
- This one also deserves its own topic, but I’ll give it a bullet here. Never route storage traffic if you can avoid it; keep it in the same VLAN and do it all at layer 2. Routing it is suboptimal at best. If you must route storage traffic, configure QoS and make it the highest priority on the network, end to end. Block storage is designed to operate in a lossless environment with low, consistent latency. Be sure you configure the network to provide that level of performance.
Last week I had the dubious pleasure of configuring boot-from-iSCSI in a Cisco UCS environment with four B200M3 blade servers and NetApp storage. The Cisco documentation is not exactly crystal clear, but there’s enough information on the Web to make it possible.
One issue I encountered in this effort was the inability to install the UCS eNIC driver through the CLI, as is my usual practice. Normally I use WinSCP to copy the VIB offline-bundle component to the /tmp folder on the host, and then issue this command (specific to this eNIC VIB version) from the CLI:
esxcli software vib install -d /tmp/enic_driver_184.108.40.206-offline_bundle-1023014.zip
The installation was failing with this message:
The transaction is not supported: VIB Cisco_bootbank_net-enic_220.127.116.111OEM.500.0.0.472560 cannot be live installed. VIB VMware_bootbank_net-enic_18.104.22.168a-1vmw.510.0.0.799733 cannot be removed live.
Please refer to the log file for more details.
I had a chicken-egg problem: The host was booting over iSCSI and apparently had locked the default ESXi net-enic VIB from removal as a result. In all other networking scenarios, where ESXi is not booting from iSCSI, I have not seen a failure using the CLI method–which I have used on dozens of hosts.
The solution was easy enough–I update the VIB through Update Manager instead of through the CLI. Patching the host using VUM installed this driver with no issues and boot-from-iSCSI continues to work flawlessly.
Another method that I didn’t attempt, but would have tried next, would have been an ESXi 5.1 custom installer ISO with the UCS eNIC driver embedded in it. I’ve done this before and it’s a great method of building or upgrading larger environments with many hosts.
Hope this helps you.
While installing vShield Endpoint and Trend Micro Deep Security virtual appliances on another 20 hosts this week, I ran into a situation I hadn’t seen before–and this one didn’t have the usual breadcrumbs on the Web to a solution.
This configuration has a vCenter Server (5.1, latest build) at one data center, managing clusters of hosts at other data centers. These data centers are connected by 20-Gbps trunks. My efforts to install vShield Endpoint from the main data center to the hosts at one of the remote data centers failed. I saw two variations of error messages. One was a “page cannot be displayed” error in the vShield Manager console. The other was a java exception message.
I went on and patched the hosts to the latest build of ESXi, 1157734. No change. Finally, I completed the last step of the networking configuration, which is what yielded the missing clue. I had been able to successfully install identically configured hosts, except that they were already connected to their distributed vSwitches. When vShield Endpoint installation is attempted before the host has any VM network port groups, the vShield Endpoint installation fails. Adding the appropriate host NICs to a distributed vSwitch, then rebooting the vShield Manager, allowed vShield Endpoint to install successfully.
Prior to installing vShield Endpoint, I had configured the usual base configuration items on the host–management networking, NTP, DNS, storage and vMotion switching, and so forth. But the only networking I had configured was VMkernel networking–I had not created any port groups for VMs. Apparently vShield Endpoint requires that at least one VM networking port group before it will install.
As always, I hope this helps you.
A while back I deployed vShield Manager 5.0.1 and vShield Endpoint for a customer, to support Trend Micro Deep Security as part of a VMware View project. Just recently I added four new Cisco UCS B200M3 servers to their View environment and needed to install vShield Endpoint on them. (By the way, I really like installing ESXi on the internal, optional USB stick that the B200M3 supports, rather than on spinning disks.)
After configuring the new hosts and joining them to the View cluster, I couldn’t get to vShield Manager web interface to deploy vShield Endpoint on them. The vCenter plug-in was also down (natch) and wouldn’t respond to my efforts to enable it. I could log into the VM console and it was responsive. I tried restarting the Web services on vShield Manager, using these commands (thanks to this blog):
No dice. So, I did the honorable thing and rebooted the VM. Still no web interface.
Digging around turned up references to the vShield Manager’s log files filling the volume where they live. This turned out to be the key. Apparently it’s a known issue with an internal VMware KB article, but I was not about to call VMware Support on this, working from home in the evening. The trick is to simply clear the logs:
manager#purge log manager
manager#purge log system
Once this was done, the web service started and I was able to complete the work. Not a difficult solution, but a bit time-consuming. I hope this saves you some time when you run into this issue.–Rus
Last week I had the privilege of upgrading a customer’s sizable vSphere 5.0 and View 5.1 environment to 5.1 and 5.2, respectively. The experience was valuable, as it provided the chance to provision and test, for the first time (for me), HTML5 virtual desktop access. The benefit of clientless access to virtual desktops is that it opens many use cases that are either difficult or impossible to address with the client-based model of desktop delivery. Many of our customers are doing trials on Chromebooks, and considering where they fit in compared to thin clients and thick clients, especially in the education market. Desktop performance on a Chromebook using View 5.2’s HTML5 access is impressive and definitely suitable for a lot of academic uses, including standardized testing, ad-hoc computer labs, remote access, and quick deployments of end-user machines.
I’ve found the View 5.2 evaluation guide a very good place to get a handle on how to deploy View 5.2, including its HTML5 access feature. If you’re new to View, and face installing it in a lab or production environment, I think you’ll find this useful. If you’ve deployed View 5 or 5.1 a bunch of times previously, as I have, you’ll quickly get the differences you need to know before scoping or performing an installation.
The article: http://www.vmware.com/files/pdf/view/VMware-View-Evaluators-Guide.pdf
Of the upgrade project, which included nearly 20 hosts and two vCenter servers, the View infrastructure pieces took only an hour to upgrade. The rest of the work took most of a week, with limited outage windows and rolling ESXi upgrades using a custom ISO and Update Manager, but that’s a subject for another post.