Showing posts with label VMware. Show all posts
Showing posts with label VMware. Show all posts

Wednesday, December 5, 2012

You Can't Triple Stamp a Double Stamp - Running an Application Cluster Inside an ESX Cluster

You Can't Triple Stamp a Double Stamp -
Using vMotion To Move An Application Cluster Inside an ESX Cluster 


This may be one of the more complex things you can be asked to do in vSphere.  Why would you want to cluster two VMs together when you have the protection of ESX clustered servers using vMotion, High Availability (HA), Distributed Resources Scheduler (DRS) and Fault Tolerance (FT)?  A common answer is because at the application layer you can't easily vMotion a clustered active node or fail over without significant configuration and testing. Let's look at the parts involved to successfully vMotion application cluster nodes. We'll also consider firmware and compatibility.  Finally, we'll talk a little about automating all of this to happen seamlessly.

Configure Anti-Affinity Rules
Remember a VM is a server.  It looks and feels like a physical server.  Keep that in mind when working with application clusters.  If you have an active node in a cluster that you need to vMotion to another host in the ESX cluster then you'd likely fail over to the passive node first.  Once the active node has been successfully failed over (automate this as much as possible) then it will become the passive node in the application cluster.  Consider manipulating the heartbeat timeout period before vMotioning the new passive to a new ESX host in the cluster.What you want to do is utilize Anti-Affinity rules in ESX to keep cluster node A on one specific host and cluster node B on a different host at all times.  Otherwise DRS will manage that for you!

Right Click on the ESX cluster, Edit Settings, Under VMware DRS
  • Click Rules > Add.
  • Click the DRS Groups Manager tab.
  • Click Add under Host DRS Groups to create a new Host DRS Group
    containing half of the ESX/ESXi hosts in the cluster. Create a 
    second Host DRS Group containing the remaining hosts.
  • Click Add under Virtual Machine DRS Groups to create a Virtual
    Machine DRS Group for each anti-affinity virtual machine.
  • Click the Rule tab, from the Type drop-down menu, click Virtual
    Machines to Hosts.
  • Select the first virtual machine from Cluster Vm Group > Must
    run on hosts in group.
  • Select a host from the Cluster Host Group.
  • Click Must run hosts in group.
  • Select the second virtual machine from Cluster VM Group.
  • Select the second host from the Cluster Host Group and click OK.

Upgrade Firmeware and BIOS
Considerations before attempting to vMotion an application cluster.  BIOS, Firmware (HBA, NIC), ESX, Windows 2008 Server.  I recently did this sort of cluster within an ESX cluster and here are the items you need to verify/upgrade before you start.  This would apply to an ESX 4.1u2 system using the Service Console.  The process to upgrade Emulex firmware on ESXi is very different.

BIOS - Upgrade BIOS especially if you are doing this on a blade server!
HBA - Upgrade your Emulex Firmware.

Emulex requires a RPM be installed on the host first.

Download the following RPM from Emulex site directly (no link):
  • cat /proc/scsi/lpfc820/1 (for example) to find the model of Emulex card installed.
  • rpm -ivh elxvmwarecorekit-esx40-4.0a46-1.x86_64.rpm
  • /usr/sbin/hbanyware/hbacmd ListHBAs | grep Port (for WWPN info)
  • /usr/sbin/hbanyware/hbacmd download 50:06:0b:00:00:c2:7a:01 zf282a4.all
You must do the “download” for each WWPN in the output of ListHBAs.

Download Emulex firmware here:

Broadcomm drivers must be installed in maintenance mode followed by a reboot. Again, mount the ISO using Virtual Media.
  • mount /mnt/cdrom
  • vimsh -n -e /hostsvc/maintenance_mode_enter
  • expupdate –bundle=/mnt/cdrom/offline-bundle/BCM-bnx2x-1.70.34.v41.1-offline_bundle-547965.ziip update
  • reboot
  • ethtool -i vmnic0 (to verify)
     and
    vimsh -n -e /hostsvc/maintenance_mode_exit
Here is the current (at the time of this post) Broadcomm fix from VMware
vmware-esx-drivers-bnx2-bnx2x-cnic-bnx2i.547965.iso

Automate, Automate, Automate
Use tools like PowerShell to manage the failover.  Microsoft Cluster Manager is the GUI way to do the same thing.  You want to to script fail over so it happens automatically based on real conditions and thresholds.  In a Disaster Recovery situation you will WANT to automate the fail over.  Consider this.  Node A resides on an ESX host at your local data center.  Node B resides in your disaster recovery hot site many miles away.  If a condition is met on Node A, fail over automatically to Node B.  Scripting this will take much time and most importantly - lots of testing!  It's worth your effort.  I prefer Linux Clusters in this scenario for the ease of scripting and the reliability of these systems. 

Clusters are complex environments with lots of inter-dependencies.  You may ask yourself why would you cluster within a cluster.  The situation can be build just like physical clusters are but within VMware.  All the configurations and tools exist to make it so.  Pay close attention to detail during your configuration.  Work off an existing standard, template or profile.  Lastly, automate all the steps that are possible to automate.  Your boss will thank you when disaster strikes.

Good luck!




Wednesday, June 6, 2012

vSphere Security - Architecture

In vSphere ESXi 5.0 there are three major components of the security architecture.  Security can and should be applied to all three areas.

  1. Virtualization Layer (VMkernel)
  2. Virtual Networking Layer
  3. Virtual Machines
In this blog we're talking about security at the Virtualization layer. What can we do at the VMkernel layer that would help us secure the system?  There are three things we can do to protect the system at the kernel level.

  1. Memory Hardening
  2. Kernel Module Integrity
  3. Trusted Platform Module (TPM)

The cool part of memory addresses in ESXi 5 is randomness.  User applications, drivers and libraries are located randomly in non-predictable memory.  Does that make you wonder about the algorithm that assigns memory in a non-predictable way?  Me too.  Even chaos shows patterns.  That's why VMware also created non-executable memory protections thanks to advances in microprocessors.  Now if any memory exploits are discovered and malicious code is deployed chances are the code will fail or encounter this randomness and/or protected memory.   Security loves randomness.

Kernel Module Integrity is a fancy term for digital signatures.  Drivers, modules and applications are digitally signed as they are loaded into the VMkernel.  This allows the kernel to identify the providers of drivers, modules and applications and ensures they are VMware-certified.   I really like this over say Android, open source solutions.  Apple does this "certification" of their third-party provided apps.  It's just smart business.  Control the development, certification process and deployment and you'll have higher quality code.  Many vendors provide this method of developing apps for their products.

Trusted Platform Module (TPM) is a measuring tool used each time ESXi boots.  Think of it as an approved configuration that boots to the same place every time.  It's enabled by default and can't be disabled. It measures the VMkernel and a subset of VIBs - vSphere Installation Bundle (not to be mistaken with MIB - Management Information Base used with SNMP).  VIBs allow you to include certain modules into the host image for deployment when building or recovering hosts.  This measurement taken at boot time is stored into a register called the Platform Configuration Register (PCR) 20.  This value can be (and should be) monitored. You want to monitor for any changes to the image.  Another item to monitor is for corruption of images.  TPM is largely a configuration and change management tool.  Alerts should be set up to notify appropriate people when unauthorized changes are made.

In this blog we discussed the features of securing the VMkernel at the Virtualizaiton layer.  There are three main components Memory Hardening, Kernel Module Integrity and Trusted Platform Module (TPM).  All three play a very important and diverse role in the security architecture of vSphere 5.0. Monitoring and/or logging activities of these features and alerting on thresholds is a critical step to ensuring your vSphere environment remains secure. 

Friday, April 27, 2012

VMadmin - The One IT Employee You Don't Want to Upset!

As IT convergence continues to gain popularity with everyone from CIOs to System Administrators the role of one (or more) person(s) is quickly becoming THE job you want to have in IT.  For many reasons the VMware Administrator is the one IT job you want if you are a Systems person (Sys Admin, Network Admin, Storage Admin, etc.).  He or she is the Neo in The Matrix who can do it all.  Virtualization is so popular because of how much money it can save the company.  Everyone wins.  From the Doctor carrying a tablet around, visiting patients, updating information with his or her Virtual Desktop Infrastructure (VDI) to the person automatically uploading photos to some storage space in the Cloud for sharing on Facebook to the lonely system or application administrator preparing a clone of a VM for a big application upgrade.  Virtualization allows this all to happen while saving money, time, space, and energy. 

A bit of history.  Computers were supposed to make everything easier, more productive and fun!  Then humans came along and ruined it.  Today computer systems can be very complex and expensive to manage. Open Systems took away the sleek beauty of one mainframe and exploded it into a massive array of little pieces at a much lower price.  Now everyone is into computers.  From their smartphone to their fancy MacBook Air or iPad tablet sitting at Panera Bread trying to appear more intellectual than they might be.

Virtualization is the next logical step in a constantly evolving electronic ecosystem.  Which brings me back to our ultimate warrior called the VM Admin.  Thanks to this push for convergence driven by budget issues and new energy initiatives comes someone who sits squarely in the middle of the data center spectrum.

There is the Network which has traditionally been managed by the Network Administrator.  Then Cisco comes along with the Unified Computing System - an awesome blade technology platform for hosting virtual machine. Wait a second.  Isn't Cisco a Networking company?  The lines between Network and Server have faded into one device.  When Sun Microsystems used to advertise "The Network is the Computer" they were spot on.  That was back in the 1990s!  The VM Admin is a key player in setting up virtual switches (vSwitch) distributed or not, virtual port groups (distributed or not), and mapping those to virtual local area networks (VLANs).  Of course the VM Admin needs to work with the traditional Network Administrator to ensure settings (static and dynamic) are compatible.

There is the Storage Administrator.  At one time the second highest paying job in IT (next to DBA). Then along comes Compellent with their vSphere client plugin to help vMotion migrating VMs to their storage array.  Now the VM Admin can do all these storage activities:
  • Provision, resize or expand volumes as business needs demand
  • Configure storage and VM hosts
  • Replicate VM datastore volumes for disaster recovery
  • Clone volumes to accelerate test and development
  • View virtual storage statistics
Last but not least the traditional System Administrator falls to the mighty VM Admin.  What, you used to be called a Windows Admin or a Unix Systems Administrator?  That's SO 2000 and late.  Need to set up a cluster of three nodes for a new application?  Don't call the Unix admin.  It'll take him or her weeks to set it up.  Call the VM admin - done in an hour.  Now, let's go to lunch.  Need to clone a production host to test out some patches or a software upgrade?  Windows admin says I'll need a few days to work out licensing, installation, hardening, etc.  VM admin says it's already done.  Have you ever lost hardware on a critical server?  VM admin yawns as he watches all the VMs automatically migrate to another host in the cluster.   Need to add more memory to that server?  Let me put in an request for change, explain myself at the change meeting, prepare for the change, create a backout plan, schedule it for Sunday at 2am, drive in, make the change, lose much needed beauty sleep then cautiously ask the application owner to test out the application.  The VM admin was done dynamically adding 2G of memory before the original question was asked. Nuff said.

Here are the list of traditional IT jobs that are being slowly consumed by one super admin - the VM Admin:

Backup Administrator
Storage Administrator
Unix Administrator
Linux Administrator
Windows Administrator
Network Administrator
Security Administrator
Business Continuity Professional
Disaster Recovery Planner
<Your Job Title Here>

It's not over yet.  You still need that expertise when you need to rebuild a Linux Kernel or troubleshoot a core dump.  The VM admin builds the house that IT lives in.  It's still up to the traditional admin to manage the details.  The convergence of roles in the data center aren't over yet.  Not by a long shot.  So forget all that HP-UX training you went to out in Fullerton, CA.  You need to upgrade your skills today.  So download that Mastering VMware vSphere electronic book to your tablet, register for the exam online and tweet a friend about it. 





Friday, April 20, 2012

Doing Cool Stuff with VMsafe - Wait Not Yet!

VMware just keeps getting better and better all the time.  If your company or organization isn't already using virtualization in some form or another then you must consider it.  Everything is being virtualized.  Literally.  And VMware is THE standard for which all others aspire too.  Not to knock Oracle VM Server which is awesome on an enterprise level.   Xen, Red Hat's KVM and Oracle's Virtual Box are great products too.  For small to medium and even enterprise level businesses VMware is the leader in the virtualization industry - aka the Cloud.

VMware VMsafe gives you three ways to better protect your virtual machines through Application Programming Interfaces (APIs).  You or some company has to create the code (test it, quality check it, etc.)

1.  vCompute is an API that provides CPU and Memory inspection before code is executed.  Vendors are scrambling to create code for this.  I'm sure Symantec, Norton, Sophos will be big players in this market.  It'll add an important layer of security as it will be the place where code is inspected for many things BEFORE it is executed.

2.  vNetwork Appliance has a DVFilter API that will sit between the vNIC and the vSwitch.  It will allow you and security vendors the ability to create network packet filters to insert into the virtual packet stream. VMware vCenter Lab Manager (a cool way to automate transient and cloned VMs) was the first product built to use DVFilter.  

3.  And for disk block inspection you can write code with the VDDK API.  This is a Software Development Kit (SDK) including all the necessary libraries.  Who's checking for slack space besides hackers and forensics specialists?  There should be software on your systems checking for that.  Why not do this at the source storage array (EMC, Compellent, Hitachi, Oracle, IBM, etc.)?  I've always preached the idea of pushing functions down to the lowest point in a data flow.  Perhaps a VMware product is the place where this inspection takes place. 

The problem is the code has yet to be developed, tested, quality checked and made available in beta.  While at the same time we DO have access to software that will do similar functions for us today.  I still like where this is headed though.  Security is all about position, timing and diligence.  That's why we architect our network to hide, inspect, filter, and block traffic.  VMware is THE place for consolidation today.  The VMware admin IS the storage, network and system administrator today.  More on that topic in another blog.  Timing is all about patching, auditing, hardening before and after the device is put on the network.  Diligence is about staying focused on security even when it becomes tedious, expensive and resource depleting.

VMware vSafe is yet another product/layer/utility poised to reinforce VMware as the leader in virtualization.  What will they think of next?