VMFAQ.com - FAQing about virtualization since 2008

Add comment
Name:
Email:
* Comment:
(Use BBcode - No HTML)


I need more performance out of my VMware environment

Step 1, 2 & 6 only applies to hosted products (Workstation, Server, Player, Fusion), but not ESX Server. The other steps are regarding hw or guest environment and will also apply to ESX Server.

Step 2 will give each VM more memory, so don't assign more memory to your guests than is available on the host. Overcommitting memory is possible, but not normally recommended. Swapping will make things go very slow.

1. Disable any power saving functions of the host OS, and BIOS

  • This includes speedstep, powernow, cool'n'quiet and similar
  • In the BIOS, choose Maximum performance
  • After doing these steps your cpu speed should be non-changing

2. A few settings should be done in the global config file or individual VM's (.vmx) config file to bring better performance.

  • The global config file is usually found in /etc/vmware/config or c:\Documents and Settings\All Users\Application Data\VMware\VMware <product name>\config.ini
  • The important lines are these:
    MemTrimRate=0
    sched.mem.pshare.enable = "FALSE"
    mainMem.useNamedFile = "FALSE"
  • By disabling MemTrimRate, memory allocation inside the guest is faster because it doesn't take and give memory to the host os upon all requests.
  • By disabling memory sharing (sched.mem.pshare.enable) your guests will not share common memory blocks. Your VMware product will also stop comparing memory blocks.
  • When allocating memory you VMware will store parts of the memory in a file. This file will be equally large as the memory allocated to the guest VM. This file exists because the ram allocation method used is mmap. By changing the setting for mainMem.useNamedFile, it will move this file from the VM's default location to /tmp on linux or into the swap file on windows. This will help a bit, especially if this is on a different disk than the VM. In linux it will help if you use the tmpfs file system for /tmp (or ramfs if you can afford it) (details here)

3. A fast physical disk subsystem is important.

  • After memory, the disk is the most common bottleneck. On a physical host the disk is accessed by one host only, now there is a load from several servers on the disk system.
  • Avoid using virtual ide drives inside your guest VMs. They are slower and will put more cpu load on the system than virtual scsi drives.
  • Using seperate disks for OS and VMs will do you good.
  • A good controller will do you even better. If you plan to run more than a few VMs on your server, a disk controller with battery and write cache will help a great deal for the performance. If your system is connected to an UPS you might also benefit from enabling caching on the individual disks (will not give much difference in performance, but lower cpu usage).
  • Avoid using software raids. Even though the performance today is ok when using software raids on normal servers, it will also put extra load on the system. You will need as much performance from your storage as possible as there is also a virtualization overhead present.
  • Use native drivers from the hw vendor in the host os (and vmware tools inside the guest). Firmware on controllers and disks might sometimes also have an effect.
  • Use preallocated disks to avoid fragmentation.
  • Snapshots have a negative effect on your performance.
  • ESX: Plan your SAN setup carefully. Make sure you don't have too many VMs per LUN. Also make sure to load balance your traffic between your SAN controllers. If performance is bad, use your SAN tools to check the cache hit ratio. VMware has a SAN design guide with good recommendations.
  • ESX: Do not connect too many ESX servers to the same SAN controller. It will affect the latency.
  • ESX: Fiberchannel normally gives much better IO than iSCSI/NFS (details)

4. Most linux kernels are not tuned for acting as a guest (details here)

  • In 2.4 kernels the system timer was normally clocked at 100 Hz, while in 2.6 the default system timer is set to 1000 Hz (some other distros are not following these "rules", and USER_HZ is still 100). 1000 Hz is definately a good thing for physical desktop computers, but it has bad side effects when virtualized. You will typically see that the load of an idle VM is higher than expected and that the clock inside the guest is not working correctly.
  • The solution is to recompile the kernel to 100Hz or (on recent RHEL/CentOS) use the tick divider boot option "divider=10".
  • Using the kernel parameters "nosmp noapic nolapic" also have a positive effect.
  • For linux guests that have paravirt_ops in the kernel (2.6.22 and newer) you should enable VMI (paravirtualization).
5. Unicpu VMs might give best performance
  • Multi cpu guests adds extra overhead to virtualization and should only be used after your testing shows that a single cpu wasn't enough. If you have only two cores in total in your host you should never give any of your guests more than a single cpu. (ESX2, Workstation & Server -> details here , ESX3.x -> details here)
  • Always start by giving your guest VMs a single cpu.
  • If you install a windows guest with two vcpu's allocated you'll have a harder time reverting the HAL back to a single cpu HAL than if you started with a single vcpu. (details here)
6. Swap
  • Since hosted VMware products will always use swap if present, you might benefit from disabling it. On linux, VMware will use separate files (so it wont help anything), while on windows it will use the systems swap file if you use the mainMem.useNamedFile option as suggested in point 2 above. If you are 100% sure you have more than enough ram in your system and you are a brave fellow, you might benefit from disabling the windows page file. This is however a very unsupported solution and I wouldnt recommend it for other than experimental usage.
7. Virtual scsi controller & disks
  • Using an LSILogic disk controller will normally give better performance than using the Buslogic controller (details here). A virtual IDE controller will cause higher cpu load in the VM and is slower than both the virtual scsi controllers.
  • Using preallocated disks is faster than growable disks.
8. Install VMware Tools
  • ..even if it's a text console only VM.

9. Use a cpu that supports MMU virtualization

  • CPUs that support MMU virtualization will give better performance for workloads that are MMU intensive (details here). Currently, this is supported on AMD cpus codenamed Barcelona and Shanghai. AMD has named this feature RVI, but also refers to it as nested paging. This feature is supported in Workstation 6.5, Server 2.0 and ESX 3.5.

An old VMware KB article, perhaps still useful: http://kb.vmware.com/kb/844

 VMware also has a very good paper on performance tuning for Workstation 6 here.

 



RSS