organized flames

Vmware Server and Xen

Posted on May 20, 2008 by Michael

In the past, I’ve played with VMware (as a “workstation” “server”, not the bare-metal one we have access to now) but was never quite happy with it. Some of the problems I had might have been that I ran it on windows Vista, not Linux. However, from a VM point of view, the VM itself should be more or less identical.

Recently I tried out Xen on the same hardware, but using NetBSD/amd64 as the “host” OS.

Hardware

The machine is Gateway GM5446E with a dual core Intel Core 2 Duo with 3 GB of ram. The machine has three SATA hard drives, connected via an Intel AHCI controller running in native AHCI mode.

Bare Hardware Baseline

I booted a standard NetBSD-current/amd64 kernel and ran some speed tests, which gives a baseline for dom0 and guest OS disk I/O speed tests. See below.

VMware

In my VMware install, I used the machine running “Windows Vista Media Center” – it was what came pre-installed on the machine.

Host OS: Windows Vista Media Center

Guest OSs tried:

  • NetBSD-4.0/i386
  • NetBSD-4.0/amd64
  • NetBSD-current/i386
  • NetBSD-current/amd64
  • Linux Ubuntu (server, then-current version)
  • Linux Debian (then-current version)

Linux booted in 64-bit mode, but would crap out at some later point with similar issues that NetBSD had.

One machine, named “nfsd”, was dedicated to serving out home directories and source trees of NetBSD. The other host OSs mounted /home or /netbsd-src from nfsd.

Each machine had a “local” disk to store object files from pkgsrc, src, and other OS-related builds.

General VMware Problems

I could not get netbsd/amd64 (current or 4.0 release) to “self host” – build /usr/src and kernels – reliably. They would either silently lock up without reason, or they would crash with an odd CPU exception.

Timekeeping was whacky. Without running the VMware-supplied (closed source) tools on a client, time was off, and apparently in uncorrectable ways. Running ntp made things seriously whacky as the time would drift wildly as ntp tried to correct a guest.

The VMware closed-source tools are only available on a small, limited number of OS types, and then specific versions of many of them. They do supply a .so that is, in theory, linkable on many versions of Linux, but the install procedure warns loudly of warnings pertaining to compatibility.

VMware running on anything but Intel chips with synchronized cycle counters (which most OSs use for high-res timekeeping these days) was a disaster.

VMware Strengths

If the problems above are solved, VMware is a true virtual machine architecture that will run any OS without modification. VMware could run windows guests, right along with unmodified NetBSD, FreeBSD, Linux, and Solaris/x86 guests.

Xen

I tried Xen 3.1.3.

Xen is a different architecture than vmware in that it prefers to use “paravirtualization” rather than a full virtual machine. It has a host machine (called “domain 0” or “dom0”) which attaches to the physical hardware and acts as a conduit between the xen hypervisor and hardware.

The boot process is that the xen kernel is booted first, which then boots the dom0 host. Multiple domains can be created, serving different hardware, but in practice this is rarely done.

Each host OS has a config file, and is stated with “xm create /path/to/file.conf”. This boots the guest OS and connects a serial console, which can be used with “xm console ”.

Since the “dom0” is a fully functional OS in its own rights, I have it serve NFS to the guest OSs.

I created the following guest OSs:

  • NetBSD-current/i386
  • NetBSD-current/amd64
  • Windows XP Pro (32-bit)
  • Windows Server 2003 (32-bit)

Yes, I managed to install Windows XP Pro and Server 2003. They run in a “vnc” console, and for all practical purposes looks like windows. This is using Xen’s “hvm” – which is a full hardware emulated virtual machine, and allows running unmodified guest OSs. People have Vista running in a virual machine under Xen, but I do not have “real” Vista install media or licenses, just the ones that came with and is tied to my hardware.

Xen also supports both realtime and offline “migration.” As I have only one machine of the same type, I have not yet read up on how this works. The basics: A realtime copy is made of the guest’s ram, device state, and other data. It is transmitted to the new destination, and synced up until a very small switchover time can be used to swap where that guest is running. Xen claims 100 ms switchover time is possible, but there are restrictions: The disks are NOT migrated, so must reside on a shared volume. The physical network each dom0 is on is also shared, in order to avoid disruption of TCP connections. I also believe fairly identical CPU and dom0 operating systems should be used.

Offline migration involves shutting the guest down, copying the disks over, and restarting it on a new dom0. This will, of course, interrupt service.

Xen Problems

It is difficult to configure for the fist time. The documentation is… lacking. It is also only as solid as the host OS is, but vmware has the same issue in “server” or “workstation” incarnations.

Xen is also very, very “linux” specific in documentation and examples. Most of these can be translated – I certainly did so easily enough – but this is being corrected in their documentation as more OSs are able to boot as dom0.

Xen Strengths

Timekeeping in Xen, since it is paravirtualized, is almost perfect. Small drifts will occur without running ntp, but all guests (and the host) can run ntp and obtain sanity.

It also appears that all guests and the dom0 “drift” identically, so this is probably related to hardware timekeeping issues. The measured drift of an uncorrected NetBSD guest was 4 seconds in two weeks. ntp correction kept the others in perfect real-world sync.

It is as free as you want it to be. Support and commercial versions exist, but the free stuff works amazingly well.

Performance

On Xen, all tests were performed with the domu’s running but idle, and no hvm guests running (windows is just too unpredictable.) On VMware, only one VM was active at once, and the Vista host was as idle as it could be made.

I measured three main things here:

  1. Boot speed: How fast a kernel gets from loading to the first /etc/rc message.
  2. Disk speed: read/write speed.
  3. CPU Performance:

Boot Speed

The dom0 boots as fast as any other kernel boots; it must probe the hardware, wait for hardware to change state, etc. No measured difference between a standard NetBSD-current/amd64 kernel and the dom0 kernel.

The domUs (guests) boot so fast it is nearly impossible to measure. This is because the devices they have access to are known – all are on a virtual bus, and are directly enumerable, so there is no need to probe for devices, wait for them to change state, or time out when not present. As best I can measure, just under 2 seconds is a fair estimate.

The hvm (windows) guest seems to be about as fast as windows is. I did not analyze this one much.

On VMware, the host OSs boot at about the same speed as a “real” machine boots unless a custom kernel is built with just “known present” devices. Even then, boot times are 15-20 seconds.

Disk Speed

In all host/dom0 tests, “iozone” version 3.263 was used, with a 1 GB file on the same disk. Each test was performed only once; for real comparison data we’d want to run it more than once, but this is just a first-pass test.

For native NetBSD/amd64, I had to increase the file size to 4 GB to avoid the cache, as the machine has 3 GB of ram.

  • wd0 is a 500 GB SATA 3.0Gb/sec disk.
  • wd1 is present but unused.
  • wd2 is a 320 GB SATA 1.5Gb/sec disk.

All are on different channels of an Intel AHCI controller running in native SATA mode.

For the Xen tests, all disks were mounted as files on the dom0 host. From dom0’s point of view, the file is mounted on a “vnd” virtual disk, and that virtual disk is exported to the host.

For the VMware test, all disks were mounted as files in the Windows filesystem.

OS Disk Block Size Read Write
netbsd-current/amd64 native wd0 8192 60762 59949
16384 60545 60141
wd2 8192 78342 76342
16384 78311 75252
netbsd-current/amd64 dom0 wd0 8192 60641 60109
16384 60459 61919
wd2 8192 80258 79102
16384 80187 80295
netbsd-current/amd64 domu wd0 8192 51205 24004
16384 51714 27971
wd2 8192 77990 23997
16384 77282 22496
netbsd-current/i386 domu wd0 8192 41730 25012
16384 42008 24543
wd2 8192 66401 26048
16384 66201 28910
netbsd-current/i386 vmware wd0 8192 25014 13912
16384 25417 13771
wd2 8192 38831 16100
16384 38994 16332

I also repeated one test with a raw, physical partition mounted in the netbsd-current/amd64 domU, which bypasses the “double filesystem” issue:

netbsd-current/amd64 domu wd2 8192 79915 76992
Physical mount wd2 16384 79744 77102

CPU Performance

Each CPU speed test was run with: Dhrystone Benchmark, Version 2.1 (Language: C) Program compiled without ‘register’ attribute.

I used an iteration count of 1,000,000,000 for each test.

Operating System Dhrystones per second
netbsd-current/amd64 native 11,013,216
netbsd-current/amd64 dom0 10,365,917
netbsd-current/amd64 domu 11,130,899
netbsd-current/i386 domu 4,935,347
netbsd-current/i386 vmware 5,012,123

Just for grins, I ran the following tests, one dhrystone on one guest and another on a different one. Since each guest is uniprocessor in my configuration, I did not run two benchmarks on the same host.

Operating Systems Speed 1 Speed 2
Running both domu/i386 and domu/amd64 4916421.0 11135857.0
Running both dom0/amd64 and domu/amd64 10298661.0 11135857.0
Running both dom0/amd64 and domu/i386 10373444.0 4921260.0

Conclusions

Xen is production ready.

When the host OS can be modified, much higher performance numbers are obtained vs. the low-end VMware server I ran.

While it might be extremely tempting to build one guest that does one very specific function, this probably does not scale: memory is pre-allocated and dedicated to a guest, and while some swapping is allowed, it will slow the guest at seemingly random times; disk can be overcommitted, but the OS sees failure to allocate a block as a hardware failure; the more hosts, the more maintenance costs are present: maintaining packages on each guest, upgrading, etc.

VMware “hmx” or whatever the name of the run-on-bare-metal product should be tested.

I’d love to install Xen on a huge machine with lots of ram and many, many CPUs as a test. Would someone like to ship me a 4 CPU quad core with 64 GB?

Fun With Apache and Virtual Hosts

Posted on October 31, 2007 by Michael

Specifically, name based virtual hosts.

I recently tried to add IPv6 support to my web server. I used to have it, I remember having it, so this should not be all that hard.

After an hour of hacking, I ended up finding two gotchas:

  • Make certain, I mean certain, that all virtual hosts for name-based servers have a unique ServerName line.
  • Make certain, and I mean certain, to save your original configuration files.

Yea, I know, I should have known better. But this is a simple thing to change, right?

A very useful tool is apachectl -S, which lists all virtual hosts. Even better is to run that output through sort

Mongrel, Apache, and Rails

Posted on October 28, 2007 by Michael

When I first started running Rails applications on my web server, I chose to use FastCGI. Specifically, the mod_fcgid module, which had some features I wanted. It also has the unfortunate by-product of corrupting Apache’s memory. Bad news.

I’ve since removed FastCGI entirely and moved to a proxy to mongrel_cluster setup. And I’ve started deploying with Capistrano.

Capistrano

I have a certain amount of concern with moving to a deployment system I knew very little about. Just like a new backup system, I feel like I’m handing the keys to my data over to something not written by me. And, while it is fairly simple to set up, Capistrano is somewhat complicated internally.

I already push out my operating system upgrades in an automated way. I compile NetBSD on one machine here at home, and push the binaries out to all the machines I have which run NetBSD. This means about 7 machines rsync from the build box with one command. This can be scary, but I’ve been doing it for 5 years now, and it just works. How can a web site be scary compared to kernels and system binaries?

The answer is, it’s not. If something breaks it is fairly easy to manually reconfigure if I need to. So, I’ve relaxed a bit. My concerns are still there, and I’m keeping a careful watch on how Capistrano runs each time I deploy. I have yet to do a real deployment after all! So far, I’ve not done a single migration, and have not had to roll back. And I’m pushing to a single machine, which runs the database as well as the site.

I suspect that, as I become comfortable with this new method to update my web sites, I’ll start thinking of it as rsync++. It really is that simple.

mongrel_cluster

Mongrel is a vary amazing little widget. Sure, it’s slower than Apache, but that’s ok. Mongrel is still far, far faster than restarting Rails for each web hit, and far more reliable than mod_fcgid.

In my configuration, I run each site on ports 10000, 10010, 10020, etc. with up to 3 servers per. This means application #1 is on 10000 through 10002, with room to grow should I need to run more. If I find myself running more than 10 servers for a site it needs a new machine anyway, or more machines. And if that happens, I hope I’ll have a budget.

Apache load balancing

This is a new feature in Apache 2.1, and apparently is very reliable with Apache 2.2. This is currently my favorite way to run a web site.

My configuration, which happens to be for this site:

  1. <proxy balancer://blog>
  2. BalancerMember http://localhost:10010
  3. BalancerMember http://localhost:10011
  4. BalancerMember http://localhost:10012
  5. </proxy>
  6. <VirtualHost blog.flame.org:80>
  7. DocumentRoot /www/blog/flame-blog/current/public
  8. <directory "/www/blog/flame-blog/current/public">
  9. Options FollowSymLinks
  10. AllowOverride None
  11. Order allow,deny
  12. Allow from all
  13. </directory>
  14. ProxyRequests off
  15. <proxy *>
  16. order deny,allow
  17. allow from all
  18. </proxy>
  19. RewriteEngine on
  20. # Check for maintenance file. Let apache load it if it exists
  21. RewriteCond %{DOCUMENT_ROOT}/system/maintenance.html -f
  22. RewriteRule . /system/maintenance.html [L]
  23. # Rewrite index to check for static
  24. RewriteRule ^/$ balancer://blog%{REQUEST_URI} [L,P,QSA]
  25. # Let apache serve static files (send everything via mod_proxy that
  26. # is *no* static file (!-f)
  27. RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-f
  28. RewriteRule .* balancer://blog%{REQUEST_URI} [L,P,QSA]
  29. </VirtualHost>

It is important, at least on my host, to use localhost in the balancer destinations. This is due to mongrel suddenly running on IPv6 loopback (::1) rather than the usual IPv4 loopback (127.0.0.1). I don’t know why this happened, but the localhost trick makes Apache try both addresses, and whichever works it will use.

This configuration makes Apache serve static content, and sends all other requests off to one of the Mongrel processes.