2010-10-04

Stuxnet & Virtualization Targeted DOS

Stuxnet for those of you have been unaware to the what has been happening with one of the most talked about worms since Conficker is an industrial rootkit targeted at Siemens software.

Now conspiracy theories aside:

  • Who was this targeted at?
  • Who created it?

I started to think how can or could this reflect on a Virtual Infrastructure?

What would happen if you had a worm / virus targeted at virtual machines or even worse at the ESX host themselves?

Let me give the following scenario.

You have a 30 Windows VM’s each using 1 vCPU on your ESX host. The host has 2 Quad core Processors, a pretty sensible VM:core ratio (3.75). All of a sudden every single VM on the host – jumps to 100% CPU utilization – which in turn brings your host to a stall.

This is a perfect DOS attack but in this case it is directed at the ESX host by causing the load on the VM.

This was also an issue before without Virtualization, but with all of the VM’s spiking at the same time - this issue is extremely amplified.

The thing is though it does not have to be a virus with malicious code that causes the load. You could easily do it as well with Calc.exe (which is on EVERY SINGLE WINDOWS MACHINE)

  1. Run calc.exe and switch to scientific mode
  2. Type a large number (eg. 12345678901234567890), press the 'n!' button.
  3. Calc will ask to confirm after warning this will take a very long time
  4. 100% CPU utilization will now occur (essentially forever)

For a Linux machine

  1. cd /usr/src/linux
  2. make -j8 &
  3. ping -l 100000 -q -s 10 -f localhost &
  4. 100% CPU utilization will now occur (essentially forever)

I am sure that this can be scripted in some way.

So here you have it, a DOS attack not only on the VM – but also on the host itself.

I am not sure that the AV vendors will recognize this as malicious activity, because you are running a legitimate Windows Application.

Security experts are wary of the dreaded Blue Pill which compromises the hypervisor. How ready are the Vendors against such a kind of attack? How ready are the Hypervisors for such an attack?

You will get alerts instantaneously regarding the abnormal CPU usage. But it will require manual intervention to solve the issue.

I would hope that in the future there will be an option to perform operations in an attack like this. It would require some additional Intelligence built into the product.

An example would be:

  • Virtual A machine CPU usage spikes to 100% for more than 5 minutes.
  • Virtual machine B, C, D, E, F, G, H, I, J, K, L, M…… CPU also spike to 100% for more than 5 minutes.
  • Evidently there is something wrong!!!
  • Create Resource pool with a CPU limit (to ensure proper operation of the host)
  • Move all these VM’s into the above Resource Pool, thereby reducing the load on the Host
  • Shoot out appropriate alerts.

Of course the automation is more or less limitless, once you have the proper technology in place.

I  would be interested in hearing your ideas and comments on this one.