Fun with VMWare and the Blue Screen of Death

Yesterday, I had an unbelievable amount of “fun” with VMWare – just what was needed for a Sunday…

This week I’m starting a project which will see me working with a group of colleagues on the other side of the Atlantic. It’s all Message Broker-related, and I just happen to have a Windows XP VMWare image with WMB v6 installed. I’d recently installed VMWare 5.5.2 on my host system, and I wanted to make sure that I had all the right levels of the middleware installed in the image.

When I first started the VM, there were a number of warnings, including one about the processor speed not being detected correctly, and a cryptic one about my laptop’s BIOS not reporting the correct NUMA settings. I went ahead and booted the virtual machine, only to find that a minute or so after logging on, it would throw a blue screen and reboot itself.

I remembered that I’d recently changed a BIOS setting to do with Intel processor virtualisation, so I went back and changed that again. No luck. I tried updating the BIOS via Lenovo System Update. Again, no change.

The VMWare knowledge base was a mine of information, although sadly the two KB articles referred to in the original warning popup didn’t seem to clarify what the problem actually was… eventually, I updated the config.ini file for VMWare as follows:

host.cpukHz = 2160000
host.noTSC = TRUE
pstc.noTSC = TRUE
host.TSC.noForceSync = TRUE
processor0.use = TRUE
processor1.use = FALSE

If I understand the 30 or 40 Google and VMWare knowledge base hits I went through, these settings should fix the CPU speed to match the host machine; get the host and guest CPU clocks to be synchronised; and force the VM to only use the first core of the CPU.

Unfortunately, even after incrementally adding these settings, the VM would continue to reboot itself / blue screen. I kept trying to change the settings to prevent it from restarting on a Windows STOP error, but I never seemed to be able to apply the change before the next reboot. In the end, I booted the VM into safe mode and then changed that setting. Even when I did manage to read the stop error, I couldn’t find anything definitive which explained what was causing it, but I did find something about Windows XP SP2 and Data Execution Protection (DEP) causing problems with some drivers. After another reboot into safe mode, I disabled DEP for the guest, and that seems to have resolved the issue – no more repeated STOP errors.

I’d love to know what recent change caused this. It could have been one of several, because I’m still in the process of getting the T60p set up and things change rapidly! My best guess is that it may have been the upgrade from VMWare 5.5.1->5.5.2 , because I’m reasonably sure that the image was working fine before that (and even after resetting the multicore settings in the BIOS, the problem persisted). Now I’m going to try rolling back a few of those hard-coded settings in the config.ini file, because I don’t think I always want to restrict my virtual machines to a single core.

Well, at least it is working, and now I know the BIOS is up-to-date!

Technorati: , ,

Advertisements

6 responses to “Fun with VMWare and the Blue Screen of Death

  1. There’s a Wiki note about this internally. For the benefit of folks who cannot get to it, here are the relevant bits (I hope the HTML tags work):

    Note for T60 VMWare Users:

    I came across an issue using VMWare WorkStation (5.5) and thought I would share my resolution with you, since it is related specifically to the T60 and/or Duo processor. I have various images that I created on my older T40 laptop, but when attempting to start them on the T60 I would receive a few errors from VMWare. After reviewing the VMWare knowledgebase and following the instructions I ended up with this resolution: (all of these steps are for the host OS, no changes are needed for any of the virtual machines or their files)

    1. Edit the C:\Documents and Settings\All Users\Application Data\VMware\VMware Workstation\config.ini file:

    * Add the following line at the end of this file:
    * host.TSC.noForceSync = TRUE
    * Edit the following line to match your cpu speed (my T60 was 2.16Ghz):
    * host.cpukHz = 2160000

    2. Update the T60 BIOS to the latest version ( I used 79ET62WW (1.07) )

    3. Leave the guest OS settings configured with the number of virtual processors set to 1. According to VMWare, this does not increase performance with hyperthreaded uniprocessors and will cause issues when attempting to start the guest OS on another host with just one processor without hyperthreading.

    Steven Koehler

    Additional note for T60 users

    I was having problems with certain images which run fine on T40 machines but rebooted after two or three minutes on my T60. I was able to get one of these images to run by turning off a BIOS feature called Memory Protection. To make this setting, cold-boot the T60 and press the blue ThinkVantage button to access the BIOS. Select Security, then Memory Protection. Set to “disabled” and reboot.

    Memory Protection prevents certain types of vulnerabilities by not allowing code to execute in areas of memory marked as data. With the correct hardware and software support this feature eliminates buffer overrun and similar vulnerabilities at the hardware level and so is a Good Thing. You may want to consider turning this feature back on when not using affected VMWare images.

    — T.Rob

    Like

  2. OK, interesting, wish I’d seen that before. Looks like I’ve basically done all of those things, but instead of turning off Memory Protection in the BIOS, I turned it off within the VM by disabling Windows DEP.

    Like

  3. Aviv Greenberg

    Thanks for the info.
    Did you set the VT CPU feature in the t60 bios? The default is “disabled” and i wonder if VMWare can use VT if it is actually enabled…

    Like

  4. Aviv, I’m having to remember back 6 months here, but yes I think I did. In fact I am reasonably sure that was started all the issues. Can’t tell you whether I have it set right now.

    Like

  5. I was getting blue screen issues in 6.0 (but not 5.5) but after upgrading to 6.0.1 they are all gone.

    Like

  6. This is really all you need

    This line is only for notebooks with variable CPU speed.
    host.cpukHz = 2160000

    For the unsynchronized TSCs error for CPUs, skip all of the above ’cause THIS LINE is all you need.
    processor0.use = TRUE
    processor1.use = FALSE

    You can go through many sites and see this same configuration with or without the host.cpukHz variable all of it is copy and past perfection form what you see on one site and post on another site. The thing about it is the copy and past artist is not a VMWare technician and has no clue what any of those settings mean when all you need is those two lines above.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s