All of our service nodes (about 40 hosts) are hosted on a VMware cluster system
Three high-spec physical machines host all of our VMs at UM, and a similar configuration is also at the MSU site. Even with one physical machine offline we can still run safely on the other two. Shared iSCSI backend storage enables this redundancy as well as dynamic live VM migration.
All VM infrastructure is redundantly connected with multiple 10Gbps links through two separate network switches. Our setup has proven to be very reliable. In combination with our provisioning and configuration management systems it enables us to quickly and easily bring up new services as needed.
More details about the hardware in this system are in our Storage and Networking sections.There is work ongoing in this area. Currently AGLT2 does not virtualize any compute nodes or offer virtual machine capable job slots. The best information on compute node virtualization in ATLAS can be found at the CernVM project.
All ATLAS software kits are currently distributed via CernVM-FS which is an http based network filesystem with aggressive local caching. It further benefits from Squid caching at compute sites and geographically distributed mirrors.