This is an article realized for StarWind blog and focused on the possible security threats in a virtual environment. See also the original post.
Security is typically a hot-topic due also to several regulations and compliant rules and laws. But more important, a security breach can have huge collateral effects, also if no data has been stoled, or compromised. But, for example, a “simple” DoS attack that makes a service not available can have a bad effect on the reputation of a B2C company.
This post will try to give an idea of some possible security threads in a virtual environment based on VMware vSphere (but several concepts are quite general also for other virtualization platforms) and some possible approaches to minimize the effect or prevent the attacks.
Virtual environment structure
A virtual environment is built with different layers that basically can be summarized and simplified in the following schema:
Some part can have multiple layers as well, for example a VM is still structured in a hardware layer, an Operating System (OS) layer, an application layer. And the number of layers could be more granular, for example showing the shared library layer.
Note: In the following example I’m considering a system virtualization solution, with other solutions (for example containers) the layers can be different, but still exist a layer structure.
A security attack can be focused on each of those layers, for this reason, the security must be in deep on each layer considering all the possible threads and trying to minimize or mitigate the risks. You have to protect each layer but also consider possible thread and attack pattern from one layer to the other boundary layers.
Theoretically speaking, virtualization can improve the security because one of the main pillars of system virtualization is the VM isolation property that protects VMs from others VMs, but also the host layer from possible VMs attacks.
Following VMware’s vision, the five pillars of cyber hygiene are:
From: https://twitter.com/vmware/status/902215509604577280
- Least privilege: this is the common and most reasonable approach, that apply for user accounts, service accounts, services in general (for example used ports).
- Micro-segmentation: with NSX (or other network virtualization solutions) is possible to implement security policy at the VM layer with the desired granularity. Inside the VM there are other specific solutions, like AppDefense.
- Encryption: data must be protected at each layer, both for data at rest and data in motion.
- Multi-Factor Authentication: authentication is usually the weakest part, mostly for too simple passwords (or passwords that are not changed periodically.
- Patching: keep your software components up-to-date is crucial for the security aspects, but it’s very important also for implementing new features.
For more information see also: https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/cloud/vmware-core-principles-cyber-hygiene-white-paper.pdf
Practically speaking we can use the general hardening term to group several types of security approaches.
Hardening is the process of securing a system, a service, or an entire infrastructure, by reducing the attack surface and minimizing the possible vulnerabilities.
VMware has built some Security Hardening Guides (https://www.vmware.com/security/hardening-guides.html) to provide prescriptive guidance for customers on how to deploy and operate VMware products in a secure manner.
Note: the vSphere Security Configuration Guide isn’t a “compliance” tool: it can be used to reach compliances, but it’s not automatically enforced. It’s mostly a set of guidelines that attempt to explain security risks, but there can be also more solutions to mitigate them. Also, those guidelines may be applicable or not to specific customer cases.
Host layer
There are different possible techniques, depending of course by the virtualization layer:
- Limit (administrative) user access and manage a proper authentication, authorization, and accounting policy. For example, in vSphere there is the Lockdown mode useful to limit the direct access to an ESXi.
- Limit services and network connection in order to minimize the attack surface. Usually, it’s performed by running only the essential services with fewer privileges and by enforcing also local firewall rules. ESXi is well designed for this task, but also a Windows Server Core (or a Nano Server) with Hyper-V roles could match this approach.
- Use only secure connections and avoid weak SSL ciphers. Clear text network communications are no more acceptable, for the data integrity and confidentiality, but also for the weak possibility to authenticate the endpoints of the communication itself.
- Segregate management functions from services functions. For example, in ESXi all the management interfaces can be segregated in a limited and protected network.
- Keep you host patched properly, in all its components, including hardware BIOS and firmware. In ESXi, for the software part, all packages (VIBs) can be signed to enforce a strict verification.
Host integrity
Guarantee the integrity of each host it’s not easy: for network connections certificates and SSL could be used, for for the software part?
In vSphere, ESXi can verify VIBs with their signature: an unsigned VIB can represent an untested code and a possible thread for an ESXi host.
You can change the acceptance level for each host, in the Configure, System, Security Profile menu, under the “Host Image Profile Acceptance Level” voice:
But this is not enough to enforce an integrity enforcement of the software part. Starting with vSphere 6.5 it’s possible use UEFI, TPM 2.0 and Secure Boot, to validates the digital signature of the ESXi components and the boot loader to ensures that only a properly signed system will boot.
For more information see: https://docs.vmware.com/en/VMware-vSphere/6.5/com.vmware.vsphere.security.doc/GUID-5D5EE0D1-2596-43D7-95C8-0B29733191D9.html
Host threads
Let’s assume that all the infrastructure layers have been “secured” with a proper hardening. There are still possible threads to the host layer? Is the isolation principle respected?
The reality demonstrates that there are possible attack patterns from the VM layer through the host layer, using possible bugs or vulnerabilities at software or hardware level.
Meltdown and Spectre (https://vinfrastructure.it/2018/01/meltdown-spectre-critical-vulnerabilities-several-processors/) are a good examples of hardware vulnerabilities that can reduce the security level of a virtual environment. And it’s only the beginning… new threads and new variants have been discovered (https://vinfrastructure.it/2018/02/new-meltdown-spectre-variants/) making the hardware level quite critical if is not keep updated.
But at hardware level, we have also out-of-band management interfaces like ILO or iDRAC (or generically IPMI interfaces) that give a high privilege access to the hardware layer. Of course, those interfaces must be confined in a secure and limited network perimeter and protected with good authentication and authorization rules.
And there are new type of hardware, like for example Persistent Memory, that can bring new potential threads that must be considered in a security analysis (for example see this SNIA document: Carlson%2C_Mark_Persistent_Memory_Security.pdf
About the software part, a hypervisor should isolate VMs from other VMs, and protect itself from VMs attacks, but the reality again is little different (https://vinfrastructure.it/2017/04/vm-really-isolated/) and there are some documented exploits to several types of hypervisors. Anyway, most of those exploits have been fixed by software patches or are very hard and difficult to be used in a real world and on server hypervisors.
Network layer
The isolation property doesn’t work for the network layer, that remains potentially weak from the security point of view. For this reason, the virtual switches usually have some specific security features. In vSphere there are different capabilities between standard and distributed virtual switch, but for best protect the network layer NSX could be a better security option with the micro-segmentation approach.
The physical part of the network layers (the physical switches, routers, firewalls, …) must be protected in the proper way. Usually, all the management interfaces are segregated in a limited and protected network.
But also, some specific protocols must be blocked or authorized properly to avoid networking issues: for example, if Spanning Tree Protocol is used could be really important have a BPDU filter at physical switch level or virtual switch level to avoid network topology changes. Hyper-V has a native BPDU block policy, on ESXi you must use KB 2047822: Understanding the BPDU Filter feature in vSphere (https://kb.vmware.com/kb/2047822).
Another protocol example is DHCP where you want to avoid DHCP spoofing. Again, Hyper-V has a native protection, on vSphere you need to add NSX to have this kind of protection.
And what about new hardware and new protocols? For example, RDMA card permits a memory to memory direct access… could be used to bring possible attach? Some studies suggest that there are possible threads using RDMA (https://www.cs.vu.nl/~herbertb/download/papers/throwhammer_atc18.pdf) and again keeping updated the firmware and the drivers will be quite critical for security aspects. But also limit this kind of traffic on isolated and segregated networks.
Also note that all consider VLANs a secure way to protect and isolated different logical networks, but there are some possible threads to VLAN (http://www.pages02.net/coastams-networkbox/newsletter/newsletter_oct2013_article1.html).
Storage layer
The storage layer could be quite critical because data reside on it, for this reason, integrity and confidentially aspects must be considered.
Data confidentially must be guaranteed and data access must be protected and authorized. Usually, this is implemented with storage area networks (SAN) and/or with ACL, but it’s not always enough: sometimes the SAN is not an isolated network (for example with iSCSI or FCoE can use shared physical switches) and ACL may be weak (for example just based on some addresses that can be spoofed). Strong authentication could be a way to improve the security, but there are also other options.
Data integrity could be more obvious: you can think that it’s just related only to a good storage design with enough data redundancy… but the reality demonstrates that software level could be critical and some storage vendors have got some issues in ensuring data integrity. More recently, in vSphere there is an interesting VVOLs integrity issue describe by KB 55800 (https://kb.vmware.com/s/article/55800) where after a vMotion, you can have corrupted incremental backups (CBT), performance degradation (VFRC and Cache IOFilters), corrupted replicas (Replication IOFilters) and disk corruption (VM Encryption and Cache IOFilters configured in write-back mode). Not nice at all. The fix will be available in all VMware vSphere 6.0.x, 6.5.x and 6.7.x future patches.
And what about HCI? This is an interesting tread but converge at least the compute and the storage part. The storage becomes a VSA (Virtual Storage Appliance) or could be integrated into the hypervisor kernel part. But of course, it opens new potential and more specific threads.
Finally, all storage have some kind of management interfaces with high privileges access to the storage and its data. Most are based on webservices that could weak (sometimes implemented with old and buggy version of Apache, Tomcat, …), for this reason using segregated, limited and protected networks is mandatory.
Management layer
The management layer could be the most critical because from the management interfaces you can access your entire infrastructure. Also, there can GUI interfaces, but also CLI, API, RestAPI interfaces and all of them must the managed properly. And it’s not always easy just segregated those interfaces because in some cases (for example in a cloud platform) you need to expose some services to other applications, services, or directly to some users.
Role-Based Access Control (RBAC) is a common approach to manage permissions and authorizations, based on specific roles. VMware vSphere provides different categories of permissions with an RBAC model, but also other management platforms can have a similar approach.
Provide also a strong authentication is another part of the puzzle. Usually, there are at different categories: knowledge (something they know), possession (something they have), and inherence (something they are). A two-factor authentication (2FA) is a type of multi-factor authentication where just two components are used.
Starting with vSphere 6.0 Update 2, is possible have a two-factor authentication with different approaches. For example, for RSA, the configuration is well explained in this blog post: https://blogs.vmware.com/vsphere/2017/07/using-vcenter-login-banner-rsa-securid-support.html
Then there is the hardening part of the management layer. In vSphere, by using the virtual appliance version of vCenter Server (VCSA), as suggested also by VMware, you can use the same VM hardening suggestions and also benefit from a hardened operating system.
Note that in most cases, the management interfaces could be based on weak services or old services (try to scan your VCSA with a security tool and look for example of the AutoDeploy service, if it’s enabled). This requires constant software updating (but really easy with VCSA), but also, if possible and applicable, segregate the management interfaces in a limited and protected network.
Other aspects to consider in the management layer are the log management, the alerting system, the SSL certificates generation, distribution and validation, and so on.
VM layer
A VM can attack the other layers, and this was already being discussed. Note that can also bring a direct attack on the virtualized resources, for example in order to consume them too much and exhausted them rapidly. This usually requires a good resources management, but you have also to consider that each VM has some implicit limits in the configured resources.
VMs must also be protected by other VMs, but in this case, the isolation pillar and specific network and security solution can minimize those threads. As already written VM isolation is theoretical, but the real world presents some possible attacks: for example, am academic research has demonstrated that is theoretically possible leverages Transparent Page Sharing (TPS) to gain unauthorized access to data under certain highly controlled conditions.
For more information see: https://blogs.vmware.com/security/2014/10/transparent-page-sharing-additional-management-capabilities-new-default-settings.html
For this reason, in vSphere 6.x, TPS is disabled across VMs, but is still working inside individual a VM. Is still possible enable on the entire ESXi, by following the KB 2097593: Additional Transparent Page Sharing management capabilities and new default settings (https://kb.vmware.com/kb/2097593).
But a new trend is now also to protect VMs from the underline infrastructure: for example in case of a public cloud service, the consumer may have some concern on how the provider manages the security and privacy of the data.
VM hardening is the first step. For vSphere, the hardening guide describes a lot of specific VM options, but starting with ESXi 6.0 Patch 5, many of the VM advances settings are now set to be “Secure By Default”. This mean that the desired values in the Security Configuration Guide are the default values for all new VMs and you don’t have to manually set them anymore. For more information see this blog post: https://blogs.vmware.com/vsphere/2017/06/secure-default-vm-disable-unexposed-features.html.
For virtual networking, NSX can provide micro-segmentation capability to enforce network security directly at VM virtual NIC level.
Also, at VMworld 2017, a new product has been announced: VMware AppDefense a data center endpoint security product that protects applications running in virtualized environments. AppDefense work “inside” the VM (compared to NSX that works only at the network level) and understands how applications are supposed to work “normally” and monitors all changes to that behavior state that indicate a threat.
But there is more to protect your VMs from the other layers: possible threads can came from the management layer (or also other layer) where you can access the VM data and mining the confidentially of them.
Protect the data at rest
There are different possible options to store your data securely in a virtual environment:
- Encryption at storage physical level using Self-encrypting drives (SED): using full disk encryption also known as Hardware-based full-disk encryption (FDE). OPAL is a set of specifications for self-encrypting drives developed by the Trusted Computing Group. But those type of disks are quite costly and require also controllers or storage that support this feature.
- Encryption at storage logic level: for example, using vSAN encryption that uses an AES 256 cipher and eliminates the extra cost, limitations, and complexity associated with purchasing and maintaining self-encrypting drives. Note that vSAN datastore encryption vSAN datastore encryption is enabled and configured at the datastore level. In other words, every object on the vSAN datastore is encrypted when this feature is enabled.
- Encryption at VM level: this depends by the hypervisor or by 3rd party products. In vSphere 6.5 (or greater) Enterprise Plus edition is possible encrypt VMs files (including VMDK virtual disks) making the stored data unreadable without the proper privileges or roles.
- Encryption inside the VM: For example using Microsoft BitLocker, or using Linux encrypted filesystem (with losetup or other tools).
Protect the data in motion
Protect the stored data is only a part: you need also to encrypt or make secure the network connections. For the infrastructure part all communications between vCenter and hosts are usually encrypted.
But some other infrastructural network traffic usually are not encrypted: for example iSCSI or NFS traffic. And also vMotion, until vSphere 6.5 where the encrypted vMotion feature has been added. The vMotion encryption feature isn’t simple an encrypting of the entire network channel for the vMotion traffic. The encryption happens on a per-VM level without the needs of certicates or external keys management (like with VM encryption).
For more information see: https://blogs.vmware.com/vsphere/2016/10/whats-new-in-vsphere-6-5-security.html
Virtual TPM 2.0 for VMs
TPM could be an essential tool to storage specific security information, like with the secure boot option discussed in the host part.
In vSphere 6.5 it’s already possible has VM secure boot, but does not provide a virtualized TPM device and this can limit the scope of secure boot or the implementation of other security features. Finally, in vSphere 6.7 it’s also possible present to the guest OS a virtualized TPM 2.0 device that can be used to do crypto operations and store credentials. The virtual TPM it’s also possible in Hyper-V (with Windows Server 2016), but works in a different way.
In vSphere 6.7 the vTPM data are stored to the VM’s .nvram file and secured with VM Encryption that it’s mandatory to have this function enabled to use vTPM.
For more information see: https://blogs.vmware.com/vsphere/2018/04/introducing-vsphere-6-7-security.html