LibFuzzer workshop学习之路（进阶）

LibFuzzer workshop学习之路（二）

上一篇对libfuzzer的原理和使用有了基本的了解，接下来就到进阶的内容了，会涉及到字典的使用，语料库精简，错误报告生成以及一些关键的编译选项的选择等内容，希望能对libfuzzer有更深入的学习。

lesson 08(dictionaries are so effective)

对libxml2进行fuzz。
首先对其解压并用clang编译之。

tar xzf libxml2.tgz
cd libxml2

./autogen.sh
export FUZZ_CXXFLAGS="-O2 -fno-omit-frame-pointer -gline-tables-only -fsanitize=address,fuzzer-no-link"
CXX="clang++ $FUZZ_CXXFLAGS" CC="clang $FUZZ_CXXFLAGS" \
    CCLD="clang++ $FUZZ_CXXFLAGS"  ./configure
make -j$(nproc)

解释下新的编译选项
-gline-tables-only:表示使用采样分析器
clang手册中对采样分析器的解释:Sampling profilers are used to collect runtime information, such as hardware counters, while your application executes. They are typically very efficient and do not incur a large runtime overhead. The sample data collected by the profiler can be used during compilation to determine what the most executed areas of the code are.
用于收集程序执行期间的信息比如硬件计数器，在编译期间使用采样分析器所收集的数据来确定代码中最值得执行的区域。因此，使用样本分析器中的数据需要对程序的构建方式进行一些更改。在编译器可以使用分析信息之前，代码需要在分析器下执行。这也对提高我们fuzz效率很重要。
提供的harness：

// Copyright 2015 The Chromium Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.

#include <stddef.h>
#include <stdint.h>

#include "libxml/parser.h"

void ignore (void* ctx, const char* msg, ...) {
  // Error handler to avoid spam of error messages from libxml parser.
}

extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
  xmlSetGenericErrorFunc(NULL, &ignore);

  if (auto doc = xmlReadMemory(reinterpret_cast<const char*>(data),
                               static_cast<int>(size), "noname.xml", NULL, 0)) {
    xmlFreeDoc(doc);
  }

  return 0;
}

将输入的样本类型转换后交给xmlReadMemory处理。编译如下：
clang++ -O2 -fno-omit-frame-pointer -gline-tables-only -fsanitize=address,fuzzer-no-link -std=c++11 xml_read_memory_fuzzer.cc -I libxml2/include libxml2/.libs/libxml2.a -fsanitize=fuzzer -lz -o libxml2-v2.9.2-fsanitize_fuzzer1

由于编译时使用了样本分析器，fuzz的执行速率和覆盖率都很可观

#2481433    NEW    cov: 2018 ft: 9895 corp: 3523/671Kb lim: 1470 exec/s: 2038 rss: 553Mb L: 484/1470 MS: 1 CopyPart-
#2481939    REDUCE cov: 2018 ft: 9895 corp: 3523/671Kb lim: 1470 exec/s: 2037 rss: 553Mb L: 390/1470 MS: 4 InsertByte-ChangeBit-ShuffleBytes-EraseBytes-
#2482177    REDUCE cov: 2018 ft: 9895 corp: 3523/671Kb lim: 1470 exec/s: 2037 rss: 553Mb L: 816/1470 MS: 3 ChangeBit-ShuffleBytes-EraseBytes-
#2482341    REDUCE cov: 2018 ft: 9895 corp: 3523/671Kb lim: 1470 exec/s: 2038 rss: 553Mb L: 41/1470 MS: 2 CopyPart-EraseBytes-
#2482513    REDUCE cov: 2018 ft: 9896 corp: 3524/671Kb lim: 1470 exec/s: 2038 rss: 553Mb L: 604/1470 MS: 3 ChangeASCIIInt-ChangeASCIIInt-CopyPart-
#2482756    REDUCE cov: 2018 ft: 9896 corp: 3524/671Kb lim: 1470 exec/s: 2038 rss: 553Mb L: 342/1470 MS: 2 InsertRepeatedBytes-EraseBytes-
#2483073    REDUCE cov: 2018 ft: 9896 corp: 3524/671Kb lim: 1470 exec/s: 2038 rss: 553Mb L: 1188/1470 MS: 3 InsertByte-ShuffleBytes-EraseBytes-
#2483808    REDUCE cov: 2018 ft: 9896 corp: 3524/671Kb lim: 1470 exec/s: 2037 rss: 553Mb L: 102/1470 MS: 2 InsertRepeatedBytes-EraseBytes-
#2483824    REDUCE cov: 2018 ft: 9896 corp: 3524/671Kb lim: 1470 exec/s: 2037 rss: 553Mb L: 477/1470 MS: 1 EraseBytes-
#2483875    REDUCE cov: 2018 ft: 9896 corp: 3524/671Kb lim: 1470 exec/s: 2037 rss: 553Mb L: 70/1470 MS: 3 CopyPart-ChangeByte-EraseBytes-
#2483999    REDUCE cov: 2018 ft: 9896 corp: 3524/671Kb lim: 1470 exec/s: 2037 rss: 553Mb L: 604/1470 MS: 1 EraseBytes-
#2485065    REDUCE cov: 2018 ft: 9896 corp: 3524/671Kb lim: 1480 exec/s: 2036 rss: 553Mb L: 32/1470 MS: 1 EraseBytes-
#2485100    REDUCE cov: 2018 ft: 9896 corp: 3524/671Kb lim: 1480 exec/s: 2036 rss: 553Mb L: 139/1470 MS: 2 ChangeByte-EraseBytes-
#2485127    REDUCE cov: 2018 ft: 9896 corp: 3524/671Kb lim: 1480 exec/s: 2036 rss: 553Mb L: 622/1470 MS: 1 EraseBytes-
#2485277    REDUCE cov: 2018 ft: 9896 corp: 3524/671Kb lim: 1480 exec/s: 2037 rss: 553Mb L: 93/1470 MS: 1 EraseBytes-
#2485465    REDUCE cov: 2019 ft: 9897 corp: 3525/671Kb lim: 1480 exec/s: 2037 rss: 553Mb L: 40/1470 MS: 1 PersAutoDict- DE: "\x00\x00\x00\x00\x00\x00\x00\x05"-
#2485715    NEW    cov: 2019 ft: 9899 corp: 3526/672Kb lim: 1480 exec/s: 2037 rss: 553Mb L: 1092/1470 MS: 3 ChangeBit-CopyPart-CopyPart-
#2485805    REDUCE cov: 2019 ft: 9899 corp: 3526/672Kb lim: 1480 exec/s: 2037 rss: 553Mb L: 25/1470 MS: 2 ShuffleBytes-EraseBytes-
#2486420    REDUCE cov: 2019 ft: 9899 corp: 3526/672Kb lim: 1480 exec/s: 2036 rss: 553Mb L: 336/1470 MS: 2 InsertByte-EraseBytes-
#2486677    REDUCE cov: 2019 ft: 9899 corp: 3526/672Kb lim: 1480 exec/s: 2036 rss: 553Mb L: 33/1470 MS: 2 ChangeBit-EraseBytes-
#2486836    REDUCE cov: 2019 ft: 9899 corp: 3526/672Kb lim: 1480 exec/s: 2036 rss: 553Mb L: 142/1470 MS: 1 EraseBytes-
#2487217    REDUCE cov: 2019 ft: 9899 corp: 3526/672Kb lim: 1480 exec/s: 2037 rss: 553Mb L: 555/1470 MS: 1 EraseBytes-
#2487243    REDUCE cov: 2019 ft: 9901 corp: 3527/673Kb lim: 1480 exec/s: 2037 rss: 553Mb L: 1464/1470 MS: 1 CopyPart-
#2487595    NEW    cov: 2019 ft: 9902 corp: 3528/675Kb lim: 1480 exec/s: 2035 rss: 553Mb L: 1430/1470 MS: 4 ShuffleBytes-ChangeByte-ChangeBinInt-CopyPart-
#2487978    REDUCE cov: 2019 ft: 9902 corp: 3528/675Kb lim: 1480 exec/s: 2035 rss: 553Mb L: 34/1470 MS: 2 ChangeBit-EraseBytes-
#2487997    REDUCE cov: 2019 ft: 9902 corp: 3528/675Kb lim: 1480 exec/s: 2036 rss: 553Mb L: 534/1470 MS: 1 EraseBytes-
#2488103    REDUCE cov: 2019 ft: 9902 corp: 3528/675Kb lim: 1480 exec/s: 2036 rss: 553Mb L: 62/1470 MS: 4 ChangeBit-PersAutoDict-ShuffleBytes-EraseBytes- DE: "UT"-

但迟迟没有crash。这可能有很多原因：1.程序很健壮。2.我们选择的接口函数不合适 3.异常检测的设置不当。
这三个可能的原因中程序是否健壮我们不得而知，接口函数是否合适我们通过覆盖率了解到以xmlReadMemory作为入口函数执行到的代码块还是较高的，但也有可能因为漏洞不在接口函数的部分。第三个可能，由于异常检测的设置不当导致即使产生了异常但因为于设置的异常检测不匹配和没有捕获到。回头看下我们的santize设置为address开启内存错误检测器(AddressSanitizer)，该选项较为通用且宽泛(无非stack/heap_overflow)，但其实还有一些更具针对行的选项：

 -fsanitize-address-field-padding=<value>
                          Level of field padding for AddressSanitizer
  -fsanitize-address-globals-dead-stripping
                          Enable linker dead stripping of globals in AddressSanitizer
  -fsanitize-address-poison-custom-array-cookie
                          Enable poisoning array cookies when using custom operator new[] in AddressSanitizer
  -fsanitize-address-use-after-scope
                          Enable use-after-scope detection in AddressSanitizer
  -fsanitize-address-use-odr-indicator
                          Enable ODR indicator globals to avoid false ODR violation reports in partially sanitized programs at the cost of an increase in binary size

其中有一个-fsanitize-address-use-after-scope描述为开启use-after-scope检测，将其加入到编译选项中，再次编译。

export FUZZ_CXXFLAGS="-O2 -fno-omit-frame-pointer -gline-tables-only -fsanitize=address,fuzzer-no-link -fsanitize-address-use-after-scope"

CXX="clang++ $FUZZ_CXXFLAGS" CC="clang $FUZZ_CXXFLAGS" \
    CCLD="clang++ $FUZZ_CXXFLAGS"  ./configure

make -j$(nproc)
clang++ -O2 -fno-omit-frame-pointer -gline-tables-only -fsanitize=address,fuzzer-no-link -fsanitize-address-use-after-scope -std=c++11 xml_read_memory_fuzzer.cc -I libxml2/include libxml2/.libs/libxml2.a -fsanitize=fuzzer -lz -o libxml2-v2.9.2-fsanitize_fuzzer1

跑了一会儿依然没有收获，看来这将会是一个较长时间的过程。

#1823774    REDUCE cov: 2019 ft: 9428 corp: 3417/499Kb lim: 1160 exec/s: 2867 rss: 546Mb L: 229/1150 MS: 4 ChangeBinInt-InsertByte-InsertByte-EraseBytes-
#1823804    REDUCE cov: 2019 ft: 9429 corp: 3418/500Kb lim: 1160 exec/s: 2867 rss: 546Mb L: 508/1150 MS: 3 CopyPart-EraseBytes-CopyPart-
#1824507    REDUCE cov: 2019 ft: 9429 corp: 3418/500Kb lim: 1160 exec/s: 2868 rss: 546Mb L: 24/1150 MS: 1 EraseBytes-
#1824608    REDUCE cov: 2019 ft: 9429 corp: 3418/500Kb lim: 1160 exec/s: 2864 rss: 546Mb L: 474/1150 MS: 4 InsertRepeatedBytes-ChangeASCIIInt-PersAutoDict-CrossOver- DE: "\xff\xff\xffN"-
#1824748    REDUCE cov: 2019 ft: 9429 corp: 3418/500Kb lim: 1160 exec/s: 2864 rss: 546Mb L: 1066/1143 MS: 5 ChangeASCIIInt-CMP-PersAutoDict-ChangeBit-EraseBytes- DE: "ISO-8859-1"-"\xfe\xff\xff"-
#1825344    REDUCE cov: 2019 ft: 9429 corp: 3418/500Kb lim: 1160 exec/s: 2865 rss: 546Mb L: 25/1143 MS: 1 EraseBytes-
#1825716    REDUCE cov: 2019 ft: 9429 corp: 3418/500Kb lim: 1160 exec/s: 2866 rss: 546Mb L: 437/1143 MS: 3 InsertRepeatedBytes-InsertRepeatedBytes-EraseBytes-
#1825879    REDUCE cov: 2019 ft: 9429 corp: 3418/500Kb lim: 1160 exec/s: 2866 rss: 546Mb L: 73/1143 MS: 4 CMP-ChangeASCIIInt-ChangeBit-EraseBytes- DE: "\x01\x00\x00P"-
#1826898    REDUCE cov: 2019 ft: 9429 corp: 3418/500Kb lim: 1170 exec/s: 2863 rss: 546Mb L: 453/1143 MS: 3 ChangeByte-ChangeASCIIInt-EraseBytes-
#1827221    REDUCE cov: 2019 ft: 9429 corp: 3418/500Kb lim: 1170 exec/s: 2863 rss: 546Mb L: 404/1143 MS: 1 EraseBytes-
#1827788    REDUCE cov: 2019 ft: 9429 corp: 3418/500Kb lim: 1170 exec/s: 2864 rss: 546Mb L: 47/1143 MS: 1 EraseBytes-
#1828282    REDUCE cov: 2019 ft: 9429 corp: 3418/500Kb lim: 1170 exec/s: 2861 rss: 546Mb L: 112/1143 MS: 4 CMP-ChangeBit-ChangeByte-EraseBytes- DE: "O>/<"-
#1828714    REDUCE cov: 2019 ft: 9429 corp: 3418/500Kb lim: 1170 exec/s: 2861 rss: 546Mb L: 7/1143 MS: 1 EraseBytes-
#1828728    REDUCE cov: 2019 ft: 9429 corp: 3418/500Kb lim: 1170 exec/s: 2861 rss: 546Mb L: 163/1143 MS: 1 EraseBytes-
#1828756    NEW    cov: 2020 ft: 9430 corp: 3419/501Kb lim: 1170 exec/s: 2861 rss: 546Mb L: 1155/1155 MS: 1 CopyPart-
#1828812    REDUCE cov: 2020 ft: 9430 corp: 3419/501Kb lim: 1170 exec/s: 2861 rss: 546Mb L: 42/1155 MS: 2 ChangeBit-EraseBytes-
#1828952    REDUCE cov: 2020 ft: 9430 corp: 3419/501Kb lim: 1170 exec/s: 2862 rss: 546Mb L: 380/1155 MS: 1 EraseBytes-
#1829111    REDUCE cov: 2020 ft: 9430 corp: 3419/501Kb lim: 1170 exec/s: 2862 rss: 546Mb L: 542/1155 MS: 3 InsertByte-ChangeASCIIInt-EraseBytes-

但我们不能放任其fuzz，要想一些办法去提高我们fuzz的效率，这其中一个办法就是使用字典。

我们知道基本上所有的程序都是处理的数据其格式是不同的，比如 xml文档， png图片等等。这些数据中会有一些特殊字符序列（或者说关键字），比如在xml文档中就有CDATA，<!ATTLIST等，png图片就有png 图片头。如果我们事先就把这些字符序列列举出来吗，fuzz直接使用这些关键字去组合，就会就可以减少很多没有意义的尝试，同时还有可能会走到更深的程序分支中去。
这里whorkshop就提供了AFL中所使用的dict:

//xml.dict
➜  08 git:(master) ✗ cat xml.dict 
# Copyright 2016 Google Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
################################################################################
#
# AFL dictionary for XML
# ----------------------
#
# Several basic syntax elements and attributes, modeled on libxml2.
#
# Created by Michal Zalewski <lcamtuf@google.com>
#

attr_encoding=" encoding=\"1\""
attr_generic=" a=\"1\""
attr_href=" href=\"1\""
attr_standalone=" standalone=\"no\""
attr_version=" version=\"1\""
attr_xml_base=" xml:base=\"1\""
attr_xml_id=" xml:id=\"1\""
attr_xml_lang=" xml:lang=\"1\""
attr_xml_space=" xml:space=\"1\""
attr_xmlns=" xmlns=\"1\""

entity_builtin="&lt;"
entity_decimal="&#1;"
entity_external="&a;"
entity_hex="&#x1;"

string_any="ANY"
string_brackets="[]"
string_cdata="CDATA"
string_col_fallback=":fallback"
string_col_generic=":a"
string_col_include=":include"
string_dashes="--"
string_empty="EMPTY"
string_empty_dblquotes="\"\""
string_empty_quotes="''"
string_entities="ENTITIES"
string_entity="ENTITY"
string_fixed="#FIXED"
string_id="ID"
string_idref="IDREF"
string_idrefs="IDREFS"
string_implied="#IMPLIED"
string_nmtoken="NMTOKEN"
string_nmtokens="NMTOKENS"
string_notation="NOTATION"
string_parentheses="()"
string_pcdata="#PCDATA"
string_percent="%a"
string_public="PUBLIC"
string_required="#REQUIRED"
string_schema=":schema"
string_system="SYSTEM"
string_ucs4="UCS-4"
string_utf16="UTF-16"
string_utf8="UTF-8"
string_xmlns="xmlns:"

tag_attlist="<!ATTLIST"
tag_cdata="<![CDATA["
tag_close="</a>"
tag_doctype="<!DOCTYPE"
tag_element="<!ELEMENT"
tag_entity="<!ENTITY"
tag_ignore="<![IGNORE["
tag_include="<![INCLUDE["
tag_notation="<!NOTATION"
tag_open="<a>"
tag_open_close="<a />"
tag_open_exclamation="<!"
tag_open_q="<?"
tag_sq2_close="]]>"
tag_xml_q="<?xml?>"

其中关键字就是””里的内容，libfuzzer会使用这些关键字进行组合来生成样本。字典使用方法./libxml2-v2.9.2-fsanitize_fuzzer1 -max_total_time=60 -print_final_stats=1 -dict=./xml.dict corpus1
执行结果：

#468074    REDUCE cov: 2521 ft: 8272 corp: 2493/82Kb lim: 135 exec/s: 7801 rss: 452Mb L: 105/135 MS: 4 InsertRepeatedBytes-CMP-CopyPart-CrossOver- DE: "\x01\x00\x00\x00"-
#468322    REDUCE cov: 2521 ft: 8272 corp: 2493/82Kb lim: 135 exec/s: 7805 rss: 452Mb L: 85/135 MS: 3 CopyPart-CopyPart-EraseBytes-
#468381    REDUCE cov: 2521 ft: 8273 corp: 2494/82Kb lim: 135 exec/s: 7806 rss: 452Mb L: 74/135 MS: 1 InsertByte-
#468390    REDUCE cov: 2521 ft: 8273 corp: 2494/82Kb lim: 135 exec/s: 7806 rss: 452Mb L: 66/135 MS: 2 ChangeASCIIInt-EraseBytes-
#468391    REDUCE cov: 2521 ft: 8273 corp: 2494/82Kb lim: 135 exec/s: 7806 rss: 452Mb L: 89/135 MS: 1 EraseBytes-
#468575    DONE   cov: 2521 ft: 8273 corp: 2494/82Kb lim: 135 exec/s: 7681 rss: 452Mb
###### Recommended dictionary. ######
"\x08\x00" # Uses: 366
"Q\x00" # Uses: 370
"\x00:" # Uses: 325
"\x97\x00" # Uses: 301
"\x0d\x00" # Uses: 335
"\xfe\xff\xff\xff" # Uses: 273
"UCS-" # Uses: 294
"\x15\x00" # Uses: 277
"\x00\x00" # Uses: 289
"\xff\xff\xff\x1c" # Uses: 258
"\xff\xff\xff!" # Uses: 257
"\xff\xff\xff\x01" # Uses: 250
"UTF-1" # Uses: 250
"\xff\xff\xffN" # Uses: 236
"UTF-16LE" # Uses: 223
"ISO-10" # Uses: 228
"ISO-1064" # Uses: 256
"\x0a\x00\x00\x00" # Uses: 246
"Q\x00\x00\x00" # Uses: 247
"\xf1\x1f\x00\x00\x00\x00\x00\x00" # Uses: 212
"$\x00\x00\x00\x00\x00\x00\x00" # Uses: 194
"\xff\xff\xff\x0e" # Uses: 211
"\x09\x00" # Uses: 226
"\x01\x00\x00\xfa" # Uses: 212
"\x01\x00\x00\x02" # Uses: 239
"\xac\x0f\x00\x00\x00\x00\x00\x00" # Uses: 206
"\xffO" # Uses: 263
"\xff\x03" # Uses: 235
"\xff\xff\xff\xff\xff\xff\xff\x10" # Uses: 200
"\xf4\x01\x00\x00\x00\x00\x00\x00" # Uses: 203
"UTF-16BE" # Uses: 188
"\x00\x00\x00P" # Uses: 207
"\x0a\x00" # Uses: 196
"\xff\xff" # Uses: 203
"\xff\xff\xff\xff\xff\x97\x96\x80" # Uses: 186
"\x01 \x00\x00\x00\x00\x00\x00" # Uses: 187
"\x00\x00\x00\x00\x00\x00\x00$" # Uses: 156
"P\x00" # Uses: 186
"\xff\xff\xff\xff" # Uses: 197
"\xff\xff\xff\x09" # Uses: 202
"\x12\x00\x00\x00\x00\x00\x00\x00" # Uses: 204
"\x01\x01" # Uses: 170
"\x01\x00\x00\x00\x00\x00\x00\x10" # Uses: 197
"\xff\xff\xff\xff\xff\xff\xff\x03" # Uses: 183
"\x00\x00\x00\x00\x00\x00\x00\x00" # Uses: 174
"\xff\x05" # Uses: 198
"US-ASCII" # Uses: 214
"\x01\x00" # Uses: 201
"xlmns" # Uses: 189
"\xff\xff\xff\x14" # Uses: 191
"xmlsn" # Uses: 179
"\x00\x00\x00\x03" # Uses: 201
"xmlns" # Uses: 182
"\xaf\x0f\x00\x00\x00\x00\x00\x00" # Uses: 186
"\xff\xff\xff\xff\xff\xff\x0e\xb9" # Uses: 176
"\xff\x09" # Uses: 178
"ISO-1" # Uses: 191
"la" # Uses: 157
"\x01\x00\x00\x00" # Uses: 173
"\x01\x00\x00\x00\x00\x00\x00\x14" # Uses: 172
"\xff\xff\xff\x7f\x00\x00\x00\x00" # Uses: 164
"\x00\x00\x00\x04" # Uses: 154
"\x01\x00\x00\x00\x00\x00\x00\x00" # Uses: 153
"\x00\x00\x00\x02" # Uses: 136
"\x04\x00\x00\x00\x00\x00\x00\x00" # Uses: 140
"ISO-10646-" # Uses: 146
"id" # Uses: 155
"\x00\x01" # Uses: 145
"\x00\x02" # Uses: 140
"\x01\x00\x00\x08" # Uses: 165
"\x00\x00\x00\x00\x00\x00\x00\x1e" # Uses: 136
"\xff\xff\xff\xff~\xff\xff\xff" # Uses: 130
"\x81\x96\x98\x00\x00\x00\x00\x00" # Uses: 150
"\x03\x00\x00\x00" # Uses: 116
"\x18\x00\x00\x00" # Uses: 176
"\xff\xff\xff\xff\xff\xff\xff\xf9" # Uses: 124
"%\x17\x8f[" # Uses: 130
"\x0e\x00\x00\x00" # Uses: 142
"\x01\x00\x00\x00\x00\x00\x00\xfa" # Uses: 96
"\x06\x00\x00\x00\x00\x00\x00\x00" # Uses: 116
"\x00\x04" # Uses: 161
"\x00\x00\x00\x0b" # Uses: 119
"\x00\x00\x00\x06" # Uses: 141
"annnn\xd4nnnnnn" # Uses: 100
"\x1f\x00\x00\x00\x00\x00\x00\x00" # Uses: 106
"\x00\x00\x00\x00\x00\x00\x00\x17" # Uses: 114
"\x16\x00\x00\x00\x00\x00\x00\x00" # Uses: 122
"\x0f\x00" # Uses: 121
"inlc0a" # Uses: 128
"\x01\x00\x00O" # Uses: 101
"\x01\x04" # Uses: 117
"\x01P" # Uses: 122
"\xfb\x00\x00\x00\x00\x00\x00\x00" # Uses: 95
"\x03\x00\x00\x00\x00\x00\x00\x00" # Uses: 107
"\x00\x00\x00\x00\x00\x00\x00\x03" # Uses: 98
"\x00#" # Uses: 102
"\x00\x00\x00\x0d" # Uses: 92
"ISO-8859-1" # Uses: 100
"\xff\xf9" # Uses: 77
"\xf7\x0f\x00\x00\x00\x00\x00\x00" # Uses: 79
"\xff\xff\xff\xff\xff\xff\xff\xfb" # Uses: 82
"\x01\x0b" # Uses: 101
"\xff\xff\xff\xff\xff\xff\x0e\xff" # Uses: 78
"><>\xb7" # Uses: 81
"<b" # Uses: 81
"UTF-8\x00" # Uses: 76
"\xff\xff\xff\xff\xff\xff\xff\x09" # Uses: 63
"#\x00\x00\x00" # Uses: 75
"S\x00\x00\x00\x00\x00\x00\x00" # Uses: 73
"a\xff" # Uses: 70
"TIONb" # Uses: 46
"\x01\x00\x00?" # Uses: 69
"!\x00\x00\x00" # Uses: 67
"\x00\x00\x00\x01" # Uses: 74
"\xff\xff\xff\xff\xff\xff\x1f\x02" # Uses: 57
"\x01\x00\x00\x00\x00\x00\x00\x16" # Uses: 57
"-\x00\x00\x00\x00\x00\x00\x00" # Uses: 53
"\x01\x00\x00\x05" # Uses: 65
":b" # Uses: 67
"\x17\x1c_>" # Uses: 63
"\xff\xff\xff\xff\xff\xff\xff\x18" # Uses: 64
"\x00\x00\x00'" # Uses: 45
"\x00\x00\x00\x05" # Uses: 52
"\xff\xff\xff\x0d" # Uses: 51
"US-AS" # Uses: 48
"a>" # Uses: 53
"C\x00\x00\x00\x00\x00\x00\x00" # Uses: 44
"\xff\xff\x00\x00" # Uses: 36
"\x01\x07" # Uses: 49
"@\x00\x00\x00" # Uses: 46
"\x02\x00" # Uses: 32
"+\x00" # Uses: 37
"\x00\x00\x00\x00\x00\x00 \x02" # Uses: 42
"\x00\x0f" # Uses: 37
"\xff\xff\xff\xff\xff\xff\xff$" # Uses: 49
"ASCII" # Uses: 40
"\x00\x00\x00\x00\x00\x00\x01\x00" # Uses: 30
"a\xff:-\xec" # Uses: 27
"\xff\x1a" # Uses: 30
"'''''''''&''" # Uses: 23
"\x01\x00\x00\x00\x00\x00\x01\x1d" # Uses: 34
"TIOIb" # Uses: 19
"J\x00\x00\x00\x00\x00\x00\x00" # Uses: 15
"N\x00\x00\x00" # Uses: 10
"\x01O" # Uses: 8
"\xff\xff\xff\x02" # Uses: 6
"HTML" # Uses: 8
"\x00P" # Uses: 9
"\xff\xff\xff\x00" # Uses: 9
"\xff\x06" # Uses: 9
"\x7f\x96\x98\x00\x00\x00\x00\x00" # Uses: 4
"^><b>" # Uses: 5
"\x01\x0a" # Uses: 5
"\x13\x00" # Uses: 1
###### End of recommended dictionary. ######
Done 479244 runs in 61 second(s)
stat::number_of_executed_units: 479244
stat::average_exec_per_sec:     7856
stat::new_units_added:          5007
stat::slowest_unit_time_sec:    0
stat::peak_rss_mb:              467

可以看到最后还给出了Recommended dictionary，可以更新到我们的.dict中。
stat::new_units_added: 4709说明最终探测到了5007个代码单元。
不使用字典的话：

Done 402774 runs in 61 second(s)
stat::number_of_executed_units: 402774
stat::average_exec_per_sec:     6602
stat::new_units_added:          3761
stat::slowest_unit_time_sec:    0
stat::peak_rss_mb:              453

可以看到使用字典效率确实提高不少。

此外，当我们长时间fuzz时，会产生和编译出很多样本，这些样本存放在语料库corpus中，例如上面就产生了➜ 08 git:(master) ✗ ls -lR| grep "^-" | wc -l 7217 7217个样本，其中很多是重复的，我们可以通过以下方法进行精简(使用-merge=1标志)：

mkdir corpus1_min
corpus1_min: 精简后的样本集存放的位置
corpus1: 原始样本集存放的位置

➜  08 git:(master) ✗ ./libxml2-v2.9.2-fsanitize_fuzzer1 -merge=1 corpus1_min corpus1
INFO: Seed: 1264856731
INFO: Loaded 1 modules   (53343 inline 8-bit counters): 53343 [0xd27740, 0xd3479f), 
INFO: Loaded 1 PC tables (53343 PCs): 53343 [0x9b3650,0xa83c40), 
MERGE-OUTER: 2724 files, 0 in the initial corpus
MERGE-OUTER: attempt 1
INFO: Seed: 1264900516
INFO: Loaded 1 modules   (53343 inline 8-bit counters): 53343 [0xd27740, 0xd3479f), 
INFO: Loaded 1 PC tables (53343 PCs): 53343 [0x9b3650,0xa83c40), 
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 1048576 bytes
MERGE-INNER: using the control file '/tmp/libFuzzerTemp.8187.txt'
MERGE-INNER: 2724 total files; 0 processed earlier; will process 2724 files now
#1    pulse  cov: 464 exec/s: 0 rss: 32Mb
#2    pulse  cov: 470 exec/s: 0 rss: 33Mb
#4    pulse  cov: 502 exec/s: 0 rss: 33Mb
#8    pulse  cov: 522 exec/s: 0 rss: 34Mb
#16    pulse  cov: 533 exec/s: 0 rss: 34Mb
#32    pulse  cov: 681 exec/s: 0 rss: 35Mb
#64    pulse  cov: 756 exec/s: 0 rss: 36Mb
#128    pulse  cov: 1077 exec/s: 0 rss: 39Mb
#256    pulse  cov: 1247 exec/s: 0 rss: 45Mb
#512    pulse  cov: 1553 exec/s: 0 rss: 55Mb
#1024    pulse  cov: 2166 exec/s: 0 rss: 77Mb
#2048    pulse  cov: 2550 exec/s: 2048 rss: 120Mb
#2724    DONE  cov: 2666 exec/s: 2724 rss: 155Mb
MERGE-OUTER: succesfull in 1 attempt(s)
MERGE-OUTER: the control file has 287194 bytes
MERGE-OUTER: consumed 0Mb (38Mb rss) to parse the control file
MERGE-OUTER: 2313 new files with 8750 new features added; 2666 new coverage edges

精简到了2313个样本。

workshop还提供了另一个fuzz target:

➜  08 git:(master) ✗ cat xml_compile_regexp_fuzzer.cc 
// Copyright 2016 The Chromium Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.

#include <stddef.h>
#include <stdint.h>

#include <algorithm>
#include <string>
#include <vector>

#include "libxml/parser.h"
#include "libxml/tree.h"
#include "libxml/xmlversion.h"

void ignore (void * ctx, const char * msg, ...) {
  // Error handler to avoid spam of error messages from libxml parser.
}

// Entry point for LibFuzzer.
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
  xmlSetGenericErrorFunc(NULL, &ignore);

  std::vector<uint8_t> buffer(size + 1, 0);
  std::copy(data, data + size, buffer.data());

  xmlRegexpPtr x = xmlRegexpCompile(buffer.data());
  if (x)
    xmlRegFreeRegexp(x);

  return 0;
}

与之前的不同，将输入的数据copy到buffer中，再交给xmlRegexpCompile处理。编译运行如下:

➜  08 git:(master) ✗ clang++ -O2 -fno-omit-frame-pointer -gline-tables-only -fsanitize=address,fuzzer-no-link -fsanitize-address-use-after-scope -std=c++11 xml_compile_regexp_fuzzer.cc -I libxml2/include libxml2/.libs/libxml2.a -fsanitize=fuzzer -lz -o libxml2-v2.9.2-fsanitize_fuzzer1 
➜  08 git:(master) ✗ ./libxml2-v2.9.2-fsanitize_fuzzer1 -dict=./xml.dict
Dictionary: 60 entries
INFO: Seed: 2400921417
INFO: Loaded 1 modules   (53352 inline 8-bit counters): 53352 [0xd27700, 0xd34768), 
INFO: Loaded 1 PC tables (53352 PCs): 53352 [0x9b36f0,0xa83d70), 
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
INFO: A corpus is not provided, starting from an empty corpus
#2    INITED cov: 114 ft: 115 corp: 1/1b exec/s: 0 rss: 30Mb
    NEW_FUNC[1/5]: 0x551cf0 in ignore(void*, char const*, ...) /home/admin/libfuzzer-workshop/lessons/08/xml_compile_regexp_fuzzer.cc:16
    NEW_FUNC[2/5]: 0x552d00 in __xmlRaiseError /home/admin/libfuzzer-workshop/lessons/08/libxml2/error.c:461
#6    NEW    cov: 150 ft: 169 corp: 2/2b lim: 4 exec/s: 0 rss: 31Mb L: 1/1 MS: 3 ShuffleBytes-ShuffleBytes-ChangeByte-
#10    NEW    cov: 155 ft: 223 corp: 3/4b lim: 4 exec/s: 0 rss: 31Mb L: 2/2 MS: 4 ChangeBit-ShuffleBytes-ShuffleBytes-InsertByte-
#12    NEW    cov: 156 ft: 277 corp: 4/8b lim: 4 exec/s: 0 rss: 31Mb L: 4/4 MS: 2 ShuffleBytes-CopyPart-
#13    NEW    cov: 161 ft: 282 corp: 5/12b lim: 4 exec/s: 0 rss: 31Mb L: 4/4 MS: 1 CrossOver-
#20    NEW    cov: 175 ft: 302 corp: 6/14b lim: 4 exec/s: 0 rss: 31Mb L: 2/4 MS: 2 ChangeByte-ChangeBinInt-
#24    NEW    cov: 177 ft: 305 corp: 7/16b lim: 4 exec/s: 0 rss: 31Mb L: 2/4 MS: 4 EraseBytes-ChangeBinInt-ChangeBit-InsertByte-
    NEW_FUNC[1/1]: 0x604f00 in xmlFAReduceEpsilonTransitions /home/admin/libfuzzer-workshop/lessons/08/libxml2/xmlregexp.c:1777
#28    NEW    cov: 206 ft: 336 corp: 8/19b lim: 4 exec/s: 0 rss: 31Mb L: 3/4 MS: 4 ShuffleBytes-ChangeByte-ChangeBit-CMP- DE: "\x01?"-
#32    NEW    cov: 209 ft: 343 corp: 9/21b lim: 4 exec/s: 0 rss: 31Mb L: 2/4 MS: 4 ManualDict-ShuffleBytes-ShuffleBytes-ChangeBit- DE: "<!"-
#38    NEW    cov: 210 ft: 344 corp: 10/25b lim: 4 exec/s: 0 rss: 31Mb L: 4/4 MS: 1 ChangeByte-
#45    NEW    cov: 210 ft: 347 corp: 11/29b lim: 4 exec/s: 0 rss: 31Mb L: 4/4 MS: 2 CopyPart-ChangeByte-
#47    NEW    cov: 211 ft: 366 corp: 12/33b lim: 4 exec/s: 0 rss: 31Mb L: 4/4 MS: 2 CopyPart-ChangeBinInt-
#50    NEW    cov: 211 ft: 367 corp: 13/37b lim: 4 exec/s: 0 rss: 31Mb L: 4/4 MS: 3 ChangeBinInt-CopyPart-ChangeBinInt-
#57    NEW    cov: 211 ft: 388 corp: 14/40b lim: 4 exec/s: 0 rss: 31Mb L: 3/4 MS: 2 ManualDict-CrossOver- DE: "\"\""-
    NEW_FUNC[1/1]: 0x606d20 in xmlFARecurseDeterminism /home/admin/libfuzzer-workshop/lessons/08/libxml2/xmlregexp.c:2589
#64    NEW    cov: 233 ft: 421 corp: 15/43b lim: 4 exec/s: 0 rss: 31Mb L: 3/4 MS: 2 ChangeBit-PersAutoDict- DE: "\x01?"-
#66    NEW    cov: 235 ft: 426 corp: 16/47b lim: 4 exec/s: 0 rss: 31Mb L: 4/4 MS: 2 EraseBytes-PersAutoDict- DE: "\x01?"-
#68    REDUCE cov: 235 ft: 426 corp: 16/46b lim: 4 exec/s: 0 rss: 31Mb L: 3/4 MS: 2 ChangeBit-EraseBytes-
#72    NEW    cov: 236 ft: 427 corp: 17/50b lim: 4 exec/s: 0 rss: 31Mb L: 4/4 MS: 4 ShuffleBytes-PersAutoDict-ShuffleBytes-ShuffleBytes- DE: "\x01?"-
#86    REDUCE cov: 236 ft: 427 corp: 17/48b lim: 4 exec/s: 0 rss: 31Mb L: 2/4 MS: 4 ChangeByte-ChangeBinInt-CopyPart-EraseBytes-
#92    NEW    cov: 237 ft: 431 corp: 18/49b lim: 4 exec/s: 0 rss: 31Mb L: 1/4 MS: 1 EraseBytes-
#103    NEW    cov: 237 ft: 433 corp: 19/53b lim: 4 exec/s: 0 rss: 31Mb L: 4/4 MS: 1 CopyPart-
#104    REDUCE cov: 237 ft: 433 corp: 19/50b lim: 4 exec/s: 0 rss: 31Mb L: 1/4 MS: 1 CrossOver-
    NEW_FUNC[1/1]: 0x600e30 in xmlFAParseCharClassEsc /home/admin/libfuzzer-workshop/lessons/08/libxml2/xmlregexp.c:4843
#115    NEW    cov: 243 ft: 439 corp: 20/54b lim: 4 exec/s: 0 rss: 31Mb L: 4/4 MS: 1 ChangeByte-
#118    NEW    cov: 246 ft: 447 corp: 21/58b lim: 4 exec/s: 0 rss: 31Mb L: 4/4 MS: 3 CrossOver-PersAutoDict-ChangeBinInt- DE: "\"\""-
#134    REDUCE cov: 249 ft: 457 corp: 22/62b lim: 4 exec/s: 0 rss: 31Mb L: 4/4 MS: 1 CrossOver-
#137    REDUCE cov: 249 ft: 460 corp: 23/64b lim: 4 exec/s: 0 rss: 31Mb L: 2/4 MS: 3 InsertByte-EraseBytes-PersAutoDict- DE: "\x01?"-
#145    NEW    cov: 250 ft: 461 corp: 24/66b lim: 4 exec/s: 0 rss: 31Mb L: 2/4 MS: 3 CopyPart-EraseBytes-ChangeBit-
#153    NEW    cov: 250 ft: 507 corp: 25/69b lim: 4 exec/s: 0 rss: 31Mb L: 3/4 MS: 3 CrossOver-ShuffleBytes-CopyPart-
    NEW_FUNC[1/1]: 0x600890 in xmlFAParseCharGroup /home/admin/libfuzzer-workshop/lessons/08/libxml2/xmlregexp.c:5100
#165    NEW    cov: 254 ft: 511 corp: 26/71b lim: 4 exec/s: 0 rss: 32Mb L: 2/4 MS: 2 InsertByte-ManualDict- DE: "[]"-
=================================================================
==8434==ERROR: AddressSanitizer: allocator is out of memory trying to allocate 0x18 bytes
==8434==ERROR: AddressSanitizer failed to allocate 0x2000 (8192) bytes of InternalMmapVector (error code: 12)
ERROR: Failed to mmap
MS: 4 ChangeBit-EraseBytes-PersAutoDict-ChangeBit- DE: "[]"-; base unit: 85f707600c5524d8497fd94066e422258633e02f
0x7f,0x5b,0xdd,
\x7f[\xdd
artifact_prefix='./'; Test unit written to ./crash-dffb37701985bd6539dcbcfe2a04661627b040ff
Base64: f1vd

好家伙，这个harness几秒抛出了crash，说明对于的入口函数的选择至关重要。但这次的异常有点奇怪==8434==ERROR: AddressSanitizer: allocator is out of memory trying to allocate 0x18 bytes描述说申请超出了内存，也没有SUMMARY对漏洞进行定位。
因此我们应该意识到问题是出在了harness上，由于在xml_compile_regexp_fuzzer.cc中使用std::vector<uint8_t> buffer(size + 1, 0);对data进行转储，在样例不断增加的过程中vector超出了扩容的内存限制，从而抛出了crash，这并不是测试函数xmlRegexpCompile函数的问题。
在另一个对xmlReadMemory的fuzz还在进行，学长说它fuzz这个函数花了十几个小时才出crash。

lesson 09(the importance of seed corpus)

这次我们的目标为开源库libpng，首先对源码进行编译

tar xzf libpng.tgz
cd libpng

# Disable logging via library build configuration control.
cat scripts/pnglibconf.dfa | sed -e "s/option STDIO/option STDIO disabled/" \
> scripts/pnglibconf.dfa.temp
mv scripts/pnglibconf.dfa.temp scripts/pnglibconf.dfa   #这里把错误消息禁用

# build the library.
autoreconf -f -i
#1
export FUZZ_CXXFLAGS="-O2 -fno-omit-frame-pointer -g -fsanitize=address \
    -fsanitize-coverage=trace-pc-guard,trace-cmp,trace-gep,trace-div"

./configure CC="clang" CFLAGS="$FUZZ_CXXFLAGS"
make -j2     

#2
export FUZZ_CXXFLAGS="-O2 -fno-omit-frame-pointer -gline-tables-only -fsanitize=address,fuzzer-no-link"
CXX="clang++ $FUZZ_CXXFLAGS" CC="clang $FUZZ_CXXFLAGS" \
    CCLD="clang++ $FUZZ_CXXFLAGS"  ./configure
make -j$(nproc)

workshop给出的是#1的编译策略，没有启用采样分析器，而且 -fsanitize-coverage=trace-pc-guard适用在older version的libfuzzer。因此我用的是#2的编译策略，上一个lesson证明这样的编译插桩能有效提高fuzz的效率。
提供的harness：

// Copyright 2015 The Chromium Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.

#include <stddef.h>
#include <stdint.h>
#include <string.h>

#include <vector>

#define PNG_INTERNAL
#include "png.h"

struct BufState {
  const uint8_t* data;
  size_t bytes_left;
};

struct PngObjectHandler {
  png_infop info_ptr = nullptr;
  png_structp png_ptr = nullptr;
  png_voidp row_ptr = nullptr;
  BufState* buf_state = nullptr;

  ~PngObjectHandler() {
    if (row_ptr && png_ptr) {
      png_free(png_ptr, row_ptr);
    }
    if (png_ptr && info_ptr) {
      png_destroy_read_struct(&png_ptr, &info_ptr, nullptr);
    }
    delete buf_state;
  }
};

void user_read_data(png_structp png_ptr, png_bytep data, png_size_t length) {
  BufState* buf_state = static_cast<BufState*>(png_get_io_ptr(png_ptr));
  if (length > buf_state->bytes_left) {
    png_error(png_ptr, "read error");
  }
  memcpy(data, buf_state->data, length);
  buf_state->bytes_left -= length;
  buf_state->data += length;
}

static const int kPngHeaderSize = 8;

// Entry point for LibFuzzer.
// Roughly follows the libpng book example:
// http://www.libpng.org/pub/png/book/chapter13.html
extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
  if (size < kPngHeaderSize) {
    return 0;
  }

  std::vector<unsigned char> v(data, data + size);
  if (png_sig_cmp(v.data(), 0, kPngHeaderSize)) {
    // not a PNG.
    return 0;
  }

  PngObjectHandler png_handler;
  png_handler.png_ptr = png_create_read_struct
    (PNG_LIBPNG_VER_STRING, nullptr, nullptr, nullptr);
  if (!png_handler.png_ptr) {
    return 0;
  }

  png_set_user_limits(png_handler.png_ptr, 2048, 2048);

  png_set_crc_action(png_handler.png_ptr, PNG_CRC_QUIET_USE, PNG_CRC_QUIET_USE);

  png_handler.info_ptr = png_create_info_struct(png_handler.png_ptr);
  if (!png_handler.info_ptr) {
    return 0;
  }

  // Setting up reading from buffer.
  png_handler.buf_state = new BufState();
  png_handler.buf_state->data = data + kPngHeaderSize;
  png_handler.buf_state->bytes_left = size - kPngHeaderSize;
  png_set_read_fn(png_handler.png_ptr, png_handler.buf_state, user_read_data);
  png_set_sig_bytes(png_handler.png_ptr, kPngHeaderSize);

  // libpng error handling.
  if (setjmp(png_jmpbuf(png_handler.png_ptr))) {
    return 0;
  }

  // Reading.
  png_read_info(png_handler.png_ptr, png_handler.info_ptr);
  png_handler.row_ptr = png_malloc(
      png_handler.png_ptr, png_get_rowbytes(png_handler.png_ptr,
                                               png_handler.info_ptr));

  // reset error handler to put png_deleter into scope.
  if (setjmp(png_jmpbuf(png_handler.png_ptr))) {
    return 0;
  }

  png_uint_32 width, height;
  int bit_depth, color_type, interlace_type, compression_type;
  int filter_type;

  if (!png_get_IHDR(png_handler.png_ptr, png_handler.info_ptr, &width,
                    &height, &bit_depth, &color_type, &interlace_type,
                    &compression_type, &filter_type)) {
    return 0;
  }

  // This is going to be too slow.
  if (width && height > 100000000 / width)
    return 0;

  if (width > 2048 || height > 2048)
    return 0;

  int passes = png_set_interlace_handling(png_handler.png_ptr);
  png_start_read_image(png_handler.png_ptr);

  for (int pass = 0; pass < passes; ++pass) {
    for (png_uint_32 y = 0; y < height; ++y) {
      png_read_row(png_handler.png_ptr,
                   static_cast<png_bytep>(png_handler.row_ptr), NULL);
    }
  }

  return 0;
}

对于模糊测试来说，能否写出合适的harness关乎着fuzz最后的结果，我们通常选择涉及内存管理，数据处理等方面的函数作为我们的接口函数去fuzz。
这里给出的harness中我们比较容易看到它会首先去通过png_sig_cmp函数去判断输入的data是否符合png的格式，符合才能进入到后面的逻辑中，这一方面是确保data的有效性，同时也提高了数据变异的速率。
由于要求输入数据为png的格式，那自然想到使用字典去拼接关键字。这样的想法是正确的，下面比较一下两者的差异：
先编译:clang++ -O2 -fno-omit-frame-pointer -gline-tables-only -fsanitize=address,fuzzer-no-link -std=c++11 libpng_read_fuzzer.cc -I libpng libpng/.libs/libpng16.a -fsanitize=fuzzer -lz -o libpng_read_fuzzer
使用的也是AFL给出的png.dict:

# Copyright 2016 Google Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
################################################################################
#
# AFL dictionary for PNG images
# -----------------------------
#
# Just the basic, standard-originating sections; does not include vendor
# extensions.
#
# Created by Michal Zalewski <lcamtuf@google.com>
#

header_png="\x89PNG\x0d\x0a\x1a\x0a"

section_IDAT="IDAT"
section_IEND="IEND"
section_IHDR="IHDR"
section_PLTE="PLTE"
section_bKGD="bKGD"
section_cHRM="cHRM"
section_fRAc="fRAc"
section_gAMA="gAMA"
section_gIFg="gIFg"
section_gIFt="gIFt"
section_gIFx="gIFx"
section_hIST="hIST"
section_iCCP="iCCP"
section_iTXt="iTXt"
section_oFFs="oFFs"
section_pCAL="pCAL"
section_pHYs="pHYs"
section_sBIT="sBIT"
section_sCAL="sCAL"
section_sPLT="sPLT"
section_sRGB="sRGB"
section_sTER="sTER"
section_tEXt="tEXt"
section_tIME="tIME"
section_tRNS="tRNS"
section_zTXt="zTXt"#

先不使用字典：

./libpng_read_fuzzer -max_total_time=60 -print_final_stats=1
Done 5454409 runs in 61 second(s)
stat::number_of_executed_units: 5454409
stat::average_exec_per_sec:     89416
stat::new_units_added:          512
stat::slowest_unit_time_sec:    0
stat::peak_rss_mb:              822

探测到了512个代码单元
之后使用字典：

./libpng_read_fuzzer -max_total_time=60 -print_final_stats=1 -dict=./png.dict
#2849333    REDUCE cov: 287 ft: 511 corp: 111/19Kb lim: 4096 exec/s: 105530 rss: 682Mb L: 43/3088 MS: 1 EraseBytes-
#2871709    REDUCE cov: 291 ft: 515 corp: 112/19Kb lim: 4096 exec/s: 106359 rss: 682Mb L: 47/3088 MS: 1 ManualDict- DE: "bKGD"-
#2883416    NEW    cov: 293 ft: 520 corp: 113/19Kb lim: 4096 exec/s: 106793 rss: 682Mb L: 48/3088 MS: 2 PersAutoDict-EraseBytes- DE: "sPLT"-
=================================================================
==26551==ERROR: AddressSanitizer: allocator is out of memory trying to allocate 0x62474b42 bytes
    #0 0x51f69d in malloc /local/mnt/workspace/bcain_clang_bcain-ubuntu_23113/llvm/utils/release/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:145:3
    #1 0x5a98a3 in png_read_buffer /home/admin/libfuzzer-workshop/lessons/09/libpng/pngrutil.c:310:16
    #2 0x5a98a3 in png_handle_sPLT /home/admin/libfuzzer-workshop/lessons/09/libpng/pngrutil.c:1683:13
    #3 0x571b3c in png_read_info /home/admin/libfuzzer-workshop/lessons/09/libpng/pngread.c:225:10
    #4 0x551b3a in LLVMFuzzerTestOneInput /home/admin/libfuzzer-workshop/lessons/09/libpng_read_fuzzer.cc:91:3
    #5 0x459a21 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /local/mnt/workspace/bcain_clang_bcain-ubuntu_23113/llvm/utils/release/final/llvm.src/projects/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:553:15
    #6 0x459265 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool*) /local/mnt/workspace/bcain_clang_bcain-ubuntu_23113/llvm/utils/release/final/llvm.src/projects/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:469:3
    #7 0x45b507 in fuzzer::Fuzzer::MutateAndTestOne() /local/mnt/workspace/bcain_clang_bcain-ubuntu_23113/llvm/utils/release/final/llvm.src/projects/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:695:19
    #8 0x45c225 in fuzzer::Fuzzer::Loop(std::Fuzzer::vector<fuzzer::SizedFile, fuzzer::fuzzer_allocator<fuzzer::SizedFile> >&) /local/mnt/workspace/bcain_clang_bcain-ubuntu_23113/llvm/utils/release/final/llvm.src/projects/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:831:5
    #9 0x449fe8 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /local/mnt/workspace/bcain_clang_bcain-ubuntu_23113/llvm/utils/release/final/llvm.src/projects/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:825:6
    #10 0x473452 in main /local/mnt/workspace/bcain_clang_bcain-ubuntu_23113/llvm/utils/release/final/llvm.src/projects/compiler-rt/lib/fuzzer/FuzzerMain.cpp:19:10
    #11 0x7f5d84c2fbf6 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21bf6)

==26551==HINT: if you don't care about these errors you may set allocator_may_return_null=1
SUMMARY: AddressSanitizer: out-of-memory /local/mnt/workspace/bcain_clang_bcain-ubuntu_23113/llvm/utils/release/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:145:3 in malloc
==26551==ABORTING
MS: 1 ShuffleBytes-; base unit: 1c223175724dec61e3adf94affb1cceec27d30ae
0x89,0x50,0x4e,0x47,0xd,0xa,0x1a,0xa,0x0,0x0,0x0,0xd,0x49,0x48,0x44,0x52,0x0,0x0,0x0,0x10,0x0,0x0,0x0,0x63,0x1,0x0,0x0,0x0,0x0,0x4,0x0,0x41,0x41,0x62,0x47,0x4b,0x41,0x73,0x50,0x4c,0x54,0x44,0x41,0x41,0x41,0x41,0x41,0xb9,
\x89PNG\x0d\x0a\x1a\x0a\x00\x00\x00\x0dIHDR\x00\x00\x00\x10\x00\x00\x00c\x01\x00\x00\x00\x00\x04\x00AAbGKAsPLTDAAAAA\xb9
artifact_prefix='./'; Test unit written to ./crash-ac5ad67d43ac829fd5148d6930a33c17c2ac7143
Base64: iVBORw0KGgoAAAANSUhEUgAAABAAAABjAQAAAAAEAEFBYkdLQXNQTFREQUFBQUG5
stat::number_of_executed_units: 2888597
stat::average_exec_per_sec:     103164
stat::new_units_added:          533
stat::slowest_unit_time_sec:    0
stat::peak_rss_mb:              682

啊这，直接出crash,有点东西。这也再次说明了好的字典使得我们fuzz时的输入数据更具有针对性，当然也提高了触发更多代码单元和获得crash的可能。
我使用workshop的#1编译方法在使用dict的情况下cov只有40多，也未能得到crash，因此上面能得到crash也得益于我们的插桩策略。
在未使用语料库的情况下就得到了crash实属意料之外，如果我们在使用字典的下情况仍然暂时未得到crash，另一个方法可以去寻找一些有效的输入语料库。因为libfuzzer是进化型的fuzz，结合了产生和变异两个发面。如果我们可以提供一些好的seed，虽然它本身没法造成程序crash，但libfuzzer会在此基础上进行变异，就有可能变异出更好的语料，从而增大程序crash的概率。具体的变异策略需要我们去阅读libfuzzer的源码或者些相关的论文。
workshop给我们提供了一些seed：

➜  09 git:(master) ✗ ls seed_corpus 
anti_aliasing_perspective.png             blue_yellow_alpha.png                green.png                                  offset_background_filter_1x.png
anti_aliasing.png                         blue_yellow_alpha_translate.png      green_small.png                            offset_background_filter_2x.png
axis_aligned.png                          blue_yellow_anti_aliasing.png        green_small_with_blue_corner.png           rotated_drop_shadow_filter_gl.png
background_filter_blur_off_axis.png       blue_yellow_filter_chain.png         green_with_blue_corner.png                 rotated_drop_shadow_filter_sw.png
background_filter_blur_outsets.png        blue_yellow_flipped.png              image_mask_of_layer.png                    rotated_filter_gl.png
background_filter_blur.png                blue_yellow_partial_flipped.png      intersecting_blue_green.png                rotated_filter_sw.png
background_filter_on_scaled_layer_gl.png  blue_yellow.png                      intersecting_blue_green_squares.png        scaled_render_surface_layer_gl.png
background_filter_on_scaled_layer_sw.png  blur_filter_with_clip_gl.png         intersecting_blue_green_squares_video.png  scaled_render_surface_layer_sw.png
background_filter.png                     blur_filter_with_clip_sw.png         intersecting_light_dark_squares_video.png  spiral_64_scale.png
background_filter_rotated_gl.png          checkers_big.png                     mask_bottom_right.png                      spiral_double_scale.png
background_filter_rotated_sw.png          checkers.png                         mask_middle.png                            spiral.png
black.png                                 dark_grey.png                        mask_of_background_filter.png              white.png
blending_and_filter.png                   enlarged_texture_on_crop_offset.png  mask_of_clipped_layer.png                  wrap_mode_repeat.png
blending_render_pass_cm.png               enlarged_texture_on_threshold.png    mask_of_layer.png                          yuv_stripes_alpha.png
blending_render_pass_mask_cm.png          filter_with_giant_crop_rect.png      mask_of_layer_with_blend.png               yuv_stripes_clipped.png
blending_render_pass_mask.png             force_anti_aliasing_off.png          mask_of_replica_of_clipped_layer.png       yuv_stripes_offset.png
blending_render_pass.png                  four_blue_green_checkers_linear.png  mask_of_replica.png                        yuv_stripes.png
blending_transparent.png                  four_blue_green_checkers.png         mask_with_replica_of_clipped_layer.png     zoom_filter_gl.png
blending_with_root.png                    green_alpha.png                      mask_with_replica.png                      zoom_filter_sw.png

使用seed_corpus去fuzz:

➜  09 git:(master) ✗ ./libpng_read_fuzzer seed_corpus
#502095    REDUCE cov: 626 ft: 2025 corp: 450/631Kb lim: 19944 exec/s: 4219 rss: 457Mb L: 821/19555 MS: 2 CMP-EraseBytes- DE: "JDAT"-
#502951    REDUCE cov: 626 ft: 2025 corp: 450/630Kb lim: 19944 exec/s: 4226 rss: 457Mb L: 2710/19555 MS: 1 EraseBytes-
#503447    REDUCE cov: 626 ft: 2025 corp: 450/630Kb lim: 19944 exec/s: 4230 rss: 457Mb L: 467/19555 MS: 1 EraseBytes-
=================================================================
==26681==ERROR: AddressSanitizer: allocator is out of memory trying to allocate 0x60000008 bytes
    #0 0x51f69d in malloc /local/mnt/workspace/bcain_clang_bcain-ubuntu_23113/llvm/utils/release/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:145:3
    #1 0x5ad493 in png_read_buffer /home/admin/libfuzzer-workshop/lessons/09/libpng/pngrutil.c:310:16
    #2 0x5ad493 in png_handle_sCAL /home/admin/libfuzzer-workshop/lessons/09/libpng/pngrutil.c:2323:13
    #3 0x571a4c in png_read_info /home/admin/libfuzzer-workshop/lessons/09/libpng/pngread.c:200:10
    #4 0x551b3a in LLVMFuzzerTestOneInput /home/admin/libfuzzer-workshop/lessons/09/libpng_read_fuzzer.cc:91:3
    #5 0x459a21 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /local/mnt/workspace/bcain_clang_bcain-ubuntu_23113/llvm/utils/release/final/llvm.src/projects/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:553:15
    #6 0x459265 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool*) /local/mnt/workspace/bcain_clang_bcain-ubuntu_23113/llvm/utils/release/final/llvm.src/projects/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:469:3
    #7 0x45b507 in fuzzer::Fuzzer::MutateAndTestOne() /local/mnt/workspace/bcain_clang_bcain-ubuntu_23113/llvm/utils/release/final/llvm.src/projects/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:695:19
    #8 0x45c225 in fuzzer::Fuzzer::Loop(std::Fuzzer::vector<fuzzer::SizedFile, fuzzer::fuzzer_allocator<fuzzer::SizedFile> >&) /local/mnt/workspace/bcain_clang_bcain-ubuntu_23113/llvm/utils/release/final/llvm.src/projects/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:831:5
    #9 0x449fe8 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /local/mnt/workspace/bcain_clang_bcain-ubuntu_23113/llvm/utils/release/final/llvm.src/projects/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:825:6
    #10 0x473452 in main /local/mnt/workspace/bcain_clang_bcain-ubuntu_23113/llvm/utils/release/final/llvm.src/projects/compiler-rt/lib/fuzzer/FuzzerMain.cpp:19:10
    #11 0x7fc8e2ee1bf6 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21bf6)

==26681==HINT: if you don't care about these errors you may set allocator_may_return_null=1
SUMMARY: AddressSanitizer: out-of-memory /local/mnt/workspace/bcain_clang_bcain-ubuntu_23113/llvm/utils/release/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:145:3 in malloc
==26681==ABORTING
MS: 1 ChangeByte-; base unit: 7221de698a693628dcbac00aa34b38a2aca2a905
0x89,0x50,0x4e,0x47,0xd,0xa,0x1a,0xa,0x0,0x0,0x0,0xd,0x49,0x48,0x44,0x52,0x0,0x0,0x0,0x27,0x0,0x0,0x0,0xc8,0x8,0x2,0x0,0x0,0x0,0x22,0x3a,0x39,0xc9,0x0,0x0,0x0,0x1,0x73,0x52,0x47,0x42,0x0,0xae,0xce,0x1c,0xe9,0x0,0x0,0x0,0x9,0x70,0x48,0x59,0x73,0x0,0x0,0xb,0x13,0x0,0x0,0xb,0x13,0x1,0x0,0x9a,0x9c,0x18,0x60,0x0,0x0,0x7,0x73,0x43,0x41,0x4c,0x7,0xdd,0xed,0x4,0x14,0x33,0x74,0x49,0x0,0x0,0x0,0x0,0xb7,0xba,0x47,0x42,0x60,0x82,
\x89PNG\x0d\x0a\x1a\x0a\x00\x00\x00\x0dIHDR\x00\x00\x00'\x00\x00\x00\xc8\x08\x02\x00\x00\x00\":9\xc9\x00\x00\x00\x01sRGB\x00\xae\xce\x1c\xe9\x00\x00\x00\x09pHYs\x00\x00\x0b\x13\x00\x00\x0b\x13\x01\x00\x9a\x9c\x18`\x00\x00\x07sCAL\x07\xdd\xed\x04\x143tI\x00\x00\x00\x00\xb7\xbaGB`\x82
artifact_prefix='./'; Test unit written to ./crash-110b2ad7102489b24efc4899bf7d9e55904eb83b
Base64: iVBORw0KGgoAAAANSUhEUgAAACcAAADICAIAAAAiOjnJAAAAAXNSR0IArs4c6QAAAAlwSFlzAAALEwAACxMBAJqcGGAAAAdzQ0FMB93tBBQzdEkAAAAAt7pHQmCC

也顺利得到了crash，这次的crash和上面的crash有所不同，上面造成crash时的cov只有293，而且造成crash的输入为Base64: iVBORw0KGgoAAAANSUhEUgAAABAAAABjAQAAAAAEAEFBYkdLQXNQTFREQUFBQUG51，而使用seed的话cov达到了626，而且造成crash的数据为Base64: iVBORw0KGgoAAAANSUhEUgAAACcAAADICAIAAAAiOjnJAAAAAXNSR0IArs4c6QAAAAlwSFlzAAALEwAACxMBAJqcGGAAAAdzQ0FMB93tBBQzdEkAAAAAt7pHQmCC，要长很多。
多数情况下我们同时使用字典和语料库，从产生和变异两个方面去提高样例的威力，双管齐下。

接下来就要分析crash的原因了：ERROR: AddressSanitizer: allocator is out of memory trying to allocate 0x60000008 bytes，怎么有点眼熟，好像和lesson 09的报错一样。。但也有所不同，它对错误定位在了in malloc /local/mnt/workspace/bcain_clang_bcain-ubuntu_23113/llvm/utils/release/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:145:3，这个是底层malloc的位置，同时有个hint：if you don't care about these errors you may set allocator_may_return_null=1，提示我们这个crash是由于malloc申请失败造成的，也就是/home/admin/libfuzzer-workshop/lessons/09/libpng/pngrutil.c:310:16处的malloc:

if (buffer == NULL)
   {
      buffer = png_voidcast(png_bytep, png_malloc_base(png_ptr, new_size));  //此处的png_malloc_base

      if (buffer != NULL)
      {
         png_ptr->read_buffer = buffer;
         png_ptr->read_buffer_size = new_size;
      }

      else if (warn < 2) /* else silent */
      {
         if (warn != 0)
             png_chunk_warning(png_ptr, "insufficient memory to read chunk");

         else
             png_chunk_error(png_ptr, "insufficient memory to read chunk");
      }
   }

定位到问题出在png_malloc_base(png_ptr, new_size)处，由于没有对new_size的大小进行严格限制岛主在malloc时trying to allocate 0x60000008 bytes导致异常崩溃。

总结

这一篇操作下来我感觉到对于提高libfuzzer的效率包括在编译插桩、字典使用、语料库选择方面有了更清楚的认识。模糊测试fuzz在软件诞生时就应运而生了，经过了如此长时间的发展，对人们它的研究也在不断深入，并且根据不同的需求开发出了很多个性化的fuzz工具。正所谓理论结合实践，要想对libfuzzer有更深入的了解，我们还是要去分析它的源码，参考各种研究paper。

初学libfuzzer，有错误疏忽之处烦请各位师傅指正。

（完）