Remember Me
Or use your Academic/Social account:


Or use your Academic/Social account:


You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.


Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message


Verify Password:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Böttinger, Konstantin; Godefroid, Patrice; Singh, Rishabh (2018)
Languages: English
Types: Preprint
Subjects: Computer Science - Artificial Intelligence, Computer Science - Cryptography and Security

Classified by OpenAIRE into

Fuzzing is the process of finding security vulnerabilities in input-processing code by repeatedly testing the code with modified inputs. In this paper, we formalize fuzzing as a reinforcement learning problem using the concept of Markov decision processes. This in turn allows us to apply state-of-the-art deep Q-learning algorithms that optimize rewards, which we define from runtime properties of the program under test. By observing the rewards caused by mutating with a specific set of actions performed on an initial program input, the fuzzing agent learns a policy that can next generate new higher-reward inputs. We have implemented this new approach, and preliminary empirical evidence shows that reinforcement fuzzing can outperform baseline random fuzzing.
  • The results below are discovered through our pilot algorithms. Let us know how we are doing!

    • [1] M. Sutton, A. Greene, and P. Amini, Fuzzing: Brute Force Vulnerability Discovery, 1st ed. Boston, MA, USA: Addison-Wesley Professional, 2007.
    • [2] M. Howard and S. Lipner, The Security Development Lifecycle. Microsoft Press, 2006.
    • [3] G. Tesauro, “Practical issues in temporal difference learning,” in Advances in neural information processing systems, 1992, pp. 259-266.
    • [4] --, “Td-gammon: A self-teaching backgammon program,” in Applications of Neural Networks. Springer, 1995, pp. 267-285.
    • [5] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski et al., “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529-533, 2015.
    • [6] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot et al., “Mastering the game of go with deep neural networks and tree search,” Nature, vol. 529, no. 7587, pp. 484-489, 2016.
    • [7] R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction. MIT press Cambridge, 1998.
    • [8] A. Takanen, J. DeMott, and C. Miller, Fuzzing for Software Security Testing and Quality Assurance, 1st ed. Norwood, MA, USA: Artech House, Inc., 2008.
    • [9] P. Godefroid, M. Y. Levin, and D. A. Molnar, “Automated whitebox fuzz testing.” in NDSS, vol. 8, 2008, pp. 151-166. [Online]. Available: 20Fuzz%20Testing%20(paper)%20(Patrice%20Godefroid).pdf
    • [10] P. Purdom, “A sentence generator for testing parsers,” BIT Numerical Mathematics, vol. 12, no. 3, pp. 366-375, 1972.
    • [11] M. Utting, A. Pretschner, and B. Legeard, “A Taxonomy of Model-Based Testing,” Department of Computer Science, The University of Waikato, New Zealand, Tech. Rep, vol. 4, 2006.
    • [12] M. Zalewski, “American fuzzy lop,” http://lcamtuf.coredump.cx/afl/.
    • [13] P. Godefroid, H. Peleg, and R. Singh, “Learn&fuzz: Machine learning for input fuzzing,” in 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE 2017), January 2017. [Online]. Available: https://www.microsoft.com/en-us/research/ publication/learnfuzz-machine-learning-input-fuzzing/
    • [14] K. V. Hanford, “Automatic generation of test cases,” IBM Systems Journal, vol. 9, no. 4, pp. 242-257, 1970.
    • [15] O. Bastani, R. Sharma, A. Aiken, and P. Liang, “Synthesizing program input grammars,” in Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, ser. PLDI 2017. New York, NY, USA: ACM, 2017, pp. 95-110. [Online]. Available: http://doi.acm.org/10.1145/3062341.3062349
    • [16] W. Cui, M. Peinado, K. Chen, H. J. Wang, and L. Irun-Briz, “Tupni: Automatic reverse engineering of input formats,” in Proceedings of the 15th ACM Conference on Computer and Communications Security, ser. CCS '08. New York, NY, USA: ACM, 2008, pp. 391-402. [Online]. Available: http://doi.acm.org/10.1145/1455770.1455820
    • [17] J. Clause and A. Orso, “Penumbra: Automatically identifying failurerelevant inputs using dynamic tainting,” in Proceedings of the Eighteenth International Symposium on Software Testing and Analysis, ser. ISSTA '09. New York, NY, USA: ACM, 2009, pp. 249-260. [Online]. Available: http://doi.acm.org/10.1145/1572272.1572301
    • [18] M. H o¨schele and A. Zeller, “Mining input grammars with autogram,” in Proceedings of the 39th International Conference on Software Engineering Companion, ser. ICSE-C '17. Piscataway, NJ, USA: IEEE Press, 2017, pp. 31-34. [Online]. Available: https://doi.org/10. 1109/ICSE-C.2017.14
    • [19] C. Szepesva´ri, “Algorithms for reinforcement learning,” Synthesis lectures on artificial intelligence and machine learning, vol. 4, no. 1, pp. 1-103, 2010.
    • [20] C. Wattkins, “Learning from delayed rewards,” Ph.D. dissertation, Cambridge University, 1989.
    • [21] C. J. Watkins and P. Dayan, “Q-learning,” Machine learning, vol. 8, no. 3-4, pp. 279-292, 1992.
    • [22] PDF Reference, 6th ed., Adobe Systems Incorporated, Nov. 2006, available at http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/ pdfs/pdf reference 1-7.pdf.
    • [23] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. A. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: A system for large-scale machine learning,” in 12th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2016, Savannah, GA, USA, November 2-4, 2016., 2016, pp. 265-283.
  • No related research data.
  • No similar publications.

Share - Bookmark

Cite this article

Collected from