• If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • Get control of your email attachments. Connect all your Gmail accounts and in less than 2 minutes, Dokkio will automatically organize your file attachments. You can also connect Dokkio to Drive, Dropbox, and Slack. Sign up for free.


Project List

Page history last edited by dmolnar@... 11 years, 9 months ago


This is a list of projects for catchconv. If you are interested in helping out, these projects are a place to start. Contact: dmolnar@eecs.berkeley.edu



*** Testing: automation. 
Our work in the spring convinced me that we 
need to dramatically simplify the process for setting up catchconv test cases. 
For example, I've created a script "spawnit.pl" in /valgrind-catchconv/catchconv/tests . 
The script takes as input a filename, creates Torque batch scripts for running Catchconv and 
for running zzuf, then submits those batch scripts via qsub. That is, "perl spawnit.pl snippet3.mp3" 
starts a zzuf run and a catchconv run for mplayer on snippet3.mp3 . We have ten visiting 
undergraduates here this summer who all want to do fuzz testing. I'm going to make more 
scripts like spawnit.pl for them, but I will need help. 
What you would do: 
- Work with the 10 undergrads to show them how to use the test scripts I write in the next 
few weeks. Help them when things break. 
- Refine existing test scripts to be more idiot proof. 
- Write new test scripts for use by the 10 undergrads. The current ones are written in perl, 
but you can use whatever scripting language you like. If you don't know one, it isn't hard to learn. 

*** Testing: spidering for test cases 
Crawl the web. Look for test case inputs for programs 
we'd like to test, such as .GIF files or .mp3 for mplayer. Grab the test cases, edit them 
(e.g. cut audio files down to 1s each) 
What you would do:
- Write scripts to obtain test cases, 
by direct crawl, by leveraging Google, or by some other clever method. Don't do anything which 
would get the RIAA/MPAA to come after us. 
- Write scripts to process test cases before starting Catchconv 
- Write script to start Catchconv run and record results 

*** Testing : Elastic Compute Cloud backend 

Currently, we use the Berkeley PSI and Millennium clusters to host 
catchconv and zzuf jobs. While we love our clusters and the people who run them, 
depending on them makes it more difficult for people outside Berkeley to use our work 
on a large scale. In addition, we have to share these clusters, which will become inconvenient
 when we want to run fuzzing on 10,000 seed files at the same time. 

This project consists of building a fuzzing backend which runs on Amazon's Elastic Compute Cloud (EC2).
 The long-term vision: third parties easily set up EC2 runs for their own projects. Each run 
contributes data to our project and contributes bug reports to their project. Each run can have 
thousands of seed files and run a different catchconv or zzuf fuzzing search on each seed file. 

A first step is to take a test program we already understand from runs on the cluster, then make
 it run on EC2 at larger scale. For example, we currently test mplayer with 5-10 test files. 
Therefore a first step is to scale to testing mplayer with 100-1000 test files. 

What you would do: 
- Obtain Amazon EC2 web service account; you will be the point of contact for Amazon EC2 
use on this project. 
- Understand the EC2 architecture. 
- Define a method for extracting generated test cases from EC2 runs. 
- Write scripts to automatically create Amazon Machine Images (AMIs) pre-loaded with catchconv 
or zzuf, plus a seed file. 
- Write scripts to submit fuzz AMIs to EC2, monitor execution, and generate summary reports. 
- Generate documentation for third parties which shows them how to create AMIs with their own 
software under test. 

*** Reporting: web front end. 

I have added features to gensearch.pl and zzuf.pl 
which report information about bugs to a SQL database. You can see summary statistics 
at http://www.metafuzz.com . We need more in-depth reports. For example, we might want 
reports on how many bugs have been found with catchconv vs. how many have been found with zzuf. 
The site could be prettier, too. 
What you would do: 
- Identify interesting reports to generate. 
- Write server-side code to display such reports. 
- Write server-side code to export data in comma-separated format, so our colleagues can look at 
the data in Excel or R or Matlab. 
- Take control of the http://www.metafuzz.com presentation. 
- Improve the reporting mechanism in the current client-side script and server-side code. 
Currently the server-side code is in PHP5, but if you felt like replacing it with something else 
that's fine with me. If you want to use .NET let me know, I'll have to find another hosting provider. 

*** Reporting: bug reporting to developers. 
Currently, we drop test cases in a directory created
as part of the test run. Then we manually have to look at the test cases and submit them to the 
package maintainer. This a pain in the neck. Instead, this project would automatically identify 
"new" bugs found via zzuf or catchconv, probably using the SQL database we now have of all bugs 
found. Then it would provide a way for us to easily review such bugs and automatically report these 
new bugs to the software developers. For a concrete test case, we could start with reporting bugs to
the mplayer bugzilla. mplayer is a good test case because they have explicit guidelines on what they
want in their bug reports.
What you would do: 
- Write code to identify "new" bugs 
- Write code to automatically generate bug reports for each new bug, including such things as 
stack trace, type of bug, software version, etc. 
- Build a mechanism for easy human review of these "new" bugs. 
- Write code to file reviewed "new" bugs in developer bugzilla. 
- Work with users, software developers to refine and improve the process. You would be 
main point of contact for bugs filed this way. 

*** Core: scaling catchconv to Firefox
Catchconv runs successfully on Flash Player on Linux. Unfortunately Catchconv runs 
out of memory when trying to run with Firefox. This project: figure out why, then fix it.
Will require a lot of hacking on catchconv, which is a large-ish code base written in C. 
There may be another student interested in this project, I'll let you know if that firms up. 
What you would do: 
- Add code to catchconv to track memory usage of various data structures. 
Identify places where catchconv could re-use memory. David Wagner and I have some thoughts here. 
- Implement optimizations in catchconv to reduce memory usage. 
*** Core: change catchconv output format to SMT-LIB 
Currently Catchconv outputs formulas in the STP format. There's a different format called SMT-LIB 
which is an emerging standard for constraint solvers. If we changed the output format to SMT-LIB, 
then we could easily compare the performance of different constraint solvers. In particular, 
Prof. Sanjit Seshia and his students have a new solver they could plug into our testing. This 
project is conceptually straightforward, but it will require changing the entire catchconv code
base. A key issue will be avoiding regressions. 
What you would do: 
- Learn the SMT-LIB format 
- Write many regression tests 
- Change the catchconv code base 
- Test changes with several different constraint solvers


*** Core: finding more bugs
Extend catchconv to generate more test cases for a broader variety
of security bugs.  Right now catchconv is primarily targeted at finding
integer overflow vulnerabilities (though it happens to find many other
kinds of security vulnerabilities as it is used).  There is information
in the literature about how to use the same infrastructure to target
other kinds of security bugs as well.  This partially comes down to
generating extra queries to automatically generate test cases that
are likely to expose these other kinds of vulnerabilities.  Conceptually
this work should not be too difficult, but it would require hacking on
the catchconv core code.  This project could significantly increase the
effectiveness of catchconv at finding vulnerabilities and so might make
catchconv a more exciting and useful tool, perhaps enough to increase
adoption of catchconv.

*** Fuzzing wireshark

Download wireshark, a packet sniffer, and use catchconv to fuzz test
the protocol decoders.  There have been a number of security holes in
the protocol decoders in the past and it is a known risk for wireshark,
so it might be a useful thing t test.  It's possible to run wireshark
from the command line, reading from a file that contains a packet dump,
so this should be well-suited to the file-based fuzzing that catchconv
supports.  The wireshark distribution already contains scripts for a
rudimentary form of fuzzing, which might help in making it work with

Comments (0)

You don't have permission to comment on this page.