marble

MARBLE

Mining API Repositories for Boilerplate Lessening Effort



Overview

Designing usable APIs is critical to developers’ productivity and software quality, but is quite difficult. Understanding how an API is used by analyzing client code at scale to discover usability issues is even harder. Prior work has employed several methods to understand API usability issues, such as lab studies and API design reviews, but API designers still face difficulty in anticipating how programmers will use their APIs, so these early methods can still miss API usability issues.

In this project, we focus on one particularly annoying API usability issue – boilerplate code. For example, Java’s javax.xml.transform API is infamous for requiring a lot of boilerplate to write an XML document to an output stream (see the example below, taken from [Joshua Bloch's OOPSLA 2006 talk]).

            
import org.w3c.dom.*;
import java.io.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.*;
import javax.xml.transform.stream.*;

static final void writeDoc(Document doc, OutputStream out) throws IOException{
  try {
    Transformer t = TransformerFactory.newInstance().newTransformer();
    t.setOutputProperty(OutputKeys.DOCTYPE_SYSTEM, doc.getDoctype().getSystemId());
    t.transform(new DOMSource(doc), new StreamResrult(out));
  } catch(TransformerException e) {
    throw new AssertionError(e);  // Can't happen!
  }
}
            
          

Boilerplate can occur for many reasons. For example, if the API designers didn’t anticipate some usage scenarios, the API may not directly provide some methods that programmers need, and this can lead to boilerplate. Alternatively, if the API designers opted for finer granularity operations to increase flexibility, but most clients do not use the flexibility to put things together in multiple ways, rather using the same methods in the same way, this can also lead to boilerplate. Either way, boilerplate can indicate a gap between the API designers’ intentions and real usage scenarios by clients, and is a usability issue.

Boilerplate Survey

Despite general consensus on the undesirability of having to write boilerplate code and this interesting dimension regarding API usability, the concept of boilerplate remains largely undefined and understudied. We started collecting boilerplate code examples from different sources (e.g., GitHub, StackOverflow, publications) to understand common properties and definitions.

We are also asking developers to share boilerplate code examples with us. If you know of any boilerplate example, preferably involving APIs, in Java or otherwise, please share it with us and help us understand it better!

Boilerplate Mining

We are exploring the feasibility of automatically mining for boilerplate code in large codebases, to build tools that API designers can use to find usability issues with their APIs in the wild at scale. We operationalize common properties of boilerplate extracted from the examples above and the survey results, to find patterns in client code that satisfy those properties. Once we finish developing our mining algorithm, we will share the tool and the mined boilerplate code candidates of various APIs publicly for further study.

Team

  • Daye Nam
  • Brad Myers
  • Bogdan Vasilescu
  • Andrew Macvean