Hieu Nguyen

Paper review - How not to structure your database-backed web applications: a study of performance bugs in the wild

Aug 10 2018


I start reading The morning paper recently. The site’s author reviews one CS paper every weekday morning, with super interesting content. I think, this is a great way to learn, and I can try it to improve myself. Therefore, I start with a paper I found there: How not to structure your database-backed web applications: a study of performance bugs in the wild.

The author’s post about the paper is here.

As the name says, the paper is about anti-patterns that cause performance issues on applications that use ORMs to connect to database server.

Short summary

This is a simple paper about anti-pattern when using ORMs in popular Web Frameworks.

The methods to approach this research are pretty interesting:

Table 1

Table 2

From those methods, they find 64 performance issues and generalize a few anti-patterns:

After that, they proceed to fix the performance issues manually.

Table 3

From that result, they create a simple regex-based static checker that automatically detects the ORM API misuses. That tool detects 428 cases of API misuses, in which there are 3 issues coincide with the initial 64 performance issues.

To conclude, they propose several ways to help developers avoid the anti-pattern:

My thought

This is a very simple and comprehensive paper, which allows developers without research background (like me) to understand it easily. The research team publishes their tools and methods as opensource and docker images, which allows us to replicate and improve their research. This paper inspires me to try to generalize and implement tools to detect similar anti-patterns to avoid performance issues right from development phase.

Their ideas about how to improve current process are interesting and useful, especially about static analysis to check and suggest improvement to database design and code. A tool that creates a guideline for us to improve our application would be a big help to novice-to-medium level developers (and maybe experienced developers as well).

However, I think there are some problems in the paper that the research team stumbles upon. I list them out below:

Conclusion

This paper is overall a good one: simple to read and understand, and provides a good idea to get me thinking. The authors’ approach is, although still flaw, good enough to determine simple performance issues that are overlooked by experienced developers in OSS. The paper generalizes a list of anti-patterns that we should avoid, for which the authors suggest several ideas to automatically check and improve.

This is my first post, so I struggle terribly with it. The review is obviously not really good, but I learn a lot in the process. I hope that it can provide you a few useful things, too.