Characterising Logging Practices in Open-Source Software

Ding Yuan, Soyeon Park and Yuanyuan Zhou.
[PDF | PS | BibTex]

Software logging is a conventional programming practice. While its efficacy is often important for users and developers to understand what have happened in the production run, yet software logging is often done in an arbitrary manner. So far, there have been little study for understanding logging practices in real world software. This paper makes the first attempt (to the best of our knowledge) to provide a quantitative characteristic study of the current log messages within four pieces of large open-source software. First, we quantitatively show that software logging is pervasive. By examining developers' own modifications to the logging code in the revision history, we find that they often do not make the log messages right in their first attempts, and thus need to spend a significant amount of efforts to modify the log messages as after-thoughts. Our study further provides several interesting findings on where developers spend most of their efforts in modifying the log messages, which can give insights for programmers, tool developers, and language and compiler designers to improve the current logging practice. To demonstrate the benefit of our study, we built a simple checker based on one of our findings and effectively detected 138 pieces of new problematic logging code from studied software (24 of them are already confirmed and fixed by developers).