He is a lecturer (assistant professor) in the School of Computer Science, the University of Adelaide (founded in 1874, one of the top 125 QS World Universities Ranking, and one of the Go8 Australian leading universities). He was a postdoctoral research fellow at Queen’s University, Canada. He holds one of the most prestigious and selective sources of national funding in Japan, i.e., a JSPS Research Fellowship for Young Researchers and a Grants-in-Aid for JSPS Fellow. He won the "Best Ph.D. Student Award" for his Ph.D. study at Nara Institute of Science and Technology, Japan. During his Ph.D. study, he also spent two years as a visiting researcher at Queen’s University, Canada. His research has been published at top-tier software engineering venues, such as, IEEE Transactions on Software Engineering (TSE), Empirical Software Engineering (EMSE), and the International Conference on Software Engineering (ICSE). His Ph.D. thesis aims to improve the fundamentals of predictive modelling for software engineering (e.g., analytics systems for software quality assurance) in order to produce more accurate predictions and reliable insights.

Interests: Big Data Analytics, Data Science, Predictive Modelling in Software Engineering, Empirical Software Engineering, Mining Software Repository, Modern Statistical Analysis

June 23, 2017
Our paper on Studying the Dialogue Between Users and Developers of Free Apps in the Google Play Store" has been accepted at Internaltional Journal of Empirical Software Engineering! (Impact Factor 2016: 3.275)
June 06, 2017
I will be giving a talk on "Using HPC Resources to Improve the Experimental Design of Software Quality Analytics" at HPCS 2017, Kingston, Ontario, Canada.
May 19, 2017
I attended MSR and ICSE 2017, Buenos Aires, Argentina during 20 - 27 May 2017.

Today software development process depends on a variety of development tools (e.g., issue tracking systems, version control systems, code review, continuous integration, continuous deployment, Q&A website). For example, Github—the largest hosting service of source code in the world—currently hosts over 35 millions software repositories, while the last million repositories were generated within 2 months. Millions of software projects also generate large quantities of unstructured software artifacts at a high frequency (so-called Big Data) in many forms, like issue reports, source code, test cases, code reviews, execution logs, app reviews, developer mailing lists, and discussion threads.

Software analytics is a field that focuses on uncovering interesting and actionable knowledge from the unprecedented amount of data in such repositories in order to improve software development, maintenance, evolution, productivity, quality, and user experience. Indeed, many software organizations are eager to be empowered to make data-driven engineering decisions, rather than relying on gut feeling. Also, they use it to identify new opportunities, leading to smarter business moves, more efficient operations, higher profits and happier customers. For example, Microsoft’s data scientists uncover frequently-used commands of Microsoft Windows, which led to an important re-design of user interfaces. Therefore, such insights give the ability to software companies to work faster – and stay agile – give software organizations a competitive edge they didn’t have before.

Our mission is to discover the most effective analytics methods and actionable insights for various stakeholders in software industry. To achieve this, I co-direct the "Software Analytics Group", working with many wonderful students, collaborators together with industrial partners to make an immediate impact on our society. At Software Analytics Group, we use cutting-edge statistical analysis, machine learning, text mining, and data science to develop analytics technologies which specifically turn software engineering data into actionable insight.

  1. Innovations for Software Analytics | Improving quality and efficiency throughout the software-development process
  2. With an increasing amount of data on every aspect of our daily activities of software development - from what features do we develop, where bugs are fixed, to who contribute most to the project, and beyond - we are able to measure code and process characteristics, software developers behaviour, and investigate interesting correlations which can be used to predict software quality, maintenance cost, and effort.

    Broadly speaking, I'm interested in applying statistical analysis and machine learning to build predictive models, recommendation and analytics systems in order to (but not limited to):

  3. Technologies for Software Analytics Methods | Exploring new technologies to ensure highly accurate prediction models
  4. While the adoption of software analytics enables software organizations to distill actionable insights, there are still many barriers to broad and successful adoption of analytics systems. Indeed, even if software organizations can access such invaluable software artifacts and toolkits for data analytics, researchers and practitioners often have little knowledge to properly develop analytics systems. Thus, the accuracy of the predictions and the insights that are derived from analytics systems is one of the most important challenges of big data in software engineering. For example, the risks of not managing and monitoring analytics systems effectively can be catastrophic, as it allows analytics systems to become outdated, leading to potentially erroneous and costly business decisions. Moreover, the use of inappropriate statistical methods could provide incomplete findings due to the veracity of big data in software engineering. Thus, important decisions that are made based on misleading insights can quickly translate into lost revenue.

    Recently, my colleagues and I investigate that (1) noise in defect datasets, (2) the choice of parameter settings of classification techniques, and (3) the choice of model validation has a large impact on the performance and interpretation of defect prediction models. Moreover, my recent work also shows that collinearity and multicollinearity have a large impact on the stability of the insights derived from prediction models.

    Broadly speaking, I'm interested in exploring the following topics to ensure highly accurate and reliable predictions, and insights derived from software analytics (but not limited to):

    • Improve algorithms for software analytics
    • Identify characterization of bias in data preparation, data preprocessing
    • Identify the most appropriate classification, model validation, and model interpretation techniques
    • Provide practical guidelines to ensure the most accurate and reliable predictions and insights

  5. Visualization and Infrastructure Support for Big Data Analytics in Software Engineering
  6. Because of the way the human brain processes information, using charts or graphs to visualize large amounts of complex data is easier than poring over spreadsheets or reports. Data visualization is a quick, easy way to convey concepts in a universal manner – and you can experiment with different scenarios by making slight adjustments.

    I'm interested in developing visualization and infrastructure to support big data management for software analytics.

  7. Academic Software for Big Data Analytics in Software Engineering
  8. Academic software is a critical component of academic research in software engineering that we use to produce research results. However, academic software in software engineering is rarely available. With the advent of open-source software, open-access, reproducibility, replicability, I'm interested in developing academic software to support research in empirical software engineering. For example, I actively develop and maintain the ScottKnott ESD test --- a statistical test for multiple comparison of treatments.

    1. Studying the Dialogue Between Users and Developers of Free Apps in the Google Play Store

      Safwat Hassan, Chakkrit Tantithamthavorn, Cor-Paul Bezemer, and Ahmed E. Hassan
      International Journal of Empirical Software Engineering (EMSE)
      2017
    2. (PhD Thesis) Towards a Better Understanding of the Impact of Experimental Components on Defect Prediction Models


      Chakkrit Tantithamthavorn
      Nara Institute of Science and Technology
      2016
      PDF
    3. A Study of Redundant Metrics in Defect Prediction Datasets

      Jirayus Jiarpakdee, Chakkrit Tantithamthavorn, Akinori Ihara, Kenichi Matsumoto
      Proceedings of the International Symposium on Software Reliability Engineering (ISSRE)
      2016
      PDF
    4. An Empirical Comparison of Model Validation Techniques for Defect Prediction Models


      Chakkrit Tantithamthavorn, Shane McIntosh, Ahmed E. Hassan, and Kenichi Matsumoto
      IEEE Transactions on Software Engineering (TSE)
      2017
      PDF
    5. Comments on "Researcher Bias: The Use of Machine Learning in Software Defect Prediction"

      Chakkrit Tantithamthavorn, Shane McIntosh, Ahmed E. Hassan, and Kenichi Matsumoto
      IEEE Transactions on Software Engineering (TSE)
      2016
      PDF
    6. Automated Parameter Optimization of Classification Techniques for Defect Prediction Models

      Chakkrit Tantithamthavorn, Shane McIntosh, Ahmed E. Hassan, and Kenichi Matsumoto
      The International Conference on Software Engineering (ICSE)
      2016
      19% (101/530)
      PDF
    7. Towards a Better Understanding of the Impact of Experimental Components on Defect Prediction Modelling

      Chakkrit Tantithamthavorn
      The International Conference on Software Engineering (ICSE) - Doctoral Symposium
      2016
      22% (8/36) for paper presentation
      PDF
    8. See more...

    Google Scholar released the 2016 version of Scholar Metrics with h5-index and h5-median for the top-20 Conferences and Journals in each area of research! In the area of Software Systems the International Conference on Software Engineering (ICSE) and IEEE Transactions on Software Engineering (TSE) are in the top two positions.

    August 25, 2017
    ICSE 2018, Gothenburg, Sweden
    October 12, 2017
    SANER 2018, Campobasso, Italy
    December 04, 2017
    APSEC 2017, Nanjing, China
    February 03, 2018
    MSR 2018, Gothenburg, Sweden
    March 09, 2018
    FSE 2018, Lake Beuna Vista, Florida, USA
    March 30, 2018
    ICSME 2018, Madrid, Spain
    April 22, 2018
    ASE 2018, Montpellier, France

    July 07, 2017
    APSEC 2017, Nanjing, China
    July 25, 2017
    QRS 2017, Prague, Czech Republic
    September 04, 2017
    ESEC/FSE 2017, Paderborn, Germany
    September 04, 2017
    SSBSE 2017, Paderborn, Germany
    September 17, 2017
    ICSME 2017, Shanghai, China
    September 17, 2017
    VISSOFT 2017, Shanghai, China
    September 17, 2017
    SCAM 2017, Shanghai, China
    October 30, 2017
    ASE 2017, Urbana-Champaign, Illinois, USA
    November 06, 2017
    ESEM 2017, Toronto, Canada
    February 21, 2018
    SANER 2018, Campobasso, Italy
    May 27, 2018
    ICSE 2018, Gothenburg, Sweden
    May 27, 2018
    MSR 2018, Gothenburg, Sweden
    September 03, 2018
    ASE 2018, Montpellier, France
    September 17, 2018
    ICSME 2018, Madrid, Spain
    November 04, 2018
    FSE 2018, Lake Beuna Vista, Florida, USA
    1. Journal Referee

    2. Program Committee (PC)

      • The International Conference on Software Maintainance and Evolution (ICSME), 2017
    3. Additional Reviewer

      • The Working Conference on Mining Software Repositories (MSR), 2015
      • The India Software Engineering Conference (ISEC), 2015
      • The IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM), 2014
    4. Web (Co)-Chair

      • The International Workshop on Empirical Software Engineering in Practice (IWESEP), 2017
      • The Japan Summer School in Mining Software Repositories (MSR Asia Summit), 2015
      • The Thailand-Japan International Academic Conference (TJIA), 2013
    5. Student Volunteer

      • The International Conference on Software Analysis, Evolution, and Reengineering (SANER), 2016
      • The International Workshop on Empirical Software Engineering in Practice (IWESEP), 2012