Scientific Keynote: Chunyang Chen

Keynote: “Towards Human-like Software Testing”

Software testing has long benefited from automation such as program analysis, GUI testing, fuzzing, yet much of today’s automated testing remains misaligned with how software is actually used and assessed by humans. In practice, teams still face “so many bugs” alongside “so many false positives”, where automated approaches can generate infeasible or unrealistic interaction sequences which can rarely be triggered by real-world users. Such generated test cases or traces rarely represent human behavior and often yield false positives and low-value defects resulting in great workload for developers to reproduce and repair those unnecessary issues, which reduce developers’ trust in testing tools. Moreover, metric-driven testing — optimizing for coverage, defect metrics, or execution throughput — can incentivize over-testing and provide an inflated sense of progress without proportional gains in actionable quality. This keynote motivates a shift from tool-centric automation towards on-demand and human-centric testing, where the goal is to explore software using strategies, constraints, and intentions closer to those of end users.

The keynote shares a line of studies that carry out human-like testing using large language model(LLM)-driven testing agents, especially in GUI testing. First, we present LLM-powered automated mobile GUI testing that frames interaction between LLM and mobile apps as a functionality-aware exploration process, demonstrating strong activity coverage and real-world bug discovery at scale. Second, to emulate how human testers accumulate experience, we introduce dynamic memory for GUI testing agents, including interaction-level episodic memory, function-level reflective memory, and app-level strategic memory, with on-demand invocation. This “experience layer” can be integrated as a plugin to improve both coverage and bug yield across different GUI testing tools and settings. Then, we extend human-like testing beyond single-user interactions by addressing multi-user interactive features, which are common in real apps and difficult to test with record-and-replay scripts due to device independence and action coordination requirements.

Finally, we will also introduce how to inject persona into testing agents to mimic different human testers as a kind of automated crowd testing. In addition, I will key challenges and research directions in this direction such as scalability, efficiency, perception, oracle missing, and testing thoroughness.

Program

Scientific Program, Friday, 09:00 – 10:00 (see Program)

Bio

Chunyang Chen is a full professor in the School of Computation, Information and Technology, Technical University of Munich, Germany. His main research interest lies in automated software engineering, especially data-driven mobile app development. Besides, he is also interested in Human-Computer Interaction and software security. He has published many research papers in top venues such as ICSE, FSE, ASE, CHI, CSCW with extensive collaboration with industry, including Google, Microsoft, and Meta. His research has won awards including ACM SIGSOFT Early Career Researcher Award, Facebook Research Award, four ACM SIGSOFT Distinguished Paper Awards (ICSE'23/21/20, ASE'18), and multiple best paper/demo awards.