1st International Workshop on Comparative Empirical Evaluation of Reasoning Systems
Sprache des Titels:
Benchmark libraries and competitions are two popular approaches to comparative empirical evaluation of reasoning systems. It has become accepted wisdom that regular comparative evaluation of reasoning systems helps focus research, identify relevant problems, bolster development, and advance the field in general. The number of competitions has been rapidly increasing lately. At the moment, we are aware of about a dozen benchmark collections and two dozen competitions for reasoning systems of different kinds. We feel that it is time to compare notes.
What are the proper empirical approaches and criteria for effective comparative evaluation of reasoning systems? What are the appropriate hardware and software environments? How to assess usability of reasoning systems? How to design, acquire, structure, publish, and use benchmarks and problem collections?
The issue has received a new dimension recently, as systems that have a strong interactive aspect have entered the arena. Three competitions dedicated to deductive software verification within the last two years show a dire need of comparative empirical evaluation in this area. At the same time, it remains largely unclear how to systematically evaluate performance and usability of such systems and how to separate technical aspects and the human factor.
The workshop aims to advance comparative empirical evaluation by bringing together current and future competition organizers and participants, maintainers of benchmark collections, as well as practitioners and the general scientific public interested in the topic.
Furthermore, the workshop intends to reach out to researchers specializing in empirical studies in computer science outside of automated reasoning. We expect such connections to be fruitful but unlikely to come about on their own.