Benchmarking & Sample Datasets – WG5

Mission and Workplan

WG5 aims to identify common BioImage Analysis problems and benchmark existing solutions. WG5 will create a framework allowing comparison of different aspects of existing solutions by running benchmarks on datasets. To guide the Analysts’ choice, the solutions referenced in WG4 will be tagged with the benchmark results. Users will be able to use the same framework to run the different Image Analysis solutions on a subset of their own image data and to explore the impact of different parameter values. This tool has the potential to boost the development of new and better solutions, and to trigger exchanges between open source and commercial solution providers. It can be used as a reference that helps to reproduce the Image Analysis used in scientific publications. WG5 will define standards for the interoperability of Image Analysis software in order to be able to run the benchmark tests in different software packages . A standard way to define the expected results of tests (ground truth) must be defined. It will define standards for benchmarking of different aspects like correctness, robustness, efficiency, flexibility and usability.

Another important step is the identification of BIAS problem classes and their association to solutions and compatible annotated images. The results from existing software competitions, for instance the ones organized within, can be used as an input. In order to obtain these results, meetings of the workgroup members will be organized.
A web-based platform that allows running of benchmark tests and reports the results must be created, based on existing solutions and similar initiatives. Cloud computing and storage solutions, adapted to issues related to Big data formats, will be reviewed to implement the benchmarking platform, in collaboration with companies. This tool will be interfaced with the webtool from WG4, populated with sample data gathered by WG5. The creation, implementation and maintenance of the software infrastructure will be supported during the WG5 meetings in close coordination with WG4. Comparisons of IA solutions will be published as scientific review articles.
WG5 also aims to gather supporting sample datasets. Real data will be sourced from existing sample data collections, as well as new datasets submitted to open calls for data. Since one of the major applications of this collection will be benchmarking, synthetic datasets will be included and new synthetic datasets generated as necessary. Compatible licensing models for data sharing by will be devised. Annotation of all data will be carried out collaboratively.
The final sample datasets collection will be made publicly available as part of the benchmarking webtool and as a standalone WG5 sample dataset repository. This resource will provide public access to a set of standard, annotated datasets reflecting common BIAS tasks. These will be critical for fair benchmarking of solutions, and will produce a number of additional benefits including use for illustrative purposes: i.e. to support teaching and knowledge dissemination (WG1 and WG6 in particular).