Skip to content

Benchmark to evaluate model and tools

stanpinte edited this page Jun 17, 2013 · 68 revisions

This page contains the guidelines and first elements for the activity of languages and tools evaluation (subtasks T1.1 and T1.2 of WP7)

Planning

T7.1 : primary toolchain for the next weeks:

  • 08-04-2013 : First version of templates of criteria (O7.1.3 and O7.1.7) Done
  • 15-04-2013 : Meeting in Munich : all approach on benchmark have been presented
    (slides) Done
  • 03-05-2013 : Final version of templates of criteria (O7.1.3 and O7.1.7) Done
  • 17-05-2013 : For each approach and tools, list of criteria is filled by the person in charge ot the tool and approach
  • 31-05-2013 : For each approach and tools, verification and complement by the assessors at least
  • 14-06-2013 : O7.1.3 and O7.1.7 release to review with synthesis of the results with each approach and tool
  • 28-06-2013 : Final version of O7.1.3 and O7.1.7
  • 04-07-2013 : Decision meeting in Paris (IUC)
  • 31-07-2013 : Final version of D7.1 "Report on the final choice(s) for the primary tool chain (means of description, tool and platform)"

T7.2 : secondary tools for next weeks:

Secondary tools : model transformation, code genration, VnV, safety, requirements management,...

See This page for details.

Issues

  • Missing assessors :

    • 1 for GNATProve
    • 1 for SysML Papyrus
  • Tool platform evaluation : few proposals

Tool selection process

No single tool will be able to satisfy the needs of the project. Therefore, the result of the tool selection process will be a "suite of tools" which will subsequently be integrated by T7.3. We will get there as follows:

  1. The evaluations of the tools by their owners and by their assessors will be completed. The resulting report is the foundation for the selection process.
  2. Evaluations will be categorized with respect to where they can be used (i.e. modeling, validation, code generation, etc.).
  3. Tools are eliminated if they do not fit certain firm requirements (i.e. not open source, out of the scope,...). This will result in a short list for each category.
  4. In a second round, the chosen candidates are discussed for all development phases and activities: this part can take more time and introduce more discussions.
  5. The results of the discussions will be a small number (3 or less) of "suites of tools". We don't anticipate to have a "clear winner" at this point. We hope that this approach will avoid any dead-locks.
  6. The decision on the final "suite of tools" will be postponed by up to 6 months. While this may not feel satisfying, it prevents giving an advantage to the business interests of one partner without having seen any concrete results, and it provides us with a plan "B", should we encounter unexpected problems with one of the chosen approaches.

Languages and tools evaluation

Due to the links between languages and tools evaluation of language and tools are made during the same benchmark.

The benchmark process is composed of two phases :

  • Initial benchmark (see Initial proposed benchmarks) during which each participant can propose methodology and tools and develop a case study. The participant is invited :
  • Assesment benchmark (see Second round benchmarks for assesment) during which independant participants are invited to assess the initial proposals. Assessors are invited :
    • to analyse the proposed means and/or tools
    • to complete the pages on tools and criteria
    • to propose other models

High priority items

A list of high priority items as been defined in WP2. The paragraph numbers refer to UNISIG SUBSET-026 v3.3.0.

List of open questions and comments on these examples is here.

Initial proposed benchmarks

This benchmark contains initial propositions of languages and tools.

Contact Formalism / Tool Subset used Started? Notes
Uwe Steinke (Siemens) SCADE 6.3.1 SUBSET-026-3, ISSUE : 3.3.0, 3.5 Management of Radio Communication" OK Results presented at WP2 meeting, Nov. 2012
Stan Pinte (ERTMS Solutions) ERTMSFormalSpecs Version 0.9, Subset-026 v3.2.0 OK 44% of Subset-026 formalized as of June 2013.
Cyril Cornu (All4Tec) CORE Workstation 5.1 SUBSET 026 ISSUE 3.3.0 Stopped Subset026 chapter 5.9 done - complete chapter 5 will be modelled for beginning of March.
Marielle Petit-Doche and Matthias Güdemann (Systerel) Event-B / Rodin 2.7 with ProR, ProB or AnimB and toolkit of provers (AtelierB, SMT,...) Subset 26 v3.3.0 OK Will combine with ProR for requirements traceability.
Marielle Petit-doche and Matthias Güdemann (Systerel) ClassicalB / AtelierB 4.0.2 Subset 26 v3.3.0 OK
David Mentré (MERCE) Why3 0.80 Subset-026 v3.3.0 Stopped Model dropped, reason to provide
David Mentré (MERCE) GNATprove Subset-026 v3.3.0 OK Trying various small parts of subset-026
Cecile Braunstein (Uni. Bremen) SysML/EA, RT-Tester SUBSET-026-3, ISSUE : 3.0.0, 3.5 Management of Radio Communication" OK
Johannes Feuser (Uni. Bremen) GOPRR (GOPPRR) meta meta model / MetaEdit+ {Subset-026 (Issue 2.3.0)/Levels: 0, 1/ETCS Modes: NP/SB/SF/IS/TR/PT/UN/SR/FS OK Not all functionality modelled. To be published in Ph.D. thesis approximately February 2013 and in J. Feuser and J. Peleska, “Dependability in Open Proof Software with Hardware Virtualization – The Railway Control Systems perspective,” 2012, under revision for Special Issue of Science of Computer Programming. . To maintain for benchmark ?
Alexander Stante (Fraunhofer) SysML / Papyrus SUBSET-026-3, 3.5 Management of Radio Communication OK
Stefan Rieger (TWT)/ Frank Golatowski (Uni Rostock) SystemC OK
Stefan Rieger (TWT) UPPAAL stand -by put on the secondary tools benchmark
Jan Welte (TU-BS) Petri Nets / Design/CPN System Requirements Specification, Version 4.00, Status from 20.12.1996 OK This work has been done for DB in 1997. It includes a full CPN modell of the SRS at this time.

Second round benchmarks for assesment

This benchmark contains list of volounteers to made an assesment of the first proposals. The volunteers shall be independant from the company of the intial contact of the proposal and from the company or academy which provides the language or tool. At least we propose two volunteers for each language and tools.

Formalism / Tool Assessor 1 Subset used A1 Assessor 2 Subset used A2 Started? Notes
SCADE 6.3.1 David Mentré (MERCE) Cecile Braunstein/Uni Bremen Yes
ERTMSFormalSpecs Renaud De Landtsheer / ALSTOM BE Marielle Petit-Doche / Systerel Yes
CORE Workstation 5.1 Marielle Petit-Doche / Systerel Removed
Event-B / Rodin 2.7 with ProR, ProB or AnimB and toolkit of provers (AtelierB, SMT,...) David Mentré (MERCE) ERTMS Solutions Yes
ClassicalB / AtelierB 4.0.2 Peter Mahlmann (DB) Jan Welte (TU-BS) Yes
GNATprove Matthias Güdemann / Systerel Assessor missing
SysML/EA, RT-Tester Uwe Steinke / Siemens Roberto Kretschmer / TWT Yes I (David Mentré) would also be interested in evaluating this approach
GOPRR (GOPPRR) meta meta model / MetaEdit+ Alexandre Ginisty (All4Tec) Matthias Güdemann / Systerel
SysML / Papyrus Renaud De Landtsheer / ALSTOM BE (Marielle Petit-Doche / Systerel if no volunteer) Yes
SystemC Cecile Braunstein (Uni. Bremen) Silvano DalZilio / LAAS Yes
UPPAAL Cecile Braunstein (Uni. Bremen) Silvano DalZilio / LAAS Secondary toolchain benchmark
Petri Nets / Design/CPN Marc Antoni (SNCF) Cyril Cornu / All4tec  Yes

"Undecided" assessor:

  • Alexander Stante (Fraunhofer)

Proposed guidelines to describes languages and tools

Please see the [Tool Description Template](Tool Description Template).

Description page for each tool :