Decentralized in-order execution of a sequential task-based code for shared-memory architectures - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Rapport (Rapport De Recherche) Année : 2022

Decentralized in-order execution of a sequential task-based code for shared-memory architectures

Exécution ordonnée décentralisée d'un code séquentiel à base de tâches sur une architecture à mémoire partagée

Résumé

Abstract: Decentralized in-order execution of a sequential task-based code for shared-memory architectures Charly Castes, Emmanuel Agullo, Olivier Aumage, Emmanuelle Saillard Project-Teams HiePACS and STORM Research Report n° 9450 — January 2022 — 30 pages The hardware complexity of modern machines makes the design of adequate pro- gramming models crucial for jointly ensuring performance, portability, and productivity in high- performance computing (HPC). Sequential task-based programming models paired with advanced runtime systems allow the programmer to write a sequential algorithm independently of the hard- ware architecture in a productive and portable manner, and let a third party software layer —the runtime system— deal with the burden of scheduling a correct, parallel execution of that algorithm to ensure performance. Many HPC algorithms have successfully been implemented following this paradigm, as a testimony of its effectiveness. Developing algorithms that specifically require fine-grained tasks along this model is still considered prohibitive, however, due to per-task management overhead [1], forcing the programmer to resort to a less abstract, and hence more complex “task+X” model. We thus investigate the possibility to offer a tailored execution model, trading dynamic mapping for efficiency by using a decentralized, conservative in-order execution of the task flow, while preserving the benefits of relying on the sequential task-based programming model. We propose a formal specification of the execution model as well as a prototype implementation, which we assess on a shared-memory multicore architecture with several synthetic workloads. The results show that under the condition of a proper task mapping supplied by the programmer, the pressure on the runtime system is significantly reduced and the execution of fine-grained task flows is much more efficient.
Fichier principal
Vignette du fichier
RR-9450.pdf (962.73 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03547334 , version 1 (28-01-2022)

Identifiants

  • HAL Id : hal-03547334 , version 1

Citer

Charly Castes, Emmanuel Agullo, Olivier Aumage, Emmanuelle Saillard. Decentralized in-order execution of a sequential task-based code for shared-memory architectures. [Research Report] RR-9450, Inria Bordeaux - Sud Ouest. 2022, pp.30. ⟨hal-03547334⟩
111 Consultations
188 Téléchargements

Partager

Gmail Facebook X LinkedIn More