{"id":971,"date":"2021-05-03T12:18:01","date_gmt":"2021-05-03T12:18:01","guid":{"rendered":"https:\/\/panda.dei.polimi.it\/?page_id=971"},"modified":"2021-06-14T13:23:43","modified_gmt":"2021-06-14T13:23:43","slug":"ics-2021-tutorial","status":"publish","type":"page","link":"https:\/\/panda.deib.polimi.it\/?page_id=971","title":{"rendered":"ICS 2021 tutorial"},"content":{"rendered":"\r\n<h2 class=\"wp-block-heading\">Bambu: High-level synthesis for parallel programming<\/h2>\r\n\r\n\r\n\r\n<p><a href=\"https:\/\/ics21.github.io\/\">ICS21<\/a> &#8211; International Conference on Supercomputing<br \/>June 14 &#8211; 17, 2021. Virtual event<\/p>\r\n\r\n\r\n\r\n<h4 class=\"wp-block-heading\">Abstract<\/h4>\r\n\r\n<p>Applications operating on very large datasets present unique behaviors, such as fine-grained, unpredictable memory accesses, and highly unbalanced task-level parallelism, that make existing high-performance general-purpose processors or accelerators (e.g., GPUs) suboptimal. To address these issues, research and industry are developing a variety of custom accelerator designs for this application area, including solutions based on reconfigurable devices (Field Programmable Gate Arrays). These new approaches often employ High-Level Synthesis (HLS) to accelerate the development of the accelerators. This tutorial will discuss the impact of FPGAs on High-Performance Computing, focusing on applications in the areas of data analytics and machine learning. The tutorial will dive into approaches for the High-Level Synthesis (HLS) of parallel applications, highlighting key methodologies, trends, advantages, benefits, but also gaps that still need to be closed. The tutorial will provide a hands-on experience of Bambu, one of the most advanced HLS tools available. Able to support the majority of C constructs, Bambu integrates with many logic syntheses and simulation tools, generating accelerators for a variety of FPGA vendors, starting from parallel code annotated with OpenMP. It also optimizes the memory architectures of the generated accelerators.<\/p>\r\n\r\n<h4 class=\"wp-block-heading\">Tutorial topics covered<\/h4>\r\n\r\n<p>The tutorial will initially provide an overview of the current state of tools and platforms for FPGA acceleration, discussing, in particular, the relevant applications and workloads for HPC. In the introductory part, we will make the case for the use of FPGAs in particular for memory-bound and memory-intensive applications from the data analytics and machine learning areas, but also highlight where they can be relevant for more conventional scientific simulation workloads. We will then discuss where Bambu is positioned in the spectrum of the available tools, introducing its unique characteristics and features. Focusing on Bambu, we will present the synthesis approaches and architectural templates adapted to extract and manage task-level parallelism, and the techniques to support complex memory patterns and parallel memory subsystems. The hands-on part of the tutorial will teach the audience how to install the tool and actually use the tool with a set of relevant application kernels (in particular, focusing on graph kernels). The audience will learn how to generate and optimize accelerators starting from parallel specifications, they will learn how to optimize the memory architecture, verify the generate accelerators and the designs obtained by integrating third parties IP, and how to configure the flow to target different brands of FPGAs, different boards, and different simulation infrastructure. The last part of the tutorial will provide an overview of current trends and future research directions for reconfigurable hardware in High-Performance Computing.<\/p>\r\n<p>Tutorial exercises are available at: <a href=\"https:\/\/github.com\/ferrandi\/PandA-bambu\/tree\/tutorial_2021\/documentation\/tutorial_ics_2021\">https:\/\/github.com\/ferrandi\/PandA-bambu\/tree\/tutorial_2021\/documentation\/tutorial_ics_2021<\/a><\/p>\r\n<p>Tutorial slides:<\/p>\r\n<ul>\r\n<li><a href=\"http:\/\/panda.dei.polimi.it\/wp-content\/uploads\/Introduction.pdf\">Introduction<\/a><\/li>\r\n<li><a href=\"http:\/\/panda.dei.polimi.it\/wp-content\/uploads\/Target-Customization.pdf\">Target Customization<\/a><\/li>\r\n<li><a href=\"http:\/\/panda.dei.polimi.it\/wp-content\/uploads\/Optimizations.pdf\">Optimizations<\/a><\/li>\r\n<li><a href=\"http:\/\/panda.dei.polimi.it\/wp-content\/uploads\/Simd.pdf\">Simd<\/a><\/li>\r\n<li><a href=\"http:\/\/panda.dei.polimi.it\/wp-content\/uploads\/context-switch.pdf\">Context switching<\/a>\u00a0<\/li>\r\n<\/ul>\r\n\r\n<h4 class=\"wp-block-heading\">Organizers and Short Bios<\/h4>\r\n\r\n\r\n\r\n<p><em><strong>Fabrizio Ferrandi, Associate Professor, Politecnico di Milano, Italy<\/strong><\/em><br \/>Fabrizio Ferrandi Fabrizio Ferrandi (Member, IEEE) received the Laurea (cum laude) degree in electronic engineering and the Ph.D. degree in information and automation engineering (computer engineering) from the Politecnico di Milano, Milan, Italy, in 1992 and 1997. He has been an assistant professor with the Politecnico di Milano, until 2002. Currently, he is an associate professor with the Dipartimento di Elettronica, Informazione e Bioingegneria of the Politecnico di Milano. His research interests include synthesis, verification simulation, and testing of digital circuits and systems. He is a member of the IEEE Computer Society since 1995, the Test Technology Technical Committee, and the European Design and Automation Association. \r\n\r\n<\/p>\r\n<p><em><strong>Serena Curzel, PHD student, Politecnico di Milano, Italy<\/strong><\/em><br \/>Serena Curzel received the B.S. degree in Electronics and telecommunication Engineering from Universit\u00e0 degli studi di Trento, Italy, in 2016and the M.S. degree in Electronics Engineering from Politecnico di Milano, Italy, in 2019, where she is currently pursuing the Ph.D. degree in Information Technology. Her main research interests are FPGA acceleration of domain-specific applications (including Deep Neural Networks) and High-Level Synthesis. Since 2019 she is also in charge of software development for the HERMES-SP CubeSat project. \r\n\r\n<\/p>\r\n<p><em><strong>Michele Fiorito, research assistant, Politecnico di Milano, Italy<\/strong><\/em><br \/>Michele Fiorito received the M.S. degree in Computer Science Engineering from Politecnico di Milano, Italy, in 2020, where he is currently working as a research assistant to support software development for the HERMES-SP Cub Sat project. His main research interests are High-Level Synthesis tools design and approximate computing. \r\n\r\n<\/p>\r\n<p><em><strong>Vito Giovanni Castellana, Senior Research Scientist, Pacific Northwest National Laboratory, United States of America<\/strong><\/em><br \/>Dr. Vito Giovanni Castellana received the M.S degree in Informatic Engineering, in 2010, and the Ph.D. degree in Computer Engineering, in 2014, from Politecnico di Milano in Italy. Since February 2014, he has been a research scientist in the PNNL\u2019s High-Performance Computing group. He joined PNNL in 2012 as a post-master research associate. His research interests are embedded systems and computer architectures, design automation, and HPC. \r\n\r\n<\/p>\r\n<p><em><strong>Marco Minutoli, Research Scientist, Pacific Northwest National Laboratory, United States of America<\/strong><\/em><br \/>Marco Minutoli received the M.S degree in Informatic Engineering, in 2014 from Politecnico di Milano in Italy. Since February 2016, he has been a research scientist in the PNNL\u2019s High-Performance Computing group. He joined PNNL in 2014 as a post-master research associate. Since 2016 is a Ph.D. candidate in Computer Science at Washington State University. His research interests are focused on the design and analysis of data structures and graph algorithms for high-performance and big data applications. \r\n\r\n<\/p>\r\n<p><em><strong>Antonino Tumeo, Senior Research Scientist, Pacific Northwest National Laboratory, United States of America<\/strong><\/em><br \/>Dr. Antonino Tumeo received the M.S degree in Informatic Engineering, in 2005, and the Ph.D. degree in Computer Engineering, in 2009, from Politecnico di Milano in Italy. Since February 2011, he has been a research scientist in the PNNL\u2019s High-Performance Computing group. He joined PNNL in 2009 as a post-doctoral research associate. Previously, he was a post\u2013doctoral researcher at Politecnico di Milano. His research interests are modeling and simulation of high-performance architectures, hardware-software codesign, FPGA prototyping and GPGPU computing. <\/p>","protected":false},"excerpt":{"rendered":"<p>Bambu: High-level synthesis for parallel programming ICS21 &#8211; International Conference on SupercomputingJune 14 &#8211; 17, 2021. Virtual event Abstract Applications operating on very large datasets present unique behaviors, such as fine-grained, unpredictable memory accesses, and highly unbalanced task-level parallelism, that make existing high-performance general-purpose processors or accelerators (e.g., GPUs) suboptimal. To address these issues, research &hellip; <a href=\"https:\/\/panda.deib.polimi.it\/?page_id=971\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">ICS 2021 tutorial<\/span> <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":649,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-971","page","type-page","status-publish","hentry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>ICS 2021 tutorial - panda.deib.polimi.it<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/panda.deib.polimi.it\/?page_id=971\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"ICS 2021 tutorial - panda.deib.polimi.it\" \/>\n<meta property=\"og:description\" content=\"Bambu: High-level synthesis for parallel programming ICS21 &#8211; International Conference on SupercomputingJune 14 &#8211; 17, 2021. Virtual event Abstract Applications operating on very large datasets present unique behaviors, such as fine-grained, unpredictable memory accesses, and highly unbalanced task-level parallelism, that make existing high-performance general-purpose processors or accelerators (e.g., GPUs) suboptimal. To address these issues, research &hellip; Continue reading ICS 2021 tutorial &rarr;\" \/>\n<meta property=\"og:url\" content=\"https:\/\/panda.deib.polimi.it\/?page_id=971\" \/>\n<meta property=\"og:site_name\" content=\"panda.deib.polimi.it\" \/>\n<meta property=\"article:modified_time\" content=\"2021-06-14T13:23:43+00:00\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:site\" content=\"@PandA4Design\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/panda.deib.polimi.it\\\/?page_id=971\",\"url\":\"https:\\\/\\\/panda.deib.polimi.it\\\/?page_id=971\",\"name\":\"ICS 2021 tutorial - panda.deib.polimi.it\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/panda.deib.polimi.it\\\/#website\"},\"datePublished\":\"2021-05-03T12:18:01+00:00\",\"dateModified\":\"2021-06-14T13:23:43+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/panda.deib.polimi.it\\\/?page_id=971#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/panda.deib.polimi.it\\\/?page_id=971\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/panda.deib.polimi.it\\\/?page_id=971#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/panda.deib.polimi.it\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"TUTORIALS\",\"item\":\"https:\\\/\\\/panda.deib.polimi.it\\\/?page_id=649\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"ICS 2021 tutorial\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/panda.deib.polimi.it\\\/#website\",\"url\":\"https:\\\/\\\/panda.deib.polimi.it\\\/\",\"name\":\"panda.deib.polimi.it\",\"description\":\"A framework for Hardware-Software Co-Design of Embedded Systems\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/panda.deib.polimi.it\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"ICS 2021 tutorial - panda.deib.polimi.it","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/panda.deib.polimi.it\/?page_id=971","og_locale":"en_US","og_type":"article","og_title":"ICS 2021 tutorial - panda.deib.polimi.it","og_description":"Bambu: High-level synthesis for parallel programming ICS21 &#8211; International Conference on SupercomputingJune 14 &#8211; 17, 2021. Virtual event Abstract Applications operating on very large datasets present unique behaviors, such as fine-grained, unpredictable memory accesses, and highly unbalanced task-level parallelism, that make existing high-performance general-purpose processors or accelerators (e.g., GPUs) suboptimal. To address these issues, research &hellip; Continue reading ICS 2021 tutorial &rarr;","og_url":"https:\/\/panda.deib.polimi.it\/?page_id=971","og_site_name":"panda.deib.polimi.it","article_modified_time":"2021-06-14T13:23:43+00:00","twitter_card":"summary_large_image","twitter_site":"@PandA4Design","twitter_misc":{"Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/panda.deib.polimi.it\/?page_id=971","url":"https:\/\/panda.deib.polimi.it\/?page_id=971","name":"ICS 2021 tutorial - panda.deib.polimi.it","isPartOf":{"@id":"https:\/\/panda.deib.polimi.it\/#website"},"datePublished":"2021-05-03T12:18:01+00:00","dateModified":"2021-06-14T13:23:43+00:00","breadcrumb":{"@id":"https:\/\/panda.deib.polimi.it\/?page_id=971#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/panda.deib.polimi.it\/?page_id=971"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/panda.deib.polimi.it\/?page_id=971#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/panda.deib.polimi.it\/"},{"@type":"ListItem","position":2,"name":"TUTORIALS","item":"https:\/\/panda.deib.polimi.it\/?page_id=649"},{"@type":"ListItem","position":3,"name":"ICS 2021 tutorial"}]},{"@type":"WebSite","@id":"https:\/\/panda.deib.polimi.it\/#website","url":"https:\/\/panda.deib.polimi.it\/","name":"panda.deib.polimi.it","description":"A framework for Hardware-Software Co-Design of Embedded Systems","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/panda.deib.polimi.it\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"}]}},"jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/panda.deib.polimi.it\/index.php?rest_route=\/wp\/v2\/pages\/971","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/panda.deib.polimi.it\/index.php?rest_route=\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/panda.deib.polimi.it\/index.php?rest_route=\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/panda.deib.polimi.it\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/panda.deib.polimi.it\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=971"}],"version-history":[{"count":9,"href":"https:\/\/panda.deib.polimi.it\/index.php?rest_route=\/wp\/v2\/pages\/971\/revisions"}],"predecessor-version":[{"id":995,"href":"https:\/\/panda.deib.polimi.it\/index.php?rest_route=\/wp\/v2\/pages\/971\/revisions\/995"}],"up":[{"embeddable":true,"href":"https:\/\/panda.deib.polimi.it\/index.php?rest_route=\/wp\/v2\/pages\/649"}],"wp:attachment":[{"href":"https:\/\/panda.deib.polimi.it\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=971"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}