Home > Seminars > Vito Castellana - SHAD: the Scalable High-performance Algorithms and Data-structures Library for Big Data Analytics

Vito Castellana - SHAD: the Scalable High-performance Algorithms and Data-structures Library for Big Data Analytics


9/14/2017 at 3:30PM


9/14/2017 at 4:45PM


140 DeBartolo


College of Engineering close button

Peter Kogge

Peter Kogge

VIEW FULL PROFILE Email: kogge@nd.edu
Phone: 574-631-6763
Website: http://www.nd.edu/~kogge/
Office: 326A Cushing


Click below to watch his IEEE Computer Society 2012 Seymour Cray Award video: Dr. Kogge's current research areas include massively parallel processing architectures, advanced VLSI technology and architectures, non van Neumann models of programming and execution, parallel algorithms and ...
Click for more information about Peter
Add to calendar:
iCal vCal

The unprecedented amount of data that needs to be processed in emerging data analytics applications poses novel challenges to industry and academia. Scalability and high performance become more than a desirable feature because due to the scale and the nature of the problems, they draw the line between what is achievable and what is unfeasible. SHAD, the Scalable High-performance Algorithms and Data-structures library, aims at addressing scalability and performance issues typical of data analytics applications, while providing the users with flexibility and high productivity.

SHAD adopts a modular design that confines low level details and promotes reuse. SHAD’s core is built on an Abstract Runtime Interface which enhances portability and identifies the minimal set of features of the underlying system required by the framework. The core library includes the four most common data-structures: Array, Vector, Map and Set. These are designed to accommodate significant amount of data which can be accessed in massively parallel environments, and used as building blocks for SHAD extensions, i.e. higher level software libraries. Among these, SHAD currently offers a Graph Library extension, which implements two different graph data-structures: Compressed Sparse Row and Indexed Neighbours Lists.

Experimental results show that the approach is promising in terms of both performance and scalability. SHAD general purpose data-structures are able to sustain a throughput of millions of operations per second on a distributed system with 320 cores, and the Graph extension is comparable to a custom solution in terms of performance and scalability, with much more limited development effort.

Seminar Speaker:

Vito Castellana

Vito Castellana

igh Performance Computing Group at Pacific Northwest National Laboratory (PNNL)

Vito Giovanni Castellana is a research scientist in the High Performance Computing Group at Pacific Northwest National Laboratory (PNNL), which he joined in 2012.
His research interests include embedded system design and electronic design automation, code transformation, compilation, and optimization, in particular in the domain of data analytics and irregular applications.

In addition to the SHAD library, he has been one of the main designers and developers of GEMS, Graph Engine for Multithreaded Systems, and the GraQL query language.
Since 2010, he has been a major contributor for the Bambu High Level Synthesis tool, investigating new hardware solutions and methodologies for the synthesis of custom hardware accelerators.

Castellana received a PhD in computer science and engineering from Politecnico di Milano, Italy.