AHPCRC Projects
Project 4-2: Massive Scale Data Analysis on the Flexible Architecture Research Machine (FARM) Principal Investigators: Kunle Olukotun, Christos Kozyrakis (Stanford University) |
![]() |
![]() |
![]() |
| One configuration of FARM | Large skewed graph |
| Graphics this page courtesy Kunle Olukotun, Christos Kozyrakis (Stanford University). | |
If heterogeneous systems and other non-traditional coprocessors are to gain widespread acceptance for visualization and other data-intensive tasks, collaborative work must be done in architecture, algorithms, and system software tools such as compilers and schedulers. Improving performance and programmability in high performance scientific computing requires insights generated by experiments that combine innovative computer systems and demanding scientific applications running on large data sets at full speed and at large scale. The Flexible Architecture Research Machine (FARM) is a flexible, high performance prototyping platform that allows researchers to demonstrate full-system prototypes running full-sized HPC applications, which is necessary for fast technology transfer to the Army or industry. The lessons learned from the FARM will be useful for the design of future large-scale computing systems for scientific applications. The FARM closely couples commodity processor and field-programmable gate array (FPGA) technologies, making it ideal for experimenting with application-specific accelerators and novel memory system designs. AHPCRC researchers have developed a hybrid software–hardware transactional memory system for the FARM to simplify the task of parallel programming. As a part of this effort, Transactional Memory Acceleration using Commodity Cores (TMACC) has been implemented, and is the only hardware implementation to date that handles large transactions. A simulation environment developed for this system has demonstrated that excellent parallel performance can be achieved using a hybrid transactional memory system accelerated by an external coherently connected FPGA. During the next stage of the project, new architectures will be developed and optimized for irregular applications and programming environments for graph algorithms using a hardware–software codesign methodology and the FARM, realistic problem sizes, and innovative global shared address space architecture ideas. Graphs are a fundamental abstraction for modeling and analyzing data, and are used extensively in counter-intelligence, business intelligence, data discovery, web mining, simulation data analysis, social network analysis, and bioinformatics. Graphs can be used to represent transportation networks, communication networks, and socio-economic interactions. Graph-based models have also been used in machine learning for machine translation and robotics, and many scientific computing problems can be formulated using graphs. Because of the explosion in amount of data and the complexity of the phenomena being modeled, recent graph representations have grown to hundreds of millions or billions of nodes. The size of the graphs combined with high-complexity graph analysis algorithms makes large-scale graph analysis a challenging problem. Current architectures and programming environments are not well suited to these algorithms. Current SMP architectures are too expensive and do not scale to the numbers of processors required for analyzing graphs associated with massive data sets. Large-scale graphs are not easy to partition and may require frequent updates to the graph structure during analysis. This makes them unsuitable for current commodity clusters, which do not provide global random memory access or enough communication bandwidth for the irregular data access of graph algorithms. Furthermore, the separate address spaces of clusters require a message passing programming model that is not well suited to graph algorithms. During the coming year, this project will focus on developing new parallel architectures, programming models and domain-specific languages for massive-scale graph-based data analysis. |
|



