LLAMA: Efficient graph analytics using Large Multiversioned Arrays

Citation:

Macko, Peter, Virendra Marathe, Daniel Margo, and Margo Seltzer. 2015. “LLAMA: Efficient graph analytics using Large Multiversioned Arrays.” 2015 IEEE 31st International Conference on Data Engineering (ICDE). Seoul, South Korea: IEEE. Copy at http://www.tinyurl.com/y3w9pcow

Abstract:

We present LLAMA, a graph storage and analysis system that supports mutability and out-of-memory execution. LLAMA performs comparably to immutable main-memory analysis systems for graphs that fit in memory and significantly outperforms existing out-of-memory analysis systems for graphs that exceed main memory. LLAMA bases its implementation on the compressed sparse row (CSR) representation, which is a read-only representation commonly used for graph analytics. We augment this representation to support mutability and persistence using a novel implementation of multi-versioned array snapshots, making it ideal for applications that receive a steady stream of new data, but need to perform whole-graph analysis on consistent views of the data. We compare LLAMA to state-of-the-art systems on representative graph analysis workloads, showing that LLAMA scales well both out-of-memory and across parallel cores. Our evaluation shows that LLAMA’s mutability introduces modest overheads of 3-18% relative to immutable CSR for in-memory execution and that it outperforms state-of-the-art out-of-memory systems in most cases, with a best case improvement of 5x on breadth-first-search.