Cheney, James, Stephen Chong, Nate Foster, Margo Seltzer, and Stijn Vansummeren. 2009. “Provenance: a future history.” Proceedings of the 24th ACM SIGPLAN conference companion on Object oriented programming systems languages and applications, 957–964. ACM.
Digital provenance describes the ancestry or history of a digital object. Most existing provenance systems, however, operate at only one level of abstraction: the system call layer, a workflow specification, or the high-level constructs of a particular application. The provenance collectable in each of these layers is different, and all of it can be important. Single-layer systems fail to account for the different levels of abstraction at which users need to reason about their data and processes. These systems cannot integrate data provenance across layers and cannot answer questions that require an integrated view of the provenance. We have designed a provenance collection structure facilitating the integration of provenance across multiple levels of abstraction, including a workflow engine, a web browser, and an initial runtime Python provenance tracking wrapper. We layer these components atop provenance-aware network storage (NFS) that builds upon a Provenance-Aware Storage System (PASS). We discuss the challenges of building systems that integrate provenance across multiple layers of abstraction, present how we augmented systems in each layer to ntegrate provenance, and present use cases that demonstrate how provenance spanning multiple layers provides functionality not available in existing systems.Our evaluation shows that the overheads imposed by layering provenance systems are reasonable.
Moreau, Luc, Bertram Ludäscher, Ilkay Altintas, Roger S Barga, Shawn Bowers, Steven Callahan, George Chin, et al. 2008. “Special issue: The first provenance challenge.” Concurrency and computation: practice and experience 20 (5). Wiley Online Library: 409–418.