To aid the process of creating reproducible experiments, we have
developed ReproZip, a tool that automatically captures the provenance
of existing experiments and packs all the components necessary to
reproduce the results in different environments. The system will be
demonstrated at SIGMOD 2013
, in New York City and at the Beyond the PDF 2
Conference , in Amsterdam. The first public release of the tool is
planned for Spring 2013.
If you want to learn more about reproducibility in science, check
out:
http://www.reproduciblescience.org .
Check out the September issue of the IEEE Data Engineering Bulletin: Data
Management beyond Database Systems. It contains a collection
of articles which highlight the importance of cross-domain
synergies and the need to go beyond traditional database systems and
to make database technology more accessible---both easier to use for
end-users and easier to integrate with other systems.
If you'd like to learn how workflows and provenance can be used
to automate the creation of customized applications/mashups, check out
our paper in IEEE Vis 2009: VisMashup: Streamlining the Creation of
Custom Visualization Applications, by Emanuele Santos, Lauro Lins, James Ahrens, Juliana Freire and Claudio Silva.
Reproducibility in Science. We are building infrastructure to simplify the creation, review and sharing of computational experiments.
VisTrails. VisTrails
is an open-source data analysis and visualization sytem.
It captures detailed provenance for the data exploration process
and uses this information to streamline the creation, execution,
and sharing of computational processes (aka workflows, dataflow,
pipelines) which are widely used to construct visualizations,
perform data analysis and mining.
Provenance Analytics
BirdVis: Visualizing Geo-Temporal
Data. BirdVis is an interactive visualization system that
supports the analysis of spatio-temporal bird distribution models.
Finding and Querying Structured Data on the Web. In this
project, we have addressed the problem of large-scale information
integration to enable on-the-fly queries over structured Web data.
Uncovering Hidden Web data. Our goal in this project is
to develop a scalable infrastructure that automates, to a large
extent, the process of discovering, organizing, and extracting
data from hidden-Web sources. We have built DeepPeep, a new search engine
specialized in Web forms. For more details about this project, see
http://fleixeiras.cs.utah.edu/webdb.
NSF CAREER. This award has supported my group in the development of
new algorithms and infrastructure for efficiently managing workflows and
their provenance and for enabling casual users (who do not
necessarily have programming expertise) to perform exploratory
tasks and solve problems through workflows. In the context of this project, we are
collaborating closely with domain scientists in different domains and are activelly
contributing to the open-source VisTrails system.
Towards and Infrastructure to Create Reproducible Papers. Beyond the PDF Workshop. San Diego. January 19th, 2011.
Publishing Reproducible Results with VisTrails. Juliana Freire and Claudio Silva. SIAM Mini-Symposium on Reproducible Research. Las Vegas. March 4, 2011.
Provenance-Rich Science. Juliana Freire. FORTH. Crete, Greece. June 22, 2011.
A Provenance-Based Infrastructure for Creating Reproducible Papers. AMP 2011 Workshop on Reproducible Research. Vancouver, Canada. July 14, 2011.
Provenance-Rich Science.
Keynote at the DB/IR Day, AT&T Shannon Labs, Florham Park, NJ, October 22, 2010.
Provenance Management for Data Exploration.
Keynote at the International Conference on Data Integration in the Life Sciences (DILS), Sweden, August 2010.
Infrastructure for Understanding Human Knowledge. ICiS Workshop on Integrating, Representing, and Reasoning over Human Knowledge: A Computational Grand Challenge for the 21st Century. Snowbird, August 8, 2010.
Supporting Provenance-Rich Science with VisTrails. CScADS Scientific Data and Analytics for Petascale Computing Workshop. Snowbird, July 26, 2010.
The WebDB Group: Research Overview. Federal University of Amazonas, Manaus, Brazil, June 29th, 2010.