We believe that data should be managed like source code (with packaging, versioning, compilation, linting, and more). Our pilot customers are Fortune 500s who find that Quilt makes their data discoverable, reproducible, and auditable.
Quilt Data was founded by Kevin Moore and Aneesh Karve. They have been fast friends ever since they met in 2005 as graduate students in Computer Science at UW-Madison.
Move data with one command. Discover packages from the community. Share your packages (or keep them private).
Just finished some heroic data collection? Package it for the benefit of others.
$ pip install quilt
$ quilt install uciml/iris
>>> from quilt.data.uciml import iris
# you've got data
Quilt stores immutable versions for every piece of data. Reproduce analyses from any point in time. Lose something? Roll back and start over.
Organize scattered files
Combine numerous files and folders into simple, reusable packages. Quilt deduplicates repeated files, minimizing network and storage bottlenecks.
Collaborate in teams
Quilt integrates data sources so that everyone is on the same page. Quilt Team Edition offers a high-security, dedicated data registry where colleagues can discover and share data.
Audit every access
Quilt admins can audit every read and every write to the registry.
Import clean data with one line of code. Start working. No more scripting to download, clean, and load data.
Quilt invisibly converts your data to Apache Parquet, a columnar storage format, for faster I/O and faster analysis with Presto DB and Hadoop tools.
from quilt.data.uciml import iris
df = iris.tables.iris() # done - you've got data more