PureML is an open-source version control for machine learning. It is a Python library that uploads metadata to S3 and provides a UI to view data lineage. It is designed to address the reproducibility crisis in machine learning by providing an intuitive versioning system for machine learning objects. It is also designed to handle large files, key/value metadata, and record information automatically from inside a training script. To contribute to the project, please visit the Contributing Guide and follow PureML on social media for updates.
