Recce is a data change management toolkit designed to help data teams evaluate, validate, and share the impact of data modifications before they are merged into production. By integrating seamlessly into existing workflows, Recce enhances collaboration, reduces review times, and ensures the accuracy of data deployments.
Key Features and Functionality:
- Column-Level Impact Analysis: Identify downstream models and columns affected by changes, providing clear visibility into potential impacts.
- One-Click Data Validation Tests: Compare production and development data using value, schema, profile, and histogram differences to detect discrepancies efficiently.
- Custom Query Comparisons: Execute SQL queries across environments to pinpoint specific differences and validate changes.
- Automated CI Validation: Integrate with continuous integration pipelines to run data validation tests automatically on every pull request.
- PR Comment Automation: Receive validation summaries directly within pull request threads, streamlining the review process.
- LLM-Powered Validation Insights: Leverage AI to analyze changes and suggest optimal data tests, enhancing validation accuracy.
- Preset Validation Checks: Establish standardized checks to run across all pull requests, ensuring consistency and reliability.
- PR Blocking Until Validation Passes: Prevent the merging of pull requests until all data validation checks are successfully completed, safeguarding data integrity.
- Shared Team Checklists: Standardize validation workflows across teams, promoting collaboration and accountability.
Primary Value and Problem Solved:
Recce addresses the challenges data teams face in managing and validating data changes by providing tools that offer visibility, verifiability, and velocity. By detecting changes, verifying their impact, and automating best practices, Recce transforms the data deployment process from a potential bottleneck into a competitive advantage. Teams can catch errors early, automate validation steps, and reduce manual review time, ultimately shipping accurate data faster and with greater confidence.