Data validation in snowflake
Why and How?
Benefits of using Datagaps to test data movement into Snowflake
We recently sat down with one of our clients that uses our DataFlow product for testing the data migration from on-Prem SQL Server to Snowflake in the cloud running in AWS.
Implementation
They started the initial migration by performing bulk loads from 400 SQL Server tables to Snowflake with minimal transformations. This was stage 0, where they could perform source and target data comparison for over 500 million rows of data per table. Making use of the Data Migration wizard, the client was able to generate comparison tests for 400 tables in just a few hours. Even though there were few changes at this stage, they still encountered errors that were surfaced by our
DataFlow product.
DataFlow was used once again to check the accuracy between the two systems once the new processes were in place. The validations exposed issues in the new ELT process through several iterations until the transformation were in sync. After a short period of testing, they could cut over to the new system and deprecate the SQL server environment. Now DataFlow continued to validate the incremental data as it was moved into Snowflake, finding issues earlier in the cycle where they are less costly to fix in time and lost credibility.
How it makes a difference?
0%
Reduction in Testing
Time by
0%
Improved Testing ROI by
Amount of data tested increased from manual testing of 10,000 sample records to complete testing of 500 M.
Conclusion
In conclusion, the goals of the migration project of agility, cost savings and performance improvements were achieved.
They also realized these benefits months earlier as a result of the improvement in the migration process due to the impact of the DataFlow products contribution in an estimated 50% test cycle reduction.
One of our clients reports comparing a file against a Snowflake instance with





