Data validation in snowflake
This article discusses why and how to use both together, and dives into the challenges of Bulk Data Migration to Snowflake.
Why and How?
This is where Datagaps come in to play.
Benefits of using Datagaps to test data movement into Snowflake
We recently sat down with one of our clients that uses our DataFlow product for testing the data migration from on-Prem SQL Server to Snowflake in the cloud running in AWS.
Implementation
They started the initial migration by performing bulk loads from 400 SQL Server tables to Snowflake with minimal transformations. This was stage 0, where they could perform source and target data comparison for over 500 million rows of data per table. Making use of the Data Migration wizard, the client was able to generate comparison tests for 400 tables in just a few hours. Even though there were few changes at this stage, they still encountered errors that were surfaced by our
DataFlow product.
Next, they began to perform incremental new data migrations where they continued to find similar issues that had to be corrected. As this continued, they wanted to transition from this incremental new data migration from SQL Server to loading the new data directly into Snowflake to reap the benefits stated earlier. To accomplish this, their initial ETL processes needed to be migrated to an ELT process aimed at Snowflake.
DataFlow was used once again to check the accuracy between the two systems once the new processes were in place. The validations exposed issues in the new ELT process through several iterations until the transformation were in sync. After a short period of testing, they could cut over to the new system and deprecate the SQL server environment. Now DataFlow continued to validate the incremental data as it was moved into Snowflake, finding issues earlier in the cycle where they are less costly to fix in time and lost credibility.
How it makes a difference?
0%
Reduction in Testing
Time by
0%
Improved Testing ROI by
Amount of data tested increased from manual testing of 10,000 sample records to complete testing of 500 M.
Conclusion
In conclusion, the goals of the migration project of agility, cost savings and performance improvements were achieved.
They also realized these benefits months earlier as a result of the improvement in the migration process due to the impact of the DataFlow products contribution in an estimated 50% test cycle reduction.
One of our clients reports comparing a file against a Snowflake instance with




