This article discusses how a migration from Oracle to Azure SQL Server can be validated from an End-to-End perspective.
Why migrate from Oracle to Azure SQL Server?
Many organizations continue to use Oracle database for storing transactional and warehouse data in their on-premise data centers. However, with data growing exponentially, their data environments are struggling to handle the growing load which requires more processing power and increased storage, adding to data costs. So, migrating Oracle data to the cloud is an extremely viable option for optimizing costs and performance, and migrating the Oracle database to Azure SQL Server (or Azure Synapse) is a popular alternative.
Migration from Oracle to Azure SQL Server can be justified mainly by below:
Help save money: Oracle Database is a great platform but does involve sizeable licensing costs.
Help move data and compute from an on-premise data center with limited resources to the cloud opens up the possibility of scaling on-demand.
Raw data can be stored in a Data Lake using ADLS reducing the need to store every dataset in the Oracle database.
Reduce on-premise database and data center management costs by making use of cloud based PAAS.
Primary Challenges in Data Migration
While Oracle and Azure both are RDMS with SQL as the main language, there are a few differences that can cause issues during or post-migration.
- Data types conversion issues – As both the systems interpret SQL ANSI standards slightly differently, there are variations in the data type conversions, especially in terms of numeric and date data types & formats.
- Metadata migration issues – Oracle database objects such as indexes, sequences, packages, procedures, triggers, grants and constraints need to be migrated and validated in the SQL Server database.
- Performance for Data-Intensive workloads post migration in the new database needs to be validated and tuned
- Data corruption: Data migration challenges, including the possibility of data corruption, data loss, and extended downtime. If the target system encounters some defect while the transfer is underway, the data migration process would be unsuccessful, leading to data corruption. Data load failures may result in duplicates or missing data.
- Prolonged Downtime: The probability of extended downtime occurs when the data migration process takes longer than anticipated. The source system is not functional during the migration process, which presents a potential risk for organizations.
- Incremental Data syncs: Data migration projects usually involve syncing of incremental data between Oracle and SQL Server databases while the new environment is being tested and validated. This required data comparison and validation to ensure that both systems have the same data and the new SQL Server database is operating as expected.
How to Overcome the Risks Associated with Data Migration?
While migrating data, here are some of the best practices to keep in mind:
- Data back-up: Make sure to back up data before proceeding further with implementation. If something goes wrong during the execution, you can revert if you have taken a backup.
- Plan efficiently: Manage your resources well in advance to effectively carry out the migration process. Data migration can be complex, so stick to the plan and work accordingly.
- Automated Testing: To ensure favorable outcomes, testing is crucial. Therefore, throughout the planning and implementation phase, test the data migration process.
The end-to-end migration can be divided into 3 phases and each phase consists of one or more activities.
- Pre-migration – Analysis, Discover, Assess and convert. These are important to prepare a plan for a successful migration. However, the conversion activity will be required in heterogeneous migration scenarios to convert the source schema to work in the target schema.
- Migration – Public schema, Data migration, and Data sink All three activities are important for a successful migration and to decide the cutover from the source to the target environment (new source after successful migration).
- Post-migration – Validation, Application remediation, and Performance validation and optimization. All of these tasks will need to be repeated in order to verify the migration status, correct the application, and validate performance and optimization.
For any successful technology migration, a defined approach is required which helps to recover from unknown situations. The migration phases consist of multiple activities and each activity has its own set of tasks to accomplish.
To do the database migration, the following activities need to be completed.
- Develop validation tests: To test the database migration, we need to use SQL queries. Create the validation queries to run against both the source and target databases. Your validation queries should cover the scope that you’ve defined.
- Set up a test environment: The test environment should contain a copy of the source database and target database.
- Run validation tests: Run the validation tests against the source and target, and then view the results.
- Run performance tests: Run performance tests against the source and target, analyze and then compare the results.
Pitfalls and Drawbacks
The scope of Data Migration projects are often underestimated resulting in delays. Data validation is a key component of Data Migration projects and lack of efforts to automate the validation results in data issues being identified after go-live. Delays in data migration projects may impact other strategic initiatives within the enterprise as migration to the cloud is one of the first steps towards the digital transformation.
Cost Of Migration
Time To Market
Poor Data Quality
Migrating data is not only a process of migrating the data from its source system to the destination system, but it is also a complex task of retaining the quality of the data which requires a dedicated validation team with quality assurance skills and trusted validation tools.
From the validation point of view, it means that the data must be validated thoroughly end-to-end along with migrating from the existing system to the new destination system successfully. Therefore, Validation of already migrated data could be achieved by following some standard steps.
Migration Validation Scope
One must have a clear understanding and accurate analysis of the scope of the migration test as to when and what needs to be tested.
Understanding the data volumes and the source/target network connectivity for validation is the key. Depending on the data volumes, the data may be migrated in phases such as Initial migration and Incremental migration.
Comparing the metadata such as Tables, Columns and Data Types between Oracle and SQL Sever is first validation that needs to be performed prior to executing the data migration jobs or validating the data migration.
Converting database objects such as procedures, packages, functions and triggers will require custom development and testing.
Data Transformation check
This process is used to validate if the data transformation is working in accordance with the mapping logic document and business rules to check if new tables or columns are created with the proper data types during the data migration.
Data Integrity and Uniqueness check
- Data Integrity Check: In this, the key attributes are validated to check if the referential integrity between parent and child tables is mapped correctly and if any uncertain data is removed or loaded.
- Data Uniqueness Check: This is nothing but a data duplication check, which is evaluated to verify the uniqueness of data of a particular table as per the data model rule.
Data Completeness check
Data completeness is evaluated throughout the ETL testing that uses aggregate functions like (sum, max, min, and count) to evaluate the average completeness of a column or record. Throughout the evaluation process, it needs to be verified whether the data in the target system is as per expectation after loading which can help in achieving data completeness.
- Record count verification.
- Comparing source and target data sets.
- Records validation.
Data Quality Check
It is critical to make sure that the data is correctly loaded into the destination tables, it must be ensured that the application correctly rejects, substitutes default values, ignores unwanted characters, and reports invalid records if any.
By following the above steps, one can achieve the data migration 100%.
Validation of ETL Changes
Since this doesn’t contain any data initially, after full load/initial load when the ETL process happens for the second, third, or fourth time, data is loaded in an incremental fashion where changes are appended to the data warehouse to keep the modified records.
The Insertion of a record happens only if a record doesn’t already exist, else the update is done only for modified records.
Validate data sources — Perform a data count check and verify that the table and column data type meet the specifications of the data model. Make sure check keys are in place and remove duplicate data. If not done correctly, the aggregate report could be inaccurate or misleading.
Validate constraints – This involves validating the constraints and making sure they are applied on the expected tables. This involves checking the misuse of integrity constraints like Foreign Key.
Validate nulls – This involves validating the full data set in the source and the target tables by using the minus query. We need to perform both source minus target and target minus source. If the minus query returns a value, then that should be considered as mismatched rows.
Data migration validation is as important in ensuring that the data has been migrated successfully into the Cloud database without data corruption or data loss. Post-migration testing phase should include, the comparison of count of records in source vs target, then it must be confirmed during the migration whether the records were transformed as per the business requirements. Then, the data reconciliation, Duplicate check, NULL check, and referential integrity check must be performed to ensure the retention of the quality of data. Data is a key asset for any enterprise today and it should be treated as such during the data migration.
Established in the year 2010 with the mission of building trust in enterprise data & reports. Datagaps provides software for ETL Data Automation, Data Synchronization, Data Quality, Data Transformation, Test Data Generation, & BI Test Automation. An innovative company focused on providing the highest customer satisfaction. We are passionate about data-driven test automation. Our flagship solutions, ETL Validator, Data Flow and BI Validator are designed to help customers automate the testing of ETL, BI, Database, Data Lake, Flat File, & XML Data Sources. Our tools support Snowflake, Tableau, Amazon Redshift, Oracle Analytics, Salesforce, Microsoft Power BI, Azure Synapse, SAP BusinessObjects, IBM Cognos, etc., data warehousing projects, and BI platforms. www.datagaps.com