Challenges in Database Testing
Databases are key component of most software applications. Application testing is well documented and important for the success of the project but often there is little or no testing done for the development done in databases. Database testing is the process of validating that the metadata (structure) and data stored in the database meets the requirement and design. Database Testing is important because it helps identify data quality and application performance issues that might other wise get detected only after the application has been live for some time.
Database Testing Categories
- Metadata Testing
- Data Quality Testing
- Reference Data Testing
- Database Procedure Testing
- Database Regression Testing
- Database Integration Testing
- Database Performance Testing
The purpose of Metadata Testing is to verify that the table definitions conform to the data model and application design specifications.
Data Type Check
Verify that the table and column data type definitions are as per the data model design specifications.
Example: Data Model column data type is NUMBER but the database column data type is STRING (or VARCHAR).
Data Length Check
Verify that the length of database columns are as per the data model design specifications.
Example: Data Model specification for the ‘first_name’ column is of length 100 but the corresponding database table column is only 80 characters long.
Index / Constraint Check
Verify that proper constraints and indexes are defined on the database tables as per the design specifications.
- Verify that the columns that cannot be null have the ‘NOT NULL’ constraint.
- Verify that the unique key and foreign key columns are indexed as per the requirement.
- Verify that the table was named according to the table naming convention.
Example 1: A column was defined as ‘NOT NULL’ but it can be optional as per the design.
Example 2: Foreign key constraints were not defined on the database table resulting in orphan records in the child table.
Metadata Naming Standards Check
Verify that the names of the database metadata such as tables, columns, indexes are as per the naming standards.
Example: The naming standard for Fact tables is to end with an ‘_F’ but some of the fact tables names end with ‘_FACT’.
Metadata Check Across Environments
Compare table and column metadata across environments to ensure that changes have been migrated appropriately.
Example: A new column added to the SALES fact table was not migrated from the Development to the Test environment resulting in ETL failures.
Automate Metadata Testing with ETL Validator
ETL Validator comes with Metadata Compare Wizard for automatically capturing and comparing Table Metadata.
- Track changes to Table metadata over a period of time. This helps ensure that the QA and development teams are aware of the changes to table metadata in both Source and Target systems.
- Compare table metadata across environments to ensure that metadata changes have been migrated properly to the test and production environments.
- Compare column data types between source and target environments.
- Validate Reference data between spreadsheet and database or across environments.
The purpose of Data Quality tests is to verify the accuracy and quality of the data. Data profiling is generally used to identify data quality issues in production systems once the application has been live for some time. However, the goal of database testing is to automate the data quality checks in the testing phase.
Duplicate Data Checks
Look for duplicate rows with same unique key column or a unique combination of columns as per business requirement.
Example: Business requirement says that a combination of First Name, Last Name, Middle Name and Data of Birth should be unique.
Sample query to identify duplicates
SELECT fst_name, lst_name, mid_name, date_of_birth, count(*) FROM Customer GROUP BY fst_name, lst_name, mid_name HAVING count(*)>1
Data Validation Rules
Many database fields can contain a range of values that cannot be enumerated. However, there are reasonable constraints or rules that can be applied to detect situations where the data is clearly wrong. Instances of fields containing values violating the validation rules defined represent a quality gap that can impact ETL processing.
Example: Date of birth (DOB). This is defined as the DATE datatype and can assume any valid date. However, a DOB in the future, or more than 100 years in the past are probably invalid. Also, the date of birth of the child is should not be greater than that of their parents.
Data Integrity Checks
This measurement addresses “keyed” relationships of entities within a domain. The goal of these checks is to identify orphan records in the child entity with a foreign key to the parent entity.
- Count of records with null foreign key values in the child table.
- Count of invalid foreign key values in the child table that do not have a corresponding primary key in the parent table.
Example: In a ERP scenario, Order Line table has a foreign key for the Order Header table. Check for Orphan Order line records without the corresponding Order header record.
1. Count of nulls in the Order Header foreign key column in the Order Line table:
SELECT count(order_header_id) FROM order_lines where order_header_id is null
2. Count of invalid foreign key values in the Order Line table:
SELECT order_header_id FROM order_lines
SELECT h.order_header_id FROM order_header h, order_lines l where h.order_header_id=l.order_header_id
Automate Data Quality Testing using ETL Validator
ETL Validator comes with Data Rules Test Plan and Foreign Key Test Plan for automating the data quality testing.
- Data Rules Test Plan: Define data rules and execute them on a periodic basis to check for data that violates them.
- Foreign Key Test Plan: Define data joins and identify data integrity issues without writing any SQL queries.
Many database fields can only contain limited set of enumerated values. Instances of fields containing values not found in the valid set represent a quality gap that can impact processing.
Verify that Data confirms to reference Data Standards
Data model standards dictate that the values in certain columns should adhere to a values in a domain.
Example: Values in the country_code column should have a valid country code from a Country Code domain.
select distinct country_code from address
select country_code from country
Compare Domain Values across Environments
One of the challenge in maintaining reference data is to verify that all the reference data values from the development environments has been migrated properly to the test and production environments.
Example: Compare Country Codes between development, test and production environments.
Track Reference Data Changes
Baseline reference data and compare it with the latest reference data so that the changes can be validated.
Example: A new country code has been added and an existing country code has been marked as deleted in the development environment without the approval or notification to the data stewart.
Automate Reference Data Testing using ETL Validator
ETL Validator comes with Baseline & Compare Wizard and Data Rules test plan for automatically capturing and comparing Table Metadata.
- Baseline reference data and compare with the latest copy to track changes to reference data.
- Define data rules to verify that the data conform to the domain values.
It is quite common to have database procedures with business logic in an application. As part of white box testing, examine the database procedure structure and derive test data from the program logic / code.
Database Procedure Unit Testing
Unit testing of the database procedures is similar to the unit testing process followed by development / QA teams for testing of code written in other languages such as Java and C#.
The steps to be followed are listed below:
- Review Design: Review the database procedures to understand the code and design specifications to come up with the Test Cases.
- Setup Test Data: Insert test data into the database tables for the test.
- Execute Test: Run the database procedure passing the appropriate input parameters. Compare data in tables and procedure output with the expected results.
- Teardown Test Data: Delete or modify the test data in the tables to their original state.
Example: In a financial company, the interest earned on the savings account is dependent the daily balance in the account for the month. A database procedure was written in the application to calculate the interest earned as part of the month end process.
- Review the requirement and design for calculating the interest and come up with the test cases.
- Setup test accounts and the corresponding daily balance records.
- Execute procedure for calculating the interest passing the corresponding account and month details. Verify that the interest output from the procedure is the expected value.
- Cleanup test account and daily balance data from database tables.
The goal of Database Regression testing is to identify any issues that might occur due to changes in the database metadata, procedures or system upgrades.
Automated Database Testing
Automating the Database testing is the key for regression testing of the Database particularly more so in an agile development environment. Organizing test cases into test plans (or test suites) and executing them automatically as and when needed can reduce the time and effort needed to perform the regression testing. Automating Database testing can also eliminate any human errors while performing manual checks.
Changes to Metadata
Track changes to table metadata such as adding or dropping of columns, new constraints and tables. Often database metadata changes are not communicated to the QA and Development teams resulting in Application failures. Maintaining a history of DDL changes will help narrow down the tests that need to be run.
Example 1: The length of a comments column was increased in the application UI but the corresponding increase was not made in the database table.
Example 2: One of the index in the database was dropped accidentally which resulted in application performance issues.
Automate Database Regression Testing using ETL Validator
ETL Validator comes with a Metadata Compare Wizard that helps track changes to Table metadata over a period of time. This helps ensure that the QA and development teams are aware of the changes to table metadata.
ETL Validator also has the capability to Baseline and Compare the output of database procedures so that any changes in the output can be validated.
When an application is tested from the UI, the database procedures and tables are accessed and tested as well. The goal of integration testing is to validate that the database tables are getting populated with the expected data based on input provided in the application.
End-to-End Data Testing
Integration testing of the database and the related applications involves the following steps:
- Review the application UI to database attribute mapping document. Prepare one if missing.
- Run tests in the application UI tracking the test data input.
- Verify that the data loaded into the database tables matches the input test data.
Example: One of the optional postal address field in the UI was not getting saved to the database because of an application defect. The issue was only identified after the application went live when some of the mails were returned.
Max. Data Length Testing
The focus of this test is to validate that the data is not getting truncated while being stored in the database:
- Review the application UI to database attribute mapping document. Prepare one if missing.
- Run tests in the application UI by entering the maximum allowed test data length from the UI.
- Verify that the data loaded into the database tables matches the input test data without any truncation.
Example: The application UI allowed the user to enter the comments larger than 2000 characters while the corresponding database column was defined as VARCHAR(2000). Any user comments longer than 2000 characters were getting truncated by the database procedure during the insert.
Database performance is an important aspect of the application performance testing. Often development environments do not have enough data for performance testing of the database. This could be because the project has just started and the database only has small amount of test data or production data has PII information which cannot be loaded into the test database without scrubbing. The application can behave differently with different volumes of data. Setting up realistic test data using one of the following strategies is a key to performance testing :
- Mask and refresh test data in the test environments from production environment.
- Generate larger volume of test data using data generation tools.
Example 1: An update might perform well when the data is small but it can run very slow with larger data volume because of a missing index.
Example 2: The application runs fine with a single user but performs badly when multiple users are using it due to a connection pool issue.
Download ETL Validator
Get a Competitive Advantage with ETL Validator. Download your 14 day free trial now.