Databases are key component of most software applications. Application testing is well documented and important for the success of the project but often there is little or no testing done for the development done in databases. Database testing is the process of validating that the metadata (structure) and data stored in the database meets the requirement and design. Database testing is important because it helps identify data quality and application performance issues that might other wise get detected only after the application has been live for some time.
The purpose of Metadata Testing is to verify that the table definitions conform to the data model and application design specifications.
Verify that the table and column data type definitions are as per the data model design specifications.
Example: Data Model column data type is NUMBER but the database column data type is STRING (or VARCHAR).
Verify that the length of database columns are as per the data model design specifications.
Example: Data Model specification for the ‘first_name’ column is of length 100 but the corresponding database table column is only 80 characters long.
Verify that proper constraints and indexes are defined on the database tables as per the design specifications.
Example 1: A column was defined as ‘NOT NULL’ but it can be optional as per the design.
Example 2: Foreign key constraints were not defined on the database table resulting in orphan records in the child table.
Verify that the names of the database metadata such as tables, columns, indexes are as per the naming standards.
Example: The naming standard for Fact tables is to end with an ‘_F’ but some of the fact tables names end with ‘_FACT’.
Compare table and column metadata across environments to ensure that changes have been migrated appropriately.
Example: A new column added to the SALES fact table was not migrated from the Development to the Test environment resulting in ETL failures.
ETL Validator comes with Metadata Compare Wizard for automatically capturing and comparing Table Metadata.
The purpose of Data Quality tests is to verify the accuracy and quality of the data. Data profiling is generally used to identify data quality issues in production systems once the application has been live for some time. However, the goal of database testing is to automate the data quality checks in the testing phase.
Look for duplicate rows with same unique key column or a unique combination of columns as per business requirement.
Example: Business requirement says that a combination of First Name, Last Name, Middle Name and Data of Birth should be unique.
Sample query to identify duplicates
SELECT fst_name, lst_name, mid_name, date_of_birth, count(*) FROM Customer GROUP BY fst_name, lst_name, mid_name HAVING count(*)>1
Many database fields can contain a range of values that cannot be enumerated. However, there are reasonable constraints or rules that can be applied to detect situations where the data is clearly wrong. Instances of fields containing values violating the validation rules defined represent a quality gap that can impact ETL processing.
Example: Date of birth (DOB). This is defined as the DATE datatype and can assume any valid date. However, a DOB in the future, or more than 100 years in the past are probably invalid. Also, the date of birth of the child is should not be greater than that of their parents.
This measurement addresses “keyed” relationships of entities within a domain. The goal of these checks is to identify orphan records in the child entity with a foreign key to the parent entity.
Example: In a ERP scenario, Order Line table has a foreign key for the Order Header table. Check for Orphan Order line records without the corresponding Order header record.
1. Count of nulls in the Order Header foreign key column in the Order Line table:
SELECT count(order_header_id) FROM order_lines where order_header_id is null
2. Count of invalid foreign key values in the Order Line table:
SELECT order_header_id FROM order_lines
SELECT h.order_header_id FROM order_header h, order_lines l where h.order_header_id=l.order_header_id
ETL Validator comes with Data Rules Test Plan and Foreign Key Test Plan for automating the data quality testing.
Many database fields can only contain limited set of enumerated values. Instances of fields containing values not found in the valid set represent a quality gap that can impact processing.
Data model standards dictate that the values in certain columns should adhere to a values in a domain.
Example: Values in the country_code column should have a valid country code from a Country Code domain.
select distinct country_code from address
select country_code from country
One of the challenge in maintaining reference data is to verify that all the reference data values from the development environments has been migrated properly to the test and production environments.
Example: Compare Country Codes between development, test and production environments.
Baseline reference data and compare it with the latest reference data so that the changes can be validated.
Example: A new country code has been added and an existing country code has been marked as deleted in the development environment without the approval or notification to the data stewart.
ETL Validator comes with Baseline & Compare Wizard and Data Rules test plan for automatically capturing and comparing Table Metadata.
It is quite common to have database procedures with business logic in an application. As part of white box testing, examine the database procedure structure and derive test data from the program logic / code.
Unit testing of the database procedures is similar to the unit testing process followed by development / QA teams for testing of code written in other languages such as Java and C#.
The steps to be followed are listed below:
Example: In a financial company, the interest earned on the savings account is dependent the daily balance in the account for the month. A database procedure was written in the application to calculate the interest earned as part of the month end process.
- Review the requirement and design for calculating the interest and come up with the test cases.
- Setup test accounts and the corresponding daily balance records.
- Execute procedure for calculating the interest passing the corresponding account and month details. Verify that the interest output from the procedure is the expected value.
- Cleanup test account and daily balance data from database tables.
The goal of Database Regression testing is to identify any issues that might occur due to changes in the database metadata, procedures or system upgrades.
Automating the Database testing is the key for regression testing of the Database particularly more so in an agile development environment. Organizing test cases into test plans (or test suites) and executing them automatically as and when needed can reduce the time and effort needed to perform the regression testing. Automating Database testing can also eliminate any human errors while performing manual checks.
Track changes to table metadata such as adding or dropping of columns, new constraints and tables. Often database metadata changes are not communicated to the QA and Development teams resulting in Application failures. Maintaining a history of DDL changes will help narrow down the tests that need to be run.
Example 1: The length of a comments column was increased in the application UI but the corresponding increase was not made in the database table.
Example 2: One of the index in the database was dropped accidentally which resulted in application performance issues.
ETL Validator comes with a Metadata Compare Wizard that helps track changes to Table metadata over a period of time. This helps ensure that the QA and development teams are aware of the changes to table metadata.
ETL Validator also has the capability to Baseline and Compare the output of database procedures so that any changes in the output can be validated.
Example: One of the optional postal address field in the UI was not getting saved to the database because of an application defect. The issue was only identified after the application went live when some of the mails were returned.
Example: The application UI allowed the user to enter the comments larger than 2000 characters while the corresponding database column was defined as VARCHAR(2000). Any user comments longer than 2000 characters were getting truncated by the database procedure during the insert.
Example 1: An update might perform well when the data is small but it can run very slow with larger data volume because of a missing index.
Example 2: The application runs fine with a single user but performs badly when multiple users are using it due to a connection pool issue.