DataOps Suite Top Feature Updates – Version 2022.5.0
REST APIs are being used extensively for Application development as well as a means of integration between systems. DataOps suite now supports API as a data source and the corresponding API Component in Dataflow can be used to call the REST API and automatically convert the response into datasets for further validation and processing.
Since the REST API responses can be deeply hierarchical JSON documents, DataOps suite automatically splits the data into separate datasets with relations so that anyone with SQL query knowledge can easily query the REST API output and validate the data by comparing it with data from another data source.
Plugin component adds extensibility to DataOps suite. The plugin system lets users define methods with user-defined parameters as inputs which once defined, can be called on with a few clicks by the end user. Plugins can be written in Python or Scala. Once configured, they can be used within the dataflow without the need for copy-pasting the code thus promoting code re-use.
One of the example plugin we recently created was to read multiple sheets of excel (usually reports received by mail) as separate datasets to be used for comparisons.
Metrics Compare Component
Metrics component makes it easy to compare metrics such as count, sum etc. across different systems. When it comes to production data monitoring, it is a common requirement to compare the counts of source and target records to validate that the ETL job processes all the records as expected.
Another feature of Metrics component is that it can be used to compare multiple metrics in a single component. These metrics can be executed in parallel for better performance.
Data Profile Component
Data Profile component has been available in DataOps suite for a very long time. It takes a dataset as an input and automatically computes aggregates such as maximum value, minimum value, minimum length, maximum length, Null (%), distinct (%), distinct count, Null count, mean, sum, row count, Constancy, Kurtosis, Skewness, and standard deviation.
In this release, Data Profile component can be used to identify anomalies in data automatically based on the historical data profile values. For example, if an ETL job is ingesting 1000 records on daily basis and if one day it only processed 100 records, Data Profile component will automatically flag that as an anomaly.
This observability feature of Data Profile component can be used to automatically detect anomalies in other data profile values such as % of Nulls in a column or Mean value.
Data Observability Improvements
Data Analysis component automatically learns from historical data and identifies anomalies in the latest data. Within the realms of data validation, anomaly detection plays one of the most crucial roles as it is not defined with strict logic or rules but with patterns and aggregates. Some of the prediction methods supported include time series based machine learning models as well statistical methods such as IQR and Standard Deviation.
Version 2022.5.0 simplifies anomaly detection by providing the option to do inline prediction. The results screen was also improved to show anomalies that were identified during the run.
SQL Query Builder
The SQL Query Builder option simplifies the generation of SQL queries from the data model without writing the query manually. It has three sections: Entities, Columns, and Conditions.
- The Entities section displays the tables and their columns corresponding to the schema of the selected data model.
- The Columns section displays the columns that will be used to generate the SQL query with or without the usage of the aggregate function.
- The Conditions section allows the user to generate the WHERE clause condition as per the end user’s requirement.
In addition to above-mentioned updates, a lot of additions and enhancements were made to Data Rules, Test Cases, Test Data Manger (TDM) and CLI. Version 2022.5.1 also adds support for Azure AD SSO integration.
DataOps Suite – Free Trial
The Datagaps’ DataOps Suite now comes with new components that add extensibility and connectivity with other applications as well as a focus on ease of creating tests by automatically creating SQL Queries and identifying anomalies based on data profile.
Try DataOps Suite Free for 14 days…
Established in the year 2010 with the mission of building trust in enterprise data & reports. Datagaps provides software for ETL Data Automation, Data Synchronization, Data Quality, Data Transformation, Test Data Generation, & BI Test Automation. An innovative company focused on providing the highest customer satisfaction. We are passionate about data-driven test automation. Our flagship solutions, ETL Validator, Dataflow, DQ Monitor and BI Validator are designed to help customers automate the testing of ETL, BI, Database, Data Lake, Flat File, & XML Data Sources. Our tools support Snowflake, Tableau, Amazon Redshift, Oracle Analytics, Salesforce, Microsoft Power BI, Azure Synapse, etc., data warehousing projects, and BI platforms. www.datagaps.com