COBOL and Financial Institutes
COBOL (Common Business-Oriented Language) is a programming language that was developed in the 1950s and remains in widespread use today, particularly in the finance and banking sectors. Many financial institutions still rely on COBOL systems to manage their data and processes, even though newer technologies such as Java and Python have largely replaced COBOL in other industries.
One of the reasons that COBOL systems are still in use is that they are extremely stable and reliable. These systems have been in operation for decades and have proven to be effective at handling large volumes of data and transactions. In addition, COBOL systems often include technologies such as VSAM (Virtual Storage Access Method) and copybooks, which help to manage and organize the data in these systems.
However, the hierarchical nature of COBOL and finance datasets can make it difficult to migrate this data to a modern system like Snowflake. In these systems, data is often organized in a tree-like structure with multiple levels of nested records. For example, a financial transaction record might contain multiple account details, each of which might contain multiple transaction details. This hierarchical structure can make it challenging to map the data to a more flat and normalized structure like that used by Snowflake.

To address this challenge, organizations may consider using partial flattening when migrating their data. Partial flattening involves keeping some of the hierarchy in the data while still flattening out other parts. This can be done using DataOps Suite’s python functions, which allow for more granular control over the data conversion process.
A quick note that the Suite also works with binary EBCDIC files. In this blog post, we focus on a COBOL system however the same can be applied to binary files.
Partial Flattening vs Complete Flattening
Let’s say that an organization is migrating a financial transaction record from a COBOL system to Snowflake. The transaction record in the COBOL system might have the following structure:
To address this challenge, organizations may consider using partial flattening when migrating their data. Partial flattening involves keeping some of the hierarchy in the data while still flattening out other parts. This can be done using DataOps Suite’s python functions, which allow for more granular control over the data conversion process.



Partial Flattening allows the organization to preserve some of the hierarchical structure of the original data (e.g., the relationship between an account and its transaction details), while still flattening out other parts to make it easier to work with in Snowflake.
Alternatively, the organization could use complete flattening to convert the transaction record. In this case, the entire hierarchical structure of the original data is flattened out, resulting in a more normalized and flat structure. However, this approach may make it more difficult to understand the relationships between different parts of the data, particularly if the data contains multiple levels of hierarchy.

Fig. DataOps Suite Partial Flattening Function

Fig. A Complex JSON Flattened

Fig. Completely Flat File – The complex columns will be further classified into new rows

Fig. Partially Flat File
One of the key differences between partial flattening and complete flattening is the volume of data that is produced. Complete flattening involves flattening out all levels of hierarchy in the data, resulting in a more normalized and flat structure. This can result in a significantly larger volume of data, as all of the hierarchical relationships are preserved in the data. On the other hand, partial flattening involves keeping some of the hierarchy in the data while still flattening out other parts. This can result in a smaller volume of data, as some of the hierarchical relationships are removed from the data.
Data Quality Post Flattening
Once a COBOL file has been converted in the Related Datasets, a multitude of traditional test cases that are difficult to implement in a the orignal complex structure.
These take the forms of Duplicity check of ID within substructure, List-of-Values Domain checks, Null checks, Character checks and the various data quality checks. Many of these are present in the Datagaps’ DataOps Suite.

Fig. List of available data rules in this use-cases

Fig. How a Zero-Code DQ node looks in the Suite

Fig. Results of the Data Rules Check
Conclusion
Overall, the migration of COBOL and finance datasets to a modern system like Snowflake can be a complex and time-consuming process. By using partial flattening and the DataOps Suite’s python functions, organizations can ensure that their data is accurately and effectively migrated, while still maintaining the hierarchical structure that is so important in these systems. This helps to validate the mainframe datasets and ensure that the data is correctly migrated to the new system, which is critical for maintaining the integrity of the data and ensuring that it is properly understood by users.
Get a Free POC scheduled today!
Request Demo

Established in the year 2010 with the mission of building trust in enterprise data & reports. Datagaps provides software for ETL Data Automation, Data Synchronization, Data Quality, Data Transformation, Test Data Generation, & BI Test Automation. An innovative company focused on providing the highest customer satisfaction. We are passionate about data-driven test automation. Our flagship solutions, ETL Validator, Data Flow, and BI Validator are designed to help customers automate the testing of ETL, BI, Database, Data Lake, Flat File, & XML Data Sources. Our tools support Snowflake, Tableau, Amazon Redshift, Oracle Analytics, Salesforce, Microsoft Power BI, Azure Synapse, SAP BusinessObjects, IBM Cognos, etc., data warehousing projects, and BI platforms. www.datagaps.com