Understanding Validation of Salesforce Objects, Uploads, Updates and How Datagaps DataOps Can Help

Intro to Salesforce 

Salesforce is a cloud-based CRM platform that helps businesses manage and analyze customer interactions and data throughout the customer lifecycle. It is used to store and organize information about customers, such as their contact details, communication history, and preferences. This data is used to provide a complete view of the customer, which can help businesses understand their needs and personalize their interactions with them.

Salesforce also provides tools for managing customer relationships, including customer segmentation, lead management, and customer service. These tools help businesses to identify and prioritize their most valuable customers, track and follow up on leads, and provide timely and efficient customer service.

In addition to its CRM capabilities, Salesforce also provides a range of tools and features for data management and integration, including data import and export, data modeling, and data governance. This makes it possible for businesses to manage and analyze their customer data in a centralized location and to integrate it with other systems and applications.

Production vs Development – Metadata Validation

In a Salesforce deployment, it is common for there to be differences in metadata between the development environment (also known as a sandbox or dev environment) and the production environment (also known as prod). This is because the development environment is often used for testing and experimentation, which can result in changes to the metadata that are not intended for the production environment.

Some examples of metadata changes that may occur in the development environment but not be intended for the production environment include:

Modifying the structure of objects or fields: This could involve adding or deleting fields, or changing field data types. For example, a developer may be testing a new feature that requires adding a new field to the Account object to store additional data. If this field is not needed in the production environment, it would be important to remove it before deploying the changes to prod.

Changing page layouts or field-level security settings: This could involve modifying the layout of a page to display new fields or rearranging existing ones, or changing the visibility of fields based on user roles or profiles. For example, a business user may be testing a new page layout for the Account object in the dev environment, but this layout may not be ready for production.

Modifying workflow rules or approval processes: This could involve adding or modifying rules that trigger actions based on certain conditions, or changing the steps or participants in an approval process. For example, a developer may be testing a new workflow rule in the dev environment that sends an email notification when an Account is created, but this rule may not be ready for production.


Adding or modifying custom objects or custom fields: This could involve creating new objects to store custom data or adding new fields to existing objects. For example, a business user may be testing a new custom object in the dev environment to track project tasks, but this object may not be ready for production.

These changes may occur in the development environment for a variety of reasons. For example, a developer may be testing a new feature or functionality and need to make changes to the metadata to support it. Or, a business user may be exploring different options for customizing the Salesforce instance and may make a series of changes as they iterate on their design.

Here, the Metadata Validation Node of the DataOps Suite can be used to ensure the metadata of the Dev and Prod Objects are in sync

A Metadata Validation Node against Dev and Prod Salesforce Schemas

Upload Validation

There are several issues that can arise while uploading data to Salesforce, and the specific issues you may encounter can depend on the type of data you are uploading, the source of the data, and the type of Salesforce object you are uploading to. Here are a few examples of issues that can arise while uploading data to Salesforce:

If the data you are uploading is not properly formatted, it may not be accepted by Salesforce. For example, if you are uploading a CSV file and the data in the file is not properly structured, Salesforce may not be able to parse the data correctly.

If the data you are uploading contains errors or inconsistencies, it may cause issues with the integrity of your Salesforce data. For example, if you are uploading a list of leads and some of the leads are missing required fields, the upload may fail.

Each Salesforce object has its own set of fields and requirements, and if the data you are uploading does not meet these requirements, the upload may fail. For example, if you are uploading data to the Account object and the data does not contain a value for the required “Name” field, the upload may fail.

If you do not have the correct permissions in Salesforce, you may not be able to upload data to certain objects or fields. For example, if you are trying to upload data to a custom object that you do not have permission to access, the upload may fail.

The Data Compare Node can be used to pull data directly from the Salesforce Object and compare it against the file or dataset on premise as seen below.

A Basic Data Compare checking salesforce against a dataset 

Upsert vs Update

It is possible to encounter issues when uploading a large set of records that contain duplicates or need to be updated rather than inserted as new records.

If you are using the upsert function and the records you are uploading contain duplicates based on the external ID field, the upsert function will treat these as updates rather than inserts and will update the existing records with the new data. This can be problematic if you want to insert the records as new records rather than updating the existing ones.

In this case, you may want to use the update function instead of the upsert function. The update function allows you to specify a query to select the records you want to modify, rather than relying on the external ID field to identify matching records. This can be useful if you want to update records based on criteria other than the external ID, or if you want to insert records as new records rather than updating existing ones.

However, it’s important to keep in mind that the update function can only modify existing records, and will not create new records. If you are using the update function and some of the records you are uploading do not already exist in Salesforce, the update function will not create these records. In this case, you may need to use the insert function to create new records or consider using the upsert function with a different external ID field.

It’s also worth noting that Salesforce has limits on the number of records you can upsert or update in a single operation, and you may need to perform the operation in smaller batches if you are working with a large number of records.


The “upsert” function is used to either update existing records or create new records in an object, depending on whether a matching record already exists. When using the upsert function, you specify a field in the object that will be used as the unique identifier, called the “external ID”. If a record with a matching external ID already exists, the upsert function will update that record with the new data. If no matching record is found, the upsert function will create a new record with the provided data.


The “update” function is used to modify existing records in an object. When using the update function, you specify the records that you want to update using a query, and then specify the new field values that you want to set for those records. The update function will only modify records that already exist in the object, and will not create new records.

Both the upsert and update functions can be used to modify a single record or multiple records at once. They can be useful for updating or creating records in bulk, or for keeping data in Salesforce synchronized with data from other sources.

To solve the complex job of figuring out which records to use with Upsert and which to use Upload, the Data Compare Node comes in handy once again.  Every node in the DataOps Suite has its results, and comparisons saved as views that can be called upon internally. This allows for reference and loops of the system to ensure complex solutions can be easily defined and solved in the Suite.

In this case, post defining the Data Compare between the Salesforce Object and the CSV file in question the node creates a set of related views such as records present only in the salesforce object, records present only in the CSV file, the records marked as the differences, and more. Our focus will the dataset which is the difference and the dataset that houses records only in the CSV file. The dataset that houses records only in the CSV file has records that are not present in the salesforce object and thus needs the upsert functions to create the new records in the salesforce object. The other dataset marks the records present in the salesforce object that need to be updated in specific value sets. Here, the update function is used with a prefixed python code to identify the exact values to be updated.

The different views created by the Data Compare Node and the corresponding upsert and update nodes.

Note that distinguishing between the addition of a new external ID field and updating an existing record is a highly quintessential task especially when the objects are called by CRM and reporting tools themselves where the difference between a new record and an updated record is really critical.

The below image shows all views created by the Data Compare Node and the corresponding the upsert and update nodes


In conclusion, it is important to carefully validate Salesforce objects, uploads, and updates to ensure that your data is accurate and consistent. By following best practices and using the appropriate tools and techniques, such as the DataOps Suite, you can avoid common issues such as data formatting errors, data integrity problems, and object-specific issues. Whether you are working with Veeva objects, pre-sales CRM objects, or any other type of object in Salesforce, taking the time to validate your data will help you maintain the quality and reliability of your Salesforce data.

Get a Free POC scheduled today!

Request Demo

Established in the year 2010 with the mission of building trust in enterprise data & reports. Datagaps provides software for ETL Data Automation, Data Synchronization, Data Quality, Data Transformation, Test Data Generation, & BI Test Automation. An innovative company focused on providing the highest customer satisfaction. We are passionate about data-driven test automation. Our flagship solutions, ETL Validator, Data Flow and BI Validator are designed to help customers automate the testing of ETL, BI, Database, Data Lake, Flat File, & XML Data Sources. Our tools support Snowflake, Tableau, Amazon Redshift, Oracle Analytics, Salesforce, Microsoft Power BI, Azure Synapse, SAP BusinessObjects, IBM Cognos, etc., data warehousing projects, and BI platforms.  www.datagaps.com 

Queries: contact@datagaps.com