As a strategic consultant, I hear something to this effect on a regular basis with my customers:

“My data looks wrong. I can't report on it.”

“I can't show my data the way I want to.”

That’s never good to hear as a consultant, nor to experience as you're creating reports.

The purpose of data visualization is to show progress towards your business goals and metrics, as well as answer pertinent questions around growing your business. We use data to answer these questions, but what do you do when your data is the problem?

This is where data integrity comes into play. As defined by DialSource, data integrity “is the accurate and consistent entry of data throughout its lifecycle.” Maintaining your business’ data integrity translates to you policing your data’s structure, ensuring that it is collected and formatted in the best way.

Maintaining data integrity today is an entire industry and intertwined with the CRM you choose to manage your business. People who help with data integrity include data scientists, business/data analysts, and others who work in a business’ operations department. When using HubSpot, the importance of understanding your data is no less vital, and if managed properly, can assist you to grow better.

Data Integrity involves ensuring your data is entered in the correct format.

The Purpose of Data Integrity in HubSpot

When thinking about data integrity with any system, it’s important to know how your system is structured. As I wrote here, HubSpot provides four default objects and looks to classify its data into properties, which can take various forms in terms of property types. The properties are broadly divided by those that are given by HubSpot by default and those that you can create to map your business nuances with HubSpot, or custom properties.

Beyond just collecting data into your HubSpot instance, you also have to consider the fields that live in your other systems, and whether you want to bring that information into HubSpot. This is where integrations come into play.

Integrations, as described in this post, are connections to other systems that enable data sharing. This gives you the chance to use features in both systems due to the instant flow of data. Broadly speaking, there are three types of integrations with HubSpot when thinking about how to associate your systems together:

  • Native integrations: Direct connections existing between HubSpot and the other system. You can find pages on all of these integrations in the App Marketplace.
  • iPaaS: Also known as integration platform as a service, these integrations involve a middleware to integrate the external system with HubSpot. Popular iPaaS platforms include Automate.io, Zapier, and PieSync (the last of whom HubSpot recently acquired).
  • Custom integrations: An integration that is created on the basis of the Custom API endpoints we have available. You can find our full documentation on this here.

When looking at which integrations works best for your data’s integrity, think about the following:

  • What is the purpose of integrating data between said system(s) into HubSpot? Will this further optimize your business processes? Do you need this data in HubSpot’s reports?
  • What type of integration is this going to be (e.g. native, iPaaS, or custom)?
  • What data needs to sync? What properties are involved here?
  • What objects is that data going to form in HubSpot under? Is it just one object or multiple?
  • What are you going to do to manage that data once it’s in HubSpot?

Data integrity becomes increasingly important in the context of managing your data, and even more so when considering integrations. Jorie Munroe, Adi Shah, and I discussed the nuances and related issues of effectively managing your data.

There are numerous instances where you can maintain effective data integrity, and subsequently enable better decision making for your business.

Examples of Data Integrity Issues and Troubleshooting Options

Importing data when first migrating into HubSpot

When first using HubSpot, you need to import data to execute marketing campaigns, create benchmarks for reporting, and nurture your opportunities and customers. Consequently, these imports mean that your properties regardless of object  end up not showing as intended in reports.

The solution to think about is to take more time preparing that initial import file and the properties you want to change on the CSV prior to importing on particular objects. Some examples are below:

  • Did some of your contacts originate from Organic Search? Enter that under their original source property.
  • What about the create date of those contacts? Enter that under the contact’s create date property.

Ensure that you are standardizing your data to account for specific property types when you import into HubSpot, as this will help with later reporting. Additionally, think about associations that you want to make between the objects  contacts, companies, deals, and tickets. Doing these steps in sequential order takes more time, but ends up yielding better results for you.

Screenshot of importing objects into HubSpot. Remember to think about associations when importing, as that can benefit reporting.

Data is missing from reports

When data is missing from reports, identify a few places to ensure where the gap in data originates from. Are your filters blocking that data from appearing in your report(s)? Are you calling the right properties on the right objects? Are you even collecting the data in question, and if so, is it native to HubSpot or external?

The solution to this would be to execute on the answers to those above questions. Remember that HubSpot possesses default properties that update regularly, which may give you some initial information that you can work with, or use in your forms when contacts are submitting. If you have integrations data that should flow into HubSpot but is not, check with whomever maintains the integrations about resolving the issue. Proper researching into the root causes will hopefully prevent future sync errors from occurring.

Data is funky in terms of its presentation

Another common issue is data not showing up the way you want it to based on what reporting allows. Data is being collected, but remains funky or difficult to visualize. For legacy data collected over time, it can be daunting to look at that and attempt to try and report on it, when the data ends up showing tens or hundreds of categories on bar or pie charts.

The solution is to look at the property types holding the data and think about how that shows up on a report. For example:

  • Single-line and multi-line text properties are generally difficult to visualize in any other HubSpot report apart from summarized/unsummarized tables, especially if you are trying to show common choices selected by contacts. If that property was a single/multi-checkbox property type, then it will be suddenly easier to visualize that data in chart form because it’s formatted into a finite set of choices.
  • Number properties are the only fields that can show an average, total, minimum or maximum values on a HubSpot report. If you have number values inserted into a property type different from number, then you cannot calculate averages/totals.

For existing data with these data issues and a property type that you do not want to use for reporting:

  1. First create a custom property as a temporary holder.
  2. Using the above, transfer the previous property type values to the temporary custom property using a workflow so you can delete all the values from the original property without loss of data.
  3. After changing the original property to a type you can use going forward, use bulk edit to standardize the original property using from the values in the temporary one, and further automate going forward using workflows, if appropriate.

Incorporating offline sources

As of right now, offline sources are characterized as any source not from HubSpot. Contacts imported through lists, integration data, API calls, conversations/chatbots, and the sales email plugin and others are all classified as offline sources. Here’s a full breakdown. While some of these categories are not occurring offline, you and other customers will look to include some of that information in their reports.

The solution to better incorporating your offline sources with your online ones first involves understanding the drill downs for offline and online original sources alike. You can then use workflows to associate those common drill downs together and subsequently show those as a combined set of categories with that as the main filter to distinguish from all offline or all online sources.

With regards to integrations, while they are classified under offline sources in HubSpot, they can be distinguished by their IDs, which are housed in the original source’s drill-down-2 property. You can use this as part of your filter(s) to characterize any reports, and ensure accuracy between both systems.

Takeaways

Data integrity, as shown by the examples, is not something that is done at one point in time. Rather, this is something that one needs to do on a consistent basis to later appreciate the insights you can see from your reports and other forms of data visualization.

Solutions to maintain your data integrity involve the tools that you have with HubSpot workflows, bulk editing properties in the CRM, and your properties, default and custom. These tools can help standardize and clean up data when in HubSpot, and give you the indication of what you need to streamline in your data collection process or integrations. You can find details of these tools here.

As you are working on your data integrity, have your reports be the barometer for where you are lacking in terms of collecting data. This can help pinpoint areas that you want to add in your visualization, and ultimately enable better decision making. Hold regular meetings with your leadership, team managers, and analysts to discuss discrepancies in your data. For the purposes of data integrity, center your thoughts around the broad question: “Is this report the result of a process issue or a data issue?” If it’s the latter, investigate what is causing this issue with data integrity and resolve it.

Another takeaway from maintaining data integrity includes involving your operations team with HubSpot. Appoint someone as a data integrity officer in charge of managing your data in HubSpot and other systems to ensure smooth syncing. Bringing your operations team to the table will help them buy-in to the vision of how to appropriately connect your systems together, and subsequently report on them. More advanced functionality like our multi-touch attribution reports will be all the more valuable when you have the people that view data on a daily basis believing in the dataset. Those same teams will now have a better understanding of your tech stack’s structure and be able to identify opportunities to grow better.

Originally published Jun 30, 2020 9:00:00 AM, updated June 30 2020