skip to content
san.is
Table of Contents

Every HubSpot migration starts with optimism. The new CRM will be cleaner, the processes will be better, the team will love it. Then you receive the export files. The optimism fades rather quickly.

This article catalogues the blockers I’ve encountered across dozens of migrations from Dynamics, Salesforce, Marketo, and legacy systems that shall remain nameless to protect the innocent. Use it as a pre-flight checklist before your migration begins.

Critical Blockers

These issues will stop your migration entirely if not addressed. Ask me how I know.

1. CSV Format Issues

The Problem: Files exported from legacy systems often aren’t properly formatted CSVs. They look like CSVs. They have the .csv extension. They are lying.

Common issues:

  • Semicolon delimiters instead of commas
  • Data in a single column requiring parsing
  • BOM characters () in headers
  • Inconsistent quoting of text fields
  • Line breaks within fields

Detection:

import csv
with open('contacts.csv', 'r', encoding='utf-8-sig') as f:
reader = csv.reader(f)
header = next(reader)
# Check if data is in single column
if len(header) == 1 and ';' in header[0]:
print("WARNING: Semicolon-delimited - needs conversion")
# Count columns
print(f"Column count: {len(header)}")

Solution: Convert to proper CSV before import. In Python:

import pandas as pd
df = pd.read_csv('export.csv', delimiter=';', encoding='utf-8-sig')
df.to_csv('import.csv', index=False, encoding='utf-8')

2. Missing Unique Identifiers

The Problem: HubSpot requires unique identifiers to match records. Without them, every row creates a new record.

ObjectRequired Identifier
ContactsEmail address
CompaniesDomain name
DealsRecord ID (for updates)
Custom ObjectsUnique value property

Detection: Check fill rates on identifier columns.

df = pd.read_csv('contacts.csv')
email_fill = df['email'].notna().mean() * 100
print(f"Email fill rate: {email_fill:.1f}%")

If contacts have <90% email fill rate, you’ll create orphaned records that can’t be deduplicated later.

Solution:

  • For contacts without emails: consider whether they should be migrated
  • For companies without domains: generate placeholder domains (companyname.placeholder.com) and flag for cleanup
  • For custom objects: designate a unique identifier property before import

3. Missing Association Columns

The Problem: Your contact file has no way to link records to companies. Your deal file has no way to link to contacts.

HubSpot needs shared identifiers between files:

  • Contact → Company: Company domain or company name
  • Deal → Contact: Contact email
  • Deal → Company: Company domain

Detection: Look for association columns in your exports.

Expected columns for contacts:

  • company_domain or account_domain → for company association
  • company_name or account_name → fallback association

Expected columns for deals:

  • contact_email → for contact association
  • company_domain → for company association

Solution:

  • If missing, extract associations from the source system separately
  • Create association mapping files: contact_id,company_id
  • Import in order: Companies → Contacts → Deals, using IDs from each import

4. Date/Time Format Inconsistencies

The Problem: Dates in formats HubSpot doesn’t recognise cause import failures or data corruption.

HubSpot accepts:

  • ISO 8601: 2025-03-12T14:30:00Z
  • Simple date: 2025-03-12
  • Unix timestamp (milliseconds): 1710252600000

Common problematic formats:

  • 12/03/2025 (ambiguous: MM/DD or DD/MM?)
  • March 12, 2025
  • 12-Mar-25

Detection:

date_columns = ['createdate', 'closedate', 'last_modified']
for col in date_columns:
if col in df.columns:
sample = df[col].dropna().head(5).tolist()
print(f"{col}: {sample}")

Solution: Standardise to ISO 8601 before import.

df['createdate'] = pd.to_datetime(
df['createdate'],
format='%d/%m/%Y' # Specify source format
).dt.strftime('%Y-%m-%d')

High-Risk Blockers

These won’t stop the import but will cause significant problems that you’ll discover at the worst possible moment.

5. Property Name Mismatches

The Problem: Source system uses emailaddress1, HubSpot expects email. Source uses firstname, HubSpot uses first_name.

Detection: Create a mapping document comparing source columns to HubSpot properties.

Source ColumnHubSpot PropertyStatus
emailaddress1email✅ Needs mapping
firstnamefirstname✅ Exact match
jobtitlejobtitle✅ Exact match
company-⚠️ Not a valid deal property

Solution: Use HubSpot’s import mapping interface, or pre-rename columns:

column_mapping = {
'emailaddress1': 'email',
'company_name': 'company',
'mobilephone': 'mobilephone'
}
df.rename(columns=column_mapping, inplace=True)

6. Dropdown Value Mismatches

The Problem: Your Lifecycle Stage values don’t match HubSpot’s options.

Source: Prospect, Customer, Churned HubSpot: subscriber, lead, marketingqualifiedlead, salesqualifiedlead, opportunity, customer, evangelist, other

Detection: Extract unique values from dropdown columns.

for col in ['lifecycle_stage', 'lead_status', 'industry']:
if col in df.columns:
unique_values = df[col].dropna().unique()
print(f"\n{col}:")
for v in unique_values[:20]:
print(f" - {v}")

Solution: Either:

  1. Map source values to HubSpot options
  2. Create custom property options in HubSpot before import
  3. Use a calculated property post-migration

7. Owner Assignment Issues

The Problem: Owner fields reference source system user IDs that don’t exist in HubSpot.

Source: owner_id: USR00123 HubSpot: Expects user email or HubSpot owner ID

Detection: Check owner column values.

owners = df['owner'].dropna().unique()
print(f"Unique owners: {len(owners)}")
print(owners[:10])

Solution: Create user mapping:

owner_mapping = {
'USR00123': '[email protected]',
'USR00456': '[email protected]'
}
df['hubspot_owner_email'] = df['owner'].map(owner_mapping)

Import using the email column, and HubSpot will resolve to owner IDs.

8. Multi-Select Property Values

The Problem: Multi-select values have inconsistent delimiters or formats.

Source: Technology; Healthcare; Finance HubSpot expects: Technology;Healthcare;Finance (no spaces after semicolons)

Detection:

multi_select_cols = ['industries', 'products', 'interests']
for col in multi_select_cols:
if col in df.columns:
sample = df[col].dropna().head(3).tolist()
print(f"{col}: {sample}")

Solution: Standardise delimiters.

df['industries'] = df['industries'].str.replace('; ', ';')

Data Quality Blockers

These won’t fail the import but will contaminate your CRM in ways that make everyone slightly miserable for years.

9. Duplicate Records

The Problem: Source system has duplicate contacts that will all import as separate records.

Detection:

duplicates = df[df.duplicated(subset=['email'], keep=False)]
print(f"Duplicate emails: {len(duplicates)}")
# Sample duplicates
print(duplicates.groupby('email').size().head(10))

Solution: Deduplicate before import, or plan for post-import cleanup using HubSpot’s deduplication tools.

10. Empty Required Fields

The Problem: Records missing data in fields critical to your processes.

Detection:

required_fields = ['email', 'firstname', 'lastname', 'company']
for field in required_fields:
if field in df.columns:
fill_rate = df[field].notna().mean() * 100
empty_count = df[field].isna().sum()
print(f"{field}: {fill_rate:.1f}% filled ({empty_count} empty)")

Solution:

  • For essential fields (email): exclude records without values
  • For optional fields: accept the gap or enrich post-migration

11. Test and Junk Data

The Problem: Production exports include test records, spam entries, and internal accounts.

Common patterns:

  • Emails: @test.com, @example.com, @company.com (internal)
  • Names: Test, Asdf, Delete Me
  • Companies: Test Account, DO NOT USE

Detection:

test_patterns = ['@test.com', '@example.com', 'test@', 'asdf']
for pattern in test_patterns:
matches = df[df['email'].str.contains(pattern, na=False, case=False)]
if len(matches) > 0:
print(f"'{pattern}': {len(matches)} records")

Solution: Filter out test data before import.

test_domains = ['test.com', 'example.com', 'mailinator.com']
df['domain'] = df['email'].str.split('@').str[1]
df_clean = df[~df['domain'].isin(test_domains)]

The Pre-Migration Checklist

Run through this list before every migration:

File Structure

  • CSV files are comma-delimited
  • No BOM characters in headers
  • Consistent encoding (UTF-8)
  • No line breaks within fields

Identifiers

  • Contacts have email addresses (>90%)
  • Companies have domains or names
  • Association columns exist between objects

Data Quality

  • Dates in recognised format
  • Dropdown values match HubSpot options
  • Owner fields mappable to HubSpot users
  • Multi-select delimiters standardised
  • Duplicates identified and planned for
  • Test data filtered out

Custom Properties

  • Custom properties created in HubSpot before import
  • Property types match data (number, date, text)
  • Dropdown options created for all values

Import Order

  • Companies first
  • Contacts second (with company associations)
  • Deals third (with contact/company associations)
  • Custom objects last (with appropriate associations)

Key Takeaways

  1. Inspect your files before import - don’t assume exports are clean
  2. Calculate fill rates - know what data you’re actually getting
  3. Map associations explicitly - don’t rely on HubSpot’s automatic matching
  4. Standardise dates and dropdowns - the most common causes of data corruption
  5. Filter test data - it will pollute your reports forever

The best migrations are boring. They’re boring because the work happened upfront, in the data preparation, not during crisis management after a failed import. Boring is good. Aim for boring.


Planning a HubSpot migration and want a second opinion on your data? Connect with me on LinkedIn.