Data Vault modeling is a natural fit for Copilot: it follows strict conventions, involves repeating patterns across dozens of tables, and benefits hugely from context a pre-prompt can carry. This article shows how to generate hubs, links, and satellites from staging tables, how to iterate across the raw and business vault layers, and how to keep Copilot aligned with your team’s Data Vault standards.
Who this is for: Data Vault practitioners: architects, engineers, and modelers building or modernizing Data Vault structures in SqlDBM. Familiarity with hub, link, and satellite concepts is assumed.
Why Copilot works well for Data Vault
Data Vault structures follow strict, repeatable patterns. Every hub looks like every other hub. Every satellite carries load timestamps, record source, and hash-diff columns. That repetition is where Copilot saves the most time. Once it knows your team’s Data Vault conventions (through a pre-prompt), it can generate hubs, links, and satellites from staging tables without you hand-coding each one.
Copilot is less strong at the decisions Data Vault requires upfront: what constitutes a business key, how to group descriptive attributes into satellites, which relationships should be modeled as links versus hubs-with-children. Those decisions still belong to the architect. Copilot accelerates the mechanical layer underneath.
Setting up a Data Vault pre-prompt
Before generating anything, configure a pre-prompt that captures your team’s Data Vault conventions. Open Copilot settings for the project and add something like:
Example Data Vault pre-prompt: Our team follows Data Vault 2.0 conventions. Hubs are prefixed H_, links L_, satellites S_. Every hub has: <business_key>_HK (hash key), <business_key> (business key), LOAD_DATE (timestamp), RECORD_SOURCE (varchar). Every link has: <link_name>_HK, <parent1>_HK, <parent2>_HK, LOAD_DATE, RECORD_SOURCE. Every satellite has: <parent>_HK, LOAD_DATE, LOAD_END_DATE, RECORD_SOURCE, HASH_DIFF, and descriptive attributes. Primary keys on all hash key columns. Before creating a new hub, check whether a business key with the same name already exists.
With this pre-prompt active, every hub, link, or satellite Copilot proposes will follow these conventions automatically. No need to restate them in each prompt.
Generating hubs, links, and satellites from staging
The typical workflow is staging-first:
- Load your staging tables into the project via reverse engineering (direct connection, DDL, or Excel).
- Open the Copilot panel.
- Ask Copilot to identify the business keys. Example prompt: “Review the staging tables and identify the natural business keys for a customer, product, order, and supplier hub.” Review the proposal carefully. Business key selection is the decision Copilot is most likely to get subtly wrong.
- Generate hubs. Example prompt: “Create hubs for customer, product, order, and supplier following our pre-prompt conventions.” Copilot proposes the four hub tables with hash keys, business keys, load dates, and record source columns.
- Generate links between hubs. Example prompt: “Create a link table for customer-orders and another for order-products.” Copilot proposes the link structures with foreign hash keys to each parent hub.
- Generate satellites. Example prompt: “Create satellites for each hub that capture the descriptive attributes from staging, grouped by rate of change.” For satellites, review the grouping. Copilot will make a reasonable first pass but may split or combine differently than your team prefers.
Working across raw and business vault layers
If your team maintains separate raw vault and business vault layers, Copilot handles the split naturally because it keeps project context across prompts. A practical pattern:
- First pass: raw vault. Generate hubs, links, and satellites directly from staging. Accept the proposals. These land in your raw vault schema.
- Second pass: business vault. In the same project, prompt: “Create business vault satellites that compute customer lifetime value and order status rollup from the raw vault.” Copilot knows the raw vault structure from the first pass and proposes business-layer satellites that reference it.
- Third pass: point-in-time and bridge tables. For consumption-layer structures, prompt explicitly: “Create a point-in-time table for customer that joins the customer hub with its satellites at load_date granularity.”
What Copilot will get wrong (and what to do about it)
- Business key identification. When staging has multiple candidate keys, Copilot may pick the wrong one. Always review proposed hubs carefully. The business key is the decision everything else hangs off. If Copilot picks wrong, name the correct key explicitly in a follow-up prompt.
- Satellite splitting. Copilot will sometimes put all descriptive attributes in one satellite when your team’s convention is to split by rate of change or source system. State the splitting rule in your pre-prompt or in the prompt itself (e.g., “create one satellite per source system”).
- Multi-active satellites. Copilot defaults to single-row-per-parent satellites. If you need multi-active (e.g., for arrays of phone numbers), call it out explicitly: “Create a multi-active satellite for customer phone numbers.”
- Same-as-links and hierarchical links. Specialized link types need explicit prompts. Copilot will not infer them from staging alone.
Related articles
- Getting started with Copilot
- Reverse engineering a model with Copilot
- Configuring Copilot settings and pre-prompts
- Bulk governance actions