Designing Power BI Semantic Models That Scale

Most Power BI bottlenecks aren't capacity problems — they're modeling problems. A clean, well-shaped semantic model on a P1 capacity will outperform a sloppy one on a P3, every time.

Here's the architecture playbook we use with enterprise clients running Power BI / Fabric at scale.

Star schema, almost always

If you remember nothing else: fact tables in the middle, dimension tables surrounding, joined by surrogate keys. Snowflake when you have to. Avoid one-big-table.

Star schema gives you:

Predictable DAX behavior.
Better compression in VertiPaq / Direct Lake.
Easier security, easier maintenance, faster queries.

The temptation to denormalize "for convenience" costs you in performance every report, every day.

One semantic model per business domain

Don't build a single mega-model with everything. Don't build a different model per report either. The right grain:

One semantic model per business domain — Sales, Finance, Operations, Service.
Composite models for cross-domain reports — chained from the domain models, no duplication.
Thin reports — pure visuals, connected to a published semantic model.

This pattern lets domain owners govern their model while consumers build whatever reports they need on top.

Calculation groups — use them

Calculation groups let you write time intelligence and reusable patterns once, applied to any measure. Instead of:

Sales YTD := TOTALYTD([Sales], 'Date'[Date]) Sales QTD := TOTALQTD([Sales], 'Date'[Date]) Margin YTD := TOTALYTD([Margin], 'Date'[Date]) ... and so on for 50 measures and 8 time variants.

You write the time pattern once and apply it to any base measure. Cuts model size, simplifies maintenance, and report builders love it.

Tabular Editor 3 (paid) makes calculation groups a 10-minute job. Free Tabular Editor 2 also handles them well.

DAX patterns that age well

Variable-first DAX — declare variables, return at end. Easier to read, often faster.
Avoid bidirectional filters unless absolutely necessary. They explode the storage engine cost.
Avoid CALCULATE inside iterators unless you really mean it.
Use SELECTEDVALUE instead of HASONEVALUE + VALUES — clearer intent.
Format strings on measures — once, in the model, not in each visual.

For complex DAX, profile in DAX Studio. The query plan tells you whether the storage engine or formula engine is the bottleneck.

Row-Level Security — design it once, deploy once

RLS gone wrong is your security audit nightmare. Best practices:

Define security at the dimension level (a User-to-Region table), never on the fact.
Use dynamic RLS with USERPRINCIPALNAME() joined to a dimension — much easier to maintain than static role-based.
Test with View as role.
Document every role's filter logic.
For object-level security (hide entire tables/columns from certain users), Tabular Editor is currently the path.

Incremental refresh — turn it on

If your fact tables are >10M rows or refresh windows exceed 30 minutes:

Partition fact by date.
Refresh only recent partitions (typically last 1–3 months).
Use detect data changes to skip unchanged partitions.
Combine with XMLA endpoint management to fix specific partitions when needed.

Cuts refresh time 5–20x for typical models.

Direct Lake (Microsoft Fabric) — when to use it

Direct Lake is the new mode in Fabric: Power BI reads Parquet files in OneLake directly, no import, no DirectQuery.

Use when:

Your data is in Fabric Lakehouse / Warehouse already.
Sub-second query latency matters.
You want sub-minute data freshness without complex incremental refresh.

Trade-offs:

Some advanced features (calculation groups in some scenarios, certain DAX patterns) have edge cases.
Falls back to DirectQuery in some conditions — monitor for this.

For most new enterprise models, Direct Lake is the right default in Fabric.

Capacity planning

F2 / F4 (Fabric SKU) — small org, departmental use.
F16 / F32 — mid-market, multiple domains, decent concurrency.
F64 / P1+ — enterprise: hundreds of reports, thousands of users, RLS in play.

Buy on active query workload, not on user count. Use Fabric Capacity Metrics App to track CU consumption and right-size.

Governance: certified vs promoted

Don't allow a free-for-all of published reports. Build a tiering:

Personal — sandbox.
Promoted — endorsed by a workspace owner.
Certified — IT/CoE-reviewed, sourced from certified semantic models.

Surface badges in the Power BI hub. Train users to look for them.

FAQs

Should I use DirectQuery or Import? Import unless you have a real-time freshness need or data is too large to fit. DirectQuery imposes performance constraints on every visual; only use it deliberately.

Can semantic models live in Fabric Workspace? Yes. They can also be promoted across Fabric workspaces. Use deployment pipelines for environment promotion.

What's the difference between a dataset and a semantic model? Same thing — Microsoft renamed datasets to semantic models in 2023.

Are aggregations still useful? Yes for very large fact tables in DirectQuery / composite models. With Direct Lake, aggregations matter less in many cases.

Need a Power BI / Fabric architecture review? We do focused 2-week engagements producing a target architecture, performance baseline and migration roadmap. Get in touch.

#Power BI #Fabric #Architecture #Modeling

Ready to talk to a Microsoft expert?

Book a 30-minute working session with our MVP team. No slides, just answers.

Book a Discovery Call

Keep reading