When you decide to build BI dashboards, one of the first architectural decisions is how those dashboards connect to your data. The two main options are direct query — connect the BI tool straight to your database — and a data warehouse — build a separate layer that holds processed data and connect your BI tool to that.

Direct query is what most people reach for first. It's faster to set up, requires less infrastructure, and seems simpler. Sometimes it's exactly right. Sometimes it creates problems that cost significantly more to fix than a warehouse would have cost to build in the first place.

Here's the clear guide to what each approach actually involves and when each is the right answer.


What direct query actually means

Direct query means your BI tool — Power BI, Tableau, Qlik — connects directly to your operational database and queries it in real time whenever a dashboard loads or refreshes. The data in the dashboard is the data in the database at that moment.

What's good about it:

Setup is fast. Data is always current. You don't need to build and maintain a separate data layer. For simple situations with one data source and modest dashboard complexity, direct query is perfectly adequate.

What causes problems:

Every dashboard view generates database queries. Those queries compete with transactional operations — your application writing orders, updating records, processing payments — on the same database server. As dashboard complexity grows and user numbers increase, analytical queries slow down the operational system, and the operational system's transaction load slows down the dashboards. Both suffer.

What a data warehouse actually means

A data warehouse is a separate database — purpose-built for analytical queries — that holds processed, validated, historical data sourced from your operational systems through an ETL pipeline. Your BI tool connects to the warehouse, not to any operational system.

What's good about it:

Analytical workload is completely separated from transactional workload. The warehouse can be optimised specifically for query performance. Data from multiple sources can be combined. Historical data is preserved even if the source system doesn't keep it. Metric definitions are enforced at the warehouse layer, so every dashboard uses the same agreed numbers.

What adds complexity:

You need to build and maintain the ETL pipeline. The warehouse takes time to design and build properly. There's a refresh lag — the warehouse is as current as the last ETL run, not real-time. Initial investment is higher.


Side by side

Direct Query Data Warehouse
Setup time Fast — connect and build dashboards Longer — ETL design and build required first
Data freshness Real-time — always current As current as the last ETL run (hourly, daily, etc.)
Production impact Analytical queries compete with transactions Zero — analytical load completely separated
Multi-source data Difficult — cross-database joins are complex and slow Natural — all sources unified in one data model
Historical data Only what the source system keeps Preserved from day one regardless of source
Metric consistency Defined in each dashboard — can drift Enforced at warehouse layer — consistent everywhere
Query performance Depends on source database load Optimised for analytical queries independently
Data quality Raw source data — what's in the DB is what you get Validated at ETL — bad data caught before dashboards
Access constraints Requires direct database connectivity Can use staging/export approach if DB access unavailable
Cost to start Lower Higher
Cost at scale Often higher — performance problems require expensive fixes Lower — architecture designed for scale from the start

When direct query is the right answer

✓ Direct Query works well when...

You have one primary data source and your dashboards don't need to join data from multiple systems
Your database is well-indexed and the dashboard queries are straightforward — not complex aggregations across large tables
Your user base is small — a handful of people viewing dashboards, not dozens generating concurrent queries
You need real-time data — if a 1-hour refresh lag is genuinely unacceptable for your use case
You're starting out and want to validate what dashboards are useful before investing in warehouse infrastructure
Your database is cloud-managed (Azure SQL, RDS) with enough headroom that analytical queries won't impact operational performance

✓ Data Warehouse is right when...

You have multiple data sources that need to be combined — CRM, finance, ERP, operations in one dashboard
Your dashboards are complex — aggregations over millions of rows, period-over-period comparisons, complex calculated metrics
Your source databases are business-critical and can't absorb analytical query load without performance risk
You need historical trend analysis that the source systems don't preserve in their current structure
Different departments are showing different numbers — you need canonical metric definitions enforced at a layer below the dashboards
Direct database access isn't available — security policy, legacy systems, or third-party platforms prevent direct connection

The hidden costs of choosing direct query when you need a warehouse

Direct query looks cheaper at the start. It often isn't by the end. Here's what typically happens when a business chooses direct query for a situation that needed a warehouse:

⚠ The direct query debt spiral

Dashboard load times grow as more users and more dashboards generate more concurrent queries against the same production database
Production application performance degrades — orders slow down, saves take longer, because the database is under analytical load during peak hours
Business demands cross-system analysis — sales joined to finance joined to support — which direct query can't do cleanly, so someone builds a manual Excel report instead
Finance wants trend analysis going back two years — but the source system only keeps 90 days of history, so historical data is gone
Two departments show different revenue numbers because their dashboards query the same data with different filters — no one agreed on the canonical definition
Eventually the business builds a warehouse anyway — but now it costs more because there's technical debt to unpick, existing dashboards to rebuild, and historical data that no longer exists to reconstruct
Direct query defers the warehouse cost — it doesn't eliminate it. For businesses that will eventually need multi-source analysis, historical data, or consistent metrics, building the warehouse later costs more than building it first.

The decision you actually need to make

Rather than "direct query vs warehouse," the better question is: what will your BI requirements look like in two years?

// Decision guide — answer honestly
Do you need data from more than one system in the same dashboard?
YES → Build the warehouse. Direct query can't join across different source systems cleanly.
NO → Continue to next question.
Do you need to see how metrics have trended over 6–24 months?
YES → Build the warehouse. Source systems rarely preserve the historical snapshots you need for trend analysis.
NO → Continue to next question.
Is your source database business-critical — would slow queries cause operational problems?
YES → Build the warehouse. Analytical load on a critical operational system is a reliability risk you don't want.
NO → Continue to next question.
Do different teams currently show different numbers for the same metrics?
YES → Build the warehouse. Metric consistency requires a canonical definition enforced below the dashboard layer.
NO → Direct query is probably fine for now. Revisit this as requirements grow.

The Atlantic LNG case — when neither approach seems possible

✦ From a real project
Atlantic LNG had a situation that seemed to rule out both approaches: they needed a warehouse (multi-source, complex requirements, critical operational systems) but direct database access to those systems wasn't available due to IT security policy.

The solution was a two-stream ingestion architecture — structured Excel exports from each system to a controlled SharePoint staging location, plus Microsoft PowerApps forms for data that had no existing digital source. Power BI read from the staging layer and Dataverse, not from any production database.

The result was a warehouse-equivalent architecture — multi-source, validated, consistent — without a single direct database connection. The constraints shaped the approach, but the outcome was the same: a reliable single source of truth for executive and departmental dashboards.

The lesson: "No direct database access" is a constraint on the ingestion method, not a reason to avoid a warehouse architecture. The warehouse is the right answer for the requirements; the ingestion approach adapts to the constraints.

What about hybrid approaches?

In practice, many BI environments use both. Some data sources are connected via direct query — operational metrics that genuinely need real-time data, or sources where the data volume is small and the database headroom is large. Other data sources go through a warehouse — the multi-source joins, the historical data, the complex aggregations.

Power BI's composite model supports this explicitly — some tables in DirectQuery mode, others in import mode (effectively a lightweight warehouse). Tableau and Qlik support similar hybrid configurations. This is often the pragmatic middle ground for businesses that have some real-time requirements alongside multi-source analytical needs.


The honest summary

Direct query is the right answer for simple, single-source BI with modest scale and where real-time data matters. It is fast to set up and perfectly adequate in that context.

A data warehouse is the right answer when you need to combine multiple sources, preserve historical data, enforce metric consistency, or protect critical operational systems from analytical load. It costs more to build initially and pays back that investment over time.

The worst outcome is choosing direct query for a situation that needs a warehouse — because the warehouse gets built eventually anyway, at higher cost, with less historical data available, and with the additional work of unpicking decisions made when direct query was the approach.

If you're not sure which applies to your situation — or if your current direct query setup is showing early signs of the problems described above — the free assessment is the right next step. We look at your data, your requirements, and your trajectory, and give you an honest recommendation.

Not sure which approach fits your situation?
Free data architecture assessment — we look at your systems, requirements, and growth trajectory and give you an honest recommendation before any commitment.