When you decide to build BI dashboards, one of the first architectural decisions is how those dashboards connect to your data. The two main options are direct query — connect the BI tool straight to your database — and a data warehouse — build a separate layer that holds processed data and connect your BI tool to that.
Direct query is what most people reach for first. It's faster to set up, requires less infrastructure, and seems simpler. Sometimes it's exactly right. Sometimes it creates problems that cost significantly more to fix than a warehouse would have cost to build in the first place.
Here's the clear guide to what each approach actually involves and when each is the right answer.
What direct query actually means
Direct query means your BI tool — Power BI, Tableau, Qlik — connects directly to your operational database and queries it in real time whenever a dashboard loads or refreshes. The data in the dashboard is the data in the database at that moment.
What's good about it:
Setup is fast. Data is always current. You don't need to build and maintain a separate data layer. For simple situations with one data source and modest dashboard complexity, direct query is perfectly adequate.
What causes problems:
Every dashboard view generates database queries. Those queries compete with transactional operations — your application writing orders, updating records, processing payments — on the same database server. As dashboard complexity grows and user numbers increase, analytical queries slow down the operational system, and the operational system's transaction load slows down the dashboards. Both suffer.
What a data warehouse actually means
A data warehouse is a separate database — purpose-built for analytical queries — that holds processed, validated, historical data sourced from your operational systems through an ETL pipeline. Your BI tool connects to the warehouse, not to any operational system.
What's good about it:
Analytical workload is completely separated from transactional workload. The warehouse can be optimised specifically for query performance. Data from multiple sources can be combined. Historical data is preserved even if the source system doesn't keep it. Metric definitions are enforced at the warehouse layer, so every dashboard uses the same agreed numbers.
What adds complexity:
You need to build and maintain the ETL pipeline. The warehouse takes time to design and build properly. There's a refresh lag — the warehouse is as current as the last ETL run, not real-time. Initial investment is higher.
Side by side
| Direct Query | Data Warehouse | |
|---|---|---|
| Setup time | Fast — connect and build dashboards | Longer — ETL design and build required first |
| Data freshness | Real-time — always current | As current as the last ETL run (hourly, daily, etc.) |
| Production impact | Analytical queries compete with transactions | Zero — analytical load completely separated |
| Multi-source data | Difficult — cross-database joins are complex and slow | Natural — all sources unified in one data model |
| Historical data | Only what the source system keeps | Preserved from day one regardless of source |
| Metric consistency | Defined in each dashboard — can drift | Enforced at warehouse layer — consistent everywhere |
| Query performance | Depends on source database load | Optimised for analytical queries independently |
| Data quality | Raw source data — what's in the DB is what you get | Validated at ETL — bad data caught before dashboards |
| Access constraints | Requires direct database connectivity | Can use staging/export approach if DB access unavailable |
| Cost to start | Lower | Higher |
| Cost at scale | Often higher — performance problems require expensive fixes | Lower — architecture designed for scale from the start |
When direct query is the right answer
✓ Direct Query works well when...
✓ Data Warehouse is right when...
The hidden costs of choosing direct query when you need a warehouse
Direct query looks cheaper at the start. It often isn't by the end. Here's what typically happens when a business chooses direct query for a situation that needed a warehouse:
⚠ The direct query debt spiral
The decision you actually need to make
Rather than "direct query vs warehouse," the better question is: what will your BI requirements look like in two years?
The Atlantic LNG case — when neither approach seems possible
The solution was a two-stream ingestion architecture — structured Excel exports from each system to a controlled SharePoint staging location, plus Microsoft PowerApps forms for data that had no existing digital source. Power BI read from the staging layer and Dataverse, not from any production database.
The result was a warehouse-equivalent architecture — multi-source, validated, consistent — without a single direct database connection. The constraints shaped the approach, but the outcome was the same: a reliable single source of truth for executive and departmental dashboards.
The lesson: "No direct database access" is a constraint on the ingestion method, not a reason to avoid a warehouse architecture. The warehouse is the right answer for the requirements; the ingestion approach adapts to the constraints.
What about hybrid approaches?
In practice, many BI environments use both. Some data sources are connected via direct query — operational metrics that genuinely need real-time data, or sources where the data volume is small and the database headroom is large. Other data sources go through a warehouse — the multi-source joins, the historical data, the complex aggregations.
Power BI's composite model supports this explicitly — some tables in DirectQuery mode, others in import mode (effectively a lightweight warehouse). Tableau and Qlik support similar hybrid configurations. This is often the pragmatic middle ground for businesses that have some real-time requirements alongside multi-source analytical needs.
The honest summary
Direct query is the right answer for simple, single-source BI with modest scale and where real-time data matters. It is fast to set up and perfectly adequate in that context.
A data warehouse is the right answer when you need to combine multiple sources, preserve historical data, enforce metric consistency, or protect critical operational systems from analytical load. It costs more to build initially and pays back that investment over time.
The worst outcome is choosing direct query for a situation that needs a warehouse — because the warehouse gets built eventually anyway, at higher cost, with less historical data available, and with the additional work of unpicking decisions made when direct query was the approach.
If you're not sure which applies to your situation — or if your current direct query setup is showing early signs of the problems described above — the free assessment is the right next step. We look at your data, your requirements, and your trajectory, and give you an honest recommendation.