Live Connection vs Extract: Choosing the Right Semantic Layer Integration

Learn when to use live connections versus extracts when connecting BI tools to semantic layers, including performance, governance, and freshness considerations.

8 min read·

When connecting BI tools to a semantic layer, organizations must choose between live connections and extracts (imports). This decision impacts performance, data freshness, governance, and user experience. Understanding the tradeoffs enables informed architecture decisions.

This guide compares live connections and extracts for semantic layer integration, helping you choose the right approach for different use cases with platforms like Codd AI.

Understanding the Options

Live Connections

With live connections, every BI query executes against the semantic layer in real time:

User Action → BI Tool → Query → Semantic Layer → Warehouse → Results

Characteristics:

  • Data reflects source freshness
  • Each query executes on demand
  • Governance applied every time
  • Performance depends on source systems

Extracts (Imports)

With extracts, the BI tool pulls data from the semantic layer on a schedule and stores it locally:

Scheduled Extract: Semantic Layer → BI Tool Storage

User Action → BI Tool → Query → Local Storage → Results

Characteristics:

  • Data freshness limited to extract schedule
  • Queries execute against local copy
  • Governance applied at extract time
  • Performance depends on BI tool storage

Comparison Matrix

FactorLive ConnectionExtract
Data freshnessReal-time (source dependent)Schedule dependent
Query performanceVariableConsistently fast
GovernanceEvery queryExtract time only
StorageNo duplicationCopies data
Complex calculationsMay be slowPre-computed
Offline accessRequires connectivityWorks offline
Semantic layer loadContinuousBatch during extract

When to Use Live Connections

Scenario 1: Operational Dashboards

Dashboards monitoring real-time operations benefit from live connections:

Examples:

  • Order fulfillment status
  • Customer support queue
  • Inventory levels
  • Financial transactions

Why live works:

  • Users need current data
  • Quick, focused queries
  • Freshness outweighs performance
  • Decisions based on now, not yesterday

Scenario 2: Strict Governance Requirements

When governance must apply to every data access:

Examples:

  • Regulatory-sensitive data
  • Row-level security requirements
  • Audit trail needs
  • Dynamic permission changes

Why live works:

  • Security checked each query
  • No stale permissions in cached data
  • Full audit capability
  • Real-time access control updates

Scenario 3: Rapidly Changing Metrics

When metric definitions change frequently:

Examples:

  • A/B testing metrics
  • Promotion performance
  • Campaign tracking
  • New product launches

Why live works:

  • Definition changes apply immediately
  • No need to refresh extracts
  • Consistent across all queries
  • Reduced risk of stale definitions

Scenario 4: Ad-Hoc Exploration

When users explore data interactively:

Examples:

  • Data analyst investigation
  • Executive questions
  • Incident analysis
  • Opportunity discovery

Why live works:

  • No waiting for extract schedules
  • Questions answered immediately
  • Access to latest data
  • Flexible query patterns

When to Use Extracts

Scenario 1: Complex Analytical Dashboards

Dashboards with heavy calculations benefit from extracts:

Examples:

  • Year-over-year comparisons
  • Cohort analysis
  • Trend calculations
  • Statistical aggregations

Why extract works:

  • Complex queries pre-computed
  • Fast user experience
  • Reduced semantic layer load
  • Consistent response times

Scenario 2: Large Data Volumes

When querying large datasets:

Examples:

  • Multi-year historical analysis
  • Transaction-level detail
  • High-cardinality dimensions
  • Full customer datasets

Why extract works:

  • Data compressed and optimized locally
  • Faster aggregation performance
  • Reduced network transfer
  • Better handling of large result sets

Scenario 3: Offline or Disconnected Use

When connectivity is unreliable:

Examples:

  • Mobile dashboards
  • Field sales presentations
  • Conference demonstrations
  • Disconnected environments

Why extract works:

  • Works without network
  • Predictable availability
  • No dependency on source systems
  • Consistent experience anywhere

Scenario 4: Scheduled Reports

When reports run on fixed schedules:

Examples:

  • Daily email reports
  • Weekly management decks
  • Monthly board packages
  • Quarterly business reviews

Why extract works:

  • Data consistent with schedule
  • Predictable performance
  • Reduced load during generation
  • Reports reflect point-in-time snapshot

Hybrid Approaches

Pattern 1: Live for Metrics, Extract for Details

Use live connections for key metrics and extracts for supporting detail:

Executive Dashboard:
├── KPI Tiles → Live Connection (always current)
├── Trend Charts → Extract (historical, refreshed daily)
└── Detail Tables → Extract (large volume, refreshed nightly)

Pattern 2: Live Primary, Extract Fallback

Default to live with automatic fallback:

Query Flow:
1. Attempt live query
2. If timeout/error → fall back to extract
3. Display data with freshness indicator

Pattern 3: Real-Time Summary, Batch Detail

Aggregate metrics live, detail on schedule:

Operational Dashboard:
├── Current Order Count → Live
├── Revenue Today → Live
├── Order List → Extract (refreshed hourly)
└── Customer Details → Extract (refreshed daily)

Pattern 4: Time-Based Strategy

Different approaches based on time window:

Time-Based Logic:
├── Last 24 hours → Live connection
├── Last 7-30 days → Daily extract
└── Historical (30+ days) → Weekly extract

Implementation Considerations

Semantic Layer Configuration

Configure the semantic layer to support both patterns:

For live connections:

  • Optimize query performance
  • Implement caching for repeated queries
  • Set appropriate timeout values
  • Monitor query load

For extracts:

  • Define extract-optimized views
  • Pre-aggregate common patterns
  • Support incremental extract
  • Provide extract monitoring APIs

BI Tool Configuration

Configure BI tools appropriately:

Tableau:

  • Live connections via published data sources
  • Extracts with scheduled refresh
  • Hybrid via multiple data sources

Power BI:

  • DirectQuery for live
  • Import mode for extracts
  • Composite models for hybrid

Looker:

  • PDTs for extract-like behavior
  • Direct queries for live
  • Caching configuration

Performance Optimization

Optimize each pattern differently:

Live connection optimization:

  • Semantic layer query caching
  • Connection pooling
  • Query timeouts
  • Index optimization

Extract optimization:

  • Incremental refresh
  • Partitioned extracts
  • Compressed storage
  • Parallel processing

Governance Implications

Live Connection Governance

Every query passes through governance:

Advantages:

  • Real-time access control
  • Complete audit trail
  • Immediate permission changes
  • Consistent policy enforcement

Considerations:

  • Higher semantic layer load
  • Query-time latency for security checks
  • Dependency on semantic layer availability

Extract Governance

Governance applied at extract time:

Advantages:

  • Governance overhead occurs in batch
  • No runtime security delays
  • Predictable performance

Considerations:

  • Permission changes not immediate
  • Extract may contain stale data
  • Need to manage extract-level security
  • Potential for ungoverned copies

Hybrid Governance

Combine approaches based on sensitivity:

Data Classification:
├── Highly Sensitive → Live only, full audit
├── Business Critical → Live preferred, extract acceptable
├── General Analytics → Extract default, live available
└── Public Data → Extract, minimal governance

Codd AI Platform

Codd AI supports both connection patterns:

  • Optimized live query execution
  • Extract API for scheduled pulls
  • Intelligent caching between patterns
  • Consistent governance across modes

Organizations can implement hybrid architectures with unified governance regardless of connection pattern.

Decision Framework

Use this framework to choose:

Step 1: Assess Data Requirements

QuestionLive if...Extract if...
How fresh must data be?Minutes/hoursDays acceptable
How complex are queries?Simple aggregationsMulti-table joins
How large is the data?Smaller datasetsLarge historical
How often do users query?ContinuousPeriodic

Step 2: Assess Technical Constraints

QuestionLive if...Extract if...
Network reliability?Stable connectivityUnreliable/offline
Semantic layer capacity?Can handle loadLimited capacity
BI tool capabilities?Supports live wellBetter extract support
Response time needs?Seconds acceptableSub-second required

Step 3: Assess Governance Requirements

QuestionLive if...Extract if...
Security sensitivity?High, audit requiredLower sensitivity
Permission volatility?Frequent changesStable permissions
Compliance requirements?Real-time enforcementBatch acceptable

Best Practices Summary

  1. Default to live for operational dashboards and governed data
  2. Use extracts for historical analysis and complex calculations
  3. Implement hybrid for dashboards with mixed requirements
  4. Monitor performance and adjust patterns based on actual usage
  5. Document decisions so users understand data freshness
  6. Display freshness indicators on dashboards using extracts
  7. Review periodically as data volumes and requirements evolve

The choice between live connections and extracts is not binary - successful semantic layer integrations often use both patterns strategically, matching data access methods to specific use case requirements.

Questions

Live connection means queries execute against the source in real time, but the underlying data freshness depends on the source system. If the semantic layer queries a data warehouse updated hourly, live connections reflect that hourly freshness.

Related