Skip to main content

Troubleshooting Guide

Quick Diagnosis Flowchart

Pipeline Failed?
├── Check Runs Page → Red Status?
│ ├── Yes → Check Error Logs
│ └── No → Check Asset Status
├── Data Missing?
│ ├── Check Asset Freshness
│ └── Verify Dependencies
└── Performance Issue?
├── Compare Run Duration
└── Check Resource Usage

Common Issues After Migration

1. "No module named 'meltano'" Error

Symptom:

ModuleNotFoundError: No module named 'meltano'

Solution:

  • This is managed by the platform
  • Should not occur in normal operation
  • Contact support immediately

2. Missing Historical Data

Symptom:

  • Dashboards show gaps
  • Reports missing historical trends

Explanation:

  • Historical run information not migrated
  • Data in warehouse remains intact
  • Only Dagster run history is new

Solution:

  • Historical data still in your warehouse
  • Create manual references if needed
  • Document cutover date

3. Schedule Running at Wrong Time

Symptom:

  • Pipeline runs at unexpected hour
  • Missing scheduled runs

Check:

# View schedule definition
Schedules page → Your Schedule → Details

Common Causes:

  • Timezone differences (Arch vs Dagster)
  • Cron expression interpretation
  • Daylight saving time

Fix:

  • Verify timezone in deployment settings
  • Adjust cron expression if needed
  • Account for UTC vs local time

Meltano-Specific Issues

State Management Problems

"State lock file exists"

Meaning: Previous run didn't complete cleanly

Solution:

  1. Wait 5 minutes (auto-cleanup)
  2. Check for active runs
  3. Contact support if persists

"Bookmark not found"

Meaning: First run or state reset

Solution:

  • Normal for first run
  • Will extract all data
  • State created automatically

Connection Errors

"Connection refused"

psycopg2.OperationalError: connection refused

Check:

  1. Source system status
  2. Network connectivity
  3. Credential rotation
  4. Firewall rules

Quick Test:

  • Try running a simple query
  • Check other extractors
  • Verify from UI console

dbt-Specific Issues

Model Compilation Errors

"Model not found"

Compilation Error in model 'my_model'
Model 'my_model' not found

Causes:

  1. Model file missing
  2. Incorrect ref() syntax
  3. Model in different schema

Fix:

  1. Verify model file exists
  2. Check model naming
  3. Review dbt_project.yml

"Permission denied"

Database Error in model 'my_model'
permission denied for schema analytics

Solution:

  • Warehouse permissions unchanged
  • Check role assignments
  • Verify schema access

Dependency Issues

Circular Dependencies

Found a cycle: model_a → model_b → model_a

Fix:

  1. Review model relationships
  2. Break cycle with staging model
  3. Restructure dependencies

Performance Issues

Slow Extractions

Diagnosis:

  1. Compare with Arch timing
  2. Check data volume growth
  3. Review extraction logs

Common Causes:

  • No incremental state
  • API rate limiting
  • Large backfill

Solutions:

  • Verify state is working
  • Check API quotas
  • Consider batching

dbt Models Taking Longer

Check:

-- In your warehouse
SELECT * FROM information_schema.query_history
WHERE query_text LIKE '%your_model%'
ORDER BY start_time DESC;

Optimize:

  1. Review model SQL
  2. Check for missing indexes
  3. Analyze query plans
  4. Consider incremental models

Asset Materialization Issues

"Upstream asset not materialized"

Meaning: Dependency hasn't run successfully

Fix:

  1. Materialize upstream asset first
  2. Check dependency chain
  3. Run full pipeline

"Asset check failed"

Example:

@asset_check
def row_count_check(context, asset):
# Validation failed
pass

Resolution:

  1. Review check logic
  2. Verify data quality
  3. Adjust thresholds
  4. Skip check if needed

Authentication Issues

"Invalid credentials"

For Dagster UI:

  1. Reset password via login page
  2. Check account status
  3. Verify email address

For Data Sources:

  • Credentials managed by platform
  • No user action needed
  • Report to support

Session Timeouts

Symptom: Logged out frequently

Normal Behavior:

  • 12-hour session timeout
  • Security feature
  • Cannot be modified

Error Message Decoder

Dagster Errors

ErrorMeaningAction
DagsterExecutionErrorAsset failed to materializeCheck logs
DagsterResourceErrorResource configuration issueContact support
DagsterTypeErrorData type mismatchReview asset output
DagsterInvariantViolationSystem constraint violatedReport bug

Platform Errors

ErrorMeaningAction
K8s pod errorCompute resource issueAuto-retry, then support
Timeout exceededRun took too longOptimize or increase limit
Memory limit exceededOut of memoryReduce data size or contact support

Getting Help Effectively

Information to Gather

  1. Run Information

    • Run ID (from URL)
    • Failure time
    • Asset name
    • Error message
  2. Context

    • Recent changes made
    • First occurrence?
    • Affecting all runs?
    • Specific to asset?
  3. Logs

    • Copy error text
    • Include stack trace
    • Note step that failed

Support Template

Issue: [Brief description]
Run ID: [From URL]
Asset: [Name]
Error: [Message]
First seen: [Date/time]
Frequency: [Always/Sometimes]
Recent changes: [What changed]

Prevention Strategies

Daily Checks

  1. Monitor asset freshness
  2. Review failed runs
  3. Check schedule health
  4. Validate critical data

Weekly Review

  1. Performance trends
  2. Error patterns
  3. Resource usage
  4. Update documentation

Before Making Changes

  1. Test in development
  2. Review dependencies
  3. Plan rollback strategy
  4. Monitor first run

Emergency Procedures

Critical Pipeline Down

  1. Immediate Actions

    • Screenshot error
    • Note exact time
    • Check all pipelines
    • Alert stakeholders
  2. Diagnosis

    • Recent changes?
    • Source system up?
    • Partial success?
    • Alternative path?
  3. Escalation

    • Use emergency contact
    • Provide run ID
    • Share error details
    • Available for call

Data Corruption

  1. Stop the Bleeding

    • Pause schedules
    • Prevent propagation
    • Document scope
  2. Recovery

    • Identify last good state
    • Plan restoration
    • Test fix carefully
    • Gradual rollout

FAQ

Q: Why can't I see historical runs from Arch? A: Run history starts fresh in Dagster. Data remains in warehouse.

Q: Can I modify Meltano configurations? A: No, Meltano configs are managed. Use dbt for transformations.

Q: How do I add a new data source? A: Contact support to add new Meltano taps.

Q: Why is my pipeline slower than Arch? A: Check if running full extraction vs incremental.

Q: Can I run pipelines locally? A: No, compute is managed in cloud infrastructure.

Next Steps