Source freshness
dbt Cloud provides a helpful interface around dbt's source data freshness calculations. When a dbt Cloud job is configured to snapshot source data freshness, dbt Cloud will render a user interface showing you the state of the most recent snapshot. This interface is intended to help you determine if your source data freshness is meeting the service level agreement (SLA) that you've defined for your organization.
Enabling source freshness snapshots
dbt build
does not include source freshness checks when building and testing resources in your DAG. Instead, you can use one of these common patterns for defining jobs:
- Add
dbt build
to the run step to run models, tests, and so on. - Select the Generate docs on run checkbox to automatically generate project docs.
- Select the Run source freshness checkbox to enable source freshness as the first step of the job.
To enable source freshness snapshots, firstly make sure to configure your sources to snapshot freshness information. You can add source freshness to the list of commands in the job run steps or enable the checkbox. However, you can expect different outcomes when you configure a job by selecting the Run source freshness checkbox compared to adding the command to the run steps.
Review the following options and outcomes:
Source freshness snapshot frequency
It's important that your freshness jobs run frequently enough to snapshot data latency in accordance with your SLAs. You can imagine that if you have a 1 hour SLA on a particular dataset, snapshotting the freshness of that table once daily would not be appropriate. As a good rule of thumb, you should run your source freshness jobs with at least double the frequency of your lowest SLA. Here's an example table of some reasonable snapshot frequencies given typical SLAs:
SLA | Snapshot Frequency |
---|---|
1 hour | 30 mins |
1 day | 12 hours |
1 week | About daily |
Further reading
- Refer to Artifacts for more info on how to create dbt Cloud artifacts, share links to the latest documentation, and share source freshness reports with your team.
- Source freshness for Snowflake is calculated using the
LAST_ALTERED
column. Read about the limitations in Snowflake configs.