Beginner’s Guide To DBT Testing
When running your data through build tools (dbt) it would be good to be able to perform simple validations earlier on, a great way to do this is with dbt tests. This is a supported feature in dbt projects that allows you to create test cases for your transformations. This testing allows you to try to catch potential issues earlier on and this saves you the effort of trying to fix the issue downstream. In this guide I show how to use dbt tests to validate your transformations.
Common Data Quality issues
Data quality issues can lead to errors in analysis and reporting, which can affect decision-making. Common data quality issues in source data include inconsistent data formats, invalid data values, missing or incomplete data, and incorrect data types. Transformations can also introduce issues, such as data loss, data corruption, data duplication, and incorrect calculations. These issues can result in incomplete analysis or inaccurate results, so it’s important to address them in order to ensure that data is accurate and reliable for making informed decisions.
In general, there are two types of testing that can be implemented, the first is the generic tests and the second Singular Tests. In our project we have implemented generic test like:
1. NOT NULL
A null value in data indicates that a value in a column is missing or unknown. Sometimes, null values are intentional and desired. Other times, they are an indication of a data issue. This kind of test identifies where null values exist.
2. UNIQUE VALUES
This kind of test is used to ensure that there are no duplicate values or recurring rows in a dataset. This is important because duplicate values in data columns like primary keys can lead to misleading data metrics, inaccurate visualizations, and unreliable reports.
3.ACCEPTED VALUES
This kind of test verifies that all column values across all rows are in the set of valid values. For example, our product_category column might only have two valid values If this test uncovers any values in column which are not one of these two, it will fail.
4.RELATIONSHIP/REFERENTIAL INTEGRITY
This kind of test verifies referential integrity, ensuring all of the records in a child table have a corresponding record in the parent table.
In our project we defined a test on the data sources for uniqueness and not null on their respective column.
To test the specific test on the source table, I use the following command

Unique Testing:

Expression Testing:

Relationship Testing:

Using the dbt test command I can execute all the test cases and receive a result. dbt will run the test and display the results, the default result can either be PASS, WARN, ERROR or SKIP, below is the result of the test run for the specific data source:
If the test run leads to an error, dbt will output the error details, I illustrate this by running a test with an error below:

SINGULAR TEST:
Singular Tests are defined in the tests folder of your dbt project, they are written test assertions using dbt’s SQL SELECT statements with additional Jinja templating. Custom tests are good for testing business transformations, and ensuring qualitative data, for instance, below example of a custom test:


In the dbt Lineage graph below, the tests defined is included with its dependencies, and when executing the test or run commands without options the custom test is also executed.

Conclusion
This wraps up the beginner’s guide to dbt testing, of course there is so much more that can be done with dbt testing, for instance creating generic macro tests, snapshot tests and implementing packages for testing. Utilizing built-in or custom tests in dbt is simple and straightforward, it helps increase observability and should be standard in any dbt project. Testing in general is a good practice and is always better when included earlier on in your workflows. Writing good quality tests will help your data be more reliable and increase observability. Not to mention you’d be able to catch data quality issues earlier on preventing reports from inaccuracies. So using dbt, develop tests early on and update them over time.