pytest-dbt-core
Write unit tests for your dbt logic with pytest-dbt-core! pytest-dbt-core is a pytest plugin for testing your dbt projects.
Installation
Install pytest-dbt-core via pip with
python -m pip install pytest-dbt-core
Usage
Create a macro:
{% macro normalize_column_names(column_names) %}
{%- set re = modules.re -%}
{%- for column_name in column_names -%}
{%- set normalized_column_name = re.sub('[!?@#$()%&]', '', column_name).strip().replace(' ', '_').replace('.', '_').rstrip('_').lower() -%}
{# Columns should not start with digits #}
{%- if normalized_column_name[0].isdigit() -%}
{% set normalized_column_name = '_' + normalized_column_name-%}
{% endif -%}
`{{ column_name }}` as {{ normalized_column_name }}
{%- if not loop.last -%}, {% endif -%}
{%- endfor -%}
{% endmacro %}
Unit test a macro:
import pytest
from dbt.clients.jinja import MacroGenerator
from pyspark.sql import SparkSession
@pytest.mark.parametrize(
"macro_generator",
["macro.dbt_project.normalize_column_names"],
indirect=True,
)
@pytest.mark.parametrize(
"column_name,expected_column_name",
[
("unit", "unit"),
("column with spaces", "column_with_spaces"),
("c!h?a#r$a(c)t%e&rs", "characters"),
("trailing white spaces ", "trailing_white_spaces"),
("column.with.periods", "column_with_periods"),
("9leading number", "_9leading_number"),
("UPPERCASE", "uppercase"),
],
)
def test_normalize_column_names(
spark_session: SparkSession,
macro_generator: MacroGenerator,
column_name: str,
expected_column_name: str,
) -> None:
"""Test normalize column names with different scenarios."""
normalized_column_names = macro_generator([column_name])
out = spark_session.sql(
f"SELECT {normalized_column_names} FROM (SELECT True AS `{column_name}`)"
)
assert out.columns[0] == expected_column_name, normalized_column_names
Configuration
The plugin runs in the context of a dbt project.
Project directory
When you run pytest from the root of your project, you do not need to set the project directory. If you want to run pytest from another location, you point the –dbt-project-dir option to the root of your project.
Target
If you want to use another target than the default, you set the –dbt-target option.
Profiles directory
If you want to change dbt’s profiles directory, use the –profiles-dir option.
dbt-spark
dbt-spark users are recommend to use the Spark connection method when testing. Together with the pytest Spark plugin, a on-the-fly Spark session removes the need for hosting Spark.
Installation
Install dbt-spark, pytest-dbt-core and pytest-spark via pip with
python -m pip install dbt-spark pytest-dbt-core pytest-spark
Configuration
Configure pytest-spark via pytest configuration.
# setup.cfg
[tool:pytest]
spark_options =
spark.executor.instances: 1
spark.sql.catalogImplementation: in-memory
Usage
Use the spark_session fixture to set-up the unit test for your macro:
import pytest
from dbt.clients.jinja import MacroGenerator
from pyspark.sql import SparkSession
@pytest.mark.parametrize(
"macro_generator", ["macro.spark_utils.get_tables"], indirect=True
)
def test_get_tables(
spark_session: SparkSession, macro_generator: MacroGenerator
) -> None:
"""The get tables macro should return the created table."""
expected_table = "default.example"
spark_session.sql(f"CREATE TABLE {expected_table} (id int) USING parquet")
tables = macro_generator()
assert tables == [expected_table]
Test
Run the Pytest via your preferred interface.
pytest
Projects
The following projects use the pytest-dbt-core plugin: