Factory injection, combining pytest with factory_boy
The most annoying in writing tests is the setup. All this boilerplate requires so much effort and thinking. You would prefer to focus on taking an action and asserting the result knowing some precondition, than diving again and again into researching how to make it happen. It would be nice to have some idea, some pattern that helps you to get started.
Normally testing tools don't provide any pattern to help you creating the test data. Developers are just writing chunks of code and create entities imperatively. Then they are trying to reuse this code, share it between different functions. But as a matter of fact these fixtures are not refactored too often as too many tests are using them. So the code gets duplicated with lots of flavors.
And in case of BDD it quickly ends up with scripting the creation of an entire hierarchy:
Scenario: Dark Helmet vs Lone Star v1
Given I am your father's brother's nephew's cousin's former roommate
When So what does that make us?
Then Absolutely nothing
This fixture is a very complicated flavor. Flavors have a huge downside of encapsulating the creation code, that makes it impossible to reuse and to parametrize, so let's split it:
Scenario: Dark Helmet vs Lone Star v2
Given there's a room
And there's you
And you have a father
And your father has a brother
And your father's brother has a nephew
And your father's brother's nephew has a cousin
And there's I am
And your father's brother's nephew's cousin's lived in the room
And I lived in the room
When So what does that make us?
Then Absolutely nothing
And this is getting even worse, but it is realistic! Imagine how terrible it is to reuse this setup for similar tests. But there's a solution to that - Gherkin Background section:
Feature: Spaceballs
Background:
Given there's a room
And there's you
And you have a father
And your father has a brother
And your father's brother has a nephew
And your father's brother's nephew has a cousin
And there's I am
Scenario: Dark Helmet vs Lone Star v3
Given your father's brother's nephew's cousin's lived in the room
And I lived in the room
When So what does that make us?
Then Absolutely nothing
How is it now different from the v1? A lot more paper work and the flavor is still there. Anyway you have to address that particular person by some symbolic name, especially because these steps are executed separately and the context is shared between them.
Popular BDD tools are providing this feature, but every step has to come up with certain variable name, put it in the context, make sure to check if it was already created. It also implies that the given steps have to be in a certain order which is again too programmatic.
On the other hand most systems have a statically defined hierarchy. In case you need an object all non-nullable parents in the hierarchy have to be created, otherwise this object cannot exist. All the preconditions that make your system valid are not really worth mentioning. This is the so called just-enough-specification principle of Gherkin where you specify only the facts that you need to observe later assuming everything else simply works.
Certain flavors usually also have statically defined roles in the systems with much shorter names. To achieve that there should be some kind of pattern to make sure all of the object's dependencies are created without mentioning them explicitly.
At Paylogic we gained quite some experience by using pytest for more than 7 years, you can read some previous articles.
Dependency injection
The main difference between pytest and most of the other testing tools is dependency injection.
@pytest.fixture
def you(your_father):
"""You can't be created without your father."""
...
Basically the entire hierarchy is configured statically in the code as dependencies. It is easy to follow the flow by looking at definitions. Each fixture cares only about creating itself and ordering all necessary information as dependencies.
In pytest-BDD we implemented dependency injection support for the steps, so that pytest fixtures are shared among them instead of the context object that you have to feed in an imperative way.
Compared to the other testing tools code written using pytest has more declarative way rather than imperative and dependencies allow just-enough-specification.
Ideal test
What would be the ideal test? I would say the one where I mention only those attributes that make the test case, these values are defined as close as possible to the test code where I can compare them with the assertion section.
pytest has an elegant way to parametrize tests:
@pytest.mark.parametrize("book__price", [Amount("EUR", "15.20")])
@pytest.mark.parametrize(
("author__fee_percent", "expected_author_fee"),
(
(27, Amount("EUR", "4.104")),
(12.5, Amount("EUR", "1.9")),
),
)
def test_unicode(book, expected_author_fee):
"""Test author fee calculation."""
assert book.author_fee == expected_author_fee
The only downside of using pytest fixtures is finding their implementation since you don't have to import them. Fixtures can be inherited and overridden so you are not always sure in what context you are. It would be nice to have a more compact representation of their definitions or to avoid their manual definition completely.
Factory pattern
There's a great project for Python called factory_boy. It allows creating objects starting on any level of the model hierarchy, also creating all of the necessary dependencies. Factory pattern solves the problem of the encapsulation, so that any value can passed to any level.
It is declarative, compact and easy to maintain. Factories are look-alike models and provide a great overview on what values attributes will get.
There are post-generation declarations and actions that help you solve problems related to circular dependencies and to apply some side effects after the object is created.
import factory
import faker
fake = faker.Factory()
class AuthorFactory(factory.Factory):
class Meta:
model = Author
name = factory.LazyAttribute(lambda f: fake.name())
class BookFactory(factory.Factory):
class Meta:
model = Book
# Parent object is created at first
author = factory.SubFactory(AuthorFactory)
title = factory.LazyAttribute(lambda f: fake.sentence())
# Create a book with default values.
book = BookFactory()
# Create a book with specific title value.
book_with_title = BookFactory(title='pytest in a nutshell')
# Create a book parametrized with specific author name.
book_with_author_name = BookFactory(author__name='John Smith')
# Creates books with titles "Book 1", "Book 2", "Book 3", etc.
tons_of_books = BookFactory.create_batch(
size=100000,
title=factory.Sequence(lambda n: "Book {0}".format(n)),
)
The double underscore syntax allows you to address the attribute, the attribute of the attribute etc. This is a nice technique that helps creating complex and large datasets for various needs.
These datasets can be totally parallel hierarchies or a common parent can be passed to bind them. You can even configure certain rules about how to obtain the parent on the attribute declaration.
We use it to populate our sandbox environment database with demo data that 3rd parties can use.
Normally for well isolated test you don't need the entire hierarchy. It takes too much time and resource to create everything. It should be like a lightning that is reaching the target following the shortest path, create only what's necessary to make it work.
Also the tests that are separated in steps want to use only relevant nodes of the hierarchy, not navigating through an entire hierarchy to reach a certain instance or attribute. And this is what pytest is good at, also solving the problem of binding objects to the common parent which is an instance of a fixture in the session.
Factory vs dependency injection
So just to recap, Factory is good at compact declarative style with a good overview of what would be the values, flexibility in parametrization on any level of hierarchy.
pytest is good at delivering fully configured fixtures at any point of the test setup and parametrization of the test case.
What if there could be a way to combine those two to take advantage of:
- Minimum objects creation in the hierarchy path.
- Easily accessed instances of the hierarchy path by conventional names.
- Strong convention of naming attributes for the parametrization.
- Compact declarative notation for the models and attributes.
It is possible to parametrize anything using pytest as long as it is a fixture. It means that we need fixtures not only for the principal entities, but also for all their attributes. Since fixture names are unique within the scope of a test session and represent the same instance, the double underscore convention of factory_boy can be also applied.
pytest-factoryboy
So I made a library where I'm trying to combine the best practices of dependency injection and the factory pattern. The main idea is that the logic of creating attributes is incorporated in the factory attribute declarations and dependencies are represented by sub-factory attributes.
Basically there's no need for manual implementation of the fixtures since factories can be introspected and fixtures can be automatically generated. Only registration is needed where you can optionally choose a name for the fixture.
Factory Fixture
Factory fixtures allow using factories without importing them. Name convention is lowercase-underscore class name.
import factory
from pytest_factoryboy import register
class AuthorFactory(factory.Factory):
class Meta:
model = Author
register(AuthorFactory)
def test_factory_fixture(author_factory):
author = author_factory(name="Charles Dickens")
assert author.name == "Charles Dickens"
Basically you don't have to be bothered importing the factory at all. The only thing you should keep in mind is the naming convention to guess the name and just request it in your fixture or your test function.
Model Fixture
Model fixture implements an instance of a model created by the factory. Name convention is lowercase-underscore class name.
import factory
from pytest_factoryboy import register
@register
class AuthorFactory(Factory):
class Meta:
model = Author
name = "Charles Dickens"
def test_model_fixture(author):
assert author.name == "Charles Dickens"
Model fixtures can be registered with specific names. For example if you address instances of some collection by names like "first", "second" or of another parent as "other":
register(BookFactory) # book
register(BookFactory, "second_book") # second_book
register(AuthorFactory) # author
register(AuthorFactory, "second_author") # second_author
register(BookFactory, "other_book") # other_book, book of another author
@pytest.fixture
def other_book__author(second_author):
"""Make the relation of the second_book to another (second) author."""
return second_author
Attributes are Fixtures
There are fixtures created for factory attributes. Attribute names are prefixed with the model fixture name and double underscore (similar to the convention used by factory boy).
@pytest.mark.parametrized("author__name", ["Bill Gates"])
def test_model_fixture(author):
assert author.name == "Bill Gates"
SubFactory
Sub-factory attribute points to the model fixture of the sub-factory. Attributes of sub-factories are injected as dependencies to the model fixture and can be overridden via parametrization.
post-generation
Post-generation attribute fixture implements only the extracted value for the post generation function.
Integration
An example of factory_boy and pytest integration.
factories/__init__.py:
import factory
from faker import Factory as FakerFactory
faker = FakerFactory.create()
class AuthorFactory(factory.django.DjangoModelFactory):
"""Author factory."""
name = factory.LazyAttribute(lambda x: faker.name())
class Meta:
model = 'app.Author'
class BookFactory(factory.django.DjangoModelFactory):
"""Book factory."""
title = factory.LazyAttribute(lambda x: faker.sentence(nb_words=4))
class Meta:
model = 'app.Book'
author = factory.SubFactory(AuthorFactory)
tests/conftest.py:
from pytest_factoryboy import register
from factories import AuthorFactory, BookFactory
register(AuthorFactory)
register(BookFactory)
tests/test_models.py:
from app.models import Book
from factories import BookFactory
def test_book_factory(book_factory):
"""Factories become fixtures automatically."""
assert isinstance(book_factory, BookFactory)
def test_book(book):
"""Instances become fixtures automatically."""
assert isinstance(book, Book)
@pytest.mark.parametrize("book__title", ["pytest for Dummies"])
@pytest.mark.parametrize("author__name", ["Bill Gates"])
def test_parametrized(book):
"""You can set any factory attribute as a fixture using naming convention."""
assert book.name == "pytest for Dummies"
assert book.author.name == "Bill Gates"
Fixture partial specialization
There is a possibility to pass keyword parameters in order to override factory attribute values during fixture registration. This comes in handy when your test case is requesting a lot of fixture flavors. Too much for the regular pytest parametrization. In this case you can register fixture flavors in the local test module and specify value deviations inside register function calls.
register(AuthorFactory, "male_author", gender="M", name="John Doe")
register(AuthorFactory, "female_author", gender="F")
@pytest.fixture
def female_author__name():
"""Override female author name as a separate fixture."""
return "Jane Doe"
@pytest.mark.parametrize("male_author__age", [42]) # Override even more
def test_partial(male_author, female_author):
"""Test fixture partial specialization."""
assert male_author.gender == "M"
assert male_author.name == "John Doe"
assert male_author.age == 42
assert female_author.gender == "F"
assert female_author.name == "Jane Doe"
Fixture attributes
Sometimes it is necessary to pass an instance of another fixture as an attribute value to the factory. It is possible to override the generated attribute fixture where desired values can be requested as fixture dependencies. There is also a lazy wrapper for the fixture that can be used in the parametrization without defining fixtures in a module.
LazyFixture constructor accepts either existing fixture name or callable with dependencies:
import pytest
from pytest_factoryboy import register, LazyFixture
@pytest.mark.parametrize("book__author", [LazyFixture("another_author")])
def test_lazy_fixture_name(book, another_author):
"""Test that book author is replaced with another author by fixture name."""
assert book.author == another_author
@pytest.mark.parametrize("book__author", [LazyFixture(lambda another_author: another_author)])
def test_lazy_fixture_callable(book, another_author):
"""Test that book author is replaced with another author by callable."""
assert book.author == another_author
# Can also be used in the partial specialization during the registration.
register(AuthorFactory, "another_book", author=LazyFixture("another_author"))
Post-generation dependencies
Unlike factory_boy which binds related objects using an internal container to store results of lazy evaluation, pytest-factoryboy relies on the pytest request.
Circular dependencies between objects can be resolved using post-generation hooks/related factories in combination with passing the SelfAttribute, but in the case of pytest request fixture functions have to return values in order to be cached in the request and to become available to other fixtures.
That's why evaluation of the post-generation declaration in pytest-factoryboy is deferred until calling the test function. This solves circular dependency resolution for situations like:
o->[ A ]-->[ B ]<--[ C ]-o | | o----(C depends on A)----o
On the other hand deferring the evaluation of post-generation declarations makes their result unavailable during the generation of objects that are not in the circular dependency, but they rely on the post-generation action. It is possible to declare such dependencies to be evaluated earlier, right before generating the requested object.
from pytest_factoryboy import register
class Foo(object):
def __init__(self, value):
self.value = value
class Bar(object):
def __init__(self, foo):
self.foo = foo
@register
class FooFactory(factory.Factory):
"""Foo factory."""
class Meta:
model = Foo
value = 0
@factory.post_generation
def set1(foo, create, value, **kwargs):
foo.value = 1
class BarFactory(factory.Factory):
"""Bar factory."""
foo = factory.SubFactory(FooFactory)
@classmethod
def _create(cls, model_class, foo):
assert foo.value == 1 # Assert that set1 is evaluated before object generation
return super(BarFactory, cls)._create(model_class, foo=foo)
class Meta:
model = Bar
register(
BarFactory,
'bar',
_postgen_dependencies=["foo__set1"],
)
"""Forces 'set1' to be evaluated first."""
def test_depends_on_set1(bar):
"""Test that post-generation hooks are done and the value is 2."""
assert depends_on_1.foo.value == 1
All post-generation/RelatedFactory attributes specified in the _postgen_dependencies list during factory registration are evaluated before the object generation.
In case you are using an ORM, SQLAlchemy is especially good with post-generation actions. It is as lazy as possible and doesn't require to bind objects by using constructors or primary keys to be generated. SQLAlchemy is your friend here.
Hooks
pytest-factoryboy exposes several pytest hooks which might be helpful for e.g. controlling database transaction, for reporting etc:
- pytest_factoryboy_done(request) - Called after all factory based fixtures and their post-generation actions have been evaluated.
Conclusion
As pytest helps you write better programs, pytest-factoryboy helps you write better tests. Conventions and limitations allow you to focus on the test exercise and assertions rather than implementation of complicated test setups.
It is easy to parametrize particular low-level attributes of your models, but it is also easy to identify higher level flags to apply all the side effects for the certain workflows in your system.
In both cases it is not needed to navigate through files and folders, looking for specific (inherited) fixture implementations. It is enough to take a look at the factory classes to get an idea of what to expect, but in most cases it is easy to guess by combining the model name and the attribute name that you know by heart.
It is also a great help in behavioral tests to introduce homogenic given steps and avoid fixture flavors. Since you are not implementing complete fixtures manually, flavors just don't exist.
Given steps are simply specifying attribute values for the models depending on them or applying some mutations after the instance is created. And again you are operating only the attributes declared on the factory level that define your data or dispatch the workflow of your system.
Future
Is there more room to automate? Yes.
Python is an amazing dynamic language with a lot of introspection possibilities. Most of the good libraries provide tools for introspection.
In case of SQLAlchemy and if you are decorating your types with meaningful types, for example by using Email, Password, Address types from the SQLAlchemy-Utils, you could generate base classes for your factories and let faker provide you with realistic human-readable values.
There's a project that I want to continue on called Alchemyboy that allows generation of the base classes for factories based on SQLAlchemy models.
Comments