Attemtps at immutability with dataclasses in Python
Disclaimer: This article was originally written on Jan 3 / 2023 and wasn’t published before.
As you may know, Python tries to follow the naming convention borrowed from C which is using all letters in uppercase. So if you see a piece of code like this you better believe it’s a constant defined.
MAX_AGE: int = 60
C has at least two ways to define constants: one with a const
keyword and the second one using #define
preprocessor directive and any attempt to reassign a read-only variable will result in a compilation error
Unfortunately, Python being a dynamic language, can’t guarantee that another module importing the constant won’t shadow or redefine it later. Uppercase idenfiers are considered a convention of sorts but that doesn’t protect from any mistakes and side effects.
Old style consts
Over the course of Python’s progression the most practical approach for having fixed values was to use a code snippet from Python Cookbook 2nd edition1. I made some slight changes to adapt it to Python 3.10 reality.from typing import Any
class _const:
class ConstError(TypeError):
pass
def __setattr__(self, name: str, value: Any) -> None:
if name in self.__dict__
raise self.ConstError(f"Cannot rebind const {name}")
self.__dict__[name] = value
def __delattr__(self, name: str) -> None:
if name in self.__dict__:
raise self.ConstError(f"Cannot unbind const {name}")
raise NameError(name)
import sys
sys.modules[__name__] = _const()
This approach is based on the fact that the Python interpreter doesn’t perform any check to assure that module entries are real module objects. In the pre optional typing era this trick was one of the few options to ensure client code would not rebind the name.
But if you’re working in VSCode with Pylance set to basic type checking an error message will be generated:
Argument of type "_const" cannot be assigned to parameter "v" of type "ModuleType" in function "__setitem__" "_const" is incompatible with "ModuleType"
making this approach a dirty hack.
New style consts
Python 3.7 introduced the possibility to defne module level__getattr__
2.
# const.py
from typing import Any
def __getattr__(name: str) -> Any:
match name:
case "FOO":
return "BAR"
case "BAZ":
return "QUX"
Alas, the module level __setattr__
was not introduced, making rebinding easily possible.
# const_test.py
from const import FOO, BAR
FOO # BAR
BAZ = 100
BAZ # 100
So what is the modern Pythonic approach to define consts then?
The most flexible way that I came up with is actually the use of dataclasses.# const.py
from dataclasses import dataclass
@dataclass(frozen=True)
class Const:
MAX_AGE: int = 60
MAX_RETRIES: int = 5
SERVICE_URL: str = "https://example.com"
const = Const()
# const_test.py
from const import const
const.MAX_RETRIES # 5
const.SERVICE_URL = "http://localhost:8080" # raises dataclasses.FrozenInstanceError: cannot assign to field 'SERVICE_URL'
del const.MAX_RETRIES # raises dataclasses.FrozenInstanceError: cannot delete field 'MAX_RETRIES'
The frozen
3 argument emulates immutability by generating __setattr__
, __delattr__
stubs that will raise FrozenInstanceError
as soon as client code tries to assign new value or delete the name. Indeed this is the same behavior that was introduced in Python Cookbook recipe.
Dataclass consts and inheritance
While refactoring a const.py
module that had 100+ indentifiers declared, my first temptation was to put them in separate const modules. I would group them by prefix or purpose and name the modules in respect to their role in the project.
Then I realized that arranging consts in separate modules would not bring much benefit to the codebase rather than introducing more modules and increasing import clutter. Given that class is a namespace on it’s own I decided try a different approach.
# const.py
from dataclasses import dataclass
@dataclass(frozen=True, slots=True)
class ConstBase:
pass
class MaxConst(ConstBase):
MAX_AGE: int = 60
MAX_RETRIES: int = 5
class ServiceConst(ConstBase):
SERVICE_URL: str = "https://example.com"
SERVICE_DC: str = "US"
SERVICE_PORT: int = 8081
This seems to look much more flexible and scalable in case a service specific constant hierarchies may be required.
# const.py
class ServiceConst(ConstBase):
SERVICE_URL: str = "https://example.com"
SERVICE_DC: str = "US"
SERVICE_PORT: int = 8081
class ServiceConstEU(ServiceConst):
SERVICE_URL: str = "https://eu.example.com"
SERVICE_DC: str = "DE"
GDPR_URL: str = "https://eu.gdpr.example.com"
serviceConstEU = ServiceConstEU()
serviceConstEU.GDPR_URL # "https://eu.gdpr.example.com"
serviceConstEU.GDPR_URL = "fake" # raises TypeError
Alas frozen=True
only mitigates instance attribute modification.
Class attributes are still mutable.
ServiceConstEU.GDPR_URL = "bogus url" # no Exception gets raised
ServiceConstEU.GDPR_URL # "bogus url"
serviceConstEU.GDPR_URL # "bogus url"
This compromises the whole effort! As I want to have class and instance level protection for my attributes and on top of that have the flexibility to redefine attribute values in the inheritance chain when needed and add new ones.
Overcoming the limitation
My exploration lead me to research a couple viable options for this: from pickingEnums
or namedtuple
over dataclasses
to metaprogramming.
Enums
Would look less complicated. But the access is happening throughvalue
property that looks ugly and subclassing the enum is not allowed if it defines new members4.
from enum import Enum
class MaxConst(Enum):
MAX_AGE: int = 60
MAX_RETRIES: int = 5
class ServiceConst(Enum):
SERVICE_URL: str = "https://example.com"
SERVICE_DC: str = "US"
SERVICE_PORT: int = 8081
ServiceConst.SERVICE_DC.value # "US"
ServiceConst.SERVICE_DC = "CZ" # raises AttributeError: cannot reassign member 'SERVICE_DC'
class ServiceConstEU(ServiceConst): # raises TypeError: <enum 'ServiceConstEU'> cannot extend <enum 'ServiceConst'>
SERVICE_URL: str = "https://eu.example.com"
SERVICE_DC: str = "DE"
GDPR_URL: str = "https://eu.gdpr.example.com"
Namedtuples
Namedtuples would look almost the same asEnum
. They would support inheritance and field attribute redefinition is subclasses and would guarantee instance immutability.
But in reality namedtuple
is a misleading data structure, given that it doesn’t really enfoces full immutability.
from typing import NamedTuple
class MaxConst(NamedTuple):
MAX_AGE: int = 60
MAX_RETRIES: int = 5
class ServiceConst(MaxConst):
SERVICE_URL: str = "https://example.com"
SERVICE_DC: str = "US"
SERVICE_PORT: int = 8081
class ServiceConstEU(ServiceConst):
SERVICE_URL: str = "https://eu.example.com"
SERVICE_DC: str = "DE"
GDPR_URL: str = "https://eu.gdpr.example.com"
eu = ServiceConstEU()
eu.MAX_AGE # 60
eu.SERVICE_DC # "DE"
eu.MAX_AGE = 100 # raises AttributeError: can't set attribute
ServiceConst.SERVICE_PORT = 9000
eu.SERVICE_PORT # 9000
The value of SERVICE_PORT
being 9000
instead of 8081
is due to dynamic lookup because attribute not being present in ServiceConstEU.mappingproxy
because it was neither copied from the parent class nor redefined in the subclass.
Metaclasses and comprehensive immutability
To prevent class attribute modification a metaclass can be used.First thing off the top of the head is to override __setattr__
at class creating level and combine it with the dataclass
decorator. It’s worth mentioning that a naïve use of this approach would block the dataclass from getting built.
Under the hood the dataclass not only generates dunder5 methods but sets internal attributes like __dataclass_fields__
,
__dataclass_params__
etc. The decorator would set the attribute __dataclass_params__
for the decorated class and later attach the instance of _DataclassParams
with the data from decorator arguments.
So this
from typing import Any, Type
from dataclasses import dataclass
class ConstMeta(type):
def __setattr__(cls: Type[Any],
key: str,
value: Any) -> None:
raise AttributeError(f"Cannot rebind const'{key}'")
@dataclass(frozen=True, slots=True)
class ConstBase(metaclass=ConstMeta): # AttributeError: Cannot rebind const'__dataclass_params__'
pass
class MaxConst(ConstBase):
MAX_AGE: int = 60
MAX_RETRIES: int = 5
would raise as soon as the dataclass decorator starts bootstrapping ConstBase
.
Ok, the correct solution over here may be to maintain an allowlist of the fields that are required for successful dataclass creation and later proceed with class level attirbute immutability.
One would be to allow the dataclass to run it’s bootstrapping by allowlisting dunder methods. And the second is to “lock” the dataclasses once they are fully bootstrapped, at the metaclass level.
While the dynamic nature of checking for dunder methods is acceptable in CPython runtime itself6, this approach is much more lenient on allowing side effects to happen in client code. Locking the class after it’s bootstrapped completely would require a lot of maintenance and the justification wouldn’t be reasonable.
Achieveing full immutability
So, the best solution I came up with the requirements I set myself is this.from typing import Any, Type
class ConstBaseMeta(type):
def __setattr__(cls: Type['ConstBase'], key: str, value: Any) -> None:
if key in cls.__dict__:
raise TypeError(f"Cannot modify existing class attribute '{key}' in {cls.__name__}")
raise TypeError(f"Cannot add new class attribute '{key}' in {cls.__name__}")
class ConstBase(metaclass=ConstBaseMeta):
__slots__ = ()
def __setattr__(self, key: str, value: Any) -> None:
raise TypeError(f"Cannot modify instance attribute '{key}' in {self.__class__.__name__}")
class MaxConst(ConstBase):
MAX_AGE: int = 60
MAX_RETRIES: int = 5
class ServiceConst(MaxConst):
SERVICE_URL: str = "https://example.com"
SERVICE_DC: str = "US"
SERVICE_PORT: int = 8081
class ServiceConstEU(ServiceConst):
MAX_AGE: int = 80
SERVICE_URL: str = "https://eu.example.com"
SERVICE_DC: str = "DE"
SERVICE_PORT: int = 8083
GDPR: bool = True
class ServiceConstEUEast(ServiceConstEU):
SERVICE_DC: str = "CZ"
eu = ServiceConstEU()
ServiceConstEU.GDPR = False # TypeError: Cannot modify existing class attribute 'GDPR' in ServiceConstEU
eu.MAX_AGE # 80
eu.MAX_AGE = 100 # TypeError: Cannot modify instance attribute 'MAX_AGE' in ServiceConstEU
MaxConst.MAX_RETRIES = 10 # TypeError: Cannot modify existing class attribute 'MAX_RETRIES' in MaxConst
ServiceConst.MAX_RETRIES # 5
east = ServiceConstEUEast()
east.SERVICE_DC # CZ
ServiceConstEUEast.__dict__["HACK"] = True # TypeError: 'mappingproxy' object does not support item assignment
By leveraging the metaclass level __setattr__
any new attributes are prevented from being added to classes derived from ConstantBase.
The __setattr__
at the class level of ConstantBase
would prevent the modification of instance level attributes while the __slots__
won’t allow adding new ones, thus achieving full immutability.
In the end, it seems like no matter how much the new data types and tricks Python standard library has to offer for immutability purposes, they end up being inconsistent in functionality or not extendable enough.
-
https://www.oreilly.com/library/view/python-cookbook-2nd/0596007973/ ↩︎
-
https://docs.python.org/3/whatsnew/3.7.html#pep-562-customization-of-access-to-module-attributes ↩︎
-
https://docs.python.org/3/library/dataclasses.html#frozen-instances ↩︎
-
https://docs.python.org/3/howto/enum.html#restricted-enum-subclassing ↩︎
-
https://docs.python.org/3/library/dataclasses.html#module-contents ↩︎
-
https://github.com/python/cpython/blob/148f32913573c29250dfb3f0d079eb8847633621/Objects/typeobject.c#L3299-L3306 ↩︎