Since software's inception, one of important aspects of software development has been data validation, it is important to validate inputs supplied into your application, so as to guarantee the data is exactly as expected.
Around 2018, Pydantic was introduced to the Python ecosystem and this data validation library receives alot of positive embrace from the community, since then Pydantic has proceeded to been the most widely used data validation library for Python, used by leading industeries and packages. Who is using Pydantic?.
Still needs more convincing why you should use Pydantic? Read here
All the code blocks can be copied and used directly (they are actually tested Python files).
A simple example to show Pydantic.
from datetime import datetime
from typing import List, Optional
from pydantic import BaseModel
class User(BaseModel):
id: int
name: str = "John Doe"
signup_ts: Optional[datetime] = None
friends: List[int] = []
external_data = {
"id": "123",
"signup_ts": "2017-06-01 12:22",
"friends": [1, "2", b"3"],
}
user = User(**external_data)
print(user)
# > User id=123 name='John Doe' signup_ts=datetime.datetime(2017, 6, 1, 12, 22) friends=[1, 2, 3]
print(user.id)
# > 123
As seen in the example above, Pydantic validates the supplied input external_data
against our User model structure and ensures the supplied input conforms to our expected 'data'.
Pydantic Types
Pydantic supports many common types from the Python standard library Common Types, also it support stricter processing of this common types Strict Types.
Pydantic also includes some custom types (e.g. to require a positive int). Pydantic Types.
Now to the purpose of this post, let look at how we can utilize Pydantic validation in more complex way.
Alongside the above types mentioned, you can also define your own custom data types. There are several ways to achieve it.
Composing types via Annotated
Pydantic takes advantage of Annotated
introduced in PEP 593 to allow us to create types that are identical to the original type as far as type checkers are concerned, but add validation, serialize differently, etc. read more
import keyword
from typing import Annotated
from pydantic import BaseModel, AfterValidator, Field, ValidationError
def validate_name(name: str):
if keyword.iskeyword(name):
raise ValueError(
f"{name} is not a valid name, please make sure it is not a python keyword."
)
return name
# A custom type that ensure that a string is not a keyword.
Identifier = Annotated[str, AfterValidator(validate_name)]
# A custom type `PositiveInt` to validate that supplied input is a positive integer
PositiveInt = Annotated[int, Field(gt=0)]
class Person(BaseModel):
name: Identifier
age: PositiveInt
external_data = {"name": "John", "age": 18}
person = Person(**external_data)
print(person)
# > User name='John' age=18
try:
wrong_data = {"name": "class", "age": -18}
person = Person(**wrong_data)
print(person)
except ValidationError as exc:
print(exc)
"""
2 validation errors for Person
name
Value error, class is not a valid name, please make sure it is not a python keyword. [type=value_error, input_value='class', input_type=str]
age
Input should be greater than 0 [type=greater_than, input_value=-18,
"""
Customizing validation with __get_pydantic_core_schema__
To do more extensive customization of how Pydantic handles custom classes, you can implement a special __get_pydantic_core_schema__
to tell Pydantic how to generate the pydantic-core schema.
We will be using the __get_pydantic_core_schema__
approach to create a more granular Pydantic custom type.
As an example we will be writing a Pydantic custom type validator called DependsOn
; the purpose of this custom type is to have a field whose value 'depends on' on the value of another field meeting specified condition.
from dataclasses import dataclass
from typing import Annotated, Any, Callable
from pydantic import (
BaseModel,
GetCoreSchemaHandler,
ValidationError,
ValidationInfo,
)
from pydantic_core import core_schema
import inspect
# The frozen=True specification makes DependsOnValidator hashable.
# Without this, a union on the custom type such as X | None will raise an error.
@dataclass(frozen=True)
class DependsOnValidator:
"""Custom type that let you define a field that depends on another field."""
depends_on: str
depends_on_conditon: Callable[[Any], bool]
value_conditon: Callable[[Any], bool]
def validate(self, value, info: ValidationInfo):
if self.depends_on not in info.data:
raise ValueError(
f"{info.field_name} is only allowed in model with {self.depends_on}"
)
if self.value_conditon(value) is True:
if self.depends_on_conditon(info.data[self.depends_on]) is True:
return value
else:
raise ValueError(
f"{info.field_name} is only allowed when {self.depends_on} pass condition \
`{inspect.getsource(self.depends_on_conditon)}`"
)
return value
def __get_pydantic_core_schema__(
self, source_type: Any, handler: GetCoreSchemaHandler
) -> core_schema.CoreSchema:
return core_schema.with_info_after_validator_function(
self.validate, handler(source_type), field_name=handler.field_name
)
# Example using our custom validation
class Person(BaseModel):
name: str
age: int
is_adult: Annotated[
bool,
DependsOnValidator(
depends_on="age",
depends_on_conditon=lambda age: age >= 18,
value_conditon=lambda v: v is True,
),
]
can_drink: Annotated[
bool,
DependsOnValidator(
depends_on="is_adult",
depends_on_conditon=lambda v: v is True,
value_conditon=lambda v: v is True,
),
]
# The above schema defines that
# - `is_adult` is only allowed to be `True` if age is set and age is greater than 18
# - `can_drink` is only allowed to be `True` if `is_adult` is True
# This gives room to create a chain depends on properties `can_drink` depends on `is_adult` which in itself depends on `age`
person = Person(name="John Doe", age=18, is_adult=True, can_drink=True) # Correct
print(person)
# > name='John Doe' age=18 is_adult=True can_drink=True
try:
Person(
name="John Doe", age=12, is_adult=True, can_drink=True
) # `is_adult` fails which inturn makes `can_drink` fail
except ValidationError as exc:
print(exc)
"""
2 validation errors for Person
is_adult
Value error, is_adult is only allowed when age pass condition `depends_on_conditon=lambda age: age >= 18,`
[type=value_error, input_value=True, input_type=bool]
can_drink
Value error, can_drink is only allowed in model with is_adult [type=value_error, input_value=True, input_type=bool]
"""
try:
Person(
name="John Doe", age=18, is_adult=False, can_drink=True
) # `can_drink` fails because it depends on `is_adult` to be True
except ValidationError as exc:
print(exc)
"""
1 validation error for Person
can_drink
Value error, can_drink is only allowed when is_adult pass condition `depends_on_conditon=lambda v: v is True,`
[type=value_error, input_value=True, input_type=bool]
"""