Skip to content

Exploring Python jsonschema

JSON Schema is a powerful and standardized way to describe and validate the structure and content of JSON data. It serves as a blueprint for defining the expected format, data types, and validation rules for JSON documents.

JSON Schema is a JSON-based format itself, used to describe the structure and constraints of other JSON documents. It defines a set of keywords that allow you to specify various aspects of your data model, such as:

  • Required and optional properties
  • Data types (strings, numbers, booleans, arrays, objects, etc.)
  • Minimum and maximum values for numbers
  • Regular expressions for string patterns
  • Enumerated values
  • Nested structures and objects
  • Relationships between properties
  • Custom validation rules

JSON Schema key features:

  • Validation: JSON Schema enables you to validate JSON data against a predefined schema.
  • Documentation: JSON Schema serves as self-documentation for your data structures. By looking at the schema, developers can understand the expected properties, their types, and their relationships.
  • Open Standards: JSON Schema is an open standard, which means it's widely adopted and supported by various programming languages and tools. This makes it a valuable tool for data validation and interoperability.

Warning

This requires the following dependencies and has been tested at these specified versions:

jsonschema==4.19.0

Defining JSON Schemas

In JSON Schema, a schema itself is a JSON object that describes the structure and constraints of another JSON document. Let's explore the key components of defining JSON schemas.

Basic Structure of a JSON Schema

A JSON Schema is a JSON object that contains various keywords to define validation rules. It is typically organized using the following structure:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "vlans": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "id": {
            "type": "integer"
          },
          "name": {
            "type": "string"
          }
        },
        "required": ["id"]
      }
    }
  },
  "required": ["vlans"]
}
  • $schema: Specifies the version of the JSON Schema specification being used.
  • type: Specifies the main type of the JSON data. It could be "object," "array," "string," "number," "boolean," or "null."
  • properties: Contains definitions for each property of the JSON object.
  • required: Lists the properties that must be present in the JSON object.
  • Properties: Within the properties section, you define individual property schemas. Each property schema can have its own validation rules, such as data type, minimum and maximum values, and more.

Defining Schema for Different Data Types

JSON Schema allows you to define schemas for various data types:

  • Strings: Use the type keyword with the value "string." You can add constraints like minLength, maxLength, pattern, and format for validation.
  • Numbers: Use the type keyword with the value "number." You can set minimum and maximum values, specify multipleOf, and more.
  • Booleans: Use the type keyword with the value "boolean."
  • Arrays: Use the type keyword with the value "array." You can define the items schema for individual elements and use minItems, maxItems, and uniqueItems for validation.
  • Objects: Use the type keyword with the value "object." Define the properties for each property of the object and use required to specify mandatory properties.

Defining JSON Schemas - LAB

Define the schema for the following data

[
  {
    "name": "GigabitEthernet1",
    "mode": "access",
    "speed": "auto"
  },
  {
    "name": "GigabitEthernet2",
    "mode": "trunk",
    "speed": "1000"
  },
]

Validating JSON Data

Before validating JSON data, you need to load both the JSON Schema and the JSON data you want to validate. You can do this using Python's built-in json module to read the JSON files.

from jsonschema import validate

schema = {
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "vlans": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "id": {
            "type": "integer"
          },
          "name": {
            "type": "string"
          }
        },
        "required": ["id"]
      }
    }
  },
  "required": ["vlans"]
}

data = {
  "vlans": [
    {
      "id": 100,
      "name": "SERVER"
    },
    {
      "id": 200,
      "name": "OTHER"
    }
  ]
}

validate(data, schema)

If the JSON data adheres to the schema, no exceptions will be raised. Otherwise, an exception will provide details about the validation error.

Handling Validation Errors

from jsonschema.exceptions import ValidationError

data = {
  "vlans": [
    {
      "id": 100,
      "name": "SERVER"
    },
    {
      "id": "200",
      "name": "OTHER"
    }
  ]
}
try:
    validate(data, schema)
except ValidationError as err:
    print("There was an error with you data")
    raise

While in this example, there was not anything valuable in the error, you could follow this pattern to provide more context.

Note: the validate function accepts a error_handler parameter, which is another custom function that could help provide more information about expected error.

Validating JSON Data - LAB

From previous lab, validate the data against the schema you have built.

Formats

There is a series of formats that are predefined that can be quickly used to enforce

  • date
  • time
  • date-time
  • duration
  • regex
  • email
  • idn-email
  • hostname
  • idn-hostname
  • ipv4
  • ipv6
  • json-pointer
  • relative-json-pointer
  • uri
  • uri-reference
  • uri-template
  • iri
  • iri-reference
  • uuid

Note: regex is to validate the value is a regex, not to enforce with a regex pattern, see below for the pattern format.

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "hostname": {
      "type": "string",
      "pattern": "^\\w{3}-\\w{2}\\d{2}$"
    },
    "vendor": {
      "type": "string",
      "enum": ["cisco", "arista", "juniper"]
    },
    "mgmt_ip": {
      "type": "string",
      "format": "ipv4"
    }
  },
  "required": ["hostname", "vendor", "mgmt_ip"]
}

Formats - LAB

Develop the schema to enforce this data and validate in Python:

{
  "hostname": "nyc-rt01",
  "vendor": "cisco",
  "mgmt_ip": "192.168.1.1",
  "purchase_date": "2023-01-01",
  "nautobot_url": "https://demo.nautobot.com/dcim/devices/fe60c60f-756f-4e18-8b50-a847862e4841",
  "uuid": "fe60c60f-756f-4e18-8b50-a847862e4841"
}

Advanced Techniques

Beyond the scope of this training, it is good to be aware of some advanced constructs, here is a brief summary of them.

  • Custom Types - you can extend with custom types, written in Python.
  • propertyNames - is a schema that all of an object's properties must be valid against.
  • $ref - is an external schema to reference.
  • allOf, anyOf, oneOf - keywords provide their logical objective.
  • if, then, else - keywords that provide logic.