Examples

For each example, the complete source code is available in Github in the examples directory

Example 1 - Multiple Data Sources

This is a simple example to show how DiffSync can be used to compare and synchronize multiple data sources.

For this example, we have a shared model for Device and Interface defined in models.py And we have 3 instances of DiffSync based on the same model but with different values (BackendA, BackendB & BackendC).

The source code for this example is in Github in the examples/01-multiple-data-sources/ directory.

First create and populate all 3 objects:

from backend_a import BackendA
from backend_b import BackendB
from backend_c import BackendC

# Create each

a = BackendA()
a.load()
print(a.str())

b = BackendB()
b.load()
print(b.str())

c = BackendC()
c.load()
print(c.str())

Configure verbosity of DiffSync’s structured logging to console; the default is full verbosity (all logs including debugging):

from diffsync.logging import enable_console_logging
enable_console_logging(verbosity=0)  # Show WARNING and ERROR logs only
# enable_console_logging(verbosity=1)  # Also include INFO logs
# enable_console_logging(verbosity=2)  # Also include DEBUG logs

Show the differences between A and B:

diff_a_b = a.diff_to(b)
print(diff_a_b.str())

Show the differences between B and C:

diff_b_c = c.diff_from(b)
print(diff_b_c.str())

Synchronize A and B (update B with the contents of A):

a.sync_to(b)
print(a.diff_to(b).str())
# Alternatively you can pass in the diff object from above to prevent another diff calculation
# a.sync_to(b, diff=diff_a_b)

Now A and B will show no differences:

diff_a_b = a.diff_to(b)
print(diff_a_b.str())

In the Device model, the site_name and role are not included in the _attributes, so they are not shown when we are comparing the different objects, even if the value is different.

Example 2 - Callback Function

This example shows how you can set up DiffSync to invoke a callback function to update its status as a sync proceeds. This could be used to, for example, update a status bar (such as with the tqdm library), although here for simplicity we’ll just have the callback print directly to the console.

The source code for this example is in Github in the examples/02-callback-function/ directory.

from diffsync.logging import enable_console_logging
from main import DiffSync1, DiffSync2, print_callback

enable_console_logging(verbosity=0)  # Show WARNING and ERROR logs only

# Create a DiffSync1 instance and populate it with records numbered 1-100
ds1 = DiffSync1()
ds1.load(count=100)

# Create a DiffSync2 instance and populate it with 100 random records in the range 1-200
ds2 = DiffSync2()
ds2.load(count=100)

# Identify and attempt to resolve the differences between the two,
# periodically invoking print_callback() as DiffSync progresses
ds1.sync_to(ds2, callback=print_callback)

You should see output similar to the following:

diff: Processed   1/200 records.
diff: Processed   3/200 records.
...
diff: Processed 199/200 records.
diff: Processed 200/200 records.
sync: Processed   1/134 records.
sync: Processed   2/134 records.
...
sync: Processed 134/134 records.

A few points to note:

  • For each record in ds1 and ds2, either it exists in both, exists only in ds1, or exists only in ds2.

  • The total number of records reported during the "diff" stage is the sum of the number of records in both ds1 and ds2.

  • For this very simple set of models, the progress counter during the "diff" stage will increase at each step by 2 (if a corresponding pair of models is identified betweends1 and ds2) or by 1 (if a model exists only in ds1 or only in ds2).

  • The total number of records reported during the "sync" stage is the number of distinct records existing across ds1 and ds2 combined, so it will be less than the total reported during the "diff" stage.

  • By design for this example, ds2 is populated semi-randomly with records, so the exact number reported during the "sync" stage may differ for you.

Example 3 - Work with a remote system

This is a simple example to show how DiffSync can be used to compare and synchronize data with a remote system like Nautobot via a REST API.

For this example, we have a shared model for Region and Country defined in models.py. A country must be part of a region and has an attribute to capture its population.

The comparison and synchronization of dataset is done between a local JSON file and the public instance of Nautobot.

Also, this example is showing :

  • How to set a Global Flags to ignore object that are not matching

  • How to provide a custom Diff class to change the ordering of a group of object

The source code for this example is in Github in the examples/03-remote-system/ directory.

Install the requirements

to use this example you must have some dependencies installed, please make sure to run

pip install -r requirements.txt

Setup the environment

By default this example will interact with the public sandbox of Nautobot at https://demo.nautobot.com but you can use your own version of Nautobot by providing a new URL and a new API token using the environment variables NAUTOBOT_URL & NAUTOBOT_TOKEN

export NAUTOBOT_URL = "https://demo.nautobot.com"
export NAUTOBOT_TOKEN = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"

Try the example

The first time you run this example, a lot of changes should be reported between Nautobot and the local data because by default the demo instance doesn’t have the subregion defined. After the first sync, on subsequent runs, the diff should show no changes. At this point, Diffsync will be able to identify and fix all changes in Nautobot. You can try to add/update or delete any country in Nautobot and DiffSync will automatically catch it and it will fix it with running in sync mode.

### DIFF Compare the data between Nautobot and the local JSON file.
python main.py --diff

### SYNC Update the list of country in Nautobot.
python main.py --sync

Example 4 - Using get or update helpers

This example aims to expand on Example 1 that will take advantage of two new helper methods on the Adapter class; get_or_instantiate and update_or_instantiate.

Both methods act similar to Django’s get_or_create function to return the object and then a boolean to identify whether the object was created or not. Let’s dive into each of them.

get_or_instantiate

The following arguments are supported: model (DiffSyncModel), ids (dictionary), and attrs (dictionary). The model and ids are used to find an existing object. If the object does not currently exist within the Adapter adapter, it will then use model, ids, and attrs to add the object.

It will then return a tuple that can be unpacked.

obj, created = self.get_or_instantiate(Interface, {"device_name": "test100", "name": "eth0"}, {"description": "Test description"})

If the object already exists, created will be False or else it will return True if the object had to be created.

update_or_instantiate

This helper is similar to get_or_instantiate, but it will update an existing object or add a new instance with the provided ids and attrs. The method does accept the same arguments, but requires attrs, whereas get_or_instantiate does not.

obj, created = self.update_or_instantiate(Interface, {"device_name": "test100", "name": "eth0"}, {"description": "Test description"})

Example Walkthrough

We can take a look at the data we will be loading into each backend to understand why these helper methods are valuable.

Example Data

BACKEND_DATA_A = [
    {
        "name": "nyc-spine1",
        "role": "spine",
        "interfaces": {"eth0": "Interface 0", "eth1": "Interface 1"},
        "site": "nyc",
    },
    {
        "name": "nyc-spine2",
        "role": "spine",
        "interfaces": {"eth0": "Interface 0", "eth1": "Interface 1"},
        "site": "nyc",
    },
]

Example Load

def load(self):
    """Initialize the BackendA Object by loading some site, device and interfaces from DATA."""
    for device_data in BACKEND_DATA_A:
        device, instantiated = self.get_or_instantiate(
            self.device, {"name": device_data["name"]}, {"role": device_data["role"]}
        )

        site, instantiated = self.get_or_instantiate(self.site, {"name": device_data["site"]})
        if instantiated:
            device.add_child(site)

        for intf_name, desc in device_data["interfaces"].items():
            intf, instantiated = self.update_or_instantiate(
                self.interface, {"name": intf_name, "device_name": device_data["name"]}, {"description": desc}
            )
            if instantiated:
                device.add_child(intf)

The new methods are helpful due to having devices that are part of the same site. As we iterate over the data and load it into the Adapter adapter, we would have to account for ObjectAlreadyExists exceptions when we go to add each duplicate site we encounter within the data or possibly several other models depending how complex the synchronization of data is between backends.

Example 5 - PeeringDB to Nautobot synchronisation

Context

The goal of this example is to synchronize some data from PeeringDB, that as the name suggests is a DB where peering entities define their facilities and presence to facilitate peering, towards Nautobot Demo that is a always on demo service for Nautobot, an open source Source of Truth.

In Peering DB there is a model that defines a Facility and you can get information about the actual data center and the city where it is placed. In Nautobot, this information could be mapped to the Region and Site models, where Region can depend from other Region and also contain Site as children. For instance, Barcelona is in Spain and Spain is in Europe, and all of them are Regions. And, finally, the actual datacenter will refer to the Region where it is placed.

Because of the nature of the demo, we will focus on syncing from PeeringDB to Nautobot (we assume that PeeringDB is the authoritative System of Record) and we will skip the delete part of the diffsync library, using diffsync flags.

We have 3 files:

  • models.py: defines the reference models that we will use: RegionMode and SiteModel

  • adapter_peeringdb.py: defines the PeeringDB adapter to translate via load() the data from PeeringDB into the reference models commented above. Notice that we don’t define CRUD methods because we will sync from it (no to it)

  • adapter_nautobot.py: defines the Nautobot adapter with the load() and the CRUD methods

The source code for this example is in Github in the examples/05-nautobot-peeringdb/ directory.

Get PeeringDB API Key (optional)

To ensure a good performance from PeeringDB API, you should provide an API Key: https://docs.peeringdb.com/howto/api_keys/

Then, copy the example creds.example.env into creds.env, and place your new API Key.

$ cp examples/05-nautobot-peeringdb/creds.example.env examples/05-nautobot-peeringdb/creds.env

Without API Key it might also work, but it could fail due to API rate limiting.

Set up local docker environment

$ docker-compose -f examples/05-nautobot-peeringdb/docker-compose.yml up -d --build

$ docker exec -it 05-nautobot-peeringdb_example_1 python

Interactive execution

from adapter_nautobot import NautobotRemote
from adapter_peeringdb import PeeringDB
from diffsync.enum import DiffSyncFlags
from diffsync.store.redis import RedisStore

store_one = RedisStore(host="redis")
store_two = RedisStore(host="redis")

# Initialize PeeringDB adapter, using CATNIX id for demonstration
peeringdb = PeeringDB(
    ix_id=62,
    internal_storage_engine=store_one
)

# Initialize Nautobot adapter, pointing to the demo instance (it's also the default settings)
nautobot = NautobotRemote(
    url="https://demo.nautobot.com",
    token="a" * 40,
    internal_storage_engine=store_two
)

# Load PeeringDB info into the adapter
peeringdb.load()

# We can check the data that has been imported, some as `site` and some as `region` (with the parent relationships)
peeringdb.dict()

# Load Nautobot info into the adapter
nautobot.load()

# Let's diffsync do it's magic
diff = nautobot.diff_from(peeringdb, flags=DiffSyncFlags.SKIP_UNMATCHED_DST)

# Quick summary of the expected changes (remember that delete ones are dry-run)
diff.summary()

# Execute the synchronization
nautobot.sync_from(peeringdb, flags=DiffSyncFlags.SKIP_UNMATCHED_DST)

Example 6 - IP Prefixes

This example shows how to play around to IPAM systems which have a different implementation of an IP Prefix.

These IPAM systems, IPAM A and IPAM B, are simulated using two YAML files within the data folder. These files are dynamic, and they will be loaded and updated from diffsync.

Test the example

You could simply run the main.py file, but to run step by step.

Set up the environment

Install the dependencies (recommended into a virtual environment)

pip3 install -r requirements.txt

and go into a python interactive session:

python3
>>>

Import the DiffSync adapters

>>> from adapter_ipam_a import IpamA
>>> from adapter_ipam_b import IpamB

Initialize and load adapter for IPAM A

>>> ipam_a = IpamA()
>>> ipam_a.load()

You can check the content loaded from IPAM A. Notice that the data has been transformed into the DiffSync model, which is different from the original YAML data.

>>> import pprint
>>> pprint.pprint(ipam_a.dict())
{'prefix': {'10.10.10.10/24': {'prefix': '10.10.10.10/24',
                               'vlan_id': 10,
                               'vrf': 'data'},
            '10.20.20.20/24': {'prefix': '10.20.20.20/24',
                               'tenant': 'ABC corp',
                               'vlan_id': 20,
                               'vrf': 'voice'},
            '172.18.0.0/16': {'prefix': '172.18.0.0/16', 'vlan_id': 18}}}

Initialize and load adapter for IPAM B

>>> ipam_b = IpamB()
>>> ipam_b.load()

You can check the content loaded from IPAM B. Notice that the data has been transformed into the DiffSync model, which again is different from the original YAML format.

>>> pprint.pprint(ipam_b.dict())
{'prefix': {'10.10.10.10/24': {'prefix': '10.10.10.10/24', 'vlan_id': 123},
            '2001:DB8::/32': {'prefix': '2001:DB8::/32',
                              'tenant': 'XYZ Corporation',
                              'vlan_id': 10,
                              'vrf': 'data'}}}

Check the difference

We can use diff_to or diff_from to select, from the perspective of the calling adapter, who is the authoritative in each case.

>>> diff = ipam_a.diff_to(ipam_b)

From this diff, we can check the summary of what would happen.

>>> diff.summary()
{'create': 2, 'update': 1, 'delete': 1, 'no-change': 0, 'skip': 0}

And, also go into the details. We can see how the '+' and + '-' represent the actual changes in the target adapter: create, delete or update (when both symbols appear).

>>> pprint.pprint(diff.dict())
{'prefix': {'10.10.10.10/24': {'+': {'vlan_id': 10, 'vrf': 'data'},
                               '-': {'vlan_id': 123, 'vrf': None}},
            '10.20.20.20/24': {'+': {'tenant': 'ABC corp',
                                     'vlan_id': 20,
                                     'vrf': 'voice'}},
            '172.18.0.0/16': {'+': {'tenant': None,
                                    'vlan_id': 18,
                                    'vrf': None}},
            '2001:DB8::/32': {'-': {'tenant': 'XYZ Corporation',
                                    'vlan_id': 10,
                                    'vrf': 'data'}}}}

Enforce synchronization

Simply transforming the diff_to to sync_to, we are going to change the state of the destination target.

>>> ipam_a.sync_to(ipam_b)

Validate synchronization

Now, if we reload the IPAM B, and try to check the difference, we should see no differences.

>>> new_ipam_b = IpamB()
>>> new_ipam_b.load()
>>> diff = ipam_a.diff_to(new_ipam_b)
>>> diff.summary()
{'create': 0, 'update': 0, 'delete': 0, 'no-change': 3, 'skip': 0}