Examples
For each example, the complete source code is available in Github in the examples directory
Example 1 - Multiple Data Sources
This is a simple example to show how DiffSync can be used to compare and synchronize multiple data sources.
For this example, we have a shared model for Device and Interface defined in models.py
And we have 3 instances of DiffSync based on the same model but with different values (BackendA, BackendB & BackendC).
The source code for this example is in Github in the examples/01-multiple-data-sources/ directory.
First create and populate all 3 objects:
from backend_a import BackendA
from backend_b import BackendB
from backend_c import BackendC
# Create each
a = BackendA()
a.load()
print(a.str())
b = BackendB()
b.load()
print(b.str())
c = BackendC()
c.load()
print(c.str())
Configure verbosity of DiffSync’s structured logging to console; the default is full verbosity (all logs including debugging):
from diffsync.logging import enable_console_logging
enable_console_logging(verbosity=0) # Show WARNING and ERROR logs only
# enable_console_logging(verbosity=1) # Also include INFO logs
# enable_console_logging(verbosity=2) # Also include DEBUG logs
Show the differences between A and B:
diff_a_b = a.diff_to(b)
print(diff_a_b.str())
Show the differences between B and C:
diff_b_c = c.diff_from(b)
print(diff_b_c.str())
Synchronize A and B (update B with the contents of A):
a.sync_to(b)
print(a.diff_to(b).str())
# Alternatively you can pass in the diff object from above to prevent another diff calculation
# a.sync_to(b, diff=diff_a_b)
Now A and B will show no differences:
diff_a_b = a.diff_to(b)
print(diff_a_b.str())
In the Device model, the
site_name
androle
are not included in the_attributes
, so they are not shown when we are comparing the different objects, even if the value is different.
Example 2 - Callback Function
This example shows how you can set up DiffSync to invoke a callback function to update its status as a sync proceeds. This could be used to, for example, update a status bar (such as with the tqdm library), although here for simplicity we’ll just have the callback print directly to the console.
The source code for this example is in Github in the examples/02-callback-function/ directory.
from diffsync.logging import enable_console_logging
from main import DiffSync1, DiffSync2, print_callback
enable_console_logging(verbosity=0) # Show WARNING and ERROR logs only
# Create a DiffSync1 instance and populate it with records numbered 1-100
ds1 = DiffSync1()
ds1.load(count=100)
# Create a DiffSync2 instance and populate it with 100 random records in the range 1-200
ds2 = DiffSync2()
ds2.load(count=100)
# Identify and attempt to resolve the differences between the two,
# periodically invoking print_callback() as DiffSync progresses
ds1.sync_to(ds2, callback=print_callback)
You should see output similar to the following:
diff: Processed 1/200 records.
diff: Processed 3/200 records.
...
diff: Processed 199/200 records.
diff: Processed 200/200 records.
sync: Processed 1/134 records.
sync: Processed 2/134 records.
...
sync: Processed 134/134 records.
A few points to note:
For each record in
ds1
andds2
, either it exists in both, exists only inds1
, or exists only inds2
.The total number of records reported during the
"diff"
stage is the sum of the number of records in bothds1
andds2
.For this very simple set of models, the progress counter during the
"diff"
stage will increase at each step by 2 (if a corresponding pair of models is identified betweends1
andds2
) or by 1 (if a model exists only inds1
or only inds2
).The total number of records reported during the
"sync"
stage is the number of distinct records existing acrossds1
andds2
combined, so it will be less than the total reported during the"diff"
stage.By design for this example,
ds2
is populated semi-randomly with records, so the exact number reported during the"sync"
stage may differ for you.
Example 3 - Work with a remote system
This is a simple example to show how DiffSync can be used to compare and synchronize data with a remote system like Nautobot via a REST API.
For this example, we have a shared model for Region and Country defined in models.py
.
A country must be part of a region and has an attribute to capture its population.
The comparison and synchronization of dataset is done between a local JSON file and the public instance of Nautobot.
Also, this example is showing :
How to set a Global Flags to ignore object that are not matching
How to provide a custom Diff class to change the ordering of a group of object
The source code for this example is in Github in the examples/03-remote-system/ directory.
Install the requirements
to use this example you must have some dependencies installed, please make sure to run
pip install -r requirements.txt
Setup the environment
By default this example will interact with the public sandbox of Nautobot at https://demo.nautobot.com but you can use your own version of Nautobot by providing a new URL and a new API token using the environment variables NAUTOBOT_URL
& NAUTOBOT_TOKEN
export NAUTOBOT_URL = "https://demo.nautobot.com"
export NAUTOBOT_TOKEN = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
Try the example
The first time you run this example, a lot of changes should be reported between Nautobot and the local data because by default the demo instance doesn’t have the subregion defined.
After the first sync, on subsequent runs, the diff should show no changes.
At this point, Diffsync
will be able to identify and fix all changes in Nautobot. You can try to add/update or delete any country in Nautobot and DiffSync will automatically catch it and it will fix it with running in sync mode.
### DIFF Compare the data between Nautobot and the local JSON file.
python main.py --diff
### SYNC Update the list of country in Nautobot.
python main.py --sync
Example 4 - Using get or update helpers
This example aims to expand on Example 1 that will take advantage of two new helper methods on the DiffSync
class; get_or_instantiate
and update_or_instantiate
.
Both methods act similar to Django’s get_or_create
function to return the object and then a boolean to identify whether the object was created or not. Let’s dive into each of them.
get_or_instantiate
The following arguments are supported: model (DiffSyncModel
), ids (dictionary), and attrs (dictionary). The model
and ids
are used to find an existing object. If the object does not currently exist within the DiffSync
adapter, it will then use model
, ids
, and attrs
to add the object.
It will then return a tuple that can be unpacked.
obj, created = self.get_or_instantiate(Interface, {"device_name": "test100", "name": "eth0"}, {"description": "Test description"})
If the object already exists, created
will be False
or else it will return True
if the object had to be created.
update_or_instantiate
This helper is similar to get_or_instantiate
, but it will update an existing object or add a new instance with the provided ids
and attrs
. The method does accept the same arguments, but requires attrs
, whereas get_or_instantiate
does not.
obj, created = self.update_or_instantiate(Interface, {"device_name": "test100", "name": "eth0"}, {"description": "Test description"})
Example Walkthrough
We can take a look at the data we will be loading into each backend to understand why these helper methods are valuable.
Example Data
BACKEND_DATA_A = [
{
"name": "nyc-spine1",
"role": "spine",
"interfaces": {"eth0": "Interface 0", "eth1": "Interface 1"},
"site": "nyc",
},
{
"name": "nyc-spine2",
"role": "spine",
"interfaces": {"eth0": "Interface 0", "eth1": "Interface 1"},
"site": "nyc",
},
]
Example Load
def load(self):
"""Initialize the BackendA Object by loading some site, device and interfaces from DATA."""
for device_data in BACKEND_DATA_A:
device, instantiated = self.get_or_instantiate(
self.device, {"name": device_data["name"]}, {"role": device_data["role"]}
)
site, instantiated = self.get_or_instantiate(self.site, {"name": device_data["site"]})
if instantiated:
device.add_child(site)
for intf_name, desc in device_data["interfaces"].items():
intf, instantiated = self.update_or_instantiate(
self.interface, {"name": intf_name, "device_name": device_data["name"]}, {"description": desc}
)
if instantiated:
device.add_child(intf)
The new methods are helpful due to having devices that are part of the same site. As we iterate over the data and load it into the DiffSync
adapter, we would have to account for ObjectAlreadyExists
exceptions when we go to add each duplicate site we encounter within the data or possibly several other models depending how complex the synchronization of data is between backends.
Example 5 - PeeringDB to Nautobot synchronisation
Context
The goal of this example is to synchronize some data from PeeringDB, that as the name suggests is a DB where peering entities define their facilities and presence to facilitate peering, towards Nautobot Demo that is a always on demo service for Nautobot, an open source Source of Truth.
In Peering DB there is a model that defines a Facility
and you can get information about the actual data center and the city where it is placed. In Nautobot, this information could be mapped to the Region
and Site
models, where Region
can depend from other Region
and also contain Site
as children. For instance, Barcelona is in Spain and Spain is in Europe, and all of them are Regions
. And, finally, the actual datacenter will refer to the Region
where it is placed.
Because of the nature of the demo, we will focus on syncing from PeeringDB to Nautobot (we assume that PeeringDB is the authoritative System of Record) and we will skip the delete
part of the diffsync
library, using diffsync flags.
We have 3 files:
models.py: defines the reference models that we will use:
RegionMode
andSiteModel
adapter_peeringdb.py
: defines the PeeringDB adapter to translate viaload()
the data from PeeringDB into the reference models commented above. Notice that we don’t define CRUD methods because we will sync from it (no to it)adapter_nautobot.py
: defines the Nautobot adapter with theload()
and the CRUD methods
The source code for this example is in Github in the examples/05-nautobot-peeringdb/ directory.
Get PeeringDB API Key (optional)
To ensure a good performance from PeeringDB API, you should provide an API Key: https://docs.peeringdb.com/howto/api_keys/
Then, copy the example creds.example.env
into creds.env
, and place your new API Key.
$ cp examples/05-nautobot-peeringdb/creds.example.env examples/05-nautobot-peeringdb/creds.env
Without API Key it might also work, but it could fail due to API rate limiting.
Set up local docker environment
$ docker-compose -f examples/05-nautobot-peeringdb/docker-compose.yml up -d --build
$ docker exec -it 05-nautobot-peeringdb_example_1 python
Interactive execution
from adapter_nautobot import NautobotRemote
from adapter_peeringdb import PeeringDB
from diffsync.enum import DiffSyncFlags
from diffsync.store.redis import RedisStore
store_one = RedisStore(host="redis")
store_two = RedisStore(host="redis")
# Initialize PeeringDB adapter, using CATNIX id for demonstration
peeringdb = PeeringDB(
ix_id=62,
internal_storage_engine=store_one
)
# Initialize Nautobot adapter, pointing to the demo instance (it's also the default settings)
nautobot = NautobotRemote(
url="https://demo.nautobot.com",
token="a" * 40,
internal_storage_engine=store_two
)
# Load PeeringDB info into the adapter
peeringdb.load()
# We can check the data that has been imported, some as `site` and some as `region` (with the parent relationships)
peeringdb.dict()
# Load Nautobot info into the adapter
nautobot.load()
# Let's diffsync do it's magic
diff = nautobot.diff_from(peeringdb, flags=DiffSyncFlags.SKIP_UNMATCHED_DST)
# Quick summary of the expected changes (remember that delete ones are dry-run)
diff.summary()
# Execute the synchronization
nautobot.sync_from(peeringdb, flags=DiffSyncFlags.SKIP_UNMATCHED_DST)
Example 6 - IP Prefixes
This example shows how to play around to IPAM systems which have a different implementation of an IP Prefix.
These IPAM systems, IPAM A and IPAM B, are simulated using two YAML files within the data
folder. These files are dynamic, and they will be loaded and updated from diffsync.
Test the example
You could simply run the main.py
file, but to run step by step.
Set up the environment
Install the dependencies (recommended into a virtual environment)
pip3 install -r requirements.txt
and go into a python
interactive session:
python3
>>>
Import the DiffSync adapters
>>> from adapter_ipam_a import IpamA
>>> from adapter_ipam_b import IpamB
Initialize and load adapter for IPAM A
>>> ipam_a = IpamA()
>>> ipam_a.load()
You can check the content loaded from IPAM A. Notice that the data has been transformed into the DiffSync model, which is different from the original YAML data.
>>> import pprint
>>> pprint.pprint(ipam_a.dict())
{'prefix': {'10.10.10.10/24': {'prefix': '10.10.10.10/24',
'vlan_id': 10,
'vrf': 'data'},
'10.20.20.20/24': {'prefix': '10.20.20.20/24',
'tenant': 'ABC corp',
'vlan_id': 20,
'vrf': 'voice'},
'172.18.0.0/16': {'prefix': '172.18.0.0/16', 'vlan_id': 18}}}
Initialize and load adapter for IPAM B
>>> ipam_b = IpamB()
>>> ipam_b.load()
You can check the content loaded from IPAM B. Notice that the data has been transformed into the DiffSync model, which again is different from the original YAML format.
>>> pprint.pprint(ipam_b.dict())
{'prefix': {'10.10.10.10/24': {'prefix': '10.10.10.10/24', 'vlan_id': 123},
'2001:DB8::/32': {'prefix': '2001:DB8::/32',
'tenant': 'XYZ Corporation',
'vlan_id': 10,
'vrf': 'data'}}}
Check the difference
We can use diff_to
or diff_from
to select, from the perspective of the calling adapter, who is the authoritative in each case.
>>> diff = ipam_a.diff_to(ipam_b)
From this diff
, we can check the summary of what would happen.
>>> diff.summary()
{'create': 2, 'update': 1, 'delete': 1, 'no-change': 0, 'skip': 0}
And, also go into the details. We can see how the '+'
and + '-'
represent the actual changes in the target adapter: create, delete or update (when both symbols appear).
>>> pprint.pprint(diff.dict())
{'prefix': {'10.10.10.10/24': {'+': {'vlan_id': 10, 'vrf': 'data'},
'-': {'vlan_id': 123, 'vrf': None}},
'10.20.20.20/24': {'+': {'tenant': 'ABC corp',
'vlan_id': 20,
'vrf': 'voice'}},
'172.18.0.0/16': {'+': {'tenant': None,
'vlan_id': 18,
'vrf': None}},
'2001:DB8::/32': {'-': {'tenant': 'XYZ Corporation',
'vlan_id': 10,
'vrf': 'data'}}}}
Enforce synchronization
Simply transforming the diff_to
to sync_to
, we are going to change the state of the destination target.
>>> ipam_a.sync_to(ipam_b)
Validate synchronization
Now, if we reload the IPAM B, and try to check the difference, we should see no differences.
>>> new_ipam_b = IpamB()
>>> new_ipam_b.load()
>>> diff = ipam_a.diff_to(new_ipam_b)
>>> diff.summary()
{'create': 0, 'update': 0, 'delete': 0, 'no-change': 3, 'skip': 0}