Skip to content

Add core engine section in docs and rename example directories #72

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Oct 19, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
b2d92db
Add core engine section in docs and rename example directories
dgarros Oct 11, 2021
6ef651c
Update docs/source/core_engine/01-flags.md
dgarros Oct 13, 2021
93e36b4
Update docs/source/core_engine/02-customize-diff-class.md
dgarros Oct 13, 2021
4b76360
Update docs/source/core_engine/01-flags.md
dgarros Oct 13, 2021
827cefc
Update docs/source/core_engine/02-customize-diff-class.md
dgarros Oct 13, 2021
f29a4fd
Fix example directory names in unit tests
dgarros Oct 15, 2021
7ab6a00
Add skip state
dgarros Oct 15, 2021
81d6838
Fix pylint
dgarros Oct 18, 2021
54f6b3b
Update docs/source/core_engine/01-flags.md
dgarros Oct 18, 2021
6b13497
Update docs/source/core_engine/01-flags.md
dgarros Oct 18, 2021
d2bbd57
Update docs/source/core_engine/01-flags.md
dgarros Oct 18, 2021
54839fe
Update docs/source/core_engine/02-customize-diff-class.md
dgarros Oct 18, 2021
c28f405
Update docs/source/core_engine/01-flags.md
dgarros Oct 18, 2021
503a751
Update docs/source/core_engine/01-flags.md
dgarros Oct 18, 2021
6e38ea1
Remove Change the rendering of the result of the diff section from doc
dgarros Oct 18, 2021
aa88e88
Convert link to ReST format
dgarros Oct 18, 2021
6591f73
Update docs/source/core_engine/01-flags.md
dgarros Oct 19, 2021
a8e22a1
Update docs/source/core_engine/01-flags.md
dgarros Oct 19, 2021
066f742
Update docs/source/core_engine/01-flags.md
dgarros Oct 19, 2021
1566714
Update docs/source/core_engine/01-flags.md
dgarros Oct 19, 2021
8d937cc
Feedback from glenn
dgarros Oct 19, 2021
f17b2f5
Merge pull request #71 from networktocode/dga-update-doc
dgarros Oct 19, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
116 changes: 116 additions & 0 deletions docs/source/core_engine/01-flags.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@

# Global and Model Flags

These flags offer a powerful way to instruct the core engine how to handle some specific situation without changing the data. One way to think of the flags is to represent them as configuration for the core engine. Currently 2 sets of flags are supported:
- **global flags**: applicable to all data.
- **model flags**: applicable to a specific model or to individual instances of a model.

> *The flags are stored in binary format which allows storing multiple flags in a single variable. See the section below [Working with flags](#working-with-flags) to learn how to manage them.*

The list of supported flags is expected to grow over time as more use cases are identified. If you think some additional flags should be supported, please reach out via Github to start a discussion.

## Global flags

Global flags can be defined at runtime when calling one of these functions : `diff_to` ,`diff_from`, `sync_to` or `sync_from`

```python
from diffsync.enum import DiffSyncFlags
flags = DiffSyncFlags.SKIP_UNMATCHED_DST
diff = nautobot.diff_from(local, flags=flags)
```

### Supported Global Flags

| Name | Description | Binary Value |
|---|---|---|
| CONTINUE_ON_FAILURE | Continue synchronizing even if failures are encountered when syncing individual models. | 0b1 |
| SKIP_UNMATCHED_SRC | Ignore objects that only exist in the source/"from" DiffSync when determining diffs and syncing. If this flag is set, no new objects will be created in the target/"to" DiffSync. | 0b10 |
| SKIP_UNMATCHED_DST | Ignore objects that only exist in the target/"to" DiffSync when determining diffs and syncing. If this flag is set, no objects will be deleted from the target/"to" DiffSync. | 0b100 |
| SKIP_UNMATCHED_BOTH | Convenience value combining both SKIP_UNMATCHED_SRC and SKIP_UNMATCHED_DST into a single flag | 0b110 |
| LOG_UNCHANGED_RECORDS | If this flag is set, a log message will be generated during synchronization for each model, even unchanged ones. | 0b1000 |

## Model flags

Model flags are stored in the attribute `model_flags` of each model and are usually set when the data is being loaded into the adapter.

```python
from diffsync import DiffSync
from diffsync.enum import DiffSyncModelFlags
from model import MyDeviceModel

class MyAdapter(DiffSync):

device = MyDeviceModel

def load(self, data):
"""Load all devices into the adapter and add the flag IGNORE to all firewall devices."""
for device in data.get("devices"):
obj = self.device(name=device["name"])
if "firewall" in device["name"]:
obj.model_flags = DiffSyncModelFlags.IGNORE
self.add(obj)
```

### Supported Model Flags

| Name | Description | Binary Value |
|---|---|---|
| IGNORE | Do not render diffs containing this model; do not make any changes to this model when synchronizing. Can be used to indicate a model instance that exists but should not be changed by DiffSync. | 0b1 |
| SKIP_CHILDREN_ON_DELETE | When deleting this model, do not recursively delete its children. Can be used for the case where deletion of a model results in the automatic deletion of all its children. | 0b10 |

## Working with flags

Flags are stored in binary format. In binary format, each bit of a variable represents 1 flag which allow us to have up to many flags stored in a single variable. Using binary flags provides more flexibility to add support for more flags in the future without redefining the current interfaces and the current DiffSync API.

### Enable a flag (Bitwise OR)

Enabling a flag is possible with the bitwise OR operator `|=`. It's important to use the bitwise operator OR when enabling a flags to ensure that the value of other flags remains unchanged.

```python
>>> from diffsync.enum import DiffSyncFlags
>>> flags = DiffSyncFlags.CONTINUE_ON_FAILURE
>>> flags
<DiffSyncFlags.CONTINUE_ON_FAILURE: 1>
>>> bin(flags.value)
'0b1'
>>> flags |= DiffSyncFlags.SKIP_UNMATCHED_DST
>>> flags
<DiffSyncFlags.SKIP_UNMATCHED_DST|CONTINUE_ON_FAILURE: 5>
>>> bin(flags.value)
'0b101'
```

### Checking the value of a specific flag (bitwise AND)

Validating if a flag is enabled is possible with the bitwise operator AND: `&`. The AND operator will return 0 if the flag is not set and the binary value of the flag if it's enabled. To convert the result of the test into a proper conditional it's possible to wrap the bitwise AND operator into a `bool` function.

```python
>>> from diffsync.enum import DiffSyncFlags
>>> flags = DiffSyncFlags.NONE
>>> bool(flags & DiffSyncFlags.CONTINUE_ON_FAILURE)
False
>>> flags |= DiffSyncFlags.CONTINUE_ON_FAILURE
>>> bool(flags & DiffSyncFlags.CONTINUE_ON_FAILURE)
True
```

### Disable a flag (bitwise NOT)

After a flag has been enabled, it's possible to disable it with a bitwise AND NOT operator : `&= ~`

```python
>>> from diffsync.enum import DiffSyncFlags
>>> flags = DiffSyncFlags.NONE
# Setting the flags SKIP_UNMATCHED_DST and CONTINUE_ON_FAILURE
>>> flags |= DiffSyncFlags.SKIP_UNMATCHED_DST | DiffSyncFlags.CONTINUE_ON_FAILURE
>>> flags
<DiffSyncFlags.SKIP_UNMATCHED_DST|CONTINUE_ON_FAILURE: 5>
>>> bool(flags & DiffSyncFlags.SKIP_UNMATCHED_DST)
True
# Unsetting the flag SKIP_UNMATCHED_DST; CONTINUE_ON_FAILURE remains set
>>> flags &= ~DiffSyncFlags.SKIP_UNMATCHED_DST
>>> flags
<DiffSyncFlags.CONTINUE_ON_FAILURE: 1>
>>> bool(flags & DiffSyncFlags.SKIP_UNMATCHED_DST)
False
```
57 changes: 57 additions & 0 deletions docs/source/core_engine/02-customize-diff-class.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@

# Custom Diff Class

When performing a diff or a sync operation, a diff object is generated. A diff object is itself composed of DiffElement objects representing the different elements of the original datasets with their differences.

The diff object helps to access all the DiffElements. It's possible to provide your own Diff class in order to customize some of its capabilities the main one being the order in which the elements are processed.

## Using your own Diff class

To use your own diff class, you need to provide it at runtime when calling one of these functions : `diff_to`, `diff_from`, `sync_to` or `sync_from`.

```python
>>> from diffsync.enum import DiffSyncFlags
>>> from diff import AlphabeticalOrderDiff
>>> diff = remote_adapter.diff_from(local_adapter, diff_class=AlphabeticalOrderDiff)
>>> type(diff)
<class 'AlphabeticalOrderDiff'>
```

## Change the order in which the element are being processed

By default, all objects of the same type will be stored in a dictionary and as such the order in which they will be processed during a diff or a sync operation is not guaranteed (although in most cases, it will match the order in which they were initially loaded and added to the adapter). When the order in which a given group of object should be processed is important, it's possible to define your own ordering inside a custom Diff class.

When iterating over a list of objects, either at the top level or as a group of children of a given object, the core engine is looking for a function named after the type of the object `order_children_<type>` and if none is found it will rely on the default function `order_children_default`. Either function need to be present and need to return an Iterator of DiffElement.

In the example below, by default all devices will be sorted per type of CRUD operations (`order_children_device`) while all other objects will be sorted alphabetically (`order_children_default`)

```python
class MixedOrderingDiff(Diff):
"""Alternate diff class to list children in alphabetical order, except devices to be ordered by CRUD action."""

@classmethod
def order_children_default(cls, children):
"""Simple diff to return all children in alphabetical order."""
for child_name, child in sorted(children.items()):
yield children[child_name]

@classmethod
def order_children_device(cls, children):
"""Return a list of device sorted by CRUD action and alphabetically."""
children_by_type = defaultdict(list)

# Organize the children's name by action create, update or delete
for child_name, child in children.items():
action = child.action or "skip"
children_by_type[action].append(child_name)

# Create a global list, organized per action
sorted_children = sorted(children_by_type["create"])
sorted_children += sorted(children_by_type["update"])
sorted_children += sorted(children_by_type["delete"])
sorted_children += sorted(children_by_type["skip"])

for name in sorted_children:
yield children[name]
```

11 changes: 11 additions & 0 deletions docs/source/core_engine/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
Core Engine
===========

The core engine of DiffSync is meant to be transparent for most users but in some cases it's important to have the ability to change its behavior to adjust to some specific use cases. For these use cases, there are several ways to customize its behavior:

- Global and Model Flags
- Diff class

.. mdinclude:: 01-flags.md
.. mdinclude:: 02-customize-diff-class.md

7 changes: 0 additions & 7 deletions docs/source/examples/01-multiple-data-sources.rst

This file was deleted.

7 changes: 0 additions & 7 deletions docs/source/examples/02-callback-function.rst

This file was deleted.

8 changes: 4 additions & 4 deletions docs/source/examples/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@
Examples
############

.. toctree::
:maxdepth: 2
For each example, the complete source code is `available in Github <https://github.com/networktocode/diffsync/tree/main/examples>`_ in the `examples` directory

01-multiple-data-sources
02-callback-function
.. mdinclude:: ../../../examples/01-multiple-data-sources/README.md
.. mdinclude:: ../../../examples/02-callback-function/README.md
.. mdinclude:: ../../../examples/03-remote-system/README.md
15 changes: 7 additions & 8 deletions docs/source/getting_started/01-getting-started.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,11 @@

# Getting started

To be able to properly compare different datasets, DiffSync relies on a shared data model that both systems must use.
Specifically, each system or dataset must provide a `DiffSync` "adapter" subclass, which in turn represents its dataset as instances of one or more `DiffSyncModel` data model classes.

When comparing two systems, DiffSync detects the intersection between the two systems (which data models they have in common, and which attributes are shared between each pair of data models) and uses this intersection to compare and/or synchronize the data.

## Define your model with DiffSyncModel
# Define your model with DiffSyncModel

`DiffSyncModel` is based on [Pydantic](https://pydantic-docs.helpmanual.io/) and is using Python typing to define the format of each attribute.
Each `DiffSyncModel` subclass supports the following class-level attributes:
Expand Down Expand Up @@ -37,11 +36,11 @@ class Site(DiffSyncModel):
database_pk: Optional[int] # not listed in _identifiers/_attributes/_children as it's only locally significant
```

### Relationship between models
## Relationship between models

Currently the relationships between models are very loose by design. Instead of storing an object, it's recommended to store the unique id of an object and retrieve it from the store as needed. The `add_child()` API of `DiffSyncModel` provides this behavior as a default.

## Define your system adapter with DiffSync
# Define your system adapter with DiffSync

A `DiffSync` "adapter" subclass must reference each model available at the top of the object by its modelname and must have a `top_level` attribute defined to indicate how the diff and the synchronization should be done. In the example below, `"site"` is the only top level object so the synchronization engine will only check all known `Site` instances and all children of each Site. In this case, as shown in the code above, `Device`s are children of `Site`s, so this is exactly the intended logic.

Expand All @@ -58,7 +57,7 @@ class BackendA(DiffSync):

It's up to the implementer to populate the `DiffSync`'s internal cache with the appropriate data. In the example below we are using the `load()` method to populate the cache but it's not mandatory, it could be done differently.

## Store data in a `DiffSync` object
# Store data in a `DiffSync` object

To add a site to the local cache/store, you need to pass a valid `DiffSyncModel` object to the `add()` function.

Expand All @@ -77,13 +76,13 @@ class BackendA(DiffSync):
site.add_child(device)
```

## Update remote system on sync
# Update remote system on sync

When data synchronization is performed via `sync_from()` or `sync_to()`, DiffSync automatically updates the in-memory
`DiffSyncModel` objects of the receiving adapter. The implementer of this class is responsible for ensuring that any remote system or data store is updated correspondingly. There are two usual ways to do this, depending on whether it's more
convenient to manage individual records (as in a database) or modify the entire data store in one pass (as in a file-based data store).

### Manage individual records
## Manage individual records

To update individual records in a remote system, you need to extend your `DiffSyncModel` class(es) to define your own `create`, `update` and/or `delete` methods for each model.
A `DiffSyncModel` instance stores a reference to its parent `DiffSync` adapter instance in case you need to use it to look up other model instances from the `DiffSync`'s cache.
Expand All @@ -110,7 +109,7 @@ class Device(DiffSyncModel):
return self
```

### Bulk/batch modifications
## Bulk/batch modifications

If you prefer to update the entire remote system with the final state after performing all individual create/update/delete operations (as might be the case if your "remote system" is a single YAML or JSON file), the easiest place to implement this logic is in the `sync_complete()` callback method that is automatically invoked by DiffSync upon completion of a sync operation.

Expand Down
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ Welcome to DiffSync's documentation!

overview/index
getting_started/index
core_engine/index
examples/index
api/diffsync
license/index
Expand Down
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
# Example 1
# Example 1 - Multiple Data Sources

This is a simple example to show how DiffSync can be used to compare and synchronize multiple data sources.

For this example, we have a shared model for Device and Interface defined in `models.py`
And we have 3 instances of DiffSync based on the same model but with different values (BackendA, BackendB & BackendC).

> The source code for this example is in Github in the [examples/01-multiple-data-sources/](https://github.com/networktocode/diffsync/tree/main/examples/01-multiple-data-sources) directory.

First create and populate all 3 objects:

```python
Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -2,27 +2,29 @@

This example shows how you can set up DiffSync to invoke a callback function to update its status as a sync proceeds. This could be used to, for example, update a status bar (such as with the [tqdm](https://github.com/tqdm/tqdm) library), although here for simplicity we'll just have the callback print directly to the console.

> The source code for this example is in Github in the [examples/02-callback-function/](https://github.com/networktocode/diffsync/tree/main/examples/02-callback-function) directory.


```python
from diffsync.logging import enable_console_logging
from example2 import DiffSync1, DiffSync2, print_callback
from main import DiffSync1, DiffSync2, print_callback

enable_console_logging(verbosity=0) # Show WARNING and ERROR logs only

# Create a DiffSync1 instance and populate it with records numbered 1-100
ds1 = DiffSync1()
ds1.populate(count=100)
ds1.load(count=100)

# Create a DiffSync2 instance and populate it with 100 random records in the range 1-200
ds2 = DiffSync2()
ds2.populate(count=100)
ds2.load(count=100)

# Identify and attempt to resolve the differences between the two,
# periodically invoking print_callback() as DiffSync progresses
ds1.sync_to(ds2, callback=print_callback)
```

You should see output similar to the following:

```
diff: Processed 1/200 records.
diff: Processed 3/200 records.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ class DiffSync1(DiffSync):

top_level = ["number"]

def populate(self, count):
def load(self, count): # pylint: disable=arguments-differ
"""Construct Numbers from 1 to count."""
for i in range(count):
self.add(Number(number=(i + 1)))
Expand All @@ -50,7 +50,7 @@ class DiffSync2(DiffSync):

top_level = ["number"]

def populate(self, count):
def load(self, count): # pylint: disable=arguments-differ
"""Construct count numbers in the range (1 - 2*count)."""
prev = 0
for i in range(count): # pylint: disable=unused-variable
Expand All @@ -68,13 +68,13 @@ def main():
"""Create instances of DiffSync1 and DiffSync2 and sync them with a progress-reporting callback function."""
enable_console_logging(verbosity=0) # Show WARNING and ERROR logs only

# Create a DiffSync1 instance and populate it with records numbered 1-100
# Create a DiffSync1 instance and load it with records numbered 1-100
ds1 = DiffSync1()
ds1.populate(count=100)
ds1.load(count=100)

# Create a DiffSync2 instance and populate it with 100 random records in the range 1-200
# Create a DiffSync2 instance and load it with 100 random records in the range 1-200
ds2 = DiffSync2()
ds2.populate(count=100)
ds2.load(count=100)

# Identify and attempt to resolve the differences between the two,
# periodically invoking print_callback() as DiffSync progresses
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@

# Example 3
# Example 3 - Work with a remote system

This is a simple example to show how DiffSync can be used to compare and synchronize data with a remote system like [Nautobot](https://nautobot.readthedocs.io) via a REST API.

Expand All @@ -8,6 +8,11 @@ A country must be part of a region and has an attribute to capture its populatio

The comparison and synchronization of dataset is done between a local JSON file and the [public instance of Nautobot](https://demo.nautobot.com).

Also, this example is showing :
- How to set a Global Flags to ignore object that are not matching
- How to provide a custom Diff class to change the ordering of a group of object

> The source code for this example is in Github in the [examples/03-remote-system/](https://github.com/networktocode/diffsync/tree/main/examples/03-remote-system) directory.

## Install the requirements

Expand Down
File renamed without changes.
Loading