root/dev/: django-instances-graph-1.1.2.dev0 metadata and description

Homepage | Simple index

`django_instances_graph`` provides a way to get the whole graph of instances starting from one

author Stephane "Twidi" Angel
author_email s.angel@twidi.com
classifiers
  • Development Status :: 5 - Production/Stable
  • Operating System :: OS Independent
  • Intended Audience :: Developers
  • License :: OSI Approved :: BSD License
  • Programming Language :: Python :: 3
  • Programming Language :: Python :: 3.6
  • Topic :: Software Development :: Libraries
  • Topic :: Software Development :: Libraries :: Python Modules
  • Framework :: Django
  • Framework :: Django :: 2.2
license BSD
File Tox results History
django_instances_graph-1.1.2.dev0-py2.py3-none-any.whl
Size
30 KB
Type
Python Wheel
Python
2.7
  • Replaced 1 time(s)
  • Uploaded to root/dev by twidi 2020-07-16 09:15:51
django_instances_graph-1.1.2.dev0.tar.gz
Size
48 KB
Type
Source
  • Replaced 2 time(s)
  • Uploaded to root/dev by twidi 2020-07-16 09:15:51
======================
django_instances_graph
======================

Purpose
=======

The purpose of the ``django_instances_graph`` library is to create a graph of django model instances
with their relations. This graph can then be serialized via pickle, updated manually, and/or used
to create a duplicate of the data used to fill it.


How it works
============

The whole logic is contained in two classes, explicitly named ``Graph`` and ``Instance``.

A few points to see how these two classes are tied together:

- all ``Instance`` have a ``uuid`` (which is NOT the primary key of the tied django model instance)
- all ``Instance`` holds in its ``fields`` attribute (a dict) all its simple values
- a ``Graph`` has an ``instances`` attribute, holding all its instances. It's a dict with the
``uuid`` as keys, and the matching ``Instance`` as values.
- a ``Graph`` has a ``relations`` attribute.

What we call a relation in this library, is a direct link between two instances.
We are closer to the way we can think in database than in Django, because there is no notions of
"many to many" here, because a "many to many" relation is simply a relation between entries in a
"through" tables, and the entries on the both sides of the "many to many".

To summarize, all relations in db, and in django, and then in this libraries, are "foreign key".

So we store the relations in a dict, with keys being the uuid of the instances declaring the
relation. Then the values are also a dict, with keys being the name of the foreign key relation.
Then as values we have sets, with the UUIDs of the instances on the other side of the relation.

And to make this complete and more usable, we also store the relation on the other side.

Creating a graph is as simple as creating an instance, asking the graph to "serialize" it, as seen
in further sections.


THE API
=======

For the examples, will use these models:

.. testsetup::

# This prepare the django environment to run all the "testcode" in this file
# This is not visible in the rendered README

import os
os.environ['DJANGO_SETTINGS_MODULE'] = 'tests.settings'

import django
django.setup()

from django.db import connection
connection.creation.create_test_db(verbosity=0)

from tests.models import Author, Book, Tag, Translation

.. code:: python

class Tag(models.Model):
name = models.CharField(max_length=10)

class Author(models.Model):
name = models.CharField(max_length=10)

class Book(models.Model):
name = models.CharField(max_length=10)
author = models.ForeignKey(Author, related_name='books', on_delete=models.CASCADE)
tags = models.ManyToManyField(Tag, related_name='books')
translator = models.ManyToManyField(Author, through="Translation",
related_name="translated_books")

class Translation(models.Model):
book = models.ForeignKey(Book, related_name="translations", on_delete=models.CASCADE)
author = models.ForeignKey(Author, related_name="translations", on_delete=models.CASCADE)
lang = models.CharField(max_length=2)


Also note that all the examples in this documentation are guaranteed to work: they are tested
in ``tests/test_readme_examples.py`` file, and they can be run one after the other

Creating objects
++++++++++++++++

Creating a graph
----------------

This is as simple as:

.. code-block:: python

from django_instances_graph import Graph

graph = Graph()


The ``Graph`` constructor doesn't expect any argument.


Creating an instance
--------------------

First thing to now: if you don't pass an existing graph to the ``Instance`` constructor, a new one
will be created, and will be accessible from the ``graph`` attribute of the created ``Instance``.

So be careful when creating many ``Instance`` objects to always pass a ``Graph``, the best being
to create if before.

The main argument to the ``Instance`` constructor is the ``source``.

You can pass it a Django model:

.. code-block:: python

from django_instances_graph import Instance

author_instance = Instance(Author) # or ``source=Author``
assert author_instance.pk is None


In this case no primary key (accessible via the ``pk`` attribute of the created ``Instance``) will
be saved in the new object.

You can also pass it a Django model instance, not in database, and it will have the same effect as
passing a model (except for one big detail that will see later):

.. code-block:: python

from django_instances_graph import Instance

author = Author(name='john')
author_instance = Instance(author) # or ``source=author``
assert author_instance.pk is None


And of course, you can pass a Django model instance from the database:

.. code-block:: python

from django_instances_graph import Instance

author = Author.objects.create(name='john')
author_instance = Instance(author) # or ``source=author``
assert author_instance.pk == author.pk


In this case, the primary key is saved in the newly created object.


Retrieving an instance from the graph
-------------------------------------

Each instance is created with a ``UUID`` (version 4).

Then you can retrieve the instance using it:

.. code-block:: python

from django_instances_graph import Graph, Instance

graph = Graph()
author = Author.objects.create(name='john')
author_instance = Instance(author, graph=graph)
author_uuid = author_instance.uuid

# later

author_instance = graph.get_instance(author_uuid)
assert author_instance.pk == author.pk


Note that you can set yourself the uuid:

.. code-block:: python

from uuid import uuid4
from django_instances_graph import Graph, Instance

graph = Graph()
author_uuid = uuid4()

author = Author.objects.create(name='john')
author_instance = Instance(author, graph=graph, uuid=author_uuid)
assert author_instance.uuid == author_uuid


If the instance has a pk, you can also retrieve it via it's model + pk:

.. code-block:: python


from django_instances_graph import Graph, Instance

graph = Graph()
author = Author.objects.create(name='john')
author_instance = Instance(author, graph=graph)
author_uuid = author_instance.uuid

# later

author_instance = graph.get_instance(Author, author.pk)
assert author_instance.uuid == author_uuid


Trying to get an instance that does not exist in the graph will raise a ``KeyError``.

Checking if an instance exits in the graph
------------------------------------------

You can do the check by uuid of by model + pk

.. code-block:: python

from uuid import uuid4
from django_instances_graph import Graph, Instance

graph = Graph()

uuid = uuid4()
assert graph.has_instance(uuid) is False

author = Author.objects.create(name='john')
author_instance = Instance(author, graph=graph, uuid=uuid)

assert author_instance.uuid == uuid
assert graph.has_instance(uuid) is True
assert graph.has_instance(Author, author.pk) is True


Saving simple fields in instances
---------------------------------

Each ``Instance`` object has ``fields`` dictionary to hold simple fields values (simple fields
are every fields that are not relations to another model: ``CharField``, ``IntegerField``...)

.. code-block:: python

from django_instances_graph import Graph, Instance

graph = Graph()

author = Instance(Author, graph=graph)
author.fields['name'] = 'john'

assert graph.get_instance(author.uuid).fields['name'] == 'john'


We'll see later that these fields can be automatically filled during the serializing process.


Adding a simple relation
------------------------

The serializing process will add the relation itself. But you can create ones manually.
Remember that a relation is a relation between two instances, via a ``ForeignKey`` or
``OneToOneField`` (which is a sort of ``ForeignKey``)

For example we have a ``ForeignKey`` between ``Book`` and ``Author``, so we can do this:

.. code-block:: python

from django_instances_graph import Graph, Instance

graph = Graph()

author = Instance(Author, graph=graph)
book = Instance(Book, graph=graph)

book.add_relation('author', author)
# or book.add_relation(accessor_name='author', target=author)
# or graph.add_relation(book, 'author', author)
# or graph.add_relation(source=book, accessor_name='author', target=author)


The ``add_relation`` method will check that ``author`` is a correct field type, and that the target
is from the expected model (or a ``ValueError`` will be raised)

This relation could have been added in the opposite way (using the ``related_name`` as the accessor
name):

.. code-block:: python

author.add_relation('books', book)


Adding a M2M relation
---------------------

To add a ``ManyToMany`` relation, we have to distinguish two cases: either the relation has a
auto-created "through" model, or not.

In the first case, it's easy, because the ``Instance`` class has a ``add_direct_m2m_relation``
method, to use this way:

.. code-block:: python

tag1 = Instance(Tag, graph=graph)
tag2 = Instance(Tag, graph=graph)

book.add_direct_m2m_relation('tags', [tag1, tag2])
# or book.add_direct_m2m_relation(accessor_name='tags', targets=[tag1, tag2])
# or graph.add_direct_m2m_relation(source=book, accessor_name='tags', targets=[tag1, tag2])


When a ``ManyToMany`` "entry" is created, what happens in Django is that there is a "through"
model in the middle that has two ``ForeignKey``: one on each side, ie in our case, one to the
``Book`` model, and one to the ``Tag`` model.

The ``add_direct_m2m_relation`` creates the relation in the graph from this "through" model to the
source and the targets. So in our example, we have two "through" entries and it will create 4
relations:

- "through1" to "book"
- "through1" to "tag1"
- "through2" to "book"
- "through2" to "tag2"

Note that the ``add_direct_m2m_relation`` method also returns ``Instance`` objects of the "through"
model, one for each target, in the same order. And it's also important to know that it will not
replace the existing relations, but will just add the specified ones.

This is more complicated when the "through" model is manually defined in the definition of the
``ManyToManyField`` because there is, in general, additional fields, and the graph cannot "guess"
them, so the whole work has to be done manually.

We can see this in an example with the "translators" ``ManyToManyField``

.. code-block:: python

translator1 = Instance(Author, graph=graph)
translator2 = Instance(Author, graph=graph)

# for book => translator1
translation1 = Instance(Translation, graph=graph)
translation1.add_relation('book', book)
translation1.add_relation('author', translator1)

# for book => translator2
translation2 = Instance(Translation, graph=graph)
translation2.add_relation('book', book)
translation2.add_relation('author', translator2)


Retrieving relations from the graph
-----------------------------------

What we want is not to retrieve the relation itself, but the targets of the relation:

To get the author of the book:

.. code-block:: python

author2 = list(book.get_relation_targets('author'))[0]
# or author2 = list(graph.get_relation_targets(source=book, accessor_name='author'))[0]
assert author2 is author


Of course here we have a ``ForeignKey`` so we *should* only have one entry. But we could also have
zero, and in this case an ``IndexError`` will be raised.

It is also possible to get all the books for an author:

.. code-block:: python

books = author.get_relation_targets('books')
assert books == {book}


Here we can see that it makes sense that this method returns a list (in fact, it's a ``set``, so not
ordered).

Note that what is returned are ``Instance`` objects, not instances of the django model.


And to retrieve the targets of a ``ManyToMany``:

.. code-block:: python

tags = book.get_m2m_relation_targets('tags')
# or tags = graph.get_m2m_relation_targets(source=book, accessor_name='tags')
assert tags == {tag1, tag2}


And to get all the books for a tag:

.. code-block:: python

books = tag1.get_m2m_relation_targets('books')
assert books == {book}


In contrary to the ``add_direct_m2m_relation``, this method works for both auto "through" and
manually defined ones, because we want just the targets (it's why there is no "direct" in the name
of this method)

If a manual "through" was defined, to get the "through" entries, simply use the
``get_relation_targets`` method. With our previous example, it should be:

.. code-block:: python

translations = book.get_relation_targets('translations')
assert translations == {translation1, translation2}


Removing relations from the graph
---------------------------------

To remove a direct relation:

.. code-block:: python

book.remove_relation('author', author)
# or graph.remove_relation(book, 'author', author)
assert book.get_relation_targets('author') == set()

And for a ``ManyToMany``:

.. code-block:: python

book.remove_m2m_relation('tags', [tag1, tag2])
# or graph.remove_m2m_relation(book, 'tags', [tag1, tag2])
assert book.get_relation_targets('tags') == set()


Note that this will not remove all the existing relations, but only the specified ones.

To remove all the relations, you can do, for a direct relation:

.. code-block:: python

book.add_relation('author', author) # just to have something

book.clear_relation('author')
# or graph.clear_relation(book, 'author')
assert book.get_relation_targets('author') == set()


And for a ``ManyToMany``:

.. code-block:: python

book.add_direct_m2m_relation('tags', [tag1, tag2]) # just to have something

book.clear_m2m_relation('tags')
# or graph.clear_m2m_relation(book, 'tags')
assert book.get_m2m_relation_targets('tags') == set()


``remove_m2m_relation`` and ``clear_m2m_relation`` accept a ``remove_through_instances``, default
to ``True``, that will remove from the graph the "through" entries of the removed relations.

It can be useful to pass it to ``False`` with manual "through" then there is other fields or other
relations going from or to it.

Also note that these two methods return these "through" instances (even if they are removed from the
graph, they still exist as ``Instance`` objects).


Serialization
+++++++++++++

The serialization is the process and converting a django model instance, to an ``Instance`` of a
graph, saving its fields and relations.

Serializing an instance
-----------------------

Now that we know how to create the graph, instances and relations manually, let's see how to do it
automatically.

First, we can serialize just an instance, not saved in database.


.. code-block:: python

from django_instances_graph import Instance

author = Author(name='john')
author_instance = Instance(author, serialize=True)

assert author_instance.fields['name'] == 'john'
assert author_instance.serialized is True


We can also do it in two steps if for example we used a model as the ``Instance`` source.


.. code-block:: python

from django_instances_graph import Instance

author_instance = Instance(Author)

author_instance.serialize(Author(name='john'))

assert author_instance.fields['name'] == 'john'
assert author_instance.serialized is True


Yes, ``serialize`` expect an instance of the django model, because if an instance of such a model
is passed to the ``Instance`` constructor, it is *not* saved in the ``Instance`` object.


Serializing a graph
-------------------

Serializing an instance is ok, but it's not really what this library is about. We want to
serialize the whole graph, from a starting point.

Let's see how to auto-create the instances and relations from the database.

If we want the whole objects related to a book in the database, it's as simple as passing the
``serialize`` argument to ``True`` when creating an instance:

.. code-block:: python

from django_instances_graph import Instance

author = Author.objects.create(name='author1')
tag1 = Tag.objects.create(name='tag1')
tag2 = Tag.objects.create(name='tag2')
tag3 = Tag.objects.create(name='tag3')
book1 = Book.objects.create(name='book 1', author=author)
book1.tags = [tag1, tag2]
book2 = Book.objects.create(name='book 2', author=author)
book2.tags = [tag1, tag3]

book_instance = Instance(book1, serialize=True)

assert book_instance.pk == book1.pk
assert book_instance.fields['name'] == 'book 1'
author_instance = list(book_instance.get_relation_targets('author'))[0]
assert author_instance.fields['name'] == 'author1'
tag_instances = book_instance.get_m2m_relation_targets('tags')
assert set(t.fields['name'] for t in tag_instances) == {'tag1', 'tag2'}


Note that we didn't set the ``graph`` argument, so we can get it back using ``book_instance.graph``.
But it could of course have been defined before and passed to the ``Instance`` constructor, as
seen before.

What is done by passing ``serialize=True``:

- all the simple fields are saved in ``book_instance.fields``
- all the relations from "book1" to any related model are created
- all "any related model" have their own ``Instance`` in the graph, also serialized, ie their simple
fields but their relations too.
- this is done recursively until there is no more relations to follow.

So we'll have the book, it's author, it's tags. But we'll also have the other books of the authors,
and their tags too, and all the books for all the tags.

Maybe it's that you want but there is a chance that it's not the case.

For this, let's introduce what we call "boundaries".

Defining boundaries
-------------------

In our example, we just want to serialize the book, its relations to an author and to its tags.

So, the boundaries are:

- the "author" ``ForeignKey`` from the ``Book`` model
- the "tag" ``ForeignKey`` from the "through" model between the ``Book`` and ``Tag`` models

To define boundaries, a new class inheriting from ``Graph`` must be defined, and its
``is_relation_boundary`` must be overridden:

.. code-block:: python

from django_instances_graph import Graph
from django_instances_graph.utils import is_through_model

class BookGraph(Graph):
def is_relation_boundary(self, instance, accessor_name, field, field_type):

# The book author is a boundary
if instance.model is Book and accessor_name == 'author':
return True

# The tag of a book <=> tag through is a boundary. We don't block on the m2m field
# because we don't want the through entries to be the boundaries, but the tags
# book --- [not boundary ] --- through model ---- [ boundary] --- tag
if is_through_model(Book, 'tags', instance.model) and accessor_name == 'tag':
return True

return False


Now we can do the serialization and check that the boundaries are correctly set:

.. code-block:: python

graph = BookGraph()

Instance(book1, serialize=True, graph=graph)

assert graph.get_instance(Author, author.pk).is_boundary

for tag in book1.tags.all():
assert graph.get_instance(Tag, tag.pk).is_boundary


What is done when an ``Instance`` is marked as boundary:

- it has a ``is_boundary`` attribute set to ``True``
- the ``Instance`` is created on the graph, and if it's created automatically by the serialization
of another model, only its simple fields will be serialized
- in the case of it is not serialized, no relations are created in the graph starting from it

This will be used in the deserializing process, for example when we want to duplicate a graph, as
we'll see below.


Note that when creating ``Instance`` objects manually (ie not from by just creating one and let the
graph create the other during the serialization process), it is possible to set it as boundary too:

.. code-block:: python

graph = BookGraph()

author_instance = Instance(Author, graph=graph, is_boundary=True)
assert graph.get_instance(author_instance.uuid).is_boundary


Deserialization
+++++++++++++++

The deserialization is the process of converting ``Instance`` objects of a graph, and their
relations, into real django model instances, saved in database.

Deserializing an instance
-------------------------

If the ``Instance`` objects have primary keys, the objects in database will be updated. In the
other case, they will be created.

Not from the database

.. code-block:: python

graph = BookGraph()

author_instance = Instance(Author, graph=graph)
author_instance.fields['name'] = 'john'

author = author_instance.deserialize()

assert isinstance(author, Author)
assert author.pk is not None
assert author.name == 'john'

# The instance has a pk now
assert author_instance.pk == author.pk


Note that you can pass the django model instance that will hold the deserialized data:

.. code-block:: python

graph = BookGraph()

author_instance = Instance(Author, graph=graph)
author_instance.fields['name'] = 'john'

blank_author = Author()

author = author_instance.deserialize(blank_author)

assert isinstance(author, Author)
assert author is blank_author
assert author.pk is not None
assert author.name == 'john'


And now with an existing object from the database:

.. code-block:: python

graph = BookGraph()

original_author = Author.objects.create(name='john')
author_instance = Instance(original_author, graph=graph, serialize=True)

# later

author_instance.fields['name'] = 'peter'
author = author_instance.deserialize()

assert author.pk == original_author.pk
assert author.name == 'peter'


Deserializing a graph
---------------------

This is the most interesting part of this library. It allows, for example:

- to create objects and their relations in a first time, then save the whole in database at the end
(which is not possible with django model instances as the instances must be saved to create
relations between them)
- to extract some data from the database and duplicate them

Start by creating some instances, not in database, and their relations

.. code-block:: python

graph = BookGraph()

# Two things to notice:
# - We pass ``serialize=True`` to save the fields
# - We set the boundaries manually as the boundaries can only be defined automatically
# in the full serialization process from django model instances, which we don't have here
author_instance = Instance(Author(name='john'), graph=graph, serialize=True,
is_boundary=True)
book_instance = Instance(Book(name='my book'), graph=graph, serialize=True)
tag1_instance = Instance(Tag(name='tag1'), graph=graph, serialize=True, is_boundary=True)
tag2_instance = Instance(Tag(name='tag2'), graph=graph, serialize=True, is_boundary=True)

book_instance.add_relation('author', author_instance)
book_instance.add_direct_m2m_relation('tags', [tag1_instance, tag2_instance])


Now we can deserialize the whole graph by simple deserializing one ``Instance``:

.. code-block:: python

book = book_instance.deserialize()

assert book.author.name == 'john'
assert set(book.tags.values_list('name', flat=True)) == {'tag1', 'tag2'}


It's done, the whole graph is saved in database.

Duplicating a graph
-------------------

Duplicating a graph is simple. What we want is to create new objects and relations in database, the
same we have in the graph, but, obviously, with different primary keys.

It's very simple, as the ``Graph`` class provides a ``clear_pks`` method.

So, following the previous deserialization just above, we can do:

.. code-block:: python

graph.clear_pks()
book2 = book_instance.deserialize()

assert book2.pk != book.pk
assert book2.author.name == 'john'
assert set(book2.tags.values_list('name', flat=True)) == {'tag1', 'tag2'}

# 1 author for both, because as a boundary the author is not deserialized if it exist in db
assert book2.author_id == book.author_id
# and same for the tags
book1_tags = set(book.tags.values_list('pk', flat=True))
book2_tags = set(book2.tags.values_list('pk', flat=True))
assert book1_tags == book2_tags


Note that between the call to ``clear_pks`` and ``deserialize``, it is possible to update the graph.
For example:

.. code-block:: python

graph.clear_pks()

# Change a field
book_instance.fields['name'] = 'new book'
# Remove a relation
book_instance.remove_m2m_relation('tags', [tag2_instance])
# And add another
tag3_instance = Instance(Tag(name='tag3'), graph=graph, serialize=True, is_boundary=True)
book_instance.add_direct_m2m_relation('tags', [tag3_instance])

book3 = book_instance.deserialize()
assert book3.name == 'new book'
assert book3.author.name == 'john'
assert set(book3.tags.values_list('name', flat=True)) == {'tag1', 'tag3'}


Overriding
++++++++++

There is two concepts of overriding we'll see in this section: class inheritance, to override
the ``Graph`` and ``Instance`` classes, and changing the values and relations to save it in graph
during the serialization or retrieved from it during the deserialization.

Class inheritance
-----------------

Inherit from Graph
^^^^^^^^^^^^^^^^^^

You can easily inherit from the ``Graph`` class if you want to change its behaviour. But in this
case, don't forget to always create your ``Graph`` instance manually and pass it to each
``Instance`` because if an instance has no ``graph`` passed to its constructor, it will create one
using the default ``Graph`` class.

What you can do by creating your own ``Graph`` subclass:

- define a default class to use for instances, instead of ``Instance`` (which is the default):

.. code-block:: python

from django_instances_graph import Graph, Instance

class MyDefaultInstance(Instance):
pass

class MyGraph(Graph):
default_instance_class = MyDefaultInstance

graph = MyGraph()
instance = Instance(Author, graph=graph)
assert isinstance(instance, MyDefaultInstance)


- define which subclass of ``Instance`` to use for the model you want. If there is no ``Instance``
class defined for a model, the one defined in ``default_instance_class`` will be used.

.. code-block:: python

class AuthorInstance(Instance):
pass

class BookInstance(Instance):
pass

class MyGraph(Graph):
default_instance_class = MyDefaultInstance
instance_classes_for_models = {
Author: AuthorInstance,
Book: BookInstance,
}

graph = MyGraph()
book_instance = Instance(Book, graph=graph)
assert isinstance(book_instance, BookInstance)
author_instance = Instance(Author, graph=graph)
assert isinstance(author_instance, AuthorInstance)
tag_instance = Instance(Tag, graph=graph)
assert isinstance(tag_instance, MyDefaultInstance)


Another way to do this if you "own" your models, is to add a ``graph_instance_class`` attribute
to your model, setting it to the subclass of ``Instance`` you want. Of course you cannot do this on
model you don't own (ie from external applications)


Inherit from Instance
^^^^^^^^^^^^^^^^^^^^^

You may want to inherit from ``Instance`` for example to change the ``__repr__`` method, or to
override the ``override_value_to_serialize`` and ``override_value_to_deserialize`` methods, as we'll
see in the next sub-section.

Value overriding
----------------

(For full example on how to override the serialized or deserialized value for all kind of fields,
you can check the ``OverrideInstance`` class in the ``tests/models.py`` file)

Serialization
^^^^^^^^^^^^^

During the serialization you may want to change on the fly the simple values to save in the
``fields`` attribute of an ``Instance``, or its relations to other objects.

You have one entry point for this on the ``Instance`` class, where you can add your own logic in a
subclass. You return simple values for simple fields, and a django model instance (or list of) for
relations (because when serializing, we convert values from the django model instances to our own
``Instance`` objects, and here we just intercept the values from the django model instances).

Here is an example if we want to change a simple value, a relation, and a many-to-many relation:


.. code-block:: python

from django_instances_graph import Instance
from django_instances_graph.utils import get_through_info


class BookInstance(Instance):
def override_value_to_serialize(self, model_instance, field, accessor_name, field_type,
value):

# ``value`` is the value got from the serialized book (accessible via
# ``model_instance``), but you can also get it by calling ``super``:
value = super().override_value_to_serialize(model_instance, field, accessor_name,
field_type, value)

# For a simple field
if accessor_name == 'name':
# Add a number to the book's name
return '%s (%s)' % (value, 123)

# For a simple relation
elif accessor_name == 'author':
# Change the author
value = Author(name='new author')

# For a many-to-many
elif accessor_name == 'tags':
# Add a new tag: a list of instances from the "through" model is expected
through_model = get_through_info(Book, 'tags')[0]
value = list(value) + [
through_model(
book=model_instance,
tag=Tag(name='new tag'),
)
]

# always return the original value if you don't change it
return value

# This should be defined in the model declaration
Book.graph_instance_class = BookInstance

author = Author.objects.create(name='john')
book = Book.objects.create(name='my book', author=author)
book.tags = [Tag.objects.create(name='tag1'), Tag.objects.create(name='tag2')]

instance = Instance(book, serialize=True)

assert instance.fields['name'] == 'my book (123)'
assert list(instance.get_relation_targets('author'))[0].fields['name'] == 'new author'
tag_instances = instance.get_m2m_relation_targets('tags')
assert set(tag.fields['name'] for tag in tag_instances) == {'tag1', 'tag2', 'new tag'}


Don't forget to also do it on reverse relations to avoid surprises. For example if you return a
different value for a ``OneToOneField``, but the original instance for this field is serialized,
the final relation may not be the one you expect.

Deserialization
^^^^^^^^^^^^^^^

Deserialization is the process of converting ``Instance`` objects and their relations from a
``Graph`` into real django model instances.

There is an entry point on the ``Instance`` class where you can change on the fly the values and
relations that will be used instead of the ones on the ``Graph``.

Note that the expected return value of this method, is simple values for simple fields, and
``Instance`` objects for targets of relations.

Here is an example if we want to change a simple value, a relation, and a many-to-many relation:

.. code-block:: python

from django_instances_graph import Instance
from django_instances_graph.utils import get_through_info


class BookInstance(Instance):
def override_value_to_deserialize(self, field, accessor_name, field_type, value,
model_instance):

# ``value`` is the value got from the deserialized book (accessible via
# ``self``), but you can also get it by calling ``super``:
value = super().override_value_to_deserialize(field, accessor_name, field_type, value,
model_instance)

# For a simple field
if accessor_name == 'name':
# Add a number to the book's name
return '%s (%s)' % (value, 456)

# For a simple relation
elif accessor_name == 'author':
self.clear_relation('author') # it's important, but note that it changes the graph
value = Instance(Author(name='the author'), graph=self.graph, serialize=True)

# For a many-to-many
elif accessor_name == 'tags':
# Add a new tag: a list of ``Instance`` from the "through" model is expected
value = value + self.add_direct_m2m_relation('tags', [
Instance(Tag(name='the tag'), graph=self.graph, serialize=True),
])

# always return the original value if you don't change it
return value



# This should be defined in the model declaration
Book.graph_instance_class = BookInstance

graph = Graph()
book_instance = Instance(Book(name='my book'), graph=graph, serialize=True)
author_instance = Instance(Author(name='john'), graph=graph, serialize=True)
book_instance.add_relation('author', author_instance)
book_instance.add_direct_m2m_relation('tags', [
Instance(Tag(name='tag1'), graph=graph, serialize=True),
Instance(Tag(name='tag2'), graph=graph, serialize=True),
])

book = book_instance.deserialize()
assert book.name == 'my book (456)'
assert book.author.name == 'the author'
assert set(book.tags.values_list('name', flat=True)) == {'tag1', 'tag2', 'the tag'}

Other goodies
+++++++++++++

Serializing the graph
---------------------

What? Serializing the graph? But we just did it above!!

No, we serialized the instances of the graph, but this is about serializing a ``Graph`` object.

What is the purpose of this?

Let's imagine you want to duplicate many times a graph, and because it may be costly, you do it
asynchronously by using, for example, celery.

But before doing the duplicate, which is a deserialization, you must fill the graph. Which is done
by fetching data from the database.

What if we can serialize the state of the graph and simply store it, or pass it, or do whatever we
want with?

Thanks ``pickle``... We made the ``Graph`` and ``Instance`` objects "pickle-ready".

So, to serialize a ``Graph`` object:

.. code-block:: python

# Reset the instance class used by the book as it cannot be pickled from the readme file!
Book.graph_instance_class = None

import pickle

graph = Graph()
author = Author.objects.create(name='john')
book = Book.objects.create(name='my book', author=author)
book.tags = [Tag.objects.create(name='tag1'), Tag.objects.create(name='tag2')]

instance = Instance(book, serialize=True, graph=graph)

pickled_graph = pickle.dumps(graph)

# later, we get back the pickled graph, and we must also have the book's primary key

book_pk = book.pk # in practice, we get it another way

graph = pickle.loads(pickled_graph)

book_instance = graph.get_instance(Book, book_pk)
assert book_instance.fields['name'] == 'my book'
author_instance = list(book_instance.get_relation_targets('author'))[0]
assert author_instance.fields['name'] == 'john'
tag_instances = book_instance.get_m2m_relation_targets('tags')
assert set(t.fields['name'] for t in tag_instances) == {'tag1', 'tag2'}


If you want to add some data to the serialized graph, simply override the ``__getstate__`` and
``__setstate__`` methods in your subclass.


Cloning a graph
---------------

Say you created a graph with objects from the database. And you will update it to do some
manipulations, but still want to keep the original graph.

For this, you can call the ``clone`` method of a ``Graph`` object, that will create a new graph
keeping the same instances and relations as the original one. Then, you can keep the original and
update the clone (or the reverse if you want).

.. code-block:: python

cloned_graph = graph.clone()

assert cloned_graph is not graph
assert {inst.uuid for inst in cloned_graph.instances.values()} == \
{inst.uuid for inst in graph.instances.values()}

book_instance = cloned_graph.get_instance(Book, book_pk)
assert book_instance.fields['name'] == 'my book'
author_instance = list(book_instance.get_relation_targets('author'))[0]
assert author_instance.fields['name'] == 'john'
tag_instances = book_instance.get_m2m_relation_targets('tags')
assert set(t.fields['name'] for t in tag_instances) == {'tag1', 'tag2'}


Cloning uses "pickle" under the hood, so you can use the same way of overriding the graph as defined
above to add more information to the data to serialize.


About through models
--------------------

We talked a lot about the "through" model which is the model between the both sides of a
"many to many" relations.

When this "through" model is manually created and defined in the ``ManyToManyField``, you already
have all the information you may need.

But for auto-created "through" models, it's not so easy.

For this, and because we use this a lot in the code, we provide two utils functions in
``django_instances_graph.utils``:


- ``is_through_model(source_model, accessor_name, through_model)``

This function simply tells if a given model is the "through" model for a relation:

.. code-block:: python

from django.apps import apps
from django_instances_graph.utils import is_through_model

# Get the "through" model from django
books_tags_through_model = apps.get_model('tests', 'Book_tags')

# In the normal direction
assert is_through_model(Book, 'tags', Author) is False
assert is_through_model(Book, 'tags', books_tags_through_model) is True

# But also in the reverse direction
assert is_through_model(Tag, 'books', Author) is False
assert is_through_model(Tag, 'books', books_tags_through_model) is True


- ``get_through_info(model, accessor_name)``

This function returns some useful information about the "through" model for the given many-to-many
relation:

.. code-block:: python

from django_instances_graph.utils import get_through_info

through_model, through_source_field, through_target_field, target_model, \
target_accessor_name = get_through_info(Book, 'tags')

assert through_model is books_tags_through_model

# the field on the through model which links to the source model (passed in argument)
assert through_source_field.name == 'book'

# the field on the through model which links to the target model (on the other side of the m2m)
assert through_target_field.name == 'tag'

# the django model on the other side of the m2m relation
assert target_model is Tag

# the name of the attribute on the target model to access the m2m field
assert target_accessor_name == 'books'


For ``through_source_field`` and ``through_target_field``, use ``.name`` to have the accessor
name from the "through" model to the model as seen in the example above, and
``.related.get_accessor_name()`` to have the accessor name from the model to the "through" model:

.. code-block:: python

assert through_source_field.remote_field.get_accessor_name() == 'Book_tags+'
assert through_target_field.remote_field.get_accessor_name() == 'Book_tags+'


As you can see this the name ending with `+`, you cannot use this name to access "through" entries
from the django model instances (it's only true for auto created "through" models, though), but
you can use it to get relations from the ``Instance`` objects:

.. code-block:: python

through_instances = list(book_instance.get_relation_targets('Book_tags+'))
assert through_instances[0].model is books_tags_through_model

tag_instance = list(tag_instances)[0]
through_instances = list(tag_instance.get_relation_targets('Book_tags+'))
assert through_instances[0].model is books_tags_through_model


Of course all of this works on the other side too:

.. code-block:: python

through_model, through_source_field, through_target_field, target_model, \
target_accessor_name = get_through_info(Tag, 'books')

assert through_model is books_tags_through_model
assert through_source_field.name == 'tag'
assert through_target_field.name == 'book'
assert target_model is Book
assert target_accessor_name == 'tags'


Extensions
==========

Some extensions are currently being written. They will all be available as python packages
prefixed with ``dig-`` (d.i.g for Django Instances Graph).

What you'll soon be able to use:

- ``dig-duplicate``
an extension of what is currently possible about duplicating a graph, but with references to
parent objects, and merge capabilities between two duplicates
- ``dig-visualization``
an extension to have a visual representation of a graph, using ``graphviz``


Installation
============

The ``django_instances_graph`` package is only available on the Magency private pypi server.

.. code-block:: sh

pip install -i https://login:password@pypi.magency.ninja/some/index django_instances_graph


Development
===========


The code can be found on the `Magency mtp-back-modules repository
<https://gitlab.com/magency/products/mtp-back-modules/-/tree/master/django-instances-graph>`_

Install the required packages:

.. code-block:: sh

pip install -i https://login:password@pypi.magency.ninja/some/index -e .[dev]


To run tests, simply launch the ``runtests.sh`` script.

And for pylint:

.. code-block:: sh

PYTHONPATH="$PYTHONPATH:." pylint django_instances_graph tests


When ready, update the version in ``setup.cfg`` then create the package:

.. code-block:: sh

./setup.py sdist bdist_wheel

You can now upload it to ``devpi``:

.. code-block:: sh

devpi use https://login:password@pypi.magency.ninja
devpi login yourlogin
devpi use yourlogin/dev
devpi upload dist/django_instances_graph-VERSION*


Support
=======

python>=3.6
django>=2.2

Render warnings:
<string>:48: (ERROR/3) Unknown directive type "testsetup".

.. testsetup::

# This prepare the django environment to run all the "testcode" in this file
# This is not visible in the rendered README

import os
os.environ['DJANGO_SETTINGS_MODULE'] = 'tests.settings'

import django
django.setup()

from django.db import connection
connection.creation.create_test_db(verbosity=0)

from tests.models import Author, Book, Tag, Translation