This is an example of using a custom type with PyMongo. The example here is a bit contrived, but shows how to use a SONManipulator to manipulate documents as they are saved or retrieved from MongoDB. More specifically, it shows a couple different mechanisms for working with custom datatypes in PyMongo.
We’ll start by getting a clean database to use for the example:
>>> from pymongo.mongo_client import MongoClient
>>> client = MongoClient()
>>> client.drop_database("custom_type_example")
>>> db = client.custom_type_example
Since the purpose of the example is to demonstrate working with custom types, we’ll need a custom datatype to use. Here we define the aptly named Custom class, which has a single method, x():
>>> class Custom(object):
... def __init__(self, x):
... self.__x = x
...
... def x(self):
... return self.__x
...
>>> foo = Custom(10)
>>> foo.x()
10
When we try to save an instance of Custom with PyMongo, we’ll get an InvalidDocument exception:
>>> db.test.insert({"custom": Custom(5)})
Traceback (most recent call last):
InvalidDocument: cannot convert value of type <class 'Custom'> to bson
One way to work around this is to manipulate our data into something we can save with PyMongo. To do so we define two methods, encode_custom() and decode_custom():
>>> def encode_custom(custom):
... return {"_type": "custom", "x": custom.x()}
...
>>> def decode_custom(document):
... assert document["_type"] == "custom"
... return Custom(document["x"])
...
We can now manually encode and decode Custom instances and use them with PyMongo:
>>> db.test.insert({"custom": encode_custom(Custom(5))})
ObjectId('...')
>>> db.test.find_one()
{u'_id': ObjectId('...'), u'custom': {u'x': 5, u'_type': u'custom'}}
>>> decode_custom(db.test.find_one()["custom"])
<Custom object at ...>
>>> decode_custom(db.test.find_one()["custom"]).x()
5
Needless to say, that was a little unwieldy. Let’s make this a bit more seamless by creating a new SONManipulator. SONManipulator instances allow you to specify transformations to be applied automatically by PyMongo:
>>> from pymongo.son_manipulator import SONManipulator
>>> class Transform(SONManipulator):
... def transform_incoming(self, son, collection):
... for (key, value) in son.items():
... if isinstance(value, Custom):
... son[key] = encode_custom(value)
... elif isinstance(value, dict): # Make sure we recurse into sub-docs
... son[key] = self.transform_incoming(value, collection)
... return son
...
... def transform_outgoing(self, son, collection):
... for (key, value) in son.items():
... if isinstance(value, dict):
... if "_type" in value and value["_type"] == "custom":
... son[key] = decode_custom(value)
... else: # Again, make sure to recurse into sub-docs
... son[key] = self.transform_outgoing(value, collection)
... return son
...
Now we add our manipulator to the Database:
>>> db.add_son_manipulator(Transform())
After doing so we can save and restore Custom instances seamlessly:
>>> db.test.remove() # remove whatever has already been saved
{...}
>>> db.test.insert({"custom": Custom(5)})
ObjectId('...')
>>> db.test.find_one()
{u'_id': ObjectId('...'), u'custom': <Custom object at ...>}
>>> db.test.find_one()["custom"].x()
5
If we get a new Database instance we’ll clear out the SONManipulator instance we added:
>>> db = client.custom_type_example
This allows us to see what was actually saved to the database:
>>> db.test.find_one()
{u'_id': ObjectId('...'), u'custom': {u'x': 5, u'_type': u'custom'}}
which is the same format that we encode to with our encode_custom() method!
We can take this one step further by encoding to binary, using a user defined subtype. This allows us to identify what to decode without resorting to tricks like the _type field used above.
We’ll start by defining the methods to_binary() and from_binary(), which convert Custom instances to and from Binary instances:
Note
You could just pickle the instance and save that. What we do here is a little more lightweight.
>>> from bson.binary import Binary
>>> def to_binary(custom):
... return Binary(str(custom.x()), 128)
...
>>> def from_binary(binary):
... return Custom(int(binary))
...
Next we’ll create another SONManipulator, this time using the methods we just defined:
>>> class TransformToBinary(SONManipulator):
... def transform_incoming(self, son, collection):
... for (key, value) in son.items():
... if isinstance(value, Custom):
... son[key] = to_binary(value)
... elif isinstance(value, dict):
... son[key] = self.transform_incoming(value, collection)
... return son
...
... def transform_outgoing(self, son, collection):
... for (key, value) in son.items():
... if isinstance(value, Binary) and value.subtype == 128:
... son[key] = from_binary(value)
... elif isinstance(value, dict):
... son[key] = self.transform_outgoing(value, collection)
... return son
...
Now we’ll empty the Database and add the new manipulator:
>>> db.test.remove()
{...}
>>> db.add_son_manipulator(TransformToBinary())
After doing so we can save and restore Custom instances seamlessly:
>>> db.test.insert({"custom": Custom(5)})
ObjectId('...')
>>> db.test.find_one()
{u'_id': ObjectId('...'), u'custom': <Custom object at ...>}
>>> db.test.find_one()["custom"].x()
5
We can see what’s actually being saved to the database (and verify that it is using a Binary instance) by clearing out the manipulators and repeating our find_one():
>>> db = client.custom_type_example
>>> db.test.find_one()
{u'_id': ObjectId('...'), u'custom': Binary('5', 128)}