CouchDB restoring deleted/updated documents and their data

We are using CouchDB for production and happy with it. It is much more lightweight rather then MongoDB yet powerful. (For our needs at least). But sometimes you have situations that some code deleted/spoiled your Couch Database data. We had some bugs leading to deleting indexes. However compaction have not been run and here is the decision.

There are several ways for different situations. I'll try to cover them all.
So for deleted CouchDB documents you need to:

1. Make sure your document with this id is Deleted.

To do it you need to request CouchDB for this document. E.g. with this string:
Where $db is your CouchDB database name and $id is your deleted document id
it should return something like this:

2. Get all the revisions of the deleted document.

With this request:
Where $db is your CouchDB database name and $id is your deleted document id.

3. Parse response.

CouchDb provides a response with revisions in a special coded format. So to parse this JSON response you need to know it's syntax. It's fairly simple.
 Content - Type: application / json

    "_id": "my-couchdb-id",
    "_rev": "6-65624dd5962e59ff09c47ba1be4f726c",
    "_deleted": true,
    "_revisions": {
        "start": 6,
        "ids": ["65624dd5962e59ff09c47ba1be4f726c", "826046dbd125b841e0dba657f65bbb78", "d1485b8d5cccc305f1dbccf65a07199d", "e1293576c567836c80cd973f36d345aa", "2bbc0c49496d3044c80b824be7e30193", "0aa0c2ce9796de25733b0cf46ee15129"]
You can get multiple JSON's like I did. Because document was deleted and undeleted several times.
Anyway. here we have a JSON indicating current (last before deletion) revision of the document. ("_rev": "6-65624dd5962e59ff09c47ba1be4f726c",  ) Let's try to recover it.

4. Find out required revision hash.

Note "_revisions" key in later JSON, containing a list of "ids". Note first one has the same revision hash as the last revision. Our target is previous revision (before deletion). E.g. Or latest one if the document we are trying to recover was just updated, but not deleted. We might need next revision code.
SO in our example:
 "ids": ["65624dd5962e59ff09c47ba1be4f726c",
First one is equal to current and the next one is before it.
Note also key start ("start": 6, ). It indicates latest revision counter. You could stick to that. E.g. subtract 1 from latest revision and add it to the second hash with "-". So to recover previous revision you would need to create a code like this: "5-826046dbd125b841e0dba657f65bbb78". Hope you have understood about how have I come to this.

5. Retrieve previous revision.

To do that just ask couch with address:
Where $db and $id are your previous database name and deleted code, but $previous_revision is obviously your constructed revision counter + revision hash separated by the "-" symbol. 5-826046dbd125b841e0dba657f65bbb78 In our case.
We will get JSOn of a previous document (before deletion). And we may put it back with PUT/POST request.

It's also worth to mention about when you do not know what revision and/or document $id you want to recover. And want to just recover e.g. previous 10 deleted documents. Then you need to look at request:
It will provide a list of documents manipulations in the database. It's simple enough. But behind the scope of this article.

That's basically it. Questions/comments?


Post a Comment

Popular posts from this blog

Django: Resetting Passwords (with internal tools)

Python converting PDF to Image

Django: Beautiful multiple files Upload Plugin using jQuery UI.