replication - Is it possible to prevent fetching of remote design document in couchdb -


update

as @akshatjiwansharma suggested have tried few things while locally replicating. instructive! have renamed question since problem not design document gets replicated, in fact isn't replicated, but fetched via http part of initial replication "negotiation" phase.

i've moved original question bottom make new question clearer. new question is:

it seems inefficient (particularly in case of couchapps) fetch entire design document - i.e. entire remote app - when initiating replication remote source. can avoided?

it particularly problematic in our case, on high latency links (less 7.2kbps), relatively large design documents (3mb).

remote target

i have first tried using "remote" target setting replication target http://127.0.0.1:5984/emr_replica.

[fri, 08 aug 2014 08:36:20 gmt] [info] [<0.18947.7>] document `88fa1b1a1315d27ded663466c6003578` triggered replication `e8e66a554d198b88b6263a572a072fd3+continuous` [fri, 08 aug 2014 08:36:20 gmt] [info] [<0.18946.7>] starting new replication `e8e66a554d198b88b6263a572a072fd3+continuous` @ <0.18947.7> (`emr_demo` -> `http://127.0.0.1:5984/emr_replica/`) [fri, 08 aug 2014 08:36:20 gmt] [info] [<0.18928.7>] 127.0.0.1 - - post /emr_replica/_revs_diff 200 [fri, 08 aug 2014 08:36:20 gmt] [info] [<0.18915.7>] y.y.y.y - - /_utils/_sidebar.html 200 [fri, 08 aug 2014 08:36:20 gmt] [info] [<0.18916.7>] y.y.y.y - - /_replicator/88fa1b1a1315d27ded663466c6003578?revs_info=true 200  

in case design document doesn't seem fetched.

remote source

then setting source "remote" this

{    "_id": "88fa1b1a1315d27ded663466c6003a4a",    "_rev": "3-b6408e98acafe729da0153c35d9df113",    "source": "http://127.0.0.1:5984/emr_demo",    "target": "emr_replica",    "continuous": true,    "filter": "emr/user_data",    "owner": "jun" } 

then server fetches remote design document before starting replication (get /emr_demo/_design/emr 200).

[fri, 08 aug 2014 08:42:17 gmt] [info] [<0.19687.7>] document `88fa1b1a1315d27ded663466c6003a4a` triggered replication `bd8f6288970bca974dba36dbc6e5353b+continuous` [fri, 08 aug 2014 08:42:17 gmt] [info] [<0.19686.7>] starting new replication `bd8f6288970bca974dba36dbc6e5353b+continuous` @ <0.19687.7> (`http://127.0.0.1:5984/emr_demo/` -> `emr_replica`) [fri, 08 aug 2014 08:42:17 gmt] [info] [<0.19648.7>] 127.0.0.1 - - head /emr_demo/ 200 [fri, 08 aug 2014 08:42:17 gmt] [info] [<0.19648.7>] 127.0.0.1 - - /emr_demo/_design/emr 200 [fri, 08 aug 2014 08:42:18 gmt] [info] [<0.19656.7>] 127.0.0.1 - - /emr_demo/5cc2db69a32a84091b96c244273fda0e?revs=true&open_revs=%5b%221-ef8967557f2e99eb137f963daccddb3f%22%5d&latest=true 200 

further testing shows fetching of design document done once. further replications (including after restarting server) fetch changes appropriate filter:

[fri, 08 aug 2014 09:06:36 gmt] [info] [<0.520.0>] document `88fa1b1a1315d27ded663466c6003a4a` triggered replication `bd8f6288970bca974dba36dbc6e5353b+continuous` [fri, 08 aug 2014 09:06:36 gmt] [info] [<0.519.0>] starting new replication `bd8f6288970bca974dba36dbc6e5353b+continuous` @ <0.520.0> (`http://127.0.0.1:5984/emr_demo/` -> `emr_replica`) [fri, 08 aug 2014 09:06:36 gmt] [info] [<0.335.0>] 127.0.0.1 - - /emr_demo/_changes?filter=emr%2fuser_data&feed=continuous&style=all_docs&since=1607&heartbeat=1666 200 [fri, 08 aug 2014 09:06:36 gmt] [info] [<0.334.0>] 127.0.0.1 - - /emr_demo/5cc2db69a32a84091b96c24427560310?atts_since=%5b%2218-b613d3160bd09c45ac07a5485c9c7bce%22%5d&revs=true&open_revs=%5b%2219-d50438143337a3a0af5ed8ceb75b42f5%22%5d&latest=true 200 

former question

we're trying use couchdb replication on high latency link (slow, frequent disconnections,...). want avoid replicate design document heavy. have filter in place , when using following curl command, design document doesn't appear, expected:

curl http://x.x.x.x:5984/emr/_changes?filter=emr/user_data 

our replication document is:

{    "_id": "e0e38be8cc0b11356dfb03bc8400074d",    "_rev": "1-d77117f03d63099e1e505b9f9de3371d",    "source": "http://x.x.x.x:5984/emr",    "target": "emr",    "continuous": true,    "filter": "emr/user_data",    "create_target": true,    "owner": "jun" } 

we have deactivated authentication while we're debugging. when using existing database , removing create_target, same problem occurs.

the source server outputs following:

[mon, 10 mar 2014 21:22:03 gmt] [info] [<0.135.0>] retrying head request http://x.x.x.x:5984/emr/ in 0.25 seconds due error {conn_failed,{error,etimedout}} [mon, 10 mar 2014 21:23:47 gmt] [info] [<0.135.0>] retrying request http://x.x.x.x:5984/emr/_design/emr in 0.25 seconds due error req_timedout [mon, 10 mar 2014 21:24:14 gmt] [error] [<0.135.0>] replicator, request "http://x.x.x.x:5984/emr/_design/emr" failed due error {error,req_timedout} [mon, 10 mar 2014 21:24:14 gmt] [error] [<0.135.0>] replication manager, error processing document `e0e38be8cc0b11356dfb03bc8400074d`: couldn't open document `_design/emr` source database `http://x.x.x.x:5984/emr/`: {'exit',{http_request_failed,"get","http://x.x.x.x:5984/emr/_design/emr",                          {error,{error,req_timedout}}}} 

when using tcpdump, it's clear replication fails because replication manager attempts download heavy design document (http://x.x.x.x:5984/emr/_design/emr).

fyi replicator's configuration is:

replicator  connection_timeout          5000                 db                          _replicator              http_connections            1                max_replication_retry_count 3                retries_per_request         1                socket_options              [{keepalive, true}, {nodelay, true}]                 ssl_certificate_max_depth   3                verify_ssl_certificates     false                worker_batch_size           1                worker_processes            1 

edit: user_data function (which correctly hides design document when ran through curl above) :

exports.user_data = function(doc, req) {     if (doc.collection == "visits" || doc.collection == "patients" || doc.collection == "reports") {         return true;     }     return false; } 

hope can help!

suggestion

try defining filter function in another, small, dedicated design document , see if fixes problem.

// replicator document: {    "_id": "e0e38be8cc0b11356dfb03bc8400074d",    "_rev": "1-d77117f03d63099e1e505b9f9de3371d",    "source": "http://x.x.x.x:5984/emr",    "target": "emr",    "continuous": true,    "filter": "small-design-doc/user_data",    "create_target": true,    "owner": "jun" }  // _design/small-design-doc // -- replicated, quite small: {   "_id": "_design/small-design-doc",   "_rev": "1-...",   "filters": {     "user_data": "function(doc, req) { ... }"   } } 

explanation

according current snapshot of source code, seems replicator trying fetch design document (_design/emr) source database, because filter function defined there (emr/user_data).

if specify filter function in design document, replicator should try download document before executing replication. cannot quite circumvent downloading any design document, able select which one.

great question way. , thoroughly investigated!


Comments

Popular posts from this blog

php - SPIP: From Tag directly to an article -

jquery - isAjaxRequest always return false -

ruby on rails - In a controller spec, how to find a specific tag in the generated view? -