Discussion:
Will slapadd work with delta-syncrepl?
Khoa Nguyen
2010-03-25 01:31:07 UTC
Permalink
I have set up delta-syncrepl between provider A and consumer B, and it seems
to work ok. Online updates to A are synch'ed to B. However, about once a
month, there is a large update which contains hundreds of million records.
Online update is going to take days. I tried to bring A offline, do
slapdadd, and bring A back online. But these new entries were not synch'ed
to B. Is there a way I can make this work?

Thanks,
Khoa
Buchan Milne
2010-03-25 10:43:17 UTC
Permalink
Post by Khoa Nguyen
I have set up delta-syncrepl between provider A and consumer B, and it
seems to work ok. Online updates to A are synch'ed to B. However, about
once a month, there is a large update which contains hundreds of million
records. Online update is going to take days. I tried to bring A offline,
do slapdadd, and bring A back online. But these new entries were not
synch'ed to B. Is there a way I can make this work?
Depending on what you've done to the data before slapadd'ing it, you could
have messed up replication.

If you are going to do this, you should:

1)slapadd on A
2)slapcat on A
3)slapadd result of (2) on B

But, why you need to bulk-load hundreds of millions of entries is another
question ...

Regards,
Buchan
Howard Chu
2010-03-26 19:48:46 UTC
Permalink
Post by Khoa Nguyen
I have set up delta-syncrepl between provider A and consumer B, and it seems
to work ok. Online updates to A are synch'ed to B. However, about once a
month, there is a large update which contains hundreds of million records.
Online update is going to take days. I tried to bring A offline, do slapdadd,
and bring A back online. But these new entries were not synch'ed to B. Is
there a way I can make this work?
Delta-syncrepl works by writing a log of all your main database changes into a
log database. When you add entries using slapadd, nothing is added to the log
database, therefore delta-sync cannot replicate those changes.

You can force a resync by emptying the log database. When a delta-sync
consumer tries to connect and the log no longer contains a record of the
consumer's last change, it will automatically fallback to regular syncrepl to
resync.

Note that since you're talking about new entries, which need to be replicated
in whole anyway, delta-syncrepl offers no benefit over regular syncrepl here.

Also, as Buchan pointed out, replicating hundreds of millions of records will
take a long time. You're better off just slapadding on both the provider and
the consumer.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Khoa Nguyen
2010-03-27 00:43:08 UTC
Permalink
Thank you all for your insights, which make me think slapadding may not be a
good companion with delta-syncrepl. So I plan to do all the updates online
since I managed to get close to 3000 updates per second on my single-disk
server.

Now, my colleague doesn't agree with me on the delta-syncrepl approach, and
prefers to update A and B independently. His argument is that with
delta-syncrepl, B is dependent on A, so if A's databases (main + log) are
corrupted, and we have to restore A to a previous checkpoint, B would
automatically rollback, and we would lost the latest data. I still prefer
delta-syncrepl approach, since if updated independently, A and B can be
out-of-synch over time and we wouldn't know it.

I also looked at other replication modes (mirror, n-way master, etc.), but
since we only have 2 servers to work with, and our openldap version is still
at 2.3, our choices are limited.

Your advices and suggestions on what should be the best approach are
appreciated.

Khoa
Post by Howard Chu
Post by Khoa Nguyen
I have set up delta-syncrepl between provider A and consumer B, and it seems
to work ok. Online updates to A are synch'ed to B. However, about once a
month, there is a large update which contains hundreds of million records.
Online update is going to take days. I tried to bring A offline, do slapdadd,
and bring A back online. But these new entries were not synch'ed to B. Is
there a way I can make this work?
Delta-syncrepl works by writing a log of all your main database changes
into a log database. When you add entries using slapadd, nothing is added to
the log database, therefore delta-sync cannot replicate those changes.
You can force a resync by emptying the log database. When a delta-sync
consumer tries to connect and the log no longer contains a record of the
consumer's last change, it will automatically fallback to regular syncrepl
to resync.
Note that since you're talking about new entries, which need to be
replicated in whole anyway, delta-syncrepl offers no benefit over regular
syncrepl here.
Also, as Buchan pointed out, replicating hundreds of millions of records
will take a long time. You're better off just slapadding on both the
provider and the consumer.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Quanah Gibson-Mount
2010-03-27 15:55:42 UTC
Permalink
--On Friday, March 26, 2010 8:43 PM -0400 Khoa Nguyen
Post by Khoa Nguyen
Now, my colleague doesn't agree with me on the delta-syncrepl approach,
and prefers to update A and B independently. His argument is that with
delta-syncrepl, B is dependent on A, so if A's databases (main + log) are
corrupted, and we have to restore A to a previous checkpoint, B would
automatically rollback, and we would lost the latest data. I still prefer
delta-syncrepl approach, since if updated independently, A and B can be
out-of-synch over time and we wouldn't know it.
If A corrupts, then you slapcat B and reload A with the data.

--Quanah

--

Quanah Gibson-Mount
Principal Software Engineer
Zimbra, Inc
--------------------
Zimbra :: the leader in open source messaging and collaboration

Loading...