File-Rsync-Mirror-Recent
view release on metacpan or search on metacpan
The algorithm we use to seed the next file needs quite a lot of more
robustness than it currently has. Something to do with looking at the
merged element of the next rf and when it has dropped off, we seed
immediately. And if it ramains dropped off, we seed again, of course.
Nope, looking from smaller to larger RFS we look at the merged element
of this RF and at the minmax/max element of the next RF. If that
$rf[next]->{minmax}{max} >= $rf[this]->{merged}{epoch}, then we can stop
seeding it.
And we need a public accessor seed and unseed or seeded. But not the mix
of public and private stuff that then is used behind the back.
And then the secondary* stuff must go.
And we must understand what the impact is on the DONE system. Can it go
unnoticed that there was a hole? And could the DONE system have decided
the hole is covered? This should be testable with three directories where
the middle stops working for a while. Done->merge is suspicious, we must
stop it from merging non-conflatable neighbors due to broken continuity.
FIXED
2008-10-10 Andreas J. Koenig <andreas.koenig.7os6VVqR@franz.ak.mind.de>
* Slaven suggests to have the current epoch or the whole current
recentfile available from the HTTP server and take it away with
keepalive. This direction goes the granularity down to subseconds.
We might want to rewrite everything to factor out transport and allow
the whole thing to run via HTTP.
2008-10-09 Andreas J. Koenig <andreas.koenig.7os6VVqR@franz.ak.mind.de>
* smoker on k81 fetching from k75 to verify cascading works. See
2008-07-17 in upgradexxx and rsync-over-recentfile-3.pl.
* maybe the loop should wait for CHECKSUMS file after every upload. And
CPAN.pm needs to deal with timestamps in the future.
* do not forget the dirtymark!
Text: have a new flag on recentfiles with the meaning: if this
changes, you're required to run a full rsync over all the files. The
reason why we set it would probably be: some foul happened. we injected
files in arbitrary places or didn't inject them although they changed.
The content of the flag? Timestamp? The relation between the
recentfiles would have to be inheritance from the principal, because any
out of band changes would soon later propagate to the next recentfile.
By upping the flag often one can easily ruin the slaves.
last out of band change? dirtymark?
Anyway, this implies that we read a potentially existing recentfile
before we write one.
And it implies that we have an eventloop that keeps us busy in 2-3
cycles, one for current stuff (tight loop) and one for the recentfiles
(cascade when principal has changed), one for the old stuff after a
dirtymark change.
And it implies that the out-of-band change in any of the recentfiles
must have a lock on the principal file and there is the place to set the
dirtymark.
* start a FAQ, especially quick start guide questions. Also to aid those
problematic areas where we have no good solution, like the "links"
option to rsync.
* wish feedback when we are slow.
* reduce mccabe
* Remove a few DEBUG statements.
* The multiple-rrr way of doing things needs a new option to rmirror,
like piecemeal or so. Not urgent because after the first pass through,
things run smoothely. It's only ugly during the first pass.
* I have the suspicion that the code is broken that decides if the
neighboring RF needs to be seeded. I fear when too much time has gone
between two calls (in our case more than one hour), it would not seed
the neighbor. Of course this will never be noticed, so we need a good
test for it.
* local/localroot confusion: I currently pass both options but one must
do.
* accounts for early birds on PAUSE rsync daemon.
* hardcoded 20 seconds
* who mirrors the index? DOING now.
* which CPAN mirrors offer rsync?
* visit all XXX, visit all _float places
* rename the pathdb stuff, it's too confusing. No idea how.
* rrr-inotify, backpan, rrr-register
2008-10-08 Andreas J. Koenig <andreas.koenig.7os6VVqR@franz.ak.mind.de>
* current bugs: the pathdb seems to get no reset, the seeding of the
secondaryttl stuff seems not to have an effect. Have helped myself with
a rand(10), need to fix this back. So not checked in. Does the rand
thing even help?
The rand thing helps. The secondaryttl stuff was in the wrong line,
fixed now.
The pathdb stuff was because I called either _pathdb or __pathdb on the
wrong object. FIXED now.
* It's not so beautiful if we never fetch the recentfiles that are not
the principal, even if this is correct behaviour. We really do not need
them after we have fetched the whole content.
( run in 0.759 second using v1.01-cache-2.11-cpan-df04353d9ac )