[commit: http://hg.dwscoalition.org/dw-free/rev/3c01c3ea5414]
Switch from syncitems to multiple getevents in entry imports
We were having some issues with syncitems failing on large accounts. Also,
honestly, it's really a finicky system and hard to use correctly. I decided
to rip it out and replace it with the a more straightforward approach.
Now, we use the selecttype of 'one' and the itemid of -1 to fetch the most
recent entry. Since itemids will be monotonically increasing, this gives us
an upper bound on the range. Since we also know that every account starts at
itemid 1, that gives us a lower bound.
From there it is a simple matter to start requesting entries in groups of
100. This is far more efficient -- for both us and the remote side -- than
using syncitems.
(The reason that we can do this is because we never go back and edit posts.
The strength of the syncitems mode is that it supports downloading edited
versions of the post.)
Patch by
mark.
Files modified:
Switch from syncitems to multiple getevents in entry imports
We were having some issues with syncitems failing on large accounts. Also,
honestly, it's really a finicky system and hard to use correctly. I decided
to rip it out and replace it with the a more straightforward approach.
Now, we use the selecttype of 'one' and the itemid of -1 to fetch the most
recent entry. Since itemids will be monotonically increasing, this gives us
an upper bound on the range. Since we also know that every account starts at
itemid 1, that gives us a lower bound.
From there it is a simple matter to start requesting entries in groups of
100. This is far more efficient -- for both us and the remote side -- than
using syncitems.
(The reason that we can do this is because we never go back and edit posts.
The strength of the syncitems mode is that it supports downloading edited
versions of the post.)
Patch by
Files modified:
- cgi-bin/DW/Worker/ContentImporter.pm
- cgi-bin/DW/Worker/ContentImporter/LiveJournal/Entries.pm
- cgi-bin/DW/Worker/ContentImporter/Local/Entries.pm