Bulk Activities Export Recency

Anonymous
Not applicable

Bulk Activities Export Recency

Hi - Is there a time cutoff between the when the job is created/enqueued and the activities returned for the bulk activities export? I think a simple example will better explain by question .

Suppose I run the 2 below bulk export activity jobs with the corresponding time parameters. Should I expect Job 1 and Job 2 to return the exact same results?

Job 1:

Time I launch(ie /create and /enqueue) the job: '2018-03-22T00:00:01+00:00'

startAt='2018-03-21T00:00:00+00:00'

endAt='2018-03-22T00:00:00+00:00'

Job 2:

Time I launch(ie /create and /enqueue) the job is one day later: '2018-03-23T00:00:00+00:00'

startAt='2018-03-21T00:00:00+00:00'

endAt='2018-03-22T00:00:00+00:00'

10 REPLIES 10
SanfordWhiteman
Level 10 - Community Moderator

Re: Bulk Activities Export Recency

Because of the asynchronous nature of certain processes, I don't believe you can guarantee that all writes will be complete in the first example.

However, it will be an accurate representation of the way the ActLog looked at the point of execution, which is another source of truth (i.e. only those activities would have triggered SCs in the timeframe).

Anonymous
Not applicable

Re: Bulk Activities Export Recency

Hi Sanford Whiteman - Many thanks for your response. Much appreciated! Makes sense; is there any documentation outlining this with more solid timeframes of when we can expect activity data to be final without any further writes to the activity log? Guaranteed/Finalised with no new activities after x hours/days?

Instead of pulling 1 day of activities at a time with the given startAt, endAt above I can pulling a larger window to circumvent missing out on activities if there is a small delay in them being written to the log. Pulling last 3 days, etc however the bulk export quota daily limits may make it a bit difficult to pull for larger timeframes.

SanfordWhiteman
Level 10 - Community Moderator

Re: Bulk Activities Export Recency

is there any documentation outlining this with more solid timeframes of when we can expect activity data to be final without any further writes to the activity log? Guaranteed/Finalised with no new activities after x hours/days?

You can't ever get this with an eventually consistent (asychronous writes) system. You can hope for no more than hour, but there could be outliers, it's just the way it is. Usually you try to set an SLA for committed writes -- but when you break that internal SLA, nobody knows!

Anonymous
Not applicable

Re: Bulk Activities Export Recency

Ok - I would expect some ballpark SLA available in the docs that api users could work with. For now I will increase the timeframes to allow for a 24 hour buffer period. Will run some tests to see the difference in number of results returned.

SanfordWhiteman
Level 10 - Community Moderator

Re: Bulk Activities Export Recency

There's nothing public about this area.

Kurt_Koller
Level 4

Re: Bulk Activities Export Recency

I've asked about the whole asynchronous thing and I was told that that isn't an issue and that the only things would be people moving in and out of a partition and anonymous leads being converted to non anonymous. You won't see anonymous activity in bulk but if a lead is converted later to known then if you got the same bulk data later you would see their activity

SanfordWhiteman
Level 10 - Community Moderator

Re: Bulk Activities Export Recency

It's not true, though. You can demonstrate that the read-after-write results for a form post at 12:00:00 are not the same as the Activity Log if fetched at exactly the same time.

Anonymous
Not applicable

Re: Bulk Activities Export Recency

Thank you very much for chiming in, Kurt.

anonymous leads being converted to non anonymous. You won't see anonymous activity in bulk but if a lead is converted later to known then if you got the same bulk data later you would see their activity

If I am reading that correctly, that seems like sort of a big deal. I could very well be missing out on a large amount of activities if the person became a known lead a few days after their corresponding activity while they were still anonymous. If that is correct, may I ask how have you gotten around this? Other than pulling for a larger time period in the bulk api, hitting the /activities endpoint seems like an alternative.

SanfordWhiteman
Level 10 - Community Moderator

Re: Bulk Activities Export Recency

There's no way to "work around" it -- it defines the way Munchkin tracking works. Someone's session can become associated weeks, months, or years after the anonymous session begins.

The extract accurately reflects (with the exception of the async commit discussed above, which is an actual exception despite the quote from support) a snapshot of the activities in associated web sessions during the given period. It can't know any more than that because it can't see the future!