Hi,
I'm trying to extract Marketo activities using rest/v1/activities.json. I want to run this extraction app multiple times a day to get all the activities but manageable chunks in each run (Without running once per day which will retrieve a large number of activities). The way I'm trying to implement this is,
1. First extract activities and store them in an Azure blob (CSV).
2. When running the next extraction, get the last activity date (MAX(activityDate)) and extract the activities since then.
pagingToken = rest/v1/activities/pagingtoken.json&sinceDatetime= MAX(activityDate)
Activities = rest/v1/activities.json&activityTypeIds=1&nextPageToken=pagingToken (from previous call)
At a glance this worked fine. But when I'm the extracted csvI notice the same Activities repeated in each extraction run. I believe this is because even if I asked for activities after a specific time, API still give all the activities on the same page regardless of the sinceDatetime. My question is: Did I understand it correctly? If yes then any workaround to resolve this. If not then please explain where did I go wrong with the method.
Thanks in advance!
id | activityDate | activityType | |
287513464 | 7/26/2022 0:58:43 | 1 | |
287516365 | 7/26/2022 1:03:56 | 1 | |
287516417 | 7/26/2022 1:09:07 | 1 | |
287535971 | 7/26/2022 6:15:55 | 1 | |
287536024 | 7/26/2022 6:23:20 | 1 | |
287536025 | 7/26/2022 6:23:18 | 1 | |
287536037 | 7/26/2022 6:23:41 | 1 | This is the last activity from the previous run |
New Run | |||
287513464 | 7/26/2022 0:58:43 | 1 | Repeated |
287516365 | 7/26/2022 1:03:56 | 1 | Repeated |
287516417 | 7/26/2022 1:09:07 | 1 | Repeated |
287535971 | 7/26/2022 6:15:55 | 1 | Repeated |
287536024 | 7/26/2022 6:23:20 | 1 | Repeated |
287536025 | 7/26/2022 6:23:18 | 1 | Repeated |
287536037 | 7/26/2022 6:23:41 | 1 | Same activity returned along with some previous activities |
287536047 | 7/26/2022 6:25:25 | 1 | This is where I actually want to start the new run |
287536934 | 7/26/2022 6:29:25 | 1 |
Solved! Go to Solution.
I think you’re right that this can happen under certain circumstances where you end up on the same virtual “page” (cursor). But not clear why it’s a significant problem. The marketoGUID should be your primary key and any modern db will let you merge.
In a perfect world, if you're updating the sinceDatetime properly with the value greater than the recent most activity's datetime in the previous call while generating the pagination token, the corresponding pagination token when used in the query activities endpoint should not return the activities with datetime stamp < sinceDatetime.
Can you please double check that you're updating the sinceDatetime properly in the next call (if you don't update, I'd expect the response similar to what you've added in the post) and the JSON response returned by activities endpoint?
Thank you for the response.
I think you’re right that this can happen under certain circumstances where you end up on the same virtual “page” (cursor). But not clear why it’s a significant problem. The marketoGUID should be your primary key and any modern db will let you merge.
Yes. That's the approach I'm taking now. When committing to db merging is a simple task. But when storing the data in file storage, I have to do some checks and validations on the application side. Anyway, thanks for the answers.