We faced an intermittent issues when querying Lead via the SOAP API by using
'last-updated-at' as a key - where the API will return more than one occurence of the same Lead in the XML response.
Here's an example, in a query where the API returns 10k ~ 20k of Leads in the xml response - there's 1 or 2 leads (sometimes a handful) with the same ID and data returned in the xml. So we are seeing 2 snippets of identical xml for a lead.
If the resultset is larger (i.e. our Marketo admin user performs a mass update, resulting in the API to throw in 100k Leads into the result), the chances of getting the duplicates will be higher.
XML Request: <mkt:paramsGetMultipleLeads>
<lastUpdatedAt>2013-05-08T10:05:54+08:00</lastUpdatedAt>
................
................
</mkt:paramsGetMultipleLeads>
I can't give a solid example of XML request to replicate the issue because of its intermittent nature and evolving query (based on last-updated-at). It does however seem to relate to mass update in Marketo - when a mass update happened last night, we were getting a lot of duplicated leads with same ID.
I am not sure if this is a known issue or if someone has reported it before, but it would be great if you can provide us a hint on how can we avoid the dupes (i.e. adding an additional param to our request).