Speeding Up - getMultipleLeads API Call

Anonymous
Not applicable

Speeding Up - getMultipleLeads API Call

Hi, 

I am refactoring some code that is used to download lead data from Marketo. The code utilizes the getMultipleLeads API call to request the lead data. As you may be aware this call only allows a request up to 1000 lead records at a time: 

http://developers.marketo.com/documentation/soap/getmultipleleads


We have had to download lead data in the 900K+ ~ 3 Million record range. As you can see this could be a very slow process...


Problem:

Speed up the time it takes to download large recordsets from Marketo via the getMultipleLeads API Call


Solution:

Use multi-threading to make concurrent getMultiLeads API Calls


My Logic:


1.     Trigger initial getMultipleLeads API call and get the size of the available recordset. Let’s use 1,050 as the returned recordset count for this example.


2.     When multi-threading is running, I want a thread to request 100 records per getMultipleLeads call. I divide 1,050 by 100 and determine 11 threads need to be spawned.


3.     This is the step that seems costly/inefficient to me at this time. The key piece of information I need in order to make the multi-thread solution work is a streamposition to associate to each thread. I do this by looping over the getMultipleLeads API call by the number of threads I need spawned. In this example 11 ....and then capture and store the returned streampositions in a SQL table... Let’s call it tbl_threadStreamPositions. 

             Loop#01 -  Returns StreamPosition For Thread#02
             Loop#02 -  Returns StreamPosition For Thread#03
             Loop#03 -  Returns StreamPosition For Thread#04
             Loop#04 -  Returns StreamPosition For Thread#05
             Loop#05 -  Returns StreamPosition For Thread#06
             Loop#06 -  Returns StreamPosition For Thread#07
             Loop#07 -  Returns StreamPosition For Thread#08
             Loop#08 -  Returns StreamPosition For Thread#09
             Loop#09 -  Returns StreamPosition For Thread#10
             Loop#10 -  Returns StreamPosition For Thread#11
             Loop#11 -  Returns No StreamPosition

I try to speedup this process by limiting the number of getMultipleLeads attributes requested in the call to 1-3
Example: FirstName, LastName, Company


4.     Once the previous step finishes processing my code fires off 11 threads concurrently that have code that pseudo looks like the code below.

<mkt:paramsGetMultipleLeads> 

               <if thread <> 1 >     
                    <streamPosition>{{thread streamposition from tbl_threadStreamPositions}}</streamPosition>      
               <else> 
               </if>
                                 
                <includeAttributes>
                                <loop list="{{list of attributes to request}}" index="local.x">
                                   <stringItem>{{local.x}}</stringItem>
                                </loop>

                </includeAttributes>

</mkt:paramsGetMultipleLeads>



Question:

- Does this logic look sound?
- Is there a better way to do accomplish this via the API
- Anyone had any luck using ETL tools like Talend, Jitterbit or Mulesoft to speed up this process?



Thanks...
 

Tags (1)
2 REPLIES 2
Anonymous
Not applicable

Re: Speeding Up - getMultipleLeads API Call

Hi Sibbs - Take a look at this article on the Developer blog.  It shows how to use the LastUpdateAtSelector within GetMultipleLeads.  When using the LastUpdateAtSelector you can set a start and end date range to only return leads that were updated in a given time range.  When using the time frame you can thread the requests for a set time range so you don't have to thread the requests based on reverse engineering the streamposition.

Another alternative, if it fits the customer's requirements is to reduce the # of leads that you want to export by creating a static list of only those leads and then using StaticListSelector with GetMultilpeLeads.
Anonymous
Not applicable

Re: Speeding Up - getMultipleLeads API Call

Hi Travis,

Thank you for your response to my inquiry. I will look into your provided options and see if I can speed-up the download process...

Thank you.
Sibbs