SOLVED

Building a data pipeline for Snowflake

Go to solution
Highlighted
Level 1

Building a data pipeline for Snowflake

We are trying to build a data pipeline using Marketo REST API, AWS lambda and Snowflake. We exceeded the API limits recently (API limit was 10K / day... now it is changed to 50K), and we want to avoid such problems in the future.

 

I am trying to understand the API query limits better. I found this blog post explaining the limits, but I am hoping to get some more details. 

  1. If we use the REST API with dynamic date range for lead activities data (for 10 activity type ids) with next page token, how much data(in MBs) can be consumed for a date range of one week and how many API calls could one request / response consume?
    • Up to 200k+ / month of lead activity data (approximately)
  2. How many api requests would we end up consuming if we download all leads from the marketo leads database?
    • oldest record is from September of 2019
    • Bulk export for a 31-day period would result in 3k - 5k records (approximately)

Thank you in advance!

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted
Level 10 - Community Moderator

Re: Building a data pipeline for Snowflake

Why do you need more than 31 days in a single job? 

 

Why can't you use the Bulk Extract for Leads, exactly?

 

The Bulk Extract is, in practice, metered only by bytes (500 MB/day). The number of API calls to queue up extracts is so small as to negligible (if you're that close to running out of calls, you're in trouble big).

 

View solution in original post

4 REPLIES 4
Highlighted
Level 10 - Community Moderator

Re: Building a data pipeline for Snowflake

You really should be using the Bulk Extract API for this, not the paginated export. 

 

As far as the data transfer in bytes, it's impossible to guess. You have to baseline it yourself with historical (i.e. current) data, and make sure you have say 20% headroom.

Highlighted
Level 1

Re: Building a data pipeline for Snowflake

Thank you for such a quick response!!! 👍

I had to use the paginated export method because I wanted to download data beyond the 31-day limit. In a bulk export, I did not know how the output will be since each activity type ID will have a different set of 'attributes'. I will have to test the bulk extract API approach for lead activities as well. And for the leads data, I have no choice but to use the bulk export and the output is much simpler too, so its working out for me.

 

I am a bit confused about how marketo api measures and if byte size takes precedence over #of API calls... Thank you!

Highlighted
Level 10 - Community Moderator

Re: Building a data pipeline for Snowflake

Why do you need more than 31 days in a single job? 

 

Why can't you use the Bulk Extract for Leads, exactly?

 

The Bulk Extract is, in practice, metered only by bytes (500 MB/day). The number of API calls to queue up extracts is so small as to negligible (if you're that close to running out of calls, you're in trouble big).

 

View solution in original post

Highlighted
Level 1

Re: Building a data pipeline for Snowflake

Why do you need more than 31 days in a single job? 

- I needed data beyond the 31-day limit because I was trying to download huge volumes of data on Aug 30th. I did not want to miss out on any data and bulk export would require me to put in multiple requests for the activities data I was hoping to download.

 

Why can't you use the Bulk Extract for Leads, exactly?

- I take your recommendation for the 'leads' database and the activities. And thank you so much for your response on the API / data limit. It makes sense to go with the bulk export. 👍 🎩