Compression on Bulk Export APIs?

Kurt_Koller
Level 4

Compression on Bulk Export APIs?

For Bulk Activity Export - exporting a file it is both stored as plain text and transmitted without gzip compression.

This data is repetitive and compresses very well. In my analysis of files with all activity types, ~90%+ with gzip -6 settings. It's a little better if you're asking for the same activity type.

If we ask for 205M worth of activities, and if the data were stored compressed, it would be 20.5M versus 205M. Even if just gzip were turned on in nginx, at least that would save 90% of transmission time. Ideally this data would be stored compressed to not count toward the quota.

At a minimum can compression be turned on for the file download so that part is quicker? The server currently doesn't seem to send gzipped data even when asked.

Tags (1)
10 REPLIES 10
Kurt_Koller
Level 4

Re: Compression on Bulk Export APIs?

Would love to see compression on the file level more than the http transfer level in order to extend the quota.

Would also love to see CSVs not contain the word null for null values between commas:

id,field,field,field,field,field

Right now:

id,null,null,field,null,field

Should be:

id,,,field,,field

David_Everly
Marketo Employee

Re: Compression on Bulk Export APIs?

Hello Kurt,

HTTP compression (gzip) is supported by bulk extract leads and activities.  When calling either Get Export Lead File, or Get Export Activity File endpoints, simply add the HTTP header "Accept-Ecoding: gzip" and the response body will be gzipped.

curl -H 'Accept-Encoding: gzip' 'https://123-XYZ-456.mktorest.com/bulk/v1/activities/export/<exportId>/file.json'

-Dave

Kurt_Koller
Level 4

Re: Compression on Bulk Export APIs?

thanks, the accept header wasn't working for us at launch, I will try again. Appreciate all your responses over the last few days.

Anonymous
Not applicable

Re: Compression on Bulk Export APIs?

Hello Dave,

I tried adding HTTP header "Accept-Ecoding: gzip" but in response header we are not receiving Content-Encoding: gzip and response is coming in plain text format.

Below is the sample request we tried:

Request header:

Accept-Encoding: gzip

Request URL:

https://<marketoendpoint URL>/rest/v1/lead/8781.json?access_token=572b2d67-727a-46f0-a999-81f94d78a4...

Received response header:

server:nginx

date:Mon, 12 Feb 2018 05:09:34 GMT

content-type:application/json;charset=UTF-8

content-length:213

connection:keep-alive

Response body is coming as plain text.

Could you please help what am i missing?

Thanks

Arun P

Frederic_Pinch1
Level 2

Re: Compression on Bulk Export APIs?

I tried yesterday, and even though the Header Accept-Encoding: gzip gets accepted in the request, it still downloads a regular CSV, not a GZIP file...

What am I missing?

SanfordWhiteman
Level 10 - Community Moderator

Re: Compression on Bulk Export APIs?

Encoding means the text in the payload is compressed and automatically decompressed by your client. It's not a gzip file (otherwise requesting gzip for JS files, like browsers do, could never work since ungzip isn't a native JS capability).

Frederic_Pinch1
Level 2

Re: Compression on Bulk Export APIs?

Oh OK, so that still means that the volume downloaded out of the Marketo instance will be compressed, consuming less of the 500MB a day quota?

SanfordWhiteman
Level 10 - Community Moderator

Re: Compression on Bulk Export APIs?

No, HTTP compression doesn't need to have any relationship to the size of the data exported from a database. They're 2 different areas.

(Though if the data in the db were pre-gzipped, it wouldn't do any good -- only bad, due to resource overhead -- to compress it again before putting it on the wire.)

Frederic_Pinch1
Level 2

Re: Compression on Bulk Export APIs?

OK thanks for all the explanations Sanford, I am new to that topic.

So if I got that right, as a conclusion, requesting to compress the request only results in a faster transfer, no other benefits...