Index Custom Fields in Marketo

Index Custom Fields in Marketo

Today, only standard Marketo fields have an index placed on them that improves and optimizes smart list performance. It would be beneficial if indexes could be placed on custom fields so that smart lists do not frequently time out when using filters on our business's core fields.

Indexes would be optimal for fields that are unique or have external Ids in another platform, are boolean, date, or have a static set of values (such as in a drop down).

11 Comments
Robb_Barrett
Marketo Employee

Using Static Lists instead of custom fields is a very fast alternative approach.

Big Marketo, Pt 2: Use static lists instead of creating fields

Anonymous
Not applicable

It's really not when you have to add members to the list via a static list via a batch campaign or manually. Also, it's operationally inefficient to add the entire database to a static list just to improve ad hoc reporting (causes backlogs, etc). For example, I was working on a database reconciliation (600,000+ records) as we just implemented a sync filter and trying to find records that are in marketo but not our CRM. Identifying a pattern is not the challenging part, its getting the static list to pull back without timing out repeatedly.

Custom fields are a requirement for a lot of businesses - especially when there's a sync involved to a CRM. We should be able to filter on those fields just as easily. Besides, how would you build the trigger to put these fields in a static list... through a smart list in smart campaign.

I'm not requesting all custom fields be indexed, just a couple core ones that can be requested through support.

Robb_Barrett
Marketo Employee

I upvoted your idea, I was just trying to show an example. 

You can add someone to a list based on a trigger like a form fillout, it's not ALL batch or manual. There are definitely easy ways to solve for your issue now, but one of the things you mentioned is boolean fields, which really don't need indexed. In fact, you'd be creating a lot of database bloat for a field type that only has two values and is quick to sort. 

Likewise for your date fields if you need to sort on them often you should create a field for Year, one for Month and one for Date as doing any search on a datetime field is resource intensive as it stores a very large INT value. Right now, in Unix time the date is 1480376078. If I were to store the date in three fields: 2016. 11. 28  I have one smallint and two tinyints, which are very easy to store and sort. If you do a lot of searching and sorting by date you're better off to create three fields for each of your date fields, yearCreatedOn, monthCreatedOn, dayCreatedOn, yearUpdatedOn, monthUpdatedOn, dayUpdatedOn, etc, and having a workflow that processes these overnight.

It would be nice if we could enumerate dropdowns for faster sorting and searching, I'll agree with you there.

I'm not disagreeing with you that indexing would be nice but I don't know how many people understand how an index bloats the database and, with Marketo not really giving up a lot of options to keep our database size small and forces us to store all of our data on just a few objects, optimization isn't really well enough understood by most people to prevent them from indexing every field.

Ultimately, I've found that writing to a static list works AWESOME as a custom boolean field. For your example, you can always complete your sync by updating a list to show that they've been synched.

SanfordWhiteman
Level 10 - Community Moderator

any search on a datetime field is resource intensive as it stores a very large INT value.

DATETIMEs are stored as 2 INT32s. (This makes your point even stronger than just INT32 vs INT<32!)

I'm uncomfortable making assumptions about the Marketo data model, though, since it's not like an in-house OLTP or warehouse whose design, and accompanying query engine quirks, are well-known. With backend moves to NoSQL and separate indexing tiers, we need to continually test approaches in Marketo itself to see what happens.

Anyway, I upvoted the idea, but there have to be limits to how many fields are indexed, like 5-10.

P.S. Also not sure I agree with the idea of never indexing booleans/tinyints, esp. when you include clustered, partial, and bitmap indexes. But that is a discussion for elsewhere.

Anonymous
Not applicable

For the sync filter it has to be a field in both systems, there's no way around that (requirement even by support). Since it's an established field, I should be able to run a smart list off of it without having to manage two separate static lists and set up a managing trigger every time the value changes. Also keep in mind once a sync filter is turned off on the SFDC side, it's cut off then and there - so you have to do a sort of cyclic work around to "request a desync" and then let marketo update the sync filter - not thrilled with the design but I don't know of another simple way to otherwise get the values to be uniform in both systems. The behavior I am observing when trying to run a smart list on whether the sync filter is false in marketo is a lot slower than anticipated given that the field is boolean. It is faster not to have a filter at all and pull back all records, which doesn't make sense - perhaps this just means that there needs to be a general performance boost.

My administrative style has been to manage records and not lists as I manage multiple platforms and integrations. I'm an aggressive enforcer of ensuring that that data is only captured in one spot. I've not been fond of the idea of static lists as they require additional management and can be misleading if records aren't added or removed appropriately. The idea that I have to create additional lists and triggers to get a performance boost to pull records seems superfluous and could quickly result in an administrative mess.

I think your article has great applications for temporary/Q&A fields (ie dynamic data capture), but the index I had in mind is for well established or administrative  fields that need to be filtered. I'm not worried about sorting as most of the time I just export the list immediately afterwards for cross review/calculations.

I completely agree that there should be governance on the number of indexes and how they're utilized. As far as the general masses using indexes incorrectly, I wouldn't be as concerned about it. If you have to go through support to implement a custom index, they can vet the use case and advise accordingly. SFDC has some good rules/information about how they utilize custom indexes that could probably be taken into consideration for this idea:

https://developer.salesforce.com/docs/atlas.en-us.apexcode.meta/apexcode/langCon_apex_SOQL_VLSQ.htm

Force.com Formula Fields, Indexes, and Performance Gotchas - Engineering

http://resources.docs.salesforce.com/194/0/en-us/sfdc/pdf/salesforce_query_search_optimization_devel...

I disagree about not considering date & date times as it's common to have a few date fields indexed in other systems (such as a system mod stamp). I would argue that most people use dates for filtering and not for sorting. But it goes back to how Marketo would design this and vetting the use case at the request. I bet there's at least one company out there that has critical business processes revolving around a date field that could use a boost in their report performance.

Overall, I think we both agree that there are some great uses cases to having some custom indexes placed, but there needs to be careful consideration on how they're placed and on what field types they can be placed on to prevent bogging down the database. Thanks for the vote.

Robb_Barrett
Marketo Employee

I'm currently working on a project where I use synched fields on a temp basis and write to static lists.  Here's the case:

We want to have a bilateral sync on product interest in SFDC and Marketo.  Because a person may indicate cross-product interest either directly to a sales person (who records it in SFDC) or through an inbound web-form we want to be able to keep two systems in sync.

To accomplish this, we created 14 product family fields in SFDC which contain between 5-100 product choices.  Because product lines are volatile and very few customers have interest in multiple product families, I didn't want to create custom fields.

So, here's how it works:

From SFDC, if the lead says "Hey, I'm also looking for a vendor for a new ABC component as well as a 123 part and a 679 part for my Foo machine," the sales rep goes to product family Foo and selects an interest in ABC, 123 and 679. This syncs to Marketo which has a FooFamily field.

From here, I have a workflow listening for changes. It reads the values and writes to lists: ABC, 123, 679.

Now, two weeks later, we look for leads who are interested in 123 and 679 and email an offer to them the 458 part. They bite, fill out a form for more info. Now I capture the interest in the 458 list. THEN...

I cycle through all of the products in the Foo family lists.

IF IN LIST 123 then FlexField1 = 123

IF IN LIST 458 THEN FlexField1 = {{lead.FlexField1}}; 458

IF IN LIST ABC THEN FlexField1 = {{lead.FlexField1}}; ABC

etc

After I'm completed, I re-wipe the FooFamily field (just to make sure) and copy FlexField1 to the FooFamily, then sync it to SFDC.

Now I have small lists for product interests instead of database fields. I have over 1 million records so even storing a bit field for a couple hundred fields is out of hand since it'd be 99.9% False.

But wait! There's more!

Because they're fields, I can name them as (region) - (regional language name) - (english language name), i.e: EGM - кардиология - Cardiology.

And then....

When we retire the product line, I'm not stuck with a field

And then...

When we rename the product line, I can rename the list

And then....

If there's a difference in the Canadian - English vs. Canadian - French version of the product line, I can differentiate

And then...

When we add in a new product to the family, I don't have to add in new fields, just a new list.

And then...

NO MORE AND THEN.

So you see, the problem you're facing isn't unique. Is my solution easy? No. Does my solution provide job security? Mmmmm....maybe? But it's scalable, manageable, lightweight and not any more difficult to use than a field. It's a bit of setup to create it but then it's flow and go. Works well, prevents dataloss from some schmoe overwriting a field. I can also have these lists sent to me via a subscription weekly or monthly so I have offline backups.

SanfordWhiteman
Level 10 - Community Moderator

It is faster not to have a filter at all and pull back all records, which doesn't make sense - perhaps this just means that there needs to be a general performance boost.

Hopefully, that boost is coming!

But this is a perfect example of where guessing at the implementation -- that Leads is a single table with an unindexed bit column -- may lead to mistaken assumptions. We don't know whether there is a partial EAV schema in place and you're hitting a very complex view (EAV columns-as-rows would be almost impossible to index).

Robb_Barrett
Marketo Employee

I'm guessing that the Company fields are a separate object and the custom fields are probably a separate object from the core leads field and the Opportunities, etc, are all separate. I don't see any normalization happening but this is a transactional system. We know that there are nightly report tables built for analysis purposes but the main object itself? While it's probably a view of all of the components combined (for display purposes) I'd be really surprised to see a transaction system with a centralized lead table broken into too many components. I mean, I guess many Address since it's used in both Lead and Company (and possibly other places) but even that'd create a lot of server stress to continuously rebuild it on each query.

When I worked with Neolane the schema was all XML based. It was a sunofagun to work with but they exposed the schema to allow for the creation of any tables, columns, derived tables, etc that you wanted. They also had a visual modeler and quite a bit more functionality. Took a lot of knowledge to use it, but possibilities were limitless. As were the bugs.

SanfordWhiteman
Level 10 - Community Moderator

I'm not saying it is this way, just that we don't know, and I have known tenant-extensible apps that use EAV for custom fields.

Anonymous
Not applicable

Just confirmed with support that this idea has not been released and is not a current feature or functionality that can be requested through support. Whoever marked this as 'already have it' is incorrect.