Supposedly, if your DB size is greater than 1M records, Marketo automatically moves you to this so called enhanced architecture (that's ultimately related to Project Orion) - called "Corona". We didn't even realize we were on it until several critical issues with our data surfaced this week - and reaching out to Support confirmed Corona could be the culprit (major delays in processing smart lists and smart campaigns). The other question we're waiting to hear about is why we were even placed on Corona - since our DB size is less than 200,000 records. Here's an example issue that we uncovered this evening:
I just ran a single flow action ("change data value") against a set of 96 email addresses (11 of those leads had their country properly defined, but I left them in here anyway since they would just be skipped) . All I did was populate the country value (to Brazil) of any lead that had a missing country value. The flow action completed successfully – and if you access the lead record of any of the leads, you can see that the country value is indeed “Brazil”. Yet when I view the leads at once, many have NULL country values:
When clicking on any of the lead records here, the “country” value contains Brazil - it's just not showing up properly in the leads view. Also, if you add another filter to the smart list – [where country is “Brazil”], it will only return the following 11 leads (out of the 96 that actually have "country = Brazil" in their lead record - these are the leads that weren't affected by the "change data value" flow action that was just applied; and already had "Brazil" as their country value):
After contacting support, it was uncovered that our instance was moved to Corona and that they are experiencing a significant lag in showing data properly (as I'm writing this, it's been two and half hours since running the flow action to change the country value of all leads to "Brazil" - and it still looks like the first screenshot above). This isn't just a display issue. As noted above, filters/triggers in smart lists/smart campaigns don't have access to this updated data either. If that's the case, I can’t even imagine the magnitude that this issue is causing in our environment – and how leads are not being processed correctly.
Just wondering - has anyone else experienced issues such as this?
I am now witnessing similar issues. New leads are entering our system via API. In their profile view the fields are updating with data but when I look at the Activity Log the Change Data Value actions are not logged.
We have smart campaigns based on Change Data Value triggers which are now not firing, meaning new leads are not receiving critical emails.
This is scary stuff Dan Stevens!
How do you know if you are on Corona?
Has anyone on Corona been told of these known issues and the effects it could be having? (Is there a known list of issues I can check my instance against?)
Thanks,
Gerard
The lag is expected. "Corona" is just a name they use for the Solr indexes they use to index your data (Note: you don't need to be on Orion to have Corona). The lag is where your data is cached or changing and needs to be replicating. If you have an ever changing DB i.e. Change Data Value etc then you will experience a small lag on load.
If you have an ever changing DB i.e. Change Data Value etc then you will experience a small lag on load.
Isn't this basically every Marketo customer? CDV is probably the most common flow step that's used in everyone's environment. And for us, it's not just a small lag - it's pretty significant.
Agreed, this can't be the cause of significant latency.
Hi Neil,
I wish it was just a lag problem. Unfortunately, it isn't. This is a top priority issue with engineering right now.
Sheila
It could be poor setup on Marketo for sure on your pod then. IMHO they made a terrible mistake with not going fully managed/hosted (hopefully something they will fix with the move to GCP).
Hi,
Just to let folks know, it seems that this bug is still haunting some instances. It is very frustrating and very scary to not be able to trust that smart lists/triggers can pull the right records and do the right things. In this case, Marketo Support has escalated to engineering. They supposedly "patched" the instance fixed it a few weeks ago but it's still happening. Re-escalated it again last week but still waiting to hear back and the problems are continuing.
I would caution folks to check your data to make sure the right things are happening with smart list/triggers/filters.
Sheila
We have been seeing similar issues where records are updated, but don't show in any sort of list view - only when you click on the individual lead records. I am not sure what architecture we are on.
That's exactly how we identified it. The data is there - it's just not being seen in any list/smart list; and most importantly, our smart campaigns that execute based on this data, are not triggering. Above, I mentioned that this was fixed by reindexing our DB. Unfortunately, the issues still remain - just not as severe. We continue to find smart campaigns that aren't processing leads, even though the data is there to support it. And Engineering informed me that they aren't working on any sort of patch - and instead, considered the re-indexing the proper fix.
We apologize for the lag issues your subscription has witnessed. Ultimately, we needed to put out a patch to fix a bug as well as "re-index" your subscription. Corona has been around for a few years now, though was recently upgraded for some subscriptions. It relates to our smartlist technology that vastly improves the speed of retrieval of smartlist results. The way this works is that lead and related data gets indexed into a specialized database to speed up queries, thereby returning smartlist results faster. In this case, the lag was the result of a bug, when we take the latest updates and move (index) them into Corona. This has since been patched. We have also reviewed our lag monitors and operating procedures to stay close to this. Once again, apologies for any inconvenience or challenges this has caused.
We have definitely seen a lot of lag in the past (mostly down to poor programs or order of execution choices) but since we had our pod moved to new infrastructure it has been blisteringly fast (less customers, more power 😉
NOTE RE: Corona (or to be more precise Marketo's re-branded name for Apache Solr) will and can produce a little lag on UI and database retrieval as the indexes get rebuilt / updated with changes.
We've been seeing a ton of lag lately, not sure if we are also on Corona. Adding myself to this thread to follow updates.
Hi Chris - this isn't really focused around any so-called lag or delay in how the platform performs. The issue here is really focused around any of your smart lists or more importantly, smart campaigns where specific flow steps are expected to happen. For us, the issue is mainly around any time a lead that had/has a "change data value" occur of any field - while it's reflected in the lead record, the smart lists/campaigns aren't able to filter/trigger off of it. Let's take the following example:
This is what I was referring to as the "lag". While the value may exist in the lead record, the value is not made available to filters/triggers for days/weeks. At least that's what we found out after doing some further investigating of this issue. And therefore, this has affected hundreds of our existing trigger campaigns that have been setup to process leads properly (e.g., country lead partition routing; lead lifecycle campaigns; adding leads to EPs; etc.).
While I did get an update this morning that the issue was fixed by re-indexing our DB - and that Marketo engineers are creating a patch for the issues on Corona processing for customer smart lists/smart campaigns - we're still left with all of these smart campaigns that didn't process as they were designed to.
UPDATE (10/20/2016): Unfortunately, the re-indexing of our DB did not fully resolve this. It has resolved some of the list/smart list issues, but our trigger campaigns still aren't firing when they should be.
So you are saying the trigger eval queue is broken on Corona.
I haven't noticed anything like that, but then I might not if the lead isn't sent to the main queue.
Josh, in this case, leads weren't even getting to that point since they weren't qualifying for the campaigns that they should have. What we're having to do is identify as many trigger campaigns that we can (this is pretty widespread across our 23 workspaces) and temporarily clone/run some of those trigger campaigns as batch campaigns. Unfortunately, that only works for some of our campaigns - specifically those that don't have timely/dependent trigger campaigns (such as our country lead partition routing ones). Our lead lifecycle and active engagement programs in each country are the ones that are most impacted by this. In summary, it's a bit of a mess for us right now as we scramble to fix what we can. It appears this issue was going on anywhere between 2 weeks to a month. If engineering would have re-indexed our DB when it was suggested, it wouldn't have been as severe as it is now.
Thanks for elaborating Dan, I completely understand. We are doing similar things with how we have some of our events set up with alert and task follow-up in Marketo, which is where we are seeing the extreme delays. Anywhere from a couple of hours up to a couple of days. Keep me posted if you don't mind, I'd love to hear about it.
I had a giant lag last week. Not aware that we're on Corona or Orion yet...though we should be.
Also you should NEVER run single flow actions because the possibility to find those records and reverse the action is pretty much zero.
Unless you've added them to a My List
Dan Stevens thanks for bringing this to our attention. Can't say I've come across anything so far but will now be keeping an eye out as I believe we're on the upgraded architecture.