This document provides an overview of the recent service issue that impacted customers in our Ashburn data center on June 4, 2019. If you are unsure if your instance was impacted, please Contact Marketo Support at https://support.marketo.com.
When:
June 4, 2019
Duration:
5 hours, 12 minutes
Service affected:
Interactive logins may have been intermittently disrupted between 12:33 PM PDT and 2:30 PM PDT. The services listed below may have been intermittently, or completely impacted for the duration of the issue, 12:33 PM PDT - 5:45 PM PDT.
While these services were impacted, the serving of forms and landing pages and SOAP API continued to function as normal.
What happened:
The source of the disruption has been identified as an IP address conflict issue. On June 4, 2019, a new network hardware device was initialized that inaccurately acquired an IP address that was already in use by the load balancer, another network hardware device. This network address conflict caused the load balancer to be intermittently unavailable, stopping network traffic. This could have caused network requests for internal services such as Locator Service, Metadata Service, and Activity Service to time out, resulting in the affected Marketo services to be intermittently or completely impacted. Due to the complex nature of the issue, it took longer than normal to identify the source of the problem. While we were able to identify the network address generating errors, each time the new device was reinitialized, the errors would appear and disappear, causing the intermittent symptoms.
Remediation:
Once the issue was identified, our team took immediate action to resolve the issue. To restore interactive logins, our team implemented a workaround to bypass the impacted load balancer device. Additionally, we began to migrate the remaining impacted services to an alternative load balancer. During this secondary process, we discovered the IP address conflict issue. Once we identified the definitive root cause, we disabled the network devices and full service was immediately restored.
To correct the activity data that was not fully indexed during the impacted timeframe, our team has developed and begun the process to correct the data. Please note that activities that occurred outside of the impacted timeframe were not affected. Due to the large volume of activities, we expect this process to take up to 30 days to complete, with an anticipated end date of July 8, 2019.
Facebook and LinkedIn LeadGen data may not have been recorded in Marketo during the impacted timeframe. To correct this, our team completed a data fix process on June 12, 2019, to replay these requests and ensure no data was lost.
There is a chance that a small number of Munchkin activities that were affected during the impacted timeframe did not get recorded. Our team cannot reprocess these activities as this would cause duplicate events for other activities that did get recorded.
We will continue to update this article including data fix timelines and processes as soon as such information is available. Please check this article frequently for updates.
For additional questions, please Contact Marketo Support