Phone numbers aren’t actually numbers

SanfordWhiteman
Level 10 - Community Moderator
Level 10 - Community Moderator

Here’s a fun one:

2022-04-11-21_41_49[1].png

 

What happened here?

 

It’s a case of mistaking numeric strings for numbers. Not an error in Marketo itself, but in an ETL app that inserts custom objects into Marketo.

 

In Marketo, CSM Owner Phone is a string field (just like the standard Phone Number and Mobile Phone Number, which are string subtypes). And in the source-of-truth customer database, the PHONE column in the CSM table is a CHAR. So far so good.

 

But the in-between app tries to auto-detect data types before passing them on to the Marketo API. And it’s... kinda bad at it.

 

If a value is all digits – not punctuated with spaces or other symbols — it thinks it’s a number. In other words, 212 555 1212 and (212) 555-1212 will remain strings, but 2125551212 will be cast to a number.

 

This is extra-bad because the subtype of number it chooses is a limited-precision field that can only handle 6 significant digits (3.53215 = 3 5 3 2 1 5 = 6 SDs). Then it stores the value in E notation — as a string!  — and writes it to Marketo.

 

This means the conversion permanently loses data. The number 353215002939 will be stored as 3.53215E+11. But so would the numbers 353215002940 and 353215002941 and so on! They’ll all be stored as 3.53215E+11 (that’s E notation for 353215000000.) It’s impossible to reproduce the original value, making the value useless.

 

So millions (yes, millions!) of records were imported into production that now have to be overwritten with the right data, which will take a ton of time. Ugh!

 

Takeaways

Accidentally auto-converting number-like strings to numbers we’ve seen before. But using a precision that can’t even store the full original value (without leading zeros)? That’s crazy, yo!

 

In any case, this incident reminds us how vital it is to communicate across groups.

 

In theory, the team writing the app could’ve guessed that a field with “phone” in the name is meant to store phone numbers. And phone “numbers” — like credit card “numbers” and Social Security “numbers” — should never use a numeric datatype. They could argue it wasn’t their place to discern business meaning from a field name... but I’d counter-argue that it wasn’t their place to auto-detect datatypes, either.☺

 

They also didn’t read values from Marketo after insertion to see if they were identical, a fundamental QA test for an ETL tool. The loss of precision is a huge bug that should’ve been detected before go-live.

 

But the biggest error was not talking enough with the folks who need to use the data day-to-day. Yes, meetings suck, but proper integrations need many pairs of eyes!

2132
10
10 Comments
Jo_Pitts1
Level 10 - Community Advisor

Terrifyingly, I've seen this all to often... including Excel!

A client asked me to eyeball some data before they imported it.  Somehow, they'd not noticed the munged phone numbers!  

SanfordWhiteman
Level 10 - Community Moderator

Did they end up with the loss of precision too? How did that happen? I tried to figure out what language/datatype would do this by default and couldn’t figure it out. Even a PostgresSQL REAL (which I kind of thought it was) doesn’t have this exact behavior.

Jo_Pitts1
Level 10 - Community Advisor

Yep.. because excel had moved it to E+ notification, and then someone copy and pasted!

SanfordWhiteman
Level 10 - Community Moderator

But even Excel wouldn’t auto-detect a large Integer into E+ with 6 significant digits?

Jo_Pitts1
Level 10 - Community Advisor

Yes it does... quite often!

And if you then copy/paste then the info is gone.  If you tell Excel to reformat the number, then copy/paste.. you're OK

 

SanfordWhiteman
Level 10 - Community Moderator

Ah, I see it now. Depending on how wide the column is dragged, between 1 and 6 SD. Wow, what a joke!

SanfordWhiteman
Level 10 - Community Moderator

I‘d like to think, though, that they haven’t been using Excel as their ETL... but maybe it’s actually a weekly batch made in Excel and uploaded by hand. That would explain it!

Jo_Pitts1
Level 10 - Community Advisor

As with a few of my clients.. they WERE.  Now, they are not! 🙂

 

 

Katja_Keesom
Level 10 - Community Advisor + Adobe Champion

I know so many examples of developers and business not communicating, making assumptions and making a mess 🤔. And some of these systems chipping in with their own assumptions doesn't help. If I lead an integration project my first focus is to get the teams to develop a common language, understanding the other team's world and priorities.

SanfordWhiteman
Level 10 - Community Moderator

Yeah, it‘s a problem that never seems to go away. Business often doesn’t want to hear/learn about datatypes, but they‘ll be the first to feel the hurt if developers make a bad guess.

 

Developers can be reckless in their own right: I’ve had dev teams request they be able to create a new field on the fly, so they could respond directly to Marketing requests without involving MOPS.  They wanted to send arbitrary data to the integration tier, have the tier auto-detect the type and update the schema in Marketo. Luckily we were able to sidestep this by noting that fields need to be created on the SFDC side in order to sync. But it’s such a dangerous concept!