Phone numbers aren’t actually numbers

Level 10 - Community Moderator
Level 10 - Community Moderator

Here’s a fun one:



What happened here?


It’s a case of mistaking numeric strings for numbers. Not an error in Marketo itself, but in an ETL app that inserts custom objects into Marketo.


In Marketo, CSM Owner Phone is a string field (just like the standard Phone Number and Mobile Phone Number, which are string subtypes). And in the source-of-truth customer database, the PHONE column in the CSM table is a CHAR. So far so good.


But the in-between app tries to auto-detect data types before passing them on to the Marketo API. And it’s... kinda bad at it.


If a value is all digits – not punctuated with spaces or other symbols — it thinks it’s a number. In other words, 212 555 1212 and (212) 555-1212 will remain strings, but 2125551212 will be cast to a number.


This is extra-bad because the subtype of number it chooses is a limited-precision field that can only handle 6 significant digits (3.53215 = 3 5 3 2 1 5 = 6 SDs). Then it stores the value in E notation — as a string!  — and writes it to Marketo.


This means the conversion permanently loses data. The number 353215002939 will be stored as 3.53215E+11. But so would the numbers 353215002940 and 353215002941 and so on! They’ll all be stored as 3.53215E+11 (that’s E notation for 353215000000.) It’s impossible to reproduce the original value, making the value useless.


So millions (yes, millions!) of records were imported into production that now have to be overwritten with the right data, which will take a ton of time. Ugh!



Accidentally auto-converting number-like strings to numbers we’ve seen before. But using a precision that can’t even store the full original value (without leading zeros)? That’s crazy, yo!


In any case, this incident reminds us how vital it is to communicate across groups.


In theory, the team writing the app could’ve guessed that a field with “phone” in the name is meant to store phone numbers. And phone “numbers” — like credit card “numbers” and Social Security “numbers” — should never use a numeric datatype. They could argue it wasn’t their place to discern business meaning from a field name... but I’d counter-argue that it wasn’t their place to auto-detect datatypes, either.☺


They also didn’t read values from Marketo after insertion to see if they were identical, a fundamental QA test for an ETL tool. The loss of precision is a huge bug that should’ve been detected before go-live.


But the biggest error was not talking enough with the folks who need to use the data day-to-day. Yes, meetings suck, but proper integrations need many pairs of eyes!