Data Clean up for �

Level 2

Data Clean up for �

Hi, we have contacts in the database that are displayed with a � instead of the letters like "é, ä, ü, ö, á, ....". 

This is mostly in the First and/or Last Name and sometimes in the Email Addresses. I would like to run a data clean up but I'm no sure how exactly I can do that, without creating duplicates. I was hoping that somehow I can make use if the individual IDs but any suggestions how to?

Level 10 - Community Moderator

Re: Data Clean up for �



First, you may already realize this, but some others won't: the Unicode replacement character can signify two different situations:

  1. There's an unrecognizable/impossible sequence of code units in the data, which can't be translated to a Unicode code ponit. This typically happens due to choosing the wrong encoding during an import. At output time, the � is displayed at each unrecognizable position. In this case the original, if mangled, bytes are still there in the data.
  2. During some previous import/export process, there were encoding problems, and somewhere along the way the literal Unicode replacement character � was inserted into the data in place of the original bad bytes. In this case the � isn't just used for output, it's really there in the data.

While case [1] is more likely, you can't actually tell which one is happening by looking at the displayed character only.


Now, to your specific question: it's true that you can't change the Email Address field and do a typical UI-based import without creating duplicates, because the Email Address is itself the dedupe key.


You can use an API-based import, if you are/have a developer, to dedupe on the Lead ID instead.


But there's a way to use the UI-based import, too. Create a new custom field like Proxy Email Address. Add that as a separate column in your spreadsheet. The fixed-up values go in that column.  Then, have a Smart Campaign that listens for changes to Proxy Email Address, then changes Email Address to the token value {{lead.Proxy Email Address}}.