Re: Open Data for Marketers

Allen_Pogorzel1
Level 2

Open Data for Marketers

I've seen a couple of different threads about identifying bogus names and email addresses, using job titles to classify contacts into job roles and job functions, and identifying personal email addresses. The data to do these things is out there, but it can be hard to come by.

As a side project, a couple of us at Openprise and some friends in the community have decided to start an open data project for marketers to share our data sets, contribute to them, and build new ones. So far, we've curated a series of data sets like Free Email ISP Domains, Suspect Contact Names (a.k.a. "The Mickey Mouse List"), and Job Sub-functions in Marketing, HR, and Finance.

One of the more interesting potential projects that was discussed in the last MUG was building a list of domains where we’re all seeing spam/security solutions scanning emails and registering clicks as they follow the links in the emails through “multilevel intent analysis”. If you’re interested in taking part in that project, just let us know.

You can download any of the open source lists we've been working on at www.marketingopendata.org. If you're interested in joining the community to help with managing data sets or creating news ones, you can find out more on the site. The more the merrier!

Hope everybody finds these data sets helpful.

Best,

Allen

5 REPLIES 5
SanfordWhiteman
Level 10 - Community Moderator

Re: Open Data for Marketers

Very interesting project, Allen!  A few notes come to mind:

  • the Internet Domain Suffix list doesn't need to be crowdsourced, because the "crowd" here isn't correct/up-to-date. Assuming you mean public domain suffix, there already is an authoritative list, the PSL. At present it has 8229 actual entries while your doc shows 10,329. The PSL is always correct and up-to-date, so the community-contributed list will never be better. Suggest linking to the PSL instead as it is always right
  • shouldn't use the term "Open Source" for data for which you don't hold copyright. For example the stock ticker list -- which may not legally be distributed anyway, depending on its original source -- can't be declared to be open source by you. (Having worked in finance IT for many years, we know that just because something's on the Bloomberg terminal doesn't mean we can republish it on the web.)  
  • in the other direction, government-supplied lists like NAICS codes are Public Domain, not Open Source. This is a really big difference because PD means there is no copyright at all, and no restrictions can be enforced even after it is downloaded. OS data feeds may be allowed to be downloaded from your site but prohibited from certain publishing, have a termination clause, etc.
Allen_Pogorzel1
Level 2

Re: Open Data for Marketers

Thanks for your notes, Standford! That's a good point about PSL! Our goal was to make it really easy for the community to access data and start working with it.  We started with exactly the same place you recommended - The PSL. The challenge with the PSL is that it takes a lot of work to clean it up so that you can put it into a spreadsheet (for VLOOKUP) or a database table so that you can easily work with it and reference it. I don't think everybody should have to go through the same work we did, so we published a cleaned up, Excel-ready version. 

SanfordWhiteman
Level 10 - Community Moderator

Re: Open Data for Marketers

Cleaned up it may be, but it's inaccurate, so I wouldn't try to fight that battle. Run a cron job to fix it up nightly, maybe.

Allen_Pogorzel1
Level 2

Re: Open Data for Marketers

That's a good point.  We'll take a look at everything people want to do and start to prioritize.  Thanks again!

Grégoire_Miche2
Level 10

Re: Open Data for Marketers

HI Allen,

that's a great help! Thanks for sharing it.

-Greg