I've seen a couple of different threads about identifying bogus names and email addresses, using job titles to classify contacts into job roles and job functions, and identifying personal email addresses. The data to do these things is out there, but it can be hard to come by.
As a side project, a couple of us at Openprise and some friends in the community have decided to start an open data project for marketers to share our data sets, contribute to them, and build new ones. So far, we've curated a series of data sets like Free Email ISP Domains, Suspect Contact Names (a.k.a. "The Mickey Mouse List"), and Job Sub-functions in Marketing, HR, and Finance.
One of the more interesting potential projects that was discussed in the last MUG was building a list of domains where we’re all seeing spam/security solutions scanning emails and registering clicks as they follow the links in the emails through “multilevel intent analysis”. If you’re interested in taking part in that project, just let us know.
You can download any of the open source lists we've been working on at www.marketingopendata.org. If you're interested in joining the community to help with managing data sets or creating news ones, you can find out more on the site. The more the merrier!
Hope everybody finds these data sets helpful.
Very interesting project, Allen! A few notes come to mind:
Thanks for your notes, Standford! That's a good point about PSL! Our goal was to make it really easy for the community to access data and start working with it. We started with exactly the same place you recommended - The PSL. The challenge with the PSL is that it takes a lot of work to clean it up so that you can put it into a spreadsheet (for VLOOKUP) or a database table so that you can easily work with it and reference it. I don't think everybody should have to go through the same work we did, so we published a cleaned up, Excel-ready version.
Cleaned up it may be, but it's inaccurate, so I wouldn't try to fight that battle. Run a cron job to fix it up nightly, maybe.