5 Replies Latest reply on Jun 20, 2017 3:41 PM by Allen Pogorzelski

    Open Data for Marketers

    Allen Pogorzelski

      I've seen a couple of different threads about identifying bogus names and email addresses, using job titles to classify contacts into job roles and job functions, and identifying personal email addresses. The data to do these things is out there, but it can be hard to come by.

       

      As a side project, a couple of us at Openprise and some friends in the community have decided to start an open data project for marketers to share our data sets, contribute to them, and build new ones. So far, we've curated a series of data sets like Free Email ISP Domains, Suspect Contact Names (a.k.a. "The Mickey Mouse List"), and Job Sub-functions in Marketing, HR, and Finance.

       

      One of the more interesting potential projects that was discussed in the last MUG was building a list of domains where we’re all seeing spam/security solutions scanning emails and registering clicks as they follow the links in the emails through “multilevel intent analysis”. If you’re interested in taking part in that project, just let us know.

       

      You can download any of the open source lists we've been working on at www.marketingopendata.org. If you're interested in joining the community to help with managing data sets or creating news ones, you can find out more on the site. The more the merrier!

       

      Hope everybody finds these data sets helpful.

       

      Best,

       

      Allen

        • Re: Open Data for Marketers
          Sanford Whiteman

          Very interesting project, Allen!  A few notes come to mind:

           

          • the Internet Domain Suffix list doesn't need to be crowdsourced, because the "crowd" here isn't correct/up-to-date. Assuming you mean public domain suffix, there already is an authoritative list, the PSL. At present it has 8229 actual entries while your doc shows 10,329. The PSL is always correct and up-to-date, so the community-contributed list will never be better. Suggest linking to the PSL instead as it is always right
          • shouldn't use the term "Open Source" for data for which you don't hold copyright. For example the stock ticker list -- which may not legally be distributed anyway, depending on its original source -- can't be declared to be open source by you. (Having worked in finance IT for many years, we know that just because something's on the Bloomberg terminal doesn't mean we can republish it on the web.)  
          • in the other direction, government-supplied lists like NAICS codes are Public Domain, not Open Source. This is a really big difference because PD means there is no copyright at all, and no restrictions can be enforced even after it is downloaded. OS data feeds may be allowed to be downloaded from your site but prohibited from certain publishing, have a termination clause, etc.
          • Re: Open Data for Marketers
            Grégoire Michel

            HI Allen,

             

            that's a great help! Thanks for sharing it.

             

             

            -Greg