SOLVED

Re: How to stop pdf files hosted in Marketo being indexed?

Go to solution
Julz_James
Level 10

Hi

I've searched the community for this but none of the answers seem to help me with stopping pdfs hosted in Marketo being indexed.

I know its possible to do a robot.txt file for the subdomain and host this on the website, but are there any other options within Marketo to stop pdfs being indexed?

All of the pdf's are accessed via forms so the nofollow text that would normally sit next to href code.

Any help would be great!

Thanks

Juli

1 ACCEPTED SOLUTION
SanfordWhiteman
Level 10 - Community Moderator
14 REPLIES 14
SanfordWhiteman
Level 10 - Community Moderator

Last I checked you can manage the robots.txt yourself.

Julz_James
Level 10

Hi Sanford, can you tell me where I can do this? I've look everywhere.

Thanks

Juli

SanfordWhiteman
Level 10 - Community Moderator
Josh_Hill13
Level 10 - Champion Alumni

Wow, of course.

How should the robots.txt be formatted to Disallow just pdf file types? Would this article help?

indexing - How to prevent a PDF file from being indexed by search engines? - Webmasters Stack Exchan...

apparently if you set the PDF headers correctly, you can make the PDF itself disallow crawling.

SanfordWhiteman
Level 10 - Community Moderator

How should the robots.txt be formatted to Disallow just pdf file types? Would this article help?

I'd go with

User-agent: *

Disallow: /rs/

apparently if you set the PDF headers correctly, you can make the PDF itself disallow crawling.

Yep, that would be cool, but we don't have access at that level.

Dan_Stevens_
Level 10 - Champion Alumni

Learned something new once again - thanks Sandy!

Julz_James
Level 10

amazing..thank you!

SanfordWhiteman
Level 10 - Community Moderator

Upload a robots.txt and then redirect the root /robots.txt to it.

David_Gallaghe2
Level 5

We had the same issue. We simply opened a support ticket with marketo and it was resolved quite easily... just a minor tweak to the robots.txt on their side. They updated it same day .

pastedImage_1.png

Kim_Wieczner
Level 3

Brilliant, thank you! Just submitted, and hopefully we get the same response.

Kim Burditt
Kim_Wieczner
Level 3

Hi all - we also tend to gate our content via a form on a landing page and then have the user directed to the actual PDF hosted on Marketo's server as a form follow-up. We recently discovered that the PDF itself indexes in a Google search and can be navigated to directly. Is there any way to prevent this? Any/all suggestions welcome. Thanks!

Kim Burditt
Josh_Hill13
Level 10 - Champion Alumni

The suggestion i provided is the only way - stop using Marketo as a file host.

Josh_Hill13
Level 10 - Champion Alumni

Is that really an issue? Sometimes i do come across gated PDFs in search, usually when their CDN isn't properly setup. Unless someone happens to really know the pdf title and hits it just right, they won't see the pdf in search that often. If it's a big deal, then put it on a different server, otherwise, I am doubtful it's worth the effort.

I'm not aware of creating a master robots.txt file for marketo this way.

I also encourage everyone to post PDFs on a separate server where you can do this type of control more easily and the CDN or server load will be more appropriate.

There may be an Idea on updating this feature.

Julz_James
Level 10

Hi Josh,

I didn't think there was a way with Marketo hosted files.  I've searched everywhere for help.

Thanks for confirming for me.

Juli