Hi
I've searched the community for this but none of the answers seem to help me with stopping pdfs hosted in Marketo being indexed.
I know its possible to do a robot.txt file for the subdomain and host this on the website, but are there any other options within Marketo to stop pdfs being indexed?
All of the pdf's are accessed via forms so the nofollow text that would normally sit next to href code.
Any help would be great!
Thanks
Juli
Solved! Go to Solution.
Last I checked you can manage the robots.txt yourself.
Hi Sanford, can you tell me where I can do this? I've look everywhere.
Thanks
Juli
Wow, of course.
How should the robots.txt be formatted to Disallow just pdf file types? Would this article help?
apparently if you set the PDF headers correctly, you can make the PDF itself disallow crawling.
How should the robots.txt be formatted to Disallow just pdf file types? Would this article help?
I'd go with
User-agent: *
Disallow: /rs/
apparently if you set the PDF headers correctly, you can make the PDF itself disallow crawling.
Yep, that would be cool, but we don't have access at that level.
Learned something new once again - thanks Sandy!
amazing..thank you!
Upload a robots.txt and then redirect the root /robots.txt to it.
We had the same issue. We simply opened a support ticket with marketo and it was resolved quite easily... just a minor tweak to the robots.txt on their side. They updated it same day .
Brilliant, thank you! Just submitted, and hopefully we get the same response.
Hi all - we also tend to gate our content via a form on a landing page and then have the user directed to the actual PDF hosted on Marketo's server as a form follow-up. We recently discovered that the PDF itself indexes in a Google search and can be navigated to directly. Is there any way to prevent this? Any/all suggestions welcome. Thanks!
The suggestion i provided is the only way - stop using Marketo as a file host.
Is that really an issue? Sometimes i do come across gated PDFs in search, usually when their CDN isn't properly setup. Unless someone happens to really know the pdf title and hits it just right, they won't see the pdf in search that often. If it's a big deal, then put it on a different server, otherwise, I am doubtful it's worth the effort.
I'm not aware of creating a master robots.txt file for marketo this way.
I also encourage everyone to post PDFs on a separate server where you can do this type of control more easily and the CDN or server load will be more appropriate.
There may be an Idea on updating this feature.
Hi Josh,
I didn't think there was a way with Marketo hosted files. I've searched everywhere for help.
Thanks for confirming for me.
Juli