Hello! I am playing around with a vps and I want to host a very simple one-page website. As this is a just for fun project I don’t want it listed in Google or any other search engines.
I did some research and there seems to be 3 ways to handle this…
It’s a separate file placed in the the root folder of a website, it can block legitimate crawlers from accessing the site entirely but the site may end up in the index if other websites reference it.
robots meta tag
Is a meta tag placed on the HEAD part of the html document and can inform legitimate crawlers not to index the website but it can only refer to the HTML document itself, additional images, videos in the server directory can be indexed (?). If you have a robots.txt file blocking access entirely this meta tag doesn’t do anything.
Configured on the server level as part of the HTTP response and can block all indexing for legitimate crawlers including the HTML files and any pictures, videos etc hosted on the server.
In my case I have just an index.html file on the root directory and no robots.txt file and I have the robots meta tag below configured on the HEAD part of the document. Is this enough to avoid having my site indexed?
<meta name="robots" content="noindex, noimageindex, nofollow, noarchive, nocache, nosnippet">