ChiliProject is not maintained anymore. Please be advised that there will be no more updates.
We do not recommend that you setup new ChiliProject instances and we urge all existing users to migrate their data to a maintained system, e.g. Redmine. We will provide a migration script later. In the meantime, you can use the instructions by Christian Daehn.
[PATCH] hiding form pages from search engines (Feature #169)
Description
Form pages like /issues/new are not worth to be indexed by search engines. And moreover it is sometimes confusing for visitors from search engine. When you have a question about chiliproject and you search about it, what can you do if /issues/new appears?
It happens when these form pages are opened for anonymous user. It actually happened at redmine.ruby-lang.org once. So I wrote the attached patch. This patch adds a meta element as follows in some pages:
<meta name="ROBOTS" content="NOINDEX,FOLLOW,NOARCHIVE" />
Associated revisions
[#169] Add a ROBOTS meta tag to several forms to hide from web spiders
Based on the patch by Yuki Sonoda
History
Updated by Felix Schäfer at 2011-02-10 09:52 am
I guess having something like */new
in the robots.txt wouldn't work, would it?
Updated by Yuki Sonoda at 2011-02-10 01:07 pm
According to http://www.robotstxt.org/robotstxt.html, robots.txt does not support glob. So we can not expect */new works fine.
Updated by Eric Davis at 2011-02-10 11:36 pm
I think this is a good idea. I'd like to improve on it a little bit though by making the robot_exclusion_tag take options for the content section (e.g. robot_exclusion_tag("NOINDEX,FOLLOW,NOARCHIVE")
or robot_exclusion_tag("NOINDEX,NOFOLLOW")
). Then we (or plugins) could have more control over the indexing options for each page.
Thoughts?
Updated by Felix Schäfer at 2011-02-11 07:30 am
Eric Davis wrote:
Thoughts?
What about making NOINDEX,FOLLOW,NOARCHIVE
the default and calling the method with any collection of (NO)SOMETHING
overrides the default for that keyword?
Updated by Eric Davis at 2011-02-11 07:01 pm
This was my idea. It let us have more control of what the actual content is in case the meta tag allows other values later.
1 def robot_exclusion_tag(content="NOINDEX,FOLLOW,NOARCHIVE")
2 "<meta name='ROBOTS' content=#{content} />"
3 end
Updated by Felix Schäfer at 2011-02-11 09:31 pm
Eric Davis wrote:
This was my idea. It let us have more control of what the actual content is in case the meta tag allows other values later.
No, I meant having NOINDEX,FOLLOW,NOARCHIVE
be the default, and if you call it with INDEX
to get INDEX,FOLLOW,NOARCHIVE
. I just realized that's overengineering it though, I like your proposal :-)
Updated by Eric Davis at 2011-02-11 11:54 pm
Yea I thought about doing keywords too but then we would have to maintain a list of valid ones. Hence the idea of just using a simple string.
I'll add and modify this patch. I think it's minor enough for 1.1.0.
- Target version set to 1.1.0 — Bell
- Assignee set to Eric Davis
Updated by Eric Davis at 2011-02-14 02:20 am
I've modified Yuki Sonoda's patch and the code is ready for review.
- Status changed from Open to Ready for review
Updated by Felix Schäfer at 2011-02-14 07:05 am
Looks good to me. I'll merge it by the time I'm around a more stable connection if you haven't done so until then.
Updated by Holger Just at 2011-02-14 09:33 am
I still like Felix' idea of having defaults and being able to gradually overwrite them. This could be done like this:
1# Add a HTML meta tag to control robots (web spiders)
2#
3# @param [optional, String] changed content of the ROBOTS tag.
4# defaults to no index, follow, and no archive
5def robot_exclusion_tag(content="")
6 default_content = { "INDEX" => "NO",
7 "FOLLOW" => "",
8 "ARCHIVE" => "NO" }
9
10 args = content.upcase.split(",").inject({}) do |args, arg|
11 value = arg.gsub(/^(NO)/, "")
12 args[value] = $1 || ""
13 args
14 end
15 default_content.merge(args).collect{ |k, v| v+k }.join(",")
16end
Updated by Eric Davis at 2011-02-14 11:07 pm
Holger Just wrote:
I still like Felix' idea of having defaults and being able to gradually overwrite them.
Just seems like a lot of code to me that might not be used that often.
Updated by Felix Schäfer at 2011-02-15 06:28 am
Eric Davis wrote:
Just seems like a lot of code to me that might not be used that often.
That's what I meant with "don't overengineer it" ;-) I think the simple version is fine, avoiding to have to write it all out for those few times you need other params is not worth it.
Updated by Eric Davis at 2011-02-17 01:12 am
Merged into master for 1.1.0. Thank you for the patch Yuki Sonoda.
- Status changed from Ready for review to Closed