Skip to content

Commit 30fd71a

Browse files
committed
Add a robots.txt
1 parent fbbcda4 commit 30fd71a

File tree

1 file changed

+49
-0
lines changed

1 file changed

+49
-0
lines changed

robots.txt

+49
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
---
2+
layout: null
3+
permalink: robots.txt
4+
badagents:
5+
- "Sogou web spider"
6+
- "Sogou inst spider"
7+
- "Sogou spider2"
8+
- "Sogou blog"
9+
- "Sogou News Spider"
10+
- "Sogou Orion spider"
11+
- "Sosospider"
12+
---
13+
{% for agent in page.badagents %}User-agent: {{ agent }}
14+
Disallow: /
15+
16+
{% endfor %}
17+
# These URIs will cause unneccessary traffic with ANY bot.
18+
User-agent: *
19+
User-agent: archive.org_bot
20+
User-agent: ia_archiver
21+
Disallow: /releases/release0.1/doc
22+
Disallow: /releases/release0.5.0~beta1/doc
23+
Disallow: /releases/release0.5.0~beta2/doc
24+
Disallow: /releases/release0.6.0~beta1/doc
25+
Disallow: /releases/release0.6.0~beta2/doc
26+
Disallow: /releases/release0.6.5~20140718/doc
27+
Disallow: /releases/release0.6.5~20140721/doc
28+
Disallow: /releases/release0.6.5~20141030/doc
29+
Disallow: /releases/*/doc/index.html?de/lmu/
30+
31+
# The following spiders were found to misbehave and are no longer welcome:
32+
User-agent: YoudaoBot
33+
User-agent: HaoSouSpider
34+
User-agent: 360Spider
35+
User-agent: MegaIndex
36+
Disallow: /
37+
Crawl-Delay: 360000
38+
39+
# The following spiders are just unnecessary traffic
40+
User-agent: BLEXBot
41+
User-agent: dotbot
42+
User-agent: AhrefsBot
43+
User-agent: SMTBot
44+
User-agent: SemrushBot
45+
User-agent: SemrushBot-SA
46+
User-agent: WeSEE_Bot
47+
User-agent: ltx71
48+
Disallow: /
49+

0 commit comments

Comments
 (0)