summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorNick White <hg@njw.me.uk>2011-08-07 13:36:19 +0100
committerNick White <hg@njw.me.uk>2011-08-07 13:36:19 +0100
commit101687cd7a85cb83dea95386ee6cdd6259c726c1 (patch)
treefd0dfb87049dcb27f4a8b631996130c286c34eac
parent62563596f477238d480fe4a701544413b6c722f5 (diff)
Improve legal info
-rw-r--r--LEGAL18
1 files changed, 10 insertions, 8 deletions
diff --git a/LEGAL b/LEGAL
index ec1a2c8..d305f90 100644
--- a/LEGAL
+++ b/LEGAL
@@ -2,18 +2,18 @@
## TOS
-Google's terms of service forbid using anything but a browser
-to access their sites. This is absurd and ruinous.
+Google's terms of service are ambiguous. On the one hand they
+forbid using anything but a browser to access their sites.
+This is absurd and ruinous. On the other hand, however, they
+state that one should abide by the rules of robots.txt, which
+are only relevant for non-browser access. A reasonable
+interpretation would be that non-browsers are allowed to
+access Google's services as long as they abide by robots.txt
See section 5.3 of http://www.google.com/accounts/TOS.
-Thankfully, however, for Google Books one is only bound to it
-"for digital content you purchase through the Google Books
-service," which does not affect this program.
-See http://www.google.com/googlebooks/tos.html
-
## robots.txt
-Their robots.txt allows certain book pages, but disallows
+Their robots.txt allows certain book URLs, but disallows
others.
We use two types of URL:
@@ -25,3 +25,5 @@ Google consider Allow statements to overrule disallow statements
if they are longer. And they happen to allow /books?*q=subject:*.
So, we append that to both url types (it has no effect on them),
and we are obeying robots.txt
+Details on how Google interprets robots.txt are at
+http://code.google.com/web/controlcrawlindex/docs/robots_txt.html