From 101687cd7a85cb83dea95386ee6cdd6259c726c1 Mon Sep 17 00:00:00 2001 From: Nick White Date: Sun, 7 Aug 2011 13:36:19 +0100 Subject: Improve legal info --- LEGAL | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/LEGAL b/LEGAL index ec1a2c8..d305f90 100644 --- a/LEGAL +++ b/LEGAL @@ -2,18 +2,18 @@ ## TOS -Google's terms of service forbid using anything but a browser -to access their sites. This is absurd and ruinous. +Google's terms of service are ambiguous. On the one hand they +forbid using anything but a browser to access their sites. +This is absurd and ruinous. On the other hand, however, they +state that one should abide by the rules of robots.txt, which +are only relevant for non-browser access. A reasonable +interpretation would be that non-browsers are allowed to +access Google's services as long as they abide by robots.txt See section 5.3 of http://www.google.com/accounts/TOS. -Thankfully, however, for Google Books one is only bound to it -"for digital content you purchase through the Google Books -service," which does not affect this program. -See http://www.google.com/googlebooks/tos.html - ## robots.txt -Their robots.txt allows certain book pages, but disallows +Their robots.txt allows certain book URLs, but disallows others. We use two types of URL: @@ -25,3 +25,5 @@ Google consider Allow statements to overrule disallow statements if they are longer. And they happen to allow /books?*q=subject:*. So, we append that to both url types (it has no effect on them), and we are obeying robots.txt +Details on how Google interprets robots.txt are at +http://code.google.com/web/controlcrawlindex/docs/robots_txt.html -- cgit v1.2.3