# Overview While legal "agreements" (which consider themselves signed by using a service they mention) tend to be hostile, using the standard for determining what an automated tool is allowed to do on a server, robots.txt, all of the getxbook tools are permitted. # Getgbook ## Terms of Service Google's terms of service are ambiguous. On the one hand they forbid using anything but a browser to access their sites. This is absurd and ruinous. On the other hand, however, they state that one should abide by the rules of robots.txt, which are only relevant for non-browser access. A reasonable interpretation would be that non-browsers are allowed to access Google's services as long as they abide by robots.txt See section 5.3 of http://www.google.com/accounts/TOS. ## robots.txt Their robots.txt allows certain book URLs, but disallows others. We use three types of URL: http://books.google.com/books?id=&printsec=frontcover http://books.google.com/books?id=&pg=&jscmd=click3 http://books.google.com/books?id=&pg=&img=1&zoom=3&hl=en& robots.txt disallows /books?*jscmd=* and /books?*pg=*. However, Google consider Allow statements to overrule disallow statements if they are longer. And they happen to allow /books?*q=subject:*. So, we append that to the urls (it has no effect on them), and we are obeying robots.txt Details on how Google interprets robots.txt are at http://code.google.com/web/controlcrawlindex/docs/robots_txt.html # Getabook ## Conditions of Use With Amazon, massive overreach rules the day. In the "license and site access" section of Amazon.com's Conditions of Use, they state that downloading any part of their website except for page caching is forbidden, and that using "robots, or similar data gathering and extraction tools" is also not allowed. http://www.amazon.com/gp/help/customer/display.html/ref=x?nodeId=508088 Thankfully, however, the rules set out in Amazon's robots.txt tells a different story. Given that these explicitly lay down the rules for automated downloading tools, it seems reasonable too take them as representative of accepted policy. ## robots.txt Amazon's main robots.txt allows all of the request types we make. ## Curious One other curious sentiment in the Conditions of Use is the clause "we each waive any right to a jury trial." Amazon's is truly a Brave New World.