diff options
author | Nick White <git@njw.me.uk> | 2011-08-21 17:22:35 +0100 |
---|---|---|
committer | Nick White <git@njw.me.uk> | 2011-08-21 17:22:35 +0100 |
commit | 043da4609ae6f9e229f0f03d602f57908f66879a (patch) | |
tree | 287d1d9a5f25872e855597dcf4ce0d217dce0314 /TODO | |
parent | 0fedff7492d97609cdfc5a02a883bdfd693f4dbb (diff) |
Fix reporting of no pages available
Diffstat (limited to 'TODO')
-rw-r--r-- | TODO | 9 |
1 files changed, 7 insertions, 2 deletions
@@ -31,9 +31,14 @@ have websummary.sh print the date of release, e.g. mkdir of bookid and save pages in there +add cmdline arguments for stdin parsing + +merge pageinfo branch + +### notes + Google will give you up to 5 cookies which get useful pages in immediate succession. It will stop serving new pages to the ip, even with a fresh cookie. So the cookie is certainly not everything. If one does something too naughty, all requests from the ip to books.google.com are blocked with a 403 'automated requests' error for 24 hours. What causes this ip block is less clear. It certainly isn't after just trying lots of pages with 5 cookies. It seems to be after requesting 100 new cookies in a certain time period - 100 in 5 minutes seemed to do it, as did 100 in ~15 minutes. -NOTE!!: the method of getting all pages from book page does miss some; they aren't all listed -* these pages can often be requested, though +The method of getting all pages from book webpage does miss some; they aren't all listed. These pages can often be requested, though. |