diff options
author | Nick White <git@njw.me.uk> | 2011-09-08 17:12:30 +0100 |
---|---|---|
committer | Nick White <git@njw.me.uk> | 2011-09-08 17:12:30 +0100 |
commit | 45a96ef428c9808abcc6d4043f149eb6ac825ece (patch) | |
tree | 1818f6bf3fbff21cea2fb2228a8a0f04cb55eac3 /plans/abook | |
parent | 6caa822d0ad5afb0d019e1b2afbbc53847d7179d (diff) |
Add notes for how to get pages from amazon and barnes and noble
Diffstat (limited to 'plans/abook')
-rw-r--r-- | plans/abook | 33 |
1 files changed, 33 insertions, 0 deletions
diff --git a/plans/abook b/plans/abook new file mode 100644 index 0000000..0418bbd --- /dev/null +++ b/plans/abook @@ -0,0 +1,33 @@ +final img urls look like: +http://sitb-images.amazon.com/Qffs+v35lepeP2icY2OteGGgTPZO7sxgfhv6+rCKfpWLrJyvNAksvFu4WzV79TodydVXgzoaP3o= +http://sitb-images.amazon.com/Qffs+v35lepeP2icY2OteGGgTPZO7sxgsXoL4TS0WgsVOflj1z8cVkwoGTF8uqsrBObiKx03xck= + +ugly, but need no cookie, and can be re-downloaded + +feature is called variously 'search inside this book' ('sitb') or 'look inside this book' ('litb') + + +http://www.amazon.com/gp/reader/0140442278/ is reader link, but appears to just redirect back to book link, with js pre-opened + +sitb js library (not very obfuscated): +http://z-ecx.images-amazon.com/images/G/01/digital/sitb/reader/v4/201010271203/sitb-library-js._V176048133_.js + + +loadBookData looks promising - does ajax method:"getBookData",asin:asin - line 4903. this uses the SITB_READER_AJAX_URL. http://www.amazon.com/gp/search-inside/service-data?method=getBookData&asin=0140442278 + +some page urls are contained in that. under the title 'litbPages', e.g. 'look inside the book'. other ajax calls are definitely the place to look; follow usage of AJAX_URL and jquery.ajax + +* note: https works :) + +the main book data only contains the initial pages linked to from sidebar; others are available from the interface by using next/prev buttons, or by scrolling. investigate the ajax calls further + +getSBData gives more metadata, not relevant here + +goToPage gives lots of good stuff (line 2972); page requested plus nearby ones. official arg requests use a 'token', but seem to be able to get away without one + +http://www.amazon.com/gp/search-inside/service-data?method=goToPage&asin=0140442278&page=27 + +not all pages are available; if not, we get this (still a 200, mind): + +{"error":{"text":{"key":"PAGE_NOT_AVAILABLE_TEXT"},"title":{"key":"PAGE_NOT_AVAILABL +E_TITLE"},"reftag":"rdr_bar_view"}} |