blob: 5458437a351f9ddb942ee8a6151ce0eeb1767858 (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
|
# other todos
0.9 - bug in get() & post(): if the \r\n\r\n after http headers is cut off between recv buffers. solution is to get all, then strstr(\n\r\n\r) to find end of header, and memcopy the rest out (so that original memory can be freed)
in getabook, the web client tries downloading sequentially the first few pages, regardless of whether they're in the available page list. this actually works (some or all of these pages will return), so we should implement something similar too. exactly how it knows when to stop looking is not clear, at least with the one i tried, it just tried all of the first 25 pages.
in getgbook, check that downloaded page doesn't match 'page not available' image; if so delete (as may be redownloadable later, perhaps even then with different cookies)
in getbnbook, check that downloaded page doesn't match 'page not available' swf; if so delete (as may be redownloadable later, perhaps even then with different cookies)
in getgbook, grab the link data (presumably as json somewhere), and add this to pdf
1.0 format and package man pages in win and osx packages
1.0 package for osx - https://github.com/kennethreitz/osx-gcc-installer
write some little tests
1.0 submit 'pad' file to websites http://padsites.asp-software.org/
add function to download html text to getabook (just a html request to get kindle version)
add scribd functionality - example is http://www.scribd.com/doc/20448287/Etidorhpa-John-Uri-Lloyd producing urls like http://htmlimg3.scribdassets.com/1qva8jpekgdk0wl/images/1-bfa8361a96.jpg
add to README or getgbook.1 a note recommending checking archive.org as many public domain books scanned by google will be archived there.
mention that google book images aren't watermarked - check others and report on that too.
|