summaryrefslogtreecommitdiff
path: root/TODO
blob: 43cb56aacae3022a18cfbdc110ead787ab91c6a3 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
Note: looks like google allows around 3 page requests per cookie session, and about 40 per ip per [some time period]. If I knew the time period, and once stdin retry is working, could make a script that gets all it can, gets a list of failures, waits, then tries failures, etc. Note these would also have to stop at some point; some pages just aren't available

make sure i'm checking all lib calls that could fail

make sure all arrays are used within bounds

strace to check paths taken are sensible

use defined constants rather than e.g. 1024

getgbooktxt (different program as it gets from html pages, which getgbook doesn't any more)

getabook

getbnbook

openlibrary.org?

# once it is basically working #

try supporting 3xx in get, if it can be done in a few lines
 by getting Location line, freeing buf, and returning a new
 iteration.

add https support to get

to be fast and efficient it's best to crank through all the json 1st, filling in an array of page structs as we go
	this requires slightly fuller json support
	could consider making a json reading module, ala confoo, to make ad-hoc memory structures from json

write helper scripts like trymissing

write some little tests

have file extension be determined by file type, rather than assuming png

think about whether default functionality should be dl all, rather than -a