summaryrefslogtreecommitdiff
path: root/TODO
diff options
context:
space:
mode:
authorNick White <hg@njw.me.uk>2011-08-07 14:21:47 +0100
committerNick White <hg@njw.me.uk>2011-08-07 14:21:47 +0100
commitff292deb12c9def19ec3b9d624bc29f396eb2726 (patch)
tree936aab50557453accb03cd83e102f83b66739b04 /TODO
parent101687cd7a85cb83dea95386ee6cdd6259c726c1 (diff)
Update documentation, including add README
Diffstat (limited to 'TODO')
-rw-r--r--TODO23
1 files changed, 8 insertions, 15 deletions
diff --git a/TODO b/TODO
index 6aaf198..7023e06 100644
--- a/TODO
+++ b/TODO
@@ -1,13 +1,5 @@
-got a stack trace when a connection seemingly timed out (after around 30 successful calls to -p)
-
-getgmissing doesn't work brilliantly with preview books as it will always get 1st ~40 pages then get ip block. getgfailed will do a better job
-
-list all binaries in readme and what they do
-
# other utils
-getgbooktxt (different program as it gets from html pages, which getgbook doesn't any more)
-
getabook
getbnbook
@@ -24,12 +16,13 @@ write some little tests
## getgbook
-have file extension be determined by file type, rather than assuming png
-
-think about whether default functionality should be dl all, rather than -a
+Note: looks like google allows around 3 page requests per cookie session, and exactly 31 per ip per [some time period > 18 hours]. If I knew the time period, could make a script that gets maybe 20 pages, waits for some time period, then continues.
-to be fast and efficient it's best to crank through all the json 1st, filling in an array of page structs as we go
- this requires slightly fuller json support
- could consider making a json reading module, ala confoo, to make ad-hoc memory structures from json
+got a stack trace when a connection seemingly timed out (after around 30 successful calls to -p). enable core dumping and re-run (note have done a small amount of hardening since, but bug is probably still there).
-Note: looks like google allows around 3 page requests per cookie session, and exactly 31 per ip per [some time period]. If I knew the time period, could make a script that gets all it can, gets a list of failures, waits, then tries failures, etc. Note these would also have to stop at some point; some pages just aren't available
+running it from scripts (getgfailed.sh and getgmissing.sh), refuses to Ctrl-C exit, and creates 2 processes, which may be killed independently. not related to torify
+ multiple processes seems to be no bother
+ ctrl-c seems to be the loop continuing rather than breaking on ctrl-c; e.g. pressing it enough times to end loop works.
+ due to ctrl-c on a program which is using a pipe continues the loop rather than breaking it. using || break works, but breaks real functionality in the scripts
+ see seq 5|while read i; do echo run $i; echo a|sleep 5||break; done vs seq 5|while read i; do echo run $i; echo a|sleep 5; do
+ trapping signals doesn't help; the trap is only reached on last iteration; e.g. when it will exit the script anyway