]> git.scottworley.com Git - paperdoorknob/log
paperdoorknob
3 months agoHandle Unicode characters ≈ and ◁ main
Scott Worley [Fri, 26 Jan 2024 08:37:41 +0000 (00:37 -0800)]
Handle Unicode characters ≈ and ◁

This is a temporary expediency.  These characters sometimes appear in a
math expression and sometimes outside of a math expression, so when we
start correctly rendering math expressions, we can't just blindly jump
into math mode like this.  :(

3 months agoUser-facing progress report counts are 1-based
Scott Worley [Fri, 26 Jan 2024 08:00:23 +0000 (00:00 -0800)]
User-facing progress report counts are 1-based

This makes the final percentages left on the screen 100%

3 months agoFollow 'Next Thread →' links
Scott Worley [Fri, 26 Jan 2024 07:27:15 +0000 (23:27 -0800)]
Follow 'Next Thread →' links

3 months agoTest emit()
Scott Worley [Fri, 26 Jan 2024 06:46:43 +0000 (22:46 -0800)]
Test emit()

3 months agoTest bare chunks
Scott Worley [Fri, 26 Jan 2024 06:25:28 +0000 (22:25 -0800)]
Test bare chunks

3 months agoMove per-thread processing stuff into Thread
Scott Worley [Fri, 26 Jan 2024 02:20:46 +0000 (18:20 -0800)]
Move per-thread processing stuff into Thread

3 months agonext_thread should be an absolute URL
Scott Worley [Fri, 26 Jan 2024 02:09:18 +0000 (18:09 -0800)]
next_thread should be an absolute URL

3 months agoUse a more URL-looking url in tests
Scott Worley [Fri, 26 Jan 2024 02:06:43 +0000 (18:06 -0800)]
Use a more URL-looking url in tests

4 months agoFetch the non-flat view to get the next-thread link
Scott Worley [Fri, 12 Jan 2024 12:05:14 +0000 (04:05 -0800)]
Fetch the non-flat view to get the next-thread link

4 months agoAlways have Thread.__init__ fetch the HTML
Scott Worley [Fri, 12 Jan 2024 04:57:25 +0000 (20:57 -0800)]
Always have Thread.__init__ fetch the HTML

4 months agoOptionally have Thread.__init__ fetch the HTML
Scott Worley [Fri, 12 Jan 2024 03:30:21 +0000 (19:30 -0800)]
Optionally have Thread.__init__ fetch the HTML

4 months agoMove get_title() to Thread
Scott Worley [Fri, 12 Jan 2024 03:00:56 +0000 (19:00 -0800)]
Move get_title() to Thread

4 months agoRename html → dom
Scott Worley [Fri, 12 Jan 2024 02:40:11 +0000 (18:40 -0800)]
Rename html → dom

4 months agoReify Thread
Scott Worley [Fri, 12 Jan 2024 02:17:26 +0000 (18:17 -0800)]
Reify Thread

4 months agoShow name of thread being processed
Scott Worley [Sat, 6 Jan 2024 01:24:46 +0000 (17:24 -0800)]
Show name of thread being processed

4 months agoLearning TeX: Do Layouts with TeX macros
Scott Worley [Mon, 1 Jan 2024 06:11:25 +0000 (22:11 -0800)]
Learning TeX: Do Layouts with TeX macros

This is more elegant and reduces the size of the .tex output by 33%

h/t https://tex.stackexchange.com/a/537222 by Steven B. Segletes
for the \ifnotempty technique

4 months agoPut Texifier in spec
Scott Worley [Mon, 1 Jan 2024 02:54:30 +0000 (18:54 -0800)]
Put Texifier in spec

4 months agoLearning TeX: Render icon images with TeX command
Scott Worley [Sun, 31 Dec 2023 22:37:57 +0000 (14:37 -0800)]
Learning TeX: Render icon images with TeX command

4 months agoLearning TeX: Keep icon image size in TeX
Scott Worley [Sun, 31 Dec 2023 22:27:10 +0000 (14:27 -0800)]
Learning TeX: Keep icon image size in TeX

4 months agoDon't uselessly repeat preamble in test
Scott Worley [Sun, 31 Dec 2023 21:49:42 +0000 (13:49 -0800)]
Don't uselessly repeat preamble in test

4 months agoUse raw strings for less escaping
Scott Worley [Sun, 31 Dec 2023 21:48:21 +0000 (13:48 -0800)]
Use raw strings for less escaping

4 months agoStrip links from meta fields
Scott Worley [Sun, 31 Dec 2023 19:05:59 +0000 (11:05 -0800)]
Strip links from meta fields

This gives us the flexibility to process non-flat URLs, which is useful
for shorter feedback cycles during development.

4 months agoSupport _ in URLs
Scott Worley [Sun, 31 Dec 2023 09:16:00 +0000 (01:16 -0800)]
Support _ in URLs

4 months agorenderIcon makes bytes
Scott Worley [Sun, 31 Dec 2023 07:45:12 +0000 (23:45 -0800)]
renderIcon makes bytes

4 months agoOnly look within each chunk-dom for chunk fields
Scott Worley [Sat, 30 Dec 2023 12:33:36 +0000 (04:33 -0800)]
Only look within each chunk-dom for chunk fields

4 months agoEscape character names
Scott Worley [Sat, 30 Dec 2023 12:19:00 +0000 (04:19 -0800)]
Escape character names

This is slow. :(

4 months agoargs: Prefer dashes to underscores
Scott Worley [Sat, 30 Dec 2023 11:42:57 +0000 (03:42 -0800)]
args: Prefer dashes to underscores

4 months agoProgress messages for fetch and parse
Scott Worley [Sat, 30 Dec 2023 11:39:18 +0000 (03:39 -0800)]
Progress messages for fetch and parse

4 months agoRename: html → dom
Scott Worley [Sat, 30 Dec 2023 11:37:44 +0000 (03:37 -0800)]
Rename: html → dom

4 months agoRender links as footnotes
Scott Worley [Sat, 30 Dec 2023 11:27:04 +0000 (03:27 -0800)]
Render links as footnotes

4 months agoChange default layout: below → beside
Scott Worley [Sat, 30 Dec 2023 11:19:11 +0000 (03:19 -0800)]
Change default layout: below → beside

:(

Below layout is nicer, but has two problems:

1. The page break logic seems to be broken.  Some pages have content run
   off the bottom.  Other pages get a single line.  This seems to be a
   bug in the wrapstuff package.

2. There's a problem I don't understand where LaTeX complains about the
   \wrapstuffclear following a \footnote, saying:

      ! LaTeX kernel Error: Not in vertical mode.

      Starting a paragraph with \RawIndent or \RawNoindent
      (or \para_raw_indent: or \para_raw_noindent:) is only allowed
      if LaTeX is in vertical mode.

Two problems

4 months ago--quiet controls fetch cache hit rate report too
Scott Worley [Sat, 30 Dec 2023 00:33:39 +0000 (16:33 -0800)]
--quiet controls fetch cache hit rate report too

4 months ago--quiet flag to suppress progress messages
Scott Worley [Sat, 30 Dec 2023 00:19:00 +0000 (16:19 -0800)]
--quiet flag to suppress progress messages

4 months agoProgress indicator
Scott Worley [Sat, 30 Dec 2023 00:06:38 +0000 (16:06 -0800)]
Progress indicator

4 months agoRemove no-longer-needed DOMFilters NoEdit and NoFooter
Scott Worley [Fri, 29 Dec 2023 23:49:02 +0000 (15:49 -0800)]
Remove no-longer-needed DOMFilters NoEdit and NoFooter

These aren't needed anymore because Chunk individually extracts the five
things it needs from the chunk-DOMs, rather than pulling in the whole
chunk-<div>.

4 months agoApply bare-\emph fix to \st too
Scott Worley [Fri, 29 Dec 2023 19:49:08 +0000 (11:49 -0800)]
Apply bare-\emph fix to \st too

4 months agoSupport strikethrough
Scott Worley [Fri, 29 Dec 2023 19:46:36 +0000 (11:46 -0800)]
Support strikethrough

4 months agotexfilter to work around \emph nesting issue
Scott Worley [Fri, 29 Dec 2023 18:24:14 +0000 (10:24 -0800)]
texfilter to work around \emph nesting issue

I don't know enough LaTeX to understand what the problem is, but this
makes it go away.

4 months agoUse LaTeX packages: longtable and booktabs
Scott Worley [Fri, 29 Dec 2023 08:38:34 +0000 (00:38 -0800)]
Use LaTeX packages: longtable and booktabs

pandoc-generated LaTeX assumes these packages are available

4 months agotest: Don't duplicate list of TeX modules in test
Scott Worley [Fri, 29 Dec 2023 08:35:20 +0000 (00:35 -0800)]
test: Don't duplicate list of TeX modules in test

4 months agoReduce the number of places we keep our version number: 2 → 1
Scott Worley [Fri, 29 Dec 2023 07:06:24 +0000 (23:06 -0800)]
Reduce the number of places we keep our version number: 2 → 1

4 months agoReduce the number of places we keep our version number: 3 → 2
Scott Worley [Fri, 29 Dec 2023 06:53:40 +0000 (22:53 -0800)]
Reduce the number of places we keep our version number: 3 → 2

4 months agofetch: Send User-Agent header
Scott Worley [Fri, 29 Dec 2023 06:48:17 +0000 (22:48 -0800)]
fetch: Send User-Agent header

4 months agoimage filenames: Drop ? and following query parameters
Scott Worley [Fri, 29 Dec 2023 06:46:59 +0000 (22:46 -0800)]
image filenames: Drop ? and following query parameters

4 months agoUse view=flat to get whole threads at once
Scott Worley [Thu, 28 Dec 2023 23:19:37 +0000 (15:19 -0800)]
Use view=flat to get whole threads at once

4 months agoFakeFetcher: Show bad URLs in error messages
Scott Worley [Thu, 28 Dec 2023 23:13:11 +0000 (15:13 -0800)]
FakeFetcher: Show bad URLs in error messages

4 months ago`beside` layout
Scott Worley [Sun, 24 Dec 2023 06:02:35 +0000 (22:02 -0800)]
`beside` layout

4 months agoNo indent on first paragraph in each chunk
Scott Worley [Thu, 21 Dec 2023 11:21:42 +0000 (03:21 -0800)]
No indent on first paragraph in each chunk

4 months agoBorder around icon + author data that extends between chunks
Scott Worley [Thu, 21 Dec 2023 11:18:41 +0000 (03:18 -0800)]
Border around icon + author data that extends between chunks

4 months agoLess space between icon and post-separation line
Scott Worley [Thu, 21 Dec 2023 08:42:07 +0000 (00:42 -0800)]
Less space between icon and post-separation line

4 months agowrapfig → wrapstuff, for \wrapstuffclear
Scott Worley [Thu, 21 Dec 2023 08:24:04 +0000 (00:24 -0800)]
wrapfig → wrapstuff, for \wrapstuffclear

4 months agoForce line breaks between icon, character, screen name, and author
Scott Worley [Thu, 21 Dec 2023 07:46:36 +0000 (23:46 -0800)]
Force line breaks between icon, character, screen name, and author

h/t https://tex.stackexchange.com/questions/35110/how-to-stack-boxes-like-a-vertical-version-of-mbox#comment75823_37801

4 months agowrapfigure 0pt width means auto-scale
Scott Worley [Thu, 21 Dec 2023 07:43:28 +0000 (23:43 -0800)]
wrapfigure 0pt width means auto-scale

4 months agoLines between chunks
Scott Worley [Thu, 21 Dec 2023 07:41:33 +0000 (23:41 -0800)]
Lines between chunks

4 months agoFix flaky test: Increment request count before responding
Scott Worley [Thu, 21 Dec 2023 07:39:49 +0000 (23:39 -0800)]
Fix flaky test: Increment request count before responding

4 months agoReport fetch cache hit rate
Scott Worley [Thu, 21 Dec 2023 05:21:53 +0000 (21:21 -0800)]
Report fetch cache hit rate

4 months agoParagraph breaks
Scott Worley [Thu, 21 Dec 2023 00:17:54 +0000 (16:17 -0800)]
Paragraph breaks

4 months agoUniform icon image size
Scott Worley [Thu, 21 Dec 2023 00:14:30 +0000 (16:14 -0800)]
Uniform icon image size

4 months agoLaTeX doesn't like % in filenames
Scott Worley [Wed, 20 Dec 2023 23:57:46 +0000 (15:57 -0800)]
LaTeX doesn't like % in filenames

4 months agoReify Layout
Scott Worley [Wed, 20 Dec 2023 23:42:32 +0000 (15:42 -0800)]
Reify Layout

4 months agoNew dependency: wrapfig TeX package
Scott Worley [Wed, 20 Dec 2023 23:25:53 +0000 (15:25 -0800)]
New dependency: wrapfig TeX package

4 months agoReify Chunk
Scott Worley [Wed, 20 Dec 2023 21:44:27 +0000 (13:44 -0800)]
Reify Chunk

4 months agoFakeImageStore for tests
Scott Worley [Wed, 20 Dec 2023 21:10:55 +0000 (13:10 -0800)]
FakeImageStore for tests

4 months agoMore structure and tests around splitting the page into chunks' DOMs.
Scott Worley [Wed, 20 Dec 2023 19:43:09 +0000 (11:43 -0800)]
More structure and tests around splitting the page into chunks' DOMs.

4 months agoPass an ImageStore around
Scott Worley [Wed, 20 Dec 2023 07:55:05 +0000 (23:55 -0800)]
Pass an ImageStore around

4 months agoAllow many Spec fields
Scott Worley [Wed, 20 Dec 2023 07:48:57 +0000 (23:48 -0800)]
Allow many Spec fields

The whole point of Spec is to be a simple, inert, immutable collection
of largely independent program elements easily at hand.  It's being used
here like a namespace, not a pile of shared mutable state for which a
field limit of 7 would make more sense.  Let Spec grow.

4 months agoImageStore
Scott Worley [Wed, 20 Dec 2023 07:39:34 +0000 (23:39 -0800)]
ImageStore

4 months agoSpecify page geometry
Scott Worley [Wed, 20 Dec 2023 06:38:46 +0000 (22:38 -0800)]
Specify page geometry

4 months agoProject Lawful start URL in --help
Scott Worley [Wed, 20 Dec 2023 06:07:37 +0000 (22:07 -0800)]
Project Lawful start URL in --help

4 months agoExtensible, flag-controlled DOM filters
Scott Worley [Wed, 20 Dec 2023 05:57:13 +0000 (21:57 -0800)]
Extensible, flag-controlled DOM filters

4 months agoStrip all &nbsp;
Scott Worley [Wed, 20 Dec 2023 03:39:16 +0000 (19:39 -0800)]
Strip all &nbsp;

Project Lawful does not use &nbsp; carefully/thoughtfully.  They are all
over the place, at the end of every sentence, randomly around <em>s,
etc.  Just remove them all.

But allow this behavior to be flag-controlled, in case this step is not
desired when processing other works.

4 months agoFakeFetcher for faster tests
Scott Worley [Wed, 20 Dec 2023 02:45:55 +0000 (18:45 -0800)]
FakeFetcher for faster tests

Run some tests against both the webserver and the FakeFetcher to make
sure both work.

4 months agoMove HTML test data out to separate file
Scott Worley [Wed, 20 Dec 2023 01:44:32 +0000 (17:44 -0800)]
Move HTML test data out to separate file

4 months agoBundle the things needed for a run together into a Spec
Scott Worley [Wed, 20 Dec 2023 01:29:14 +0000 (17:29 -0800)]
Bundle the things needed for a run together into a Spec

4 months agoContemplate generating LaTeX directly
Scott Worley [Tue, 19 Dec 2023 10:23:48 +0000 (02:23 -0800)]
Contemplate generating LaTeX directly

4 months agoTexifier interface
Scott Worley [Tue, 19 Dec 2023 09:45:20 +0000 (01:45 -0800)]
Texifier interface

4 months agoRename: fetch → parse
Scott Worley [Tue, 19 Dec 2023 09:32:43 +0000 (01:32 -0800)]
Rename: fetch → parse

4 months agoUse the new encapsulated fetchers
Scott Worley [Tue, 19 Dec 2023 09:28:31 +0000 (01:28 -0800)]
Use the new encapsulated fetchers

4 months agoCleaner Fetcher interface
Scott Worley [Tue, 19 Dec 2023 09:16:03 +0000 (01:16 -0800)]
Cleaner Fetcher interface

4 months agoExtract fake webserver
Scott Worley [Tue, 19 Dec 2023 08:36:35 +0000 (00:36 -0800)]
Extract fake webserver

5 months agoLaTeX output is a valid LaTeX file
Scott Worley [Fri, 1 Dec 2023 09:01:21 +0000 (01:01 -0800)]
LaTeX output is a valid LaTeX file

5 months agoVerify that LaTeX conversion is doing something
Scott Worley [Thu, 30 Nov 2023 16:29:35 +0000 (08:29 -0800)]
Verify that LaTeX conversion is doing something

5 months agoInvoke pandoc to convert HTML to LaTeX
Scott Worley [Thu, 30 Nov 2023 16:26:35 +0000 (08:26 -0800)]
Invoke pandoc to convert HTML to LaTeX

5 months ago--pandoc flag
Scott Worley [Sat, 25 Nov 2023 09:47:31 +0000 (01:47 -0800)]
--pandoc flag

5 months agoOpen output file
Scott Worley [Sat, 25 Nov 2023 09:40:05 +0000 (01:40 -0800)]
Open output file

5 months agoDrop Post as a class
Scott Worley [Sat, 25 Nov 2023 09:17:05 +0000 (01:17 -0800)]
Drop Post as a class

5 months agoStrip edit-boxes and footers
Scott Worley [Fri, 24 Nov 2023 04:22:31 +0000 (20:22 -0800)]
Strip edit-boxes and footers

5 months agoentries() convenience method
Scott Worley [Fri, 24 Nov 2023 04:04:59 +0000 (20:04 -0800)]
entries() convenience method

5 months agoReplies
Scott Worley [Fri, 24 Nov 2023 03:32:52 +0000 (19:32 -0800)]
Replies

5 months agoBegin parsing glowfic html
Scott Worley [Fri, 24 Nov 2023 03:26:53 +0000 (19:26 -0800)]
Begin parsing glowfic html

5 months agoParse HTML
Scott Worley [Fri, 24 Nov 2023 02:34:52 +0000 (18:34 -0800)]
Parse HTML

5 months agofetch: Honor Cache-Control headers
Scott Worley [Thu, 23 Nov 2023 22:13:05 +0000 (14:13 -0800)]
fetch: Honor Cache-Control headers

5 months agofetch: cache: Keep cache in XDG_CACHE_HOME (eg: ~/.cache)
Scott Worley [Thu, 23 Nov 2023 21:57:49 +0000 (13:57 -0800)]
fetch: cache: Keep cache in XDG_CACHE_HOME (eg: ~/.cache)

5 months agofetch: Verify caching across sessions
Scott Worley [Thu, 23 Nov 2023 21:28:50 +0000 (13:28 -0800)]
fetch: Verify caching across sessions

5 months agofetch: Cache
Scott Worley [Thu, 23 Nov 2023 21:27:38 +0000 (13:27 -0800)]
fetch: Cache

5 months agofetch: Multiple fetches per session
Scott Worley [Thu, 23 Nov 2023 21:18:27 +0000 (13:18 -0800)]
fetch: Multiple fetches per session

5 months agofetch: Use a session
Scott Worley [Thu, 23 Nov 2023 21:11:59 +0000 (13:11 -0800)]
fetch: Use a session

5 months agofetch: test: Count requests
Scott Worley [Thu, 23 Nov 2023 21:09:18 +0000 (13:09 -0800)]
fetch: test: Count requests

5 months agofetch: test: Explicitly close webserver
Scott Worley [Thu, 23 Nov 2023 20:54:43 +0000 (12:54 -0800)]
fetch: test: Explicitly close webserver

This fixes "unclosed <socket.socket ..." warnings

5 months agofetch: test: Hold reference to webserver
Scott Worley [Thu, 23 Nov 2023 20:49:40 +0000 (12:49 -0800)]
fetch: test: Hold reference to webserver