]> git.scottworley.com Git - paperdoorknob/log
paperdoorknob
11 months agoHandle Unicode characters ≈ and ◁ main
Scott Worley [Fri, 26 Jan 2024 08:37:41 +0000 (00:37 -0800)]
Handle Unicode characters ≈ and ◁

This is a temporary expediency.  These characters sometimes appear in a
math expression and sometimes outside of a math expression, so when we
start correctly rendering math expressions, we can't just blindly jump
into math mode like this.  :(

11 months agoUser-facing progress report counts are 1-based
Scott Worley [Fri, 26 Jan 2024 08:00:23 +0000 (00:00 -0800)]
User-facing progress report counts are 1-based

This makes the final percentages left on the screen 100%

11 months agoFollow 'Next Thread →' links
Scott Worley [Fri, 26 Jan 2024 07:27:15 +0000 (23:27 -0800)]
Follow 'Next Thread →' links

11 months agoTest emit()
Scott Worley [Fri, 26 Jan 2024 06:46:43 +0000 (22:46 -0800)]
Test emit()

11 months agoTest bare chunks
Scott Worley [Fri, 26 Jan 2024 06:25:28 +0000 (22:25 -0800)]
Test bare chunks

11 months agoMove per-thread processing stuff into Thread
Scott Worley [Fri, 26 Jan 2024 02:20:46 +0000 (18:20 -0800)]
Move per-thread processing stuff into Thread

11 months agonext_thread should be an absolute URL
Scott Worley [Fri, 26 Jan 2024 02:09:18 +0000 (18:09 -0800)]
next_thread should be an absolute URL

11 months agoUse a more URL-looking url in tests
Scott Worley [Fri, 26 Jan 2024 02:06:43 +0000 (18:06 -0800)]
Use a more URL-looking url in tests

11 months agoFetch the non-flat view to get the next-thread link
Scott Worley [Fri, 12 Jan 2024 12:05:14 +0000 (04:05 -0800)]
Fetch the non-flat view to get the next-thread link

11 months agoAlways have Thread.__init__ fetch the HTML
Scott Worley [Fri, 12 Jan 2024 04:57:25 +0000 (20:57 -0800)]
Always have Thread.__init__ fetch the HTML

11 months agoOptionally have Thread.__init__ fetch the HTML
Scott Worley [Fri, 12 Jan 2024 03:30:21 +0000 (19:30 -0800)]
Optionally have Thread.__init__ fetch the HTML

11 months agoMove get_title() to Thread
Scott Worley [Fri, 12 Jan 2024 03:00:56 +0000 (19:00 -0800)]
Move get_title() to Thread

11 months agoRename html → dom
Scott Worley [Fri, 12 Jan 2024 02:40:11 +0000 (18:40 -0800)]
Rename html → dom

11 months agoReify Thread
Scott Worley [Fri, 12 Jan 2024 02:17:26 +0000 (18:17 -0800)]
Reify Thread

11 months agoShow name of thread being processed
Scott Worley [Sat, 6 Jan 2024 01:24:46 +0000 (17:24 -0800)]
Show name of thread being processed

11 months agoLearning TeX: Do Layouts with TeX macros
Scott Worley [Mon, 1 Jan 2024 06:11:25 +0000 (22:11 -0800)]
Learning TeX: Do Layouts with TeX macros

This is more elegant and reduces the size of the .tex output by 33%

h/t https://tex.stackexchange.com/a/537222 by Steven B. Segletes
for the \ifnotempty technique

11 months agoPut Texifier in spec
Scott Worley [Mon, 1 Jan 2024 02:54:30 +0000 (18:54 -0800)]
Put Texifier in spec

11 months agoLearning TeX: Render icon images with TeX command
Scott Worley [Sun, 31 Dec 2023 22:37:57 +0000 (14:37 -0800)]
Learning TeX: Render icon images with TeX command

11 months agoLearning TeX: Keep icon image size in TeX
Scott Worley [Sun, 31 Dec 2023 22:27:10 +0000 (14:27 -0800)]
Learning TeX: Keep icon image size in TeX

11 months agoDon't uselessly repeat preamble in test
Scott Worley [Sun, 31 Dec 2023 21:49:42 +0000 (13:49 -0800)]
Don't uselessly repeat preamble in test

11 months agoUse raw strings for less escaping
Scott Worley [Sun, 31 Dec 2023 21:48:21 +0000 (13:48 -0800)]
Use raw strings for less escaping

11 months agoStrip links from meta fields
Scott Worley [Sun, 31 Dec 2023 19:05:59 +0000 (11:05 -0800)]
Strip links from meta fields

This gives us the flexibility to process non-flat URLs, which is useful
for shorter feedback cycles during development.

11 months agoSupport _ in URLs
Scott Worley [Sun, 31 Dec 2023 09:16:00 +0000 (01:16 -0800)]
Support _ in URLs

11 months agorenderIcon makes bytes
Scott Worley [Sun, 31 Dec 2023 07:45:12 +0000 (23:45 -0800)]
renderIcon makes bytes

11 months agoOnly look within each chunk-dom for chunk fields
Scott Worley [Sat, 30 Dec 2023 12:33:36 +0000 (04:33 -0800)]
Only look within each chunk-dom for chunk fields

11 months agoEscape character names
Scott Worley [Sat, 30 Dec 2023 12:19:00 +0000 (04:19 -0800)]
Escape character names

This is slow. :(

11 months agoargs: Prefer dashes to underscores
Scott Worley [Sat, 30 Dec 2023 11:42:57 +0000 (03:42 -0800)]
args: Prefer dashes to underscores

11 months agoProgress messages for fetch and parse
Scott Worley [Sat, 30 Dec 2023 11:39:18 +0000 (03:39 -0800)]
Progress messages for fetch and parse

11 months agoRename: html → dom
Scott Worley [Sat, 30 Dec 2023 11:37:44 +0000 (03:37 -0800)]
Rename: html → dom

11 months agoRender links as footnotes
Scott Worley [Sat, 30 Dec 2023 11:27:04 +0000 (03:27 -0800)]
Render links as footnotes

11 months agoChange default layout: below → beside
Scott Worley [Sat, 30 Dec 2023 11:19:11 +0000 (03:19 -0800)]
Change default layout: below → beside

:(

Below layout is nicer, but has two problems:

1. The page break logic seems to be broken.  Some pages have content run
   off the bottom.  Other pages get a single line.  This seems to be a
   bug in the wrapstuff package.

2. There's a problem I don't understand where LaTeX complains about the
   \wrapstuffclear following a \footnote, saying:

      ! LaTeX kernel Error: Not in vertical mode.

      Starting a paragraph with \RawIndent or \RawNoindent
      (or \para_raw_indent: or \para_raw_noindent:) is only allowed
      if LaTeX is in vertical mode.

Two problems

11 months ago--quiet controls fetch cache hit rate report too
Scott Worley [Sat, 30 Dec 2023 00:33:39 +0000 (16:33 -0800)]
--quiet controls fetch cache hit rate report too

11 months ago--quiet flag to suppress progress messages
Scott Worley [Sat, 30 Dec 2023 00:19:00 +0000 (16:19 -0800)]
--quiet flag to suppress progress messages

11 months agoProgress indicator
Scott Worley [Sat, 30 Dec 2023 00:06:38 +0000 (16:06 -0800)]
Progress indicator

11 months agoRemove no-longer-needed DOMFilters NoEdit and NoFooter
Scott Worley [Fri, 29 Dec 2023 23:49:02 +0000 (15:49 -0800)]
Remove no-longer-needed DOMFilters NoEdit and NoFooter

These aren't needed anymore because Chunk individually extracts the five
things it needs from the chunk-DOMs, rather than pulling in the whole
chunk-<div>.

11 months agoApply bare-\emph fix to \st too
Scott Worley [Fri, 29 Dec 2023 19:49:08 +0000 (11:49 -0800)]
Apply bare-\emph fix to \st too

11 months agoSupport strikethrough
Scott Worley [Fri, 29 Dec 2023 19:46:36 +0000 (11:46 -0800)]
Support strikethrough

11 months agotexfilter to work around \emph nesting issue
Scott Worley [Fri, 29 Dec 2023 18:24:14 +0000 (10:24 -0800)]
texfilter to work around \emph nesting issue

I don't know enough LaTeX to understand what the problem is, but this
makes it go away.

11 months agoUse LaTeX packages: longtable and booktabs
Scott Worley [Fri, 29 Dec 2023 08:38:34 +0000 (00:38 -0800)]
Use LaTeX packages: longtable and booktabs

pandoc-generated LaTeX assumes these packages are available

11 months agotest: Don't duplicate list of TeX modules in test
Scott Worley [Fri, 29 Dec 2023 08:35:20 +0000 (00:35 -0800)]
test: Don't duplicate list of TeX modules in test

11 months agoReduce the number of places we keep our version number: 2 → 1
Scott Worley [Fri, 29 Dec 2023 07:06:24 +0000 (23:06 -0800)]
Reduce the number of places we keep our version number: 2 → 1

11 months agoReduce the number of places we keep our version number: 3 → 2
Scott Worley [Fri, 29 Dec 2023 06:53:40 +0000 (22:53 -0800)]
Reduce the number of places we keep our version number: 3 → 2

11 months agofetch: Send User-Agent header
Scott Worley [Fri, 29 Dec 2023 06:48:17 +0000 (22:48 -0800)]
fetch: Send User-Agent header

11 months agoimage filenames: Drop ? and following query parameters
Scott Worley [Fri, 29 Dec 2023 06:46:59 +0000 (22:46 -0800)]
image filenames: Drop ? and following query parameters

11 months agoUse view=flat to get whole threads at once
Scott Worley [Thu, 28 Dec 2023 23:19:37 +0000 (15:19 -0800)]
Use view=flat to get whole threads at once

11 months agoFakeFetcher: Show bad URLs in error messages
Scott Worley [Thu, 28 Dec 2023 23:13:11 +0000 (15:13 -0800)]
FakeFetcher: Show bad URLs in error messages

12 months ago`beside` layout
Scott Worley [Sun, 24 Dec 2023 06:02:35 +0000 (22:02 -0800)]
`beside` layout

12 months agoNo indent on first paragraph in each chunk
Scott Worley [Thu, 21 Dec 2023 11:21:42 +0000 (03:21 -0800)]
No indent on first paragraph in each chunk

12 months agoBorder around icon + author data that extends between chunks
Scott Worley [Thu, 21 Dec 2023 11:18:41 +0000 (03:18 -0800)]
Border around icon + author data that extends between chunks

12 months agoLess space between icon and post-separation line
Scott Worley [Thu, 21 Dec 2023 08:42:07 +0000 (00:42 -0800)]
Less space between icon and post-separation line

12 months agowrapfig → wrapstuff, for \wrapstuffclear
Scott Worley [Thu, 21 Dec 2023 08:24:04 +0000 (00:24 -0800)]
wrapfig → wrapstuff, for \wrapstuffclear

12 months agoForce line breaks between icon, character, screen name, and author
Scott Worley [Thu, 21 Dec 2023 07:46:36 +0000 (23:46 -0800)]
Force line breaks between icon, character, screen name, and author

h/t https://tex.stackexchange.com/questions/35110/how-to-stack-boxes-like-a-vertical-version-of-mbox#comment75823_37801

12 months agowrapfigure 0pt width means auto-scale
Scott Worley [Thu, 21 Dec 2023 07:43:28 +0000 (23:43 -0800)]
wrapfigure 0pt width means auto-scale

12 months agoLines between chunks
Scott Worley [Thu, 21 Dec 2023 07:41:33 +0000 (23:41 -0800)]
Lines between chunks

12 months agoFix flaky test: Increment request count before responding
Scott Worley [Thu, 21 Dec 2023 07:39:49 +0000 (23:39 -0800)]
Fix flaky test: Increment request count before responding

12 months agoReport fetch cache hit rate
Scott Worley [Thu, 21 Dec 2023 05:21:53 +0000 (21:21 -0800)]
Report fetch cache hit rate

12 months agoParagraph breaks
Scott Worley [Thu, 21 Dec 2023 00:17:54 +0000 (16:17 -0800)]
Paragraph breaks

12 months agoUniform icon image size
Scott Worley [Thu, 21 Dec 2023 00:14:30 +0000 (16:14 -0800)]
Uniform icon image size

12 months agoLaTeX doesn't like % in filenames
Scott Worley [Wed, 20 Dec 2023 23:57:46 +0000 (15:57 -0800)]
LaTeX doesn't like % in filenames

12 months agoReify Layout
Scott Worley [Wed, 20 Dec 2023 23:42:32 +0000 (15:42 -0800)]
Reify Layout

12 months agoNew dependency: wrapfig TeX package
Scott Worley [Wed, 20 Dec 2023 23:25:53 +0000 (15:25 -0800)]
New dependency: wrapfig TeX package

12 months agoReify Chunk
Scott Worley [Wed, 20 Dec 2023 21:44:27 +0000 (13:44 -0800)]
Reify Chunk

12 months agoFakeImageStore for tests
Scott Worley [Wed, 20 Dec 2023 21:10:55 +0000 (13:10 -0800)]
FakeImageStore for tests

12 months agoMore structure and tests around splitting the page into chunks' DOMs.
Scott Worley [Wed, 20 Dec 2023 19:43:09 +0000 (11:43 -0800)]
More structure and tests around splitting the page into chunks' DOMs.

12 months agoPass an ImageStore around
Scott Worley [Wed, 20 Dec 2023 07:55:05 +0000 (23:55 -0800)]
Pass an ImageStore around

12 months agoAllow many Spec fields
Scott Worley [Wed, 20 Dec 2023 07:48:57 +0000 (23:48 -0800)]
Allow many Spec fields

The whole point of Spec is to be a simple, inert, immutable collection
of largely independent program elements easily at hand.  It's being used
here like a namespace, not a pile of shared mutable state for which a
field limit of 7 would make more sense.  Let Spec grow.

12 months agoImageStore
Scott Worley [Wed, 20 Dec 2023 07:39:34 +0000 (23:39 -0800)]
ImageStore

12 months agoSpecify page geometry
Scott Worley [Wed, 20 Dec 2023 06:38:46 +0000 (22:38 -0800)]
Specify page geometry

12 months agoProject Lawful start URL in --help
Scott Worley [Wed, 20 Dec 2023 06:07:37 +0000 (22:07 -0800)]
Project Lawful start URL in --help

12 months agoExtensible, flag-controlled DOM filters
Scott Worley [Wed, 20 Dec 2023 05:57:13 +0000 (21:57 -0800)]
Extensible, flag-controlled DOM filters

12 months agoStrip all &nbsp;
Scott Worley [Wed, 20 Dec 2023 03:39:16 +0000 (19:39 -0800)]
Strip all &nbsp;

Project Lawful does not use &nbsp; carefully/thoughtfully.  They are all
over the place, at the end of every sentence, randomly around <em>s,
etc.  Just remove them all.

But allow this behavior to be flag-controlled, in case this step is not
desired when processing other works.

12 months agoFakeFetcher for faster tests
Scott Worley [Wed, 20 Dec 2023 02:45:55 +0000 (18:45 -0800)]
FakeFetcher for faster tests

Run some tests against both the webserver and the FakeFetcher to make
sure both work.

12 months agoMove HTML test data out to separate file
Scott Worley [Wed, 20 Dec 2023 01:44:32 +0000 (17:44 -0800)]
Move HTML test data out to separate file

12 months agoBundle the things needed for a run together into a Spec
Scott Worley [Wed, 20 Dec 2023 01:29:14 +0000 (17:29 -0800)]
Bundle the things needed for a run together into a Spec

12 months agoContemplate generating LaTeX directly
Scott Worley [Tue, 19 Dec 2023 10:23:48 +0000 (02:23 -0800)]
Contemplate generating LaTeX directly

12 months agoTexifier interface
Scott Worley [Tue, 19 Dec 2023 09:45:20 +0000 (01:45 -0800)]
Texifier interface

12 months agoRename: fetch → parse
Scott Worley [Tue, 19 Dec 2023 09:32:43 +0000 (01:32 -0800)]
Rename: fetch → parse

12 months agoUse the new encapsulated fetchers
Scott Worley [Tue, 19 Dec 2023 09:28:31 +0000 (01:28 -0800)]
Use the new encapsulated fetchers

12 months agoCleaner Fetcher interface
Scott Worley [Tue, 19 Dec 2023 09:16:03 +0000 (01:16 -0800)]
Cleaner Fetcher interface

12 months agoExtract fake webserver
Scott Worley [Tue, 19 Dec 2023 08:36:35 +0000 (00:36 -0800)]
Extract fake webserver

12 months agoLaTeX output is a valid LaTeX file
Scott Worley [Fri, 1 Dec 2023 09:01:21 +0000 (01:01 -0800)]
LaTeX output is a valid LaTeX file

12 months agoVerify that LaTeX conversion is doing something
Scott Worley [Thu, 30 Nov 2023 16:29:35 +0000 (08:29 -0800)]
Verify that LaTeX conversion is doing something

12 months agoInvoke pandoc to convert HTML to LaTeX
Scott Worley [Thu, 30 Nov 2023 16:26:35 +0000 (08:26 -0800)]
Invoke pandoc to convert HTML to LaTeX

13 months ago--pandoc flag
Scott Worley [Sat, 25 Nov 2023 09:47:31 +0000 (01:47 -0800)]
--pandoc flag

13 months agoOpen output file
Scott Worley [Sat, 25 Nov 2023 09:40:05 +0000 (01:40 -0800)]
Open output file

13 months agoDrop Post as a class
Scott Worley [Sat, 25 Nov 2023 09:17:05 +0000 (01:17 -0800)]
Drop Post as a class

13 months agoStrip edit-boxes and footers
Scott Worley [Fri, 24 Nov 2023 04:22:31 +0000 (20:22 -0800)]
Strip edit-boxes and footers

13 months agoentries() convenience method
Scott Worley [Fri, 24 Nov 2023 04:04:59 +0000 (20:04 -0800)]
entries() convenience method

13 months agoReplies
Scott Worley [Fri, 24 Nov 2023 03:32:52 +0000 (19:32 -0800)]
Replies

13 months agoBegin parsing glowfic html
Scott Worley [Fri, 24 Nov 2023 03:26:53 +0000 (19:26 -0800)]
Begin parsing glowfic html

13 months agoParse HTML
Scott Worley [Fri, 24 Nov 2023 02:34:52 +0000 (18:34 -0800)]
Parse HTML

13 months agofetch: Honor Cache-Control headers
Scott Worley [Thu, 23 Nov 2023 22:13:05 +0000 (14:13 -0800)]
fetch: Honor Cache-Control headers

13 months agofetch: cache: Keep cache in XDG_CACHE_HOME (eg: ~/.cache)
Scott Worley [Thu, 23 Nov 2023 21:57:49 +0000 (13:57 -0800)]
fetch: cache: Keep cache in XDG_CACHE_HOME (eg: ~/.cache)

13 months agofetch: Verify caching across sessions
Scott Worley [Thu, 23 Nov 2023 21:28:50 +0000 (13:28 -0800)]
fetch: Verify caching across sessions

13 months agofetch: Cache
Scott Worley [Thu, 23 Nov 2023 21:27:38 +0000 (13:27 -0800)]
fetch: Cache

13 months agofetch: Multiple fetches per session
Scott Worley [Thu, 23 Nov 2023 21:18:27 +0000 (13:18 -0800)]
fetch: Multiple fetches per session

13 months agofetch: Use a session
Scott Worley [Thu, 23 Nov 2023 21:11:59 +0000 (13:11 -0800)]
fetch: Use a session

13 months agofetch: test: Count requests
Scott Worley [Thu, 23 Nov 2023 21:09:18 +0000 (13:09 -0800)]
fetch: test: Count requests

13 months agofetch: test: Explicitly close webserver
Scott Worley [Thu, 23 Nov 2023 20:54:43 +0000 (12:54 -0800)]
fetch: test: Explicitly close webserver

This fixes "unclosed <socket.socket ..." warnings

13 months agofetch: test: Hold reference to webserver
Scott Worley [Thu, 23 Nov 2023 20:49:40 +0000 (12:49 -0800)]
fetch: test: Hold reference to webserver