Not a bug in original TeX

This page lists a few reports which could plausibly be considered notable bugs in the original TeX software written and maintained by Donald Knuth, but have been deemed not something to be fixed, either by Knuth or his vetters.

Many other reports have been declined that are not listed here (Knuth's tune-up reports mention some: 2021, 2014, 2008). The original reports and answers have been edited or paraphrased for presentation here.

The page and module numbers are merely an initial hint about a relevant location; typically, more than one place in the code and/or documentation is involved. The initial letter (A, B, …) refers to the Computers & Typesetting volume.

A list of accepted bugs for the next tune-up is also available. These are not expected to be reviewed by Knuth until the next tune-up.

For any discussion about these issues, or further reports, please use the contact information on the main TeX bugs page here.

Contents: A005: missing \null - A214: \endinput behavior - A415: \ninebig delimiters - B032: use of word “procedure” - B035: newlines inconsistently written to terminal - B133: max_param_stack comment - B214: input file name flushed - B274: line number ranges - B506: bogus dimen display/other overflow - B546: output routine braces - D350: unused variable m declared - dvitype: unnecessary loop condition.


A005, et al.: missing \null

From Udo Wermuth, 2017-01-15: in exercise 2.4, to get the right spacefactor, “OK,” should be “OK\null,”. The same issue occurs dozens of times throughout the books and WEB sources, after all kinds of punctuation. “\TeX.” and “MF.” are particularly prevalent.

Response from DEK: My practice has been to insert \null only when I notice something amiss in proofreading. Similarly with lots of other refinements.


A214: \endinput behavior

The TeXbook defines the behavior of \endinput with:

The next time TeX gets to the end of an \input line, it will stop reading from the file containing that line.

and that is exactly how it behaves. N.B. It does not say “stop reading from the file containing the \endinput” (let alone “stop reading immediately”). Thus, when material is placed on an input line after \endinput, there are counter-intuitive effects (report 1) and/or wrong/imprecise error messages (report 2), but this is not a bug that Knuth will consider.

TeX's error message “File ended while ...” is technically inaccurate even in the simple case of any text at all following \endinput, in that file reading did not reach EOF.

Response: In general, Knuth has said that extreme cases of TeX input deserve whatever they get. Furthermore, he knows error messages are not always worded as optimally as they might be. But he has consistently declined to tinker with wording for small incremental improvements at this point.


A415: \ninebig delimiters axis height

From Hu Yajie 胡亚捷, 2020-07-29 and 2020-07-29 again and 2020-08-02:

The \ninebig macro in manmac.tex typesets \big delimiters in 9-point math by borrowing the 10-point ones in cmr10 and cmsy10, but it forgets to retain the 9-point axis height. Thus examples like \input manmac \ninepoint $\bigl(()\bigr)$\end are vertically asymmetrical. This asymmetry can be observed in the real books (page A245, line 20; page C298, line -1; etc.), and it can be fixed by changing ‘\hbox{...}’ to ‘\vcenter{\hbox{...}}’ to get vertical symmetry.

Response: Knuth accepts the analysis, but says: “Since I've been happy with that for nearly 40 years, I guess I'm still happy with it.”


B032 (module 72): use of word “procedure”

From Martin Ruckert, 2021-03-14: The text says “The print_err procedure”, but print_err is a macro, not a procedure.

Rino Jose, 2018-12-18, made a similar report for Metafont, page D120 (module 266): “This procedure returns” instead of “This function returns”. [The phrase occurs in several other places in mf.web and tex.web.]

Response: As a matter of English, it is normal to use “procedure” interchangeably with other terms (macro, function, (sub)routine), and since it's not formatted in bold, it shouldn't be taken as implying a Pascal procedure.


B035 (module 35): newlines inconsistently written to terminal

From Igor Liferenko, 2021-07-06: Before exiting, TeX (and other WEB programs) sometimes use write and sometimes use write_ln, e.g., tex.web line 1036 (B035, module 35) vs. line 1085 (B036, module 37).

Response from DEK: This is not important enough to warrant any change. [However, he has noted that it's fine/expected for change files to do as they see fit in this regard.]

Further information from DRF: For the historical record. The Sail/Waits OS, where DEK spent his time back in the day, had strong knowledge of what text was where on your screen, as well as what was buffered up for (custom) keyboard input and (custom) screen output, and it was all tightly bound with the “shell” that was really integrated with the OS. The system handled whole lines of input, and when the user hit <return>, it put the cursor at start of the next line, and knew it; when the program put characters on the screen, the system knew exactly what row and column they were in; and when a program ended, any remaining output buffered up for the screen was flushed, including moving the cursor to the start of a new line if necessary.

Score/TOPS-20, the only port we directly provided, was the same but different (completely different OS and no special terminal hardware, but the shell was tightly integrated with the terminal IO, and the system knew where the cursor was at all times; see the SFPOS and RFPOS system calls).

The only question is whether I'm lying about “if necessary”, and that it was really “always”. Looking for further clues, note that Tangle and Weave always do not end with a newline to the terminal, while PLtoTF, TFtoPL and PoolType always do end with a newline to the terminal. This is evidence that either the OSes we dealt with added a newline at the end only as needed, or that they always added a newline and DEK didn't care that the PL/TF programs left an extra blank line. The latter is believable, as those programs were virtually never used, while Tangle and Weave were in constant use, especially by DEK. (Perhaps oddly, DVItype and GFtype don't report errors or progress to the terminal; all their output goes into the .TYP file, so they don't really provide any evidence.)

That leaves TeX and MF. In normal operation, they also do not end with a newline to the terminal, and they too were in constant use. However, in most exceptional cases they do end with a newline (minus l.1085). Looking back at version 0.97 on saildart.org/[TEX,DEK] it looks like it was all pretty much the same mix.

I'd say that the normal-operation TeX/MF, along with Tangle/Weave, is fairly strong evidence that care was being taken for intentionally ending without a newline; and that it is in fact a mistake in the (very) exceptional cases where it does, but nobody cared or noticed, since those exceptions pretty much never happened. So, I suppose I'd say that [in principle] lines 1036, 1328, 10164, 23810, 24289 should all be changed not have a newline, so that everything is self-consistent; with the super advantage that it's then always ok for other ports to always add a newline at the end of the job, and that will never cause an extra blank line.

I don't think there's any mysterious reason for the odd-man-out case of 1085 where the code seems already “right”.


B133 (module 308): max_param_stack comment

From Wolfgang Helbig, 2021-07-23:

The comment

     { largest value of param_ptr, will be <= param_size + 9 }
at the declaration of max_param_stack seems misleading to me. I'd suggest instead:
     { largest value of param_ptr }
The param_ptr must not exceed param_size, which is ensured in module 390.

Response: That's true about param_ptr. What's misleading is that second half of the comment, “will be <= param_size + 9”, applies to max_param_stack, not param_ptr. A semicolon instead of comma would have made that clearer.

More from DRF about this: DEK is commenting on the fact that he had to make the type of max_param_stack be integer rather than 0..param_size+9, which is what it really ought to be—but Pascal doesn't let you use even a constant additive expression in the range definition (and WEB only lets you if it's from a numeric (=) macro so it can collapse the addition, but DEK wanted max_param_stack to be compile-time changeable in the const section, evidently). See all of the other max_* global variables for confirmation; they're all 0..<whatever>_size (and <whatever> is a Pascal const).

He could have detected the overflow before doing the addition, and thus be able to use max_param_stack:0..param_size and gotten rid of the comment entirely, but then the statistics report at the end of the TeX run would not have shown how big you need to increase param_size to for the job to run.

This is not the only terse comment that needs much thought / experience / analysis to figure out the motivation for.


B214 (module 537): input file name flushed prematurely

From Wolfgang Helbig, 2020-10-26:

TeX sometimes flushes the name of an input file, keeping only the base name without directory and extension. This causes an error if the full name of the file needs to be passed to the editor during error recovery, as suggested by Prof. Knuth. The same bug is in Metafont (module 793).

[...] change block[s] from my tex.ch [...]:

@x [tex 537] continued
if name=str_ptr-1 then {we can conserve string pool space now}
   begin flush_string; name:=cur_name;
   end;
@y
@^Editor@>
@z

Response: The assumption was that on any “reasonable” OS, you could easily ask the system for the full canonical file name of the appropriate open alpha_file when you need it, so there's no need for TeX to remember it. Since *nix is not able to do this in general, dealing with this in the changefile as shown seems reasonable. (There are various system-dependent ways to approximate this, but no reliable and portable method is possible.)

On the other hand, filesystems on popular TeX-able OSes of the day (TOPS-20, VAX/VMS, etc.) had both Logical Name and Version Number features, resulting in the need to ask the OS for the full name of the file that actually got opened.

This call to flush_string also causes a non-standard filename extension to be lost when calling the editor. Knuth recommends that implementors avoid this, either via Wolfgang's change that eliminate flushing the string or some other method.

These issues all fall under the rubric of “system wizardry” mentioned in the description of the E option (page B036, module 84).


B274 (module 663): line number ranges may come from different files

From Udo Wermuth, 2017-01-25.

In overfull/underfull box messages, the beginning and ending lines shown might come from different files, and thus be misleading. Suppose we have a file main.tex containing this line:

Main 1\par \input auxone \end

and a file auxone.tex with these two lines:

Aux1 1\par
Aux1 2 bug in underfull message?\break

then running tex main gives:

This is TeX...
(./main.tex (./auxone.tex)
Underfull \hbox (badness 10000) in paragraph at lines 2--1
...

A range of lines that begins after it ends does not make sense. Also, a user will connect both numbers to the file main.tex as it is the only active file; there is a ) after auxone.tex, so this file has been processed.

The situation can also occur in alignments and with overfull messages. It is shown in the trip test.

Response: Indeed, and because it is shown in the trip test, we can conclude that DEK was aware that if you started a paragraph in one source file and ended it in another, then line numbers in messages would be problematic. The attitude was that robustness in the face of this edge case (not a recommended best practice) wasn't worth the extra bytes of memory (both code and data). Especially since the rest of the context is usually clear from the logging of the actual text of the paragraph.


B506 (module 1238): bogus display of bogus dimen, and other overflow

From Bruno Le Floch, 2020-10-22.

Slightly incorrect display of 32768pt dimen: The following shows --32768.0pt with a double minus sign.

\dimen0=\maxdimen
\advance\dimen0\maxdimen
\advance\dimen0 2sp
\showthe\dimen0

Response: Any bug is the lack of an error message at \advance\dimen0\maxdimen, but the lack of overflow checking is pervasive in TeX, and is a deliberate choice by Knuth. The resulting display is a case of GIGO.

Supplement: Overfull \hbox not reported: the lack of complete overflow checking induces strange behavior in other ways. For example, on 2021-05-16, Matteo Caoduro reported that an extremely long line does not generate an overfull box message:

xxx...6298 x's...xxx\end

With 6297 x's, and starting at 24924 x's, there is an overfull box message, but not in between. TeX's errors and warnings are not intended to handle such pathological situations.


B546 (module 1372): output routine braces are super special

From Bruno Le Floch, 2020-10-22.

The \output routine is surrounded by very peculiar braces, and by removing the closing one with \let\next=, one ends up in a black hole where TeX does not interpret any further token. My question and answer on tex.sx describe the strange behaviour. It is probably not a bug as there is an explicit comment “loops forever if reading from a file”. It would be interesting to have a rationale.

Response: The idea here is that “The error message [I can't handle that very well] told you you've made a mess, and if the error message isn't enough, then the help info warns you that error recovery is not likely, and if that's not enough, then you'd better look at Volume B.” Basically, “I can't handle that very well” includes the possibility “I can get in a loop.”

As for a rationale, it would be more or less impossible to generally recover into any sensible state, and certainly not one that would give any reasonable subsequent output. This is a case that normal users of macro packages and documents would never run into, and would need a deep expert's close study to fix, so it is not worth spending time or (precious, at the time) bytes of code on trying to let the job continue, most likely fruitlessly in the end.

For that matter, it's somewhat surprising that this case doesn't just bail out immediately, as with the few

fatal_error("(interwoven alignment preambles are not allowed)")

cases, or the favorite:

confusion("256 spans"); {this can happen, but won't}
both of which are more likely for a real user to run into directly.

D350 (module 788): unused variable m declared

From David Fuchs, 2021-01-24.

This line in mf.web should be removed:

@!m:integer; {the current month}

as the local variable k is used, as correctly commented, by open_log_file for indexing into months, and m is now an unused variable.

Response from DEK: In accordance with the wonderful Japanese tradition of wabi-sabi, I won't be changing that.


dvitype.web (module 36): unnecessary loop condition

From Lucas Mirelman 2021-07-10: In

@<Store character-width indices...@>=
if wp>0 then for k:=width_ptr to wp-1 do

the condition on wp is unnecessary.

Response from DRF: In the declarations:

var k:integer; {index for loops}
...
@!lh:integer; {length of the header data, in four-byte words}
...
@!nw:integer; {number of words in the width table}
...
@!wp:0..max_widths; {new value of |width_ptr| after successful input}

I think k should have been 0..max_widths like wp, and then the suspicious check would make sense. Instead, true to form, DEK saved a word of memory by using k for another loop with the range (0..lh+3) and when someone pointed out lh could be larger than max_widths, it was easier to make k an integer rather than clarify the code a little and use a different index variable for that loop.

For what it's worth, lh and nw should both have had type 0..65535 since they're set to lh:=b2*256+b3; and nw:=b0*256+b1; which are both guaranteed to be in that range.) There's another case just a bit later in DVIType, exactly like the one Lucas pointed out, Anyway, if this were 40 years ago, I'd militate for the changes I suggest above; now it seems ok to leave it be.

Further down the rabbit hole: In quite a number of places in TeX and friends, there's code like this that does seem necessary in order to protect for-loops from having their “to” value be out-of-range for their index variable. I believe that Hedrick and/or Vax/VMS Pascal optionally enforced this when you turned on some runtime checks. But the old Pascal User Manual and Report from back in the day (as well as the more recent ISO/IEC Pascal standard documents) are pretty clear that first you check if the for-loop is going to happen at all, and then you check that the “from” and “to” values are in range. So, perhaps all the guards scattered about Knuth's code were not supposed to be needed, other than to satisfy a too-fussy compiler?


For any discussion about these issues, or further reports to be listed here, please use the contact information on the main TeX bugs page here.


$Date: 2021/07/28 18:18:37 $;
TUG home page; webmaster; facebook; twitter;   (via DuckDuckGo)