zlacker

The Source History of Cat

submitted by janvdb+(OP) on 2018-11-12 22:51:03 | 151 points 62 comments
[view article] [source] [go to bottom]

NOTE: showing posts with links only show all posts
1. zeveb+A4[view] [source] 2018-11-12 23:41:15
>>janvdb+(OP)
> But, if you pull up the manual page for something like grep, you will see that it has not been updated since 2010 (at least on MacOS).

Well, GNU grep was last released 16 months ago, and the last change to its master branch was 4 weeks ago: http://git.savannah.gnu.org/cgit/grep.git

FreeBSD's grep was last updated back in August: https://github.com/freebsd/freebsd/tree/master/usr.bin/grep

OpenBSD's grep was last updated 11 months ago: http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/usr.bin/grep/

Oddly, it looks like the Darwin grep was last updated in 2012: https://opensource.apple.com/source/text_cmds/text_cmds-99/g...

Strange that Apple would be shipping such an ancient grep.

3. mitcht+Y5[view] [source] 2018-11-12 23:57:30
>>janvdb+(OP)
I only read the beginning and end, and I very much like the closing message here.

A tldr of the middle would be cool. Maybe there was a pattern.

I'd like to add another OS not mentioned that will hopefully become a well-appreciated artifact soon too, from Redox OS: https://gitlab.redox-os.org/redox-os/coreutils/blob/master/s...

I can't find it quickly now, but jackpot51 also has a very answer somewhere on Reddit about how their networking stack's DNS query command departs from a commonly deployed C program for Windows and Unix, iirc. fascinating

◧◩
7. driver+na[view] [source] [discussion] 2018-11-13 00:44:22
>>ccanno+ca
It's short for conCATenate.

original man page: http://man.cat-v.org/unix-1st/1/cat

9. Isamu+lc[view] [source] 2018-11-13 01:05:54
>>janvdb+(OP)
Really nice history! I want to applaud the author on this loving treatment.

Also I want to point readers to the commentary of some of the Unix authors:

“Old programs have become encrusted with dubious features. Newer programs are not always written with attention to proper separation of function and design for interconnection.”

http://harmful.cat-v.org/cat-v/unix_prog_design.pdf

My point being: Unix (and derivatives) encompass a set of people who disagree about what constitutes Unix philosophy.

◧◩◪◨
11. LukeSh+fe[view] [source] [discussion] 2018-11-13 01:26:28
>>Isamu+Mc
In the earliest references, it was "concatenate". It wasn't until 7th edition UNIX (1979) that "catenate" was given.

References:

- 1971 draft (pre 1st edition) of the paper that would become the well-known 1974 CACM UNIX paper (earliest documentation on `cat` that I can find): https://www.tuhs.org/Archive/Distributions/Research/McIlroy_... (tune in on page 28)

- 6th edition cat(1) man page (1975): http://man.cat-v.org/unix-6th/1/cat

- 7th edition cat(1) man page (1979): http://man.cat-v.org/unix_7th/1/cat

18. enneff+ym[view] [source] 2018-11-13 03:04:34
>>janvdb+(OP)
The plan9 cat is nice: https://github.com/pete/cats/blob/master/plan9-cat.c
21. akkart+Xp[view] [source] 2018-11-13 03:38:42
>>janvdb+(OP)
Interesting to think what a different conclusion the article would have arrived at if he'd chosen to look at GNU cat on Linux. A few sample points:

* 2002: 833 LoC (http://landley.net/aboriginal/history.html)

* 2013: 36kLoC, 2/3rds of them .h files (https://news.ycombinator.com/item?id=11340510#11341175)

* 2018: 37kLoC of .c file dependencies going into libcoreutils.a and some LoC of .h files (coreutils has 60kLoC of .h files)

The methodology for counting lines likely isn't consistent across those data points. But the trend is still unmistakeable. Maybe I'll tree-shake all the dead code out and come up with an accurate line count one of these days..

◧◩
40. pauldd+9P[view] [source] [discussion] 2018-11-13 10:07:29
>>fanbel+FL
Was it http://www.linuxcertif.com/man/1/dog/ ?
◧◩◪◨
41. JdeBP+kP[view] [source] [discussion] 2018-11-13 10:09:21
>>pjmlp+vO
Not it, them.

* http://jdebp.eu./FGA/operating-system-books.html

One of these days I shall get around to expressing my opinions, which as you can see are still missing. Indeed, the list itself is a decade out of date. (-:

I have some SCO UNIX manuals on the other side of the room as I type this.

54. rain1+ht2[view] [source] 2018-11-14 00:29:03
>>janvdb+(OP)
I think that code bloat, especially in GNU, is a huge problem in our software because it makes programs difficult to maintain, to understand and modify. I feel like most people I interacted with online (present company excepted) don't care about it and don't see it as a problem. I can get that it doesn't affect them because they only use these projects as black boxes and don't maintain them, so it isn't relevant to their work.

I created a wiki page to measure the number of lines of code* of various types of software https://softwarecrisis.miraheze.org/wiki/Linecount - LOC is a very very rough proxy for what I actually want to measure, but the results are so stunning that even a inaccurate indirect measurement tells a lot. You can see that for 2 projects that do essentially the same thing there might be a 1000x difference in LOC.

It's fascinating what can happen to such a simple program like 'cat'. The same effect is amplified further when you look at projects like gcc. I tried to ask the question on a couple sites like stackexchange and reddit why does gcc take half an hour to build instead of a fraction of a second but this question was not taken well. I got a lot of resistance to it, X-Y answers, deleted etc. I don't think that the common software engineer wants to take the idea seriously that the day to day tools we use have a million fold inefficiency built into them by accident. I also noticed that 'make' has no profiler, nobody has even really done a breakdown of what takes how long to build in the gcc tree.

There are a lot of brilliant engineers who understand this problem and want to solve it though. We see that in Alan Kay's STEPS project, aligrudi's work, musl, toybox, maybe sbase and many of the independent bootstrapping projects that have popped up. There's a lot of inertia and weight to the standard GNU toolkit to push back against but I believe these problems are all solvable and by solving them we can create programming languages and tools with leverage far beyond what currently exists. I just hope such projects can be integrated rather than be forgotten.

◧◩
56. loeg+5D2[view] [source] [discussion] 2018-11-14 02:11:43
>>LukeSh+xi
Hah, I learned something about cat today. Thanks.

Amusingly, the BSD socket behavior can be disabled with the compiler macro -DNO_UDOM_SUPPORT, but as far as I can tell it is not documented nor hooked into the rest of the build system in any way since its introduction in 2001:

https://svnweb.freebsd.org/base?view=revision&revision=83482

◧◩◪◨⬒⬓
58. loeg+VD2[view] [source] [discussion] 2018-11-14 02:22:01
>>swoopi+X52
Latin root "con-" ("com-") meaning "with," or "together." As in, "concatenate" means something like, "chain together."

https://www.etymonline.com/word/com-

https://www.etymonline.com/word/concatenate

◧◩◪◨
60. JdeBP+id3[view] [source] [discussion] 2018-11-14 11:25:01
>>loeg+qC2
Tut-tut! So easily demonstratable otherwise.

MacOS:

* https://unix.stackexchange.com/questions/352977/

* https://unix.stackexchange.com/a/398249/5132

The very version of FreeBSD from some years ago:

   % bsdgrep --version
   bsdgrep (BSD grep) 2.5.1-FreeBSD
   % grep --version
   grep (GNU grep) 2.5.1-FreeBSD

   Copyright 1988, 1992-1999, 2000, 2001 Free Software Foundation, Inc.
   This is free software; see the source for copying conditions. There is NO
   warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
   
   %
More on that:

* https://unix.stackexchange.com/a/65609/5132

Kyle Evans and others on making bsdgrep into grep:

* https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=201650

[go to top]