brennen
/
userland-book

<!DOCTYPE html><html lang=en><head>  <meta charset="utf-8">  <title>userland: a book about the command line for humans</title>  <link rel=stylesheet href="userland.css" />  <link rel="alternate" type="application/atom+xml" title="changes" href="//p1k3.com/userland-book/feed.xml" />  <script src="js/jquery.js" type="text/javascript"></script></head>
<body>
<h1 class=bigtitle>userland</h1><hr />
<h1><a name=a-book-about-the-command-line-for-humans href=#a-book-about-the-command-line-for-humans>#</a> a book about the command line for humans</h1>
<p>In the fall of 2013, <a href="//p1k3.com/2013/8/4">thinking about</a> text utilities gotme thinking in turn about how my writing habits depend on the Linux commandline.  This seems like a good hook for explaining some tools I use every day,so now I&rsquo;m writing a short, haphazard book.</p>
<p>This isn&rsquo;t a book about system administration, writing complex software, orbecoming a wizard.  I am not a wizard, and I don&rsquo;t subscribe to the idea thatwizardry is required to use these tools.  In fact, I barely know what I&rsquo;m doingmost of the time.  I still get some stuff done.</p>
<p>This is a work in progress.  It probably gets some stuff wrong.</p>
<p>&ndash; bpb / <a href="https://p1k3.com">p1k3</a> / <a href="https://twitter.com/brennen">@brennen</a></p>
<div class=details>  <h2 class=clicker><a name=contents href=#contents>#</a> contents</h2>  <div class=full>    <div class=contents><ul><li><a href="#a-book-about-the-command-line-for-humans">a book about the command line for humans</a>
<ul><li><a href="#contents">contents</a></li></ul></li><li><a href="#get-you-a-shell">0. get you a shell</a>
<ul><li><a href="#get-an-account-on-a-social-unix-server">get an account on a social unix server</a></li><li><a href="#use-a-raspberry-pi-or-beaglebone">use a raspberry pi or beaglebone</a></li><li><a href="#use-a-virtual-machine">use a virtual machine</a></li></ul></li><li><a href="#the-command-line-as-literary-environment">1. the command line as literary environment</a>
<ul><li><a href="#terms-and-definitions">terms and definitions</a></li><li><a href="#twisty-little-passages">twisty little passages</a></li><li><a href="#cat">cat</a></li><li><a href="#wildcards">wildcards</a></li><li><a href="#sort">sort</a></li><li><a href="#options">options</a></li><li><a href="#uniq">uniq</a></li><li><a href="#standard-IO">standard IO</a></li><li><a href="#code-help-code-and-man-pages"><code>&ndash;help</code> and man pages</a></li><li><a href="#wc">wc</a></li><li><a href="#head-tail-and-cut">head, tail, and cut</a></li><li><a href="#tab-separated-values">tab separated values</a></li><li><a href="#finding-text-grep">finding text: grep</a></li><li><a href="#now-you-have-n-problems">now you have n problems</a></li></ul></li><li><a href="#a-literary-problem">2. a literary problem</a></li><li><a href="#programmerthink">3. programmerthink</a></li><li><a href="#script">4. script</a>
<ul><li><a href="#learn-you-an-editor">learn you an editor</a></li><li><a href="#d-i-y-utilities">d.i.y. utilities</a></li><li><a href="#heavy-lifting">heavy lifting</a></li><li><a href="#generality">generality</a></li></ul></li><li><a href="#general-purpose-programmering">5. general purpose programmering</a></li><li><a href="#one-of-these-things-is-not-like-the-others">6. one of these things is not like the others</a>
<ul><li><a href="#diff">diff</a></li><li><a href="#wdiff">wdiff</a></li></ul></li><li><a href="#the-command-line-as-as-a-shared-world">7. the command line as as a shared world</a></li><li><a href="#the-command-line-and-the-web">8. the command line and the web</a></li><li><a href="#a-miscellany-of-tools-and-techniques">9. a miscellany of tools and techniques</a>
<ul><li><a href="#dict">dict</a></li><li><a href="#aspell">aspell</a></li><li><a href="#mostcommon">mostcommon</a></li><li><a href="#cal-and-ncal">cal and ncal</a></li><li><a href="#seq">seq</a></li><li><a href="#shuf">shuf</a></li><li><a href="#ptx">ptx</a></li><li><a href="#figlet">figlet</a></li><li><a href="#cowsay">cowsay</a></li></ul></li><li><a href="#endmatter">endmatter</a>
<ul><li><a href="#further-reading">further reading</a></li><li><a href="#code">code</a></li><li><a href="#copying">copying</a></li></ul></li></ul>
</div>  </div></div>

<hr />
<h1><a name=get-you-a-shell href=#get-you-a-shell>#</a> 0. get you a shell</h1>
<p>You don&rsquo;t have to have a shell at hand to get something out of this book.Still, as with most practical subjects, you&rsquo;ll learn more if you try things outas you go.  You shouldn&rsquo;t feel guilty about skipping this section.  It willalways be here later if you need it.</p>
<p>Not so long ago, it was common for schools and ISPs to hand out shell accountson big shared systems.  People learned the command line as a side effect ofreading their e-mail.</p>
<p>That doesn&rsquo;t happen as often now, but in the meanwhile computers have becomerelatively cheap and free software is abundant.  If you&rsquo;re reading this on theweb, you can probably get access to a shell.  Some options follow.</p>
<h2><a name=get-an-account-on-a-social-unix-server href=#get-an-account-on-a-social-unix-server>#</a> get an account on a social unix server</h2>
<p>Check out <a href="https://tilde.town/">tilde.town</a>:</p>
<blockquote><p>tilde.town is an intentional digital community for making art, socializing, andlearning. Unlike many online spaces, users interact with tilde.town through adirect connection instead of a web site. This means using a tool called ssh andother text based tools.</p></blockquote>
<h2><a name=use-a-raspberry-pi-or-beaglebone href=#use-a-raspberry-pi-or-beaglebone>#</a> use a raspberry pi or beaglebone</h2>
<p>Do you have a single-board computer laying around?  Perfect.  If you alreadyrun the standard Raspbian, Debian on a BeagleBone, or a similar-enough Linux,you don&rsquo;t need much else.  I wrote most of this text on a Raspberry Pi, and theexample commands should all work there.</p>
<h2><a name=use-a-virtual-machine href=#use-a-virtual-machine>#</a> use a virtual machine</h2>
<p>A few options:</p>
<ul><li><a href="https://docs.vagrantup.com/v2/getting-started/index.html">Use Vagrant to spin up a machine in Virtualbox</a></li><li><a href="https://www.digitalocean.com/community/tutorials/how-to-create-your-first-digitalocean-droplet-virtual-server">Use DigitalOcean to create a remotely-hosted VM running Linux</a></li></ul>

<hr />
<h1><a name=the-command-line-as-literary-environment href=#the-command-line-as-literary-environment>#</a> 1. the command line as literary environment</h1>
<p>There&rsquo;re a lot of ways to structure an introduction to the command line.  I&rsquo;mgoing to start with writing as a point of departure because, aside from webdevelopment, it&rsquo;s what I use a computer for most.  I want to shine a light onthe humane potential of ideas that are usually understood as nerd trivia.Computers have utterly transformed the practice of writing within the space ofmy lifetime, but it seems to me that writers as a class miss out on many of thesoftware tools and patterns taken as a given in more &ldquo;technical&rdquo; fields.</p>
<p>Writing, particularly writing of any real scope or complexity, is very much atechnical task.  It makes demands, both physical and psychological, of itspractitioners.  As with woodworkers, graphic artists, and farmers, writersexhibit strong preferences in their tools, materials, and environment, and theydo so because they&rsquo;re engaged in a physically and cognitively challenging task.</p>
<p>My thesis is that the modern Linux command line is a pretty good environmentfor working with English prose and prosody, and that maybe this will illuminatethe ways it could be useful in your own work with a computer, whatever thatwork happens to be.</p>
<h2><a name=terms-and-definitions href=#terms-and-definitions>#</a> terms and definitions</h2>
<p>What software are we actually talking about when we say &ldquo;the command line&rdquo;?</p>
<p>For the purposes of this discussion, we&rsquo;re talking about an environment builton a very old paradigm called Unix.</p>
<p style="text-align:center;"> <img src="images/jp_unix.jpg" height=320 width=470></p>
<p>&hellip;except what classical Unix really looks like is this:</p>
<p style="text-align:center;"> <img src="images/blinking.gif" width=470></p>
<p>The Unix-like environment we&rsquo;re going to use isn&rsquo;t very classical, really.It&rsquo;s an operating system kernel called Linux, combined with a bunch of thingswritten by other people (people in the GNU and Debian projects, and manyothers).  Purists will tell you that this isn&rsquo;t properly Unix at all.  Instrict historical terms they&rsquo;re right, or at least a certain kind of right, butfor the purposes of my cultural agenda I&rsquo;m going to ignore them right now.</p>
<p style="text-align:center;"> <img src="images/debian.png"></p>
<p>This is what&rsquo;s called a shell.  There are many different shells, but theypretty much all operate on the same idea:  You navigate a filesystem and runprograms by typing commands.  Commands can be combined in various ways to makeprograms of their own, and in fact the way you use the computer is often justto write little programs that invoke other programs, turtles-all-the-way-downstyle.</p>
<p>The standard shell these days is something called Bash, so we&rsquo;ll use Bash.It&rsquo;s what you&rsquo;ll most often see in the wild.  Like most shells, Bash is uglyand stupid in more ways than it is possible to easily summarize.  It&rsquo;s also anincredibly powerful and expressive piece of software.</p>
<h2><a name=twisty-little-passages href=#twisty-little-passages>#</a> twisty little passages</h2>
<p>Have you ever played a text-based adventure game or MUD, of the kind thatdescribes a setting and takes commands for movement and so on?  Readers of acertain age and temperament might recognize the opening of Crowther &amp; Woods'<em>Adventure</em>, the great-granddaddy of text adventure games:</p>
<pre><code>YOU ARE STANDING AT THE END OF A ROAD BEFORE A SMALL BRICK BUILDING.AROUND YOU IS A FOREST.  A SMALL STREAM FLOWS OUT OF THE BUILDING ANdDOWN A GULLY.
&gt; GO EAST
YOU ARE INSIDE A BUILDING, A WELL HOUSE FOR A LARGE SPRING.
THERE ARE SOME KEYS ON THE GROUND HERE.
THERE IS A SHINY BRASS LAMP NEARBY.
THERE IS FOOD HERE.
THERE IS A BOTTLE OF WATER HERE.</code></pre>
<p>You can think of the shell as a kind of environment you inhabit, in much theway your character inhabits an adventure game.  The difference is that insteadof navigating around virtual rooms and hallways with commands like <code>LOOK</code> and<code>EAST</code>, you navigate between directories by typing commands like <code>ls</code> and <code>cdnotes</code>:</p>
<pre><code>$ lscode  Downloads  notes  p1k3  photos  scraps  userland-book$ cd notes$ lsnotes.txt  sparkfun  TODO.txt</code></pre>
<p><code>ls</code> lists files.  Some files are directories, which means they can containother files, and you can step inside of them by typing <code>cd</code> (for <strong>c</strong>hange<strong>d</strong>irectory).</p>
<p>In the Macintosh and Windows world, directories have been called&ldquo;folders&rdquo; for a long time now.  This isn&rsquo;t the <em>worst</em> metaphor for what&rsquo;sgoing on, and it&rsquo;s so pervasive by now that it&rsquo;s not worth fighting about.It&rsquo;s also not exactly a <em>great</em> metaphor, since computer filesystems aren&rsquo;tbuilt very much like the filing cabinets of yore.  A directory acts a lot likea container of some sort, but it&rsquo;s an infinitely expandable one which maycontain nested sub-spaces much larger than itself.  Directories are frequentlylike the TARDIS: Bigger on the inside.</p>
<h2><a name=cat href=#cat>#</a> cat</h2>
<p>When you&rsquo;re in the shell, you have many tools at your disposal - programs thatcan be used on many different files, or chained together with other programs.They tend to have weird, cryptic names, but a lot of them do very simplethings.  Tasks that might be a menu item in a big program like Word, likecounting the number of words in a document or finding a particular phrase, areoften programs unto themselves.  We&rsquo;ll start with something even more basicthan that.</p>
<p>Suppose you have some files, and you&rsquo;re curious what&rsquo;s in them.  For example,suppose you&rsquo;ve got a list of authors you&rsquo;re planning to reference, and you justwant to check its contents real quick-like.  This is where our friend <code>cat</code>comes in:</p>
<!-- exec -->

<pre><code>$ cat authors_sffUrsula K. Le GuinJo WaltonPat CadiganJohn Ronald Reuel TolkienVanessa VeselkaJames Tiptree, Jr.John Brunner</code></pre>
<!-- end -->

<p>&ldquo;Why,&rdquo; you might be asking, &ldquo;is the command to dump out the contents of a fileto a screen called <code>cat</code>?  What do felines have to do with anything?&rdquo;</p>
<p>It turns out that <code>cat</code> is actually short for &ldquo;catenate&rdquo;, which is a longword basically meaning &ldquo;stick things together&rdquo;.  In programming, we usuallyrefer to sticking two bits of text together as &ldquo;string concatenation&rdquo;, probablybecause programmers like to feel like they&rsquo;re being very precise about verysimple actions.</p>
<p>Suppose you wanted to see the contents of a <em>set</em> of author lists:</p>
<!-- exec -->

<pre><code>$ cat authors_sff authors_contemporary_fic authors_nat_histUrsula K. Le GuinJo WaltonPat CadiganJohn Ronald Reuel TolkienVanessa VeselkaJames Tiptree, Jr.John BrunnerEden RobinsonVanessa VeselkaMiriam ToewsGwendolyn L. Waring</code></pre>
<!-- end -->

<h2><a name=wildcards href=#wildcards>#</a> wildcards</h2>
<p>We&rsquo;re working with three filenames: <code>authors_sff</code>, <code>authors_contemporary_fic</code>,and <code>authors_nat_hist</code>.  That&rsquo;s an awful lot of typing every time we want to dosomething to all three files.  Fortunately, our shell offers a shorthand for&ldquo;all the files that start with <code>authors_</code>&rdquo;:</p>
<!-- exec -->

<pre><code>$ cat authors_*Eden RobinsonVanessa VeselkaMiriam ToewsGwendolyn L. WaringUrsula K. Le GuinJo WaltonPat CadiganJohn Ronald Reuel TolkienVanessa VeselkaJames Tiptree, Jr.John Brunner</code></pre>
<!-- end -->

<p>In Bash-land, <code>*</code> basically means &ldquo;anything&rdquo;, and is known in the vernacular,somewhat poetically, as a &ldquo;wildcard&rdquo;.  You should always be careful withwildcards, especially if you&rsquo;re doing anything destructive.  They can and willsurprise the unwary.  Still, once you&rsquo;re used to the idea, they will save you alot of RSI.</p>
<h2><a name=sort href=#sort>#</a> sort</h2>
<p>There&rsquo;s a problem here.  Our author list is out of order, and thus confusing toreference.  Fortunately, since one of the most basic things you can do to alist is to sort it, someone else has already solved this problem for us.Here&rsquo;s a command that will give us some organization:</p>
<!-- exec -->

<pre><code>$ sort authors_*Eden RobinsonGwendolyn L. WaringJames Tiptree, Jr.John BrunnerJohn Ronald Reuel TolkienJo WaltonMiriam ToewsPat CadiganUrsula K. Le GuinVanessa VeselkaVanessa Veselka</code></pre>
<!-- end -->

<p>Does it bother you that they aren&rsquo;t sorted by last name?  Me too.  As a partialsolution, we can ask <code>sort</code> to use the second &ldquo;field&rdquo; in each line as its sort<strong>k</strong>ey (by default, sort treats whitespace as a division between fields):</p>
<!-- exec -->

<pre><code>$ sort -k2 authors_*John BrunnerPat CadiganUrsula K. Le GuinGwendolyn L. WaringEden RobinsonJohn Ronald Reuel TolkienJames Tiptree, Jr.Miriam ToewsVanessa VeselkaVanessa VeselkaJo Walton</code></pre>
<!-- end -->

<p>That&rsquo;s closer, right?  It sorted on &ldquo;Cadigan&rdquo; and &ldquo;Veselka&rdquo; instead of &ldquo;Pat&rdquo;and &ldquo;Vanessa&rdquo;.  (Of course, it&rsquo;s still far from perfect, because thesecond field in each line isn&rsquo;t necessarily the person&rsquo;s last name.)</p>
<h2><a name=options href=#options>#</a> options</h2>
<p>Above, when we wanted to ask <code>sort</code> to behave differently, we gave it what isknown as an option.  Most programs with command-line interfaces will allowtheir behavior to be changed by adding various options.  Options usually(but not always!) look like <code>-o</code> or <code>--option</code>.</p>
<p>For example, if we wanted to see just the unique lines, irrespective of case,for a file called colors:</p>
<!-- exec -->

<pre><code>$ cat colorsREDblueredBLUEGreengreenGREEN</code></pre>
<!-- end -->

<p>We could write this:</p>
<!-- exec -->

<pre><code>$ sort -uf colorsblueGreenRED</code></pre>
<!-- end -->

<p>Here <code>-u</code> stands for <strong>u</strong>nique and <code>-f</code> stands for <strong>f</strong>old case, which meansto treat upper- and lower-case letters as the same for comparison purposes.  You&rsquo;lloften see a group of short options following the <code>-</code> like this.</p>
<h2><a name=uniq href=#uniq>#</a> uniq</h2>
<p>Did you notice how Vanessa Veselka shows up twice in our list of authors?That&rsquo;s useful if we want to remember that she&rsquo;s in more than one category, butit&rsquo;s redundant if we&rsquo;re just worried about membership in the overall set ofauthors.  We can make sure our list doesn&rsquo;t contain repeating lines by using<code>sort</code>, just like with that list of colors:</p>
<!-- exec -->

<pre><code>$ sort -u -k2 authors_*John BrunnerPat CadiganUrsula K. Le GuinGwendolyn L. WaringEden RobinsonJohn Ronald Reuel TolkienJames Tiptree, Jr.Miriam ToewsVanessa VeselkaJo Walton</code></pre>
<!-- end -->

<p>But there&rsquo;s another approach to this &mdash; <code>sort</code> is good at only displaying a lineonce, but suppose we wanted to see a count of how many different lists anauthor shows up on?  <code>sort</code> doesn&rsquo;t do that, but a command called <code>uniq</code> does,if you give it the option <code>-c</code> for <strong>c</strong>ount.</p>
<p><code>uniq</code> moves through the lines in its input, and if it sees a line more thanonce in sequence, it will only print that line once.  If you have a bunch offiles and you just want to see the unique lines across all of those files, youprobably need to run them through <code>sort</code> first.  How do you do that?</p>
<!-- exec -->

<pre><code>$ sort authors_* | uniq -c      1 Eden Robinson      1 Gwendolyn L. Waring      1 James Tiptree, Jr.      1 John Brunner      1 John Ronald Reuel Tolkien      1 Jo Walton      1 Miriam Toews      1 Pat Cadigan      1 Ursula K. Le Guin      2 Vanessa Veselka</code></pre>
<!-- end -->

<h2><a name=standard-IO href=#standard-IO>#</a> standard IO</h2>
<p>The <code>|</code> is called a &ldquo;pipe&rdquo;.  In the command above, it tells your shell thatinstead of printing the output of <code>sort authors_*</code> right to your terminal, itshould send it to <code>uniq -c</code>.</p>
<p style="text-align:center;"> <img src="images/pipe.gif"></p>
<p>Pipes are some of the most important magic in the shell.  When the people whobuilt Unix in the first place give interviews about the stuff they rememberfrom the early days, a lot of them reminisce about the invention of pipes andall of the new stuff it immediately made possible.</p>
<p>Pipes help you control a thing called &ldquo;standard IO&rdquo;.  In the world of thecommand line, programs take <strong>i</strong>nput and produce <strong>o</strong>utput.  A pipe is a wayto hook the output from one program to the input of another.</p>
<p>Unlike a lot of the weirdly named things you&rsquo;ll encounter in software, themetaphor here is obvious and makes pretty good sense.  It even kind of lookslike a physical pipe.</p>
<p>What if, instead of sending the output of one program to the input of another,you&rsquo;d like to store it in a file for later use?</p>
<p>Check it out:</p>
<!-- exec -->

<pre><code>$ sort authors_* | uniq &gt; ./all_authors</code></pre>
<!-- end -->


<!-- exec -->

<pre><code>$ cat all_authorsEden RobinsonGwendolyn L. WaringJames Tiptree, Jr.John BrunnerJohn Ronald Reuel TolkienJo WaltonMiriam ToewsPat CadiganUrsula K. Le GuinVanessa Veselka</code></pre>
<!-- end -->

<p>I like to think of the <code>&gt;</code> as looking like a little funnel.  It can bedangerous &mdash; you should always make sure that you&rsquo;re not going to clobberan existing file you actually want to keep.</p>
<p>If you want to tack more stuff on to the end of an existing file, you can use<code>&gt;&gt;</code> instead.  To test that, let&rsquo;s use <code>echo</code>, which prints out whatever stringyou give it on a line by itself:</p>
<!-- exec -->

<pre><code>$ echo 'hello' &gt; hello_world</code></pre>
<!-- end -->


<!-- exec -->

<pre><code>$ echo 'world' &gt;&gt; hello_world</code></pre>
<!-- end -->


<!-- exec -->

<pre><code>$ cat hello_worldhelloworld</code></pre>
<!-- end -->

<p>You can also take a file and pull it directly back into the input of a givenprogram, which is a bit like a funnel going the other direction:</p>
<!-- exec -->

<pre><code>$ nl &lt; all_authors     1  Eden Robinson     2  Gwendolyn L. Waring     3  James Tiptree, Jr.     4  John Brunner     5  John Ronald Reuel Tolkien     6  Jo Walton     7  Miriam Toews     8  Pat Cadigan     9  Ursula K. Le Guin    10  Vanessa Veselka</code></pre>
<!-- end -->

<p><code>nl</code> is just a way to <strong>n</strong>umber <strong>l</strong>ines.  This command accomplishes pretty muchthe same thing as <code>cat all_authors | nl</code>, or <code>nl all_authors</code>.  You won&rsquo;t seeit used as often as <code>|</code> and <code>&gt;</code>, since most utilities can read files on theirown, but it can save you typing <code>cat</code> quite as often.</p>
<p>We&rsquo;ll use these features liberally from here on out.</p>
<h2><a name=code-help-code-and-man-pages href=#code-help-code-and-man-pages>#</a> <code>--help</code> and man pages</h2>
<p>You can change the behavior of most tools by giving them different options.This is all well and good if you already know what options are available,but what if you don&rsquo;t?</p>
<p>Often, you can ask the tool itself:</p>
<pre><code>$ sort --helpUsage: sort [OPTION]... [FILE]...  or:  sort [OPTION]... --files0-from=FWrite sorted concatenation of all FILE(s) to standard output.
Mandatory arguments to long options are mandatory for short options too.Ordering options:
  -b, --ignore-leading-blanks  ignore leading blanks  -d, --dictionary-order      consider only blanks and alphanumeric characters  -f, --ignore-case           fold lower case to upper case characters  -g, --general-numeric-sort  compare according to general numerical value  -i, --ignore-nonprinting    consider only printable characters  -M, --month-sort            compare (unknown) &lt; 'JAN' &lt; ... &lt; 'DEC'  -h, --human-numeric-sort    compare human readable numbers (e.g., 2K 1G)  -n, --numeric-sort          compare according to string numerical value  -R, --random-sort           sort by random hash of keys      --random-source=FILE    get random bytes from FILE  -r, --reverse               reverse the result of comparisons</code></pre>
<p>&hellip;and so on.  (It goes on for a while in this vein.)</p>
<p>If that doesn&rsquo;t work, or doesn&rsquo;t provide enough info, the next thing to try iscalled a man page.  (&ldquo;man&rdquo; is short for &ldquo;manual&rdquo;.  It&rsquo;s sort of an unfortunateabbreviation.)</p>
<pre><code>$ man sort
SORT(1)                         User Commands                        SORT(1)


NAME       sort - sort lines of text files
SYNOPSIS       sort [OPTION]... [FILE]...       sort [OPTION]... --files0-from=F
DESCRIPTION       Write sorted concatenation of all FILE(s) to standard output.</code></pre>
<p>&hellip;and so on.  Manual pages vary in quality, and it can take a while to getused to reading them, but they&rsquo;re very often the best place to look for help.</p>
<p>If you&rsquo;re not sure what <em>program</em> you want to use to solve a given problem, youmight try searching all the man pages on the system for a keyword.  <code>man</code>itself has an option to let you do this - <code>man -k keyword</code> - but most systemsalso have a shortcut called <code>apropos</code>, which I like to use because it&rsquo;s easy toremember if you imagine yourself saying &ldquo;apropos of [some problem I have]&hellip;&rdquo;</p>
<!-- exec -->

<pre><code>$ apropos -s1 sortapt-sortpkgs (1)     - Utility to sort package index filesbunzip2 (1)          - a block-sorting file compressor, v1.0.6bzip2 (1)            - a block-sorting file compressor, v1.0.6comm (1)             - compare two sorted files line by linesort (1)             - sort lines of text filestsort (1)            - perform topological sort</code></pre>
<!-- end -->

<p>It&rsquo;s useful to know that the manual represented by <code>man</code> has numbered sectionsfor different kinds of manual pages.  Most of what the average user needs toknow about lives in section 1, &ldquo;User Commands&rdquo;, so you&rsquo;ll often see the namesof different tools written like <code>sort(1)</code> or <code>cat(1)</code>.  This can be a good wayto make it clear in writing that you&rsquo;re talking about a specific piece ofsoftware rather than a verb or a small carnivorous mammal.  (I specified <code>-s1</code>for section 1 above just to cut down on clutter, though in practice I usuallydon&rsquo;t bother.)</p>
<p>Like other literary traditions, Unix is littered with this sort of convention.This one just happens to date from a time when the manual was still a physicalbook.</p>
<h2><a name=wc href=#wc>#</a> wc</h2>
<p><code>wc</code> stands for <strong>w</strong>ord <strong>c</strong>ount.  It does about what you&rsquo;d expect - itcounts the number of words in its input.</p>
<pre><code>$ wc index.md  736  4117 24944 index.md</code></pre>
<p>736 is the number of lines, 4117 the number of words, and 24944 the number ofcharacters in the file I&rsquo;m writing right now.  I use this constantly.  Mostobviously, it&rsquo;s a good way to get an idea of how much you&rsquo;ve written.  <code>wc</code> isthe tool I used to track my progress the last time I tried National NovelWriting Month:</p>
<pre><code>$ find ~/p1k3/archives/2010/11 -regextype egrep -regex '.*([0-9]+|index)' -type f | xargs wc -w | tail -1 6585 total</code></pre>
<!-- exec -->

<pre><code>$ cowsay 'embarrassing.' _______________&lt; embarrassing. &gt; ---------------        \   ^__^         \  (oo)\_______            (__)\       )\/\                ||----w |                ||     ||</code></pre>
<!-- end -->

<p>Anyway.  The less obvious thing about <code>wc</code> is that you can use it to count theoutput of other commands.  Want to know <em>how many</em> unique authors we have?</p>
<!-- exec -->

<pre><code>$ sort authors_* | uniq | wc -l10</code></pre>
<!-- end -->

<p>This kind of thing is trivial, but it comes in handy more often than you mightthink.</p>
<h2><a name=head-tail-and-cut href=#head-tail-and-cut>#</a> head, tail, and cut</h2>
<p>Remember our old pal <code>cat</code>, which just splats everything it&rsquo;s given back tostandard output?</p>
<p>Sometimes you&rsquo;ve got a piece of output that&rsquo;s more than you actually want todeal with at once.  Maybe you just want to glance at the first few lines in afile:</p>
<!-- exec -->

<pre><code>$ head -3 colorsREDbluered</code></pre>
<!-- end -->

<p>&hellip;or maybe you want to see the last thing in a list:</p>
<!-- exec -->

<pre><code>$ sort colors | uniq -i | tail -1red</code></pre>
<!-- end -->

<p>&hellip;or maybe you&rsquo;re only interested in the first &ldquo;field&rdquo; in some list. You mightuse <code>cut</code>  here, asking it to treat spaces as delimiters between fields andreturn only the first field for each line of its input:</p>
<!-- exec -->

<pre><code>$ cut -d' ' -f1 ./authors_*EdenVanessaMiriamGwendolynUrsulaJoPatJohnVanessaJamesJohn</code></pre>
<!-- end -->

<p>Suppose we&rsquo;re curious what the few most commonly occurring first names on ourauthor list are?  Here&rsquo;s an approach, silly but effective, that combines a lotof what we&rsquo;ve discussed so far and looks like plenty of one-liners I wind upwriting in real life:</p>
<!-- exec -->

<pre><code>$ cut -d' ' -f1 ./authors_* | sort | uniq -ci | sort -n | tail -3      1 Ursula      2 John      2 Vanessa</code></pre>
<!-- end -->

<p>Let&rsquo;s walk through this one step by step:</p>
<p>First, we have <code>cut</code> extract the first field of each line in our author lists.</p>
<pre><code>cut -d' ' -f1 ./authors_*</code></pre>
<p>Then we sort these results</p>
<pre><code>| sort</code></pre>
<p>and pass them to <code>uniq</code>, asking it for a case-insensitive count of eachrepeated line</p>
<pre><code>| uniq -ci</code></pre>
<p>then sort again, numerically,</p>
<pre><code>| sort -n</code></pre>
<p>and finally, we chop off everything but the last three lines:</p>
<pre><code>| tail -3</code></pre>
<p>If you wanted to make sure to count an individual author&rsquo;s first nameonly once, even if that author appears more than once in the files,you could instead do:</p>
<!-- exec -->

<pre><code>$ sort -u ./authors_* | cut -d' ' -f1 | uniq -ci | sort -n | tail -3      1 Ursula      1 Vanessa      2 John</code></pre>
<!-- end -->

<h2><a name=tab-separated-values href=#tab-separated-values>#</a> tab separated values</h2>
<p>Notice above how we had to tell <code>cut</code> that &ldquo;fields&rdquo; in <code>authors_*</code> aredelimited by spaces?  It turns out that if you don&rsquo;t use <code>-d</code>, <code>cut</code> defaultsto using tab characters for a delimiter.</p>
<p>Tab characters are sort of weird little animals.  You can&rsquo;t usually <em>see</em> themdirectly &mdash; they&rsquo;re like a space character that takes up more than one spacewhen displayed.  By convention, one tab is usually rendered as 8 spaces, butit&rsquo;s up to the software that&rsquo;s displaying the character what it wants to do.</p>
<p>(In fact, it&rsquo;s more complicated than that:  Tabs are often rendered as marking<em>tab stops</em>, which is a concept I remember from 7th grade typing classes, buthaven&rsquo;t actually thought about in my day-to-day life for nearly 20 years.)</p>
<p>Here&rsquo;s a version of our <code>all_authors</code> that&rsquo;s been rearranged so that the firstfield is the author&rsquo;s last name, the second is their first name, the third istheir middle name or initial (if we know it) and the fourth is any suffix.Fields are separated by a single tab character:</p>
<!-- exec -->

<pre><code>$ cat all_authors.tsvRobinson    EdenWaring  Gwendolyn   L.Tiptree James       Jr.Brunner JohnTolkien John    Ronald ReuelWalton  JoToews   MiriamCadigan PatLe Guin Ursula  K.Veselka Vanessa</code></pre>
<!-- end -->

<p>That looks kind of garbled, right?  In order to make it a little more obviouswhat&rsquo;s happening, let&rsquo;s use <code>cat -T</code>, which displays tab characters as <code>^I</code>:</p>
<!-- exec -->

<pre><code>$ cat -T all_authors.tsvRobinson^IEdenWaring^IGwendolyn^IL.Tiptree^IJames^I^IJr.Brunner^IJohnTolkien^IJohn^IRonald ReuelWalton^IJoToews^IMiriamCadigan^IPatLe Guin^IUrsula^IK.Veselka^IVanessa</code></pre>
<!-- end -->

<p>It looks odd when displayed because some names are at or nearly at 8 characters long.&ldquo;Robinson&rdquo;, at 8 characters, overshoots the first tab stop, so &ldquo;Eden&rdquo; gets indentedfurther than other first names, and so on.</p>
<p>Fortunately, in order to make this more human-readable, we can pass it through<code>expand</code>, which turns tabs into a given number of spaces (8 by default):</p>
<!-- exec -->

<pre><code>$ expand -t14 all_authors.tsvRobinson      EdenWaring        Gwendolyn     L.Tiptree       James                       Jr.Brunner       JohnTolkien       John          Ronald ReuelWalton        JoToews         MiriamCadigan       PatLe Guin       Ursula        K.Veselka       Vanessa</code></pre>
<!-- end -->

<p>Now it&rsquo;s easy to sort by last name:</p>
<!-- exec -->

<pre><code>$ sort -k1 all_authors.tsv | expand -t14Brunner       JohnCadigan       PatLe Guin       Ursula        K.Robinson      EdenTiptree       James                       Jr.Toews         MiriamTolkien       John          Ronald ReuelVeselka       VanessaWalton        JoWaring        Gwendolyn     L.</code></pre>
<!-- end -->

<p>Or just extract middle names and initials:</p>
<!-- exec -->

<pre><code>$ cut -f3 all_authors.tsv
L.

Ronald Reuel


K.</code></pre>
<!-- end -->

<p>It probably won&rsquo;t surprise you to learn that there&rsquo;s a corresponding <code>paste</code>command, which takes two or more files and stitches them together with tabcharacters.  Let&rsquo;s extract a couple of things from our author list and put themback together in a different order:</p>
<!-- exec -->

<pre><code>$ cut -f1 all_authors.tsv &gt; lastnames</code></pre>
<!-- end -->


<!-- exec -->

<pre><code>$ cut -f2 all_authors.tsv &gt; firstnames</code></pre>
<!-- end -->


<!-- exec -->

<pre><code>$ paste firstnames lastnames | sort -k2 | expand -t12John        BrunnerPat         CadiganUrsula      Le GuinEden        RobinsonJames       TiptreeMiriam      ToewsJohn        TolkienVanessa     VeselkaJo          WaltonGwendolyn   Waring</code></pre>
<!-- end -->

<p>As these examples show, TSV is something very like a primitive spreadsheet:  Away to represent information in columns and rows.  In fact, it&rsquo;s a close cousinof CSV, which is often used as a lowest-common-denominator format fortransferring spreadsheets, and which represents data something like this:</p>
<pre><code>last,first,middle,suffixTolkien,John,Ronald Reuel,Tiptree,James,,Jr.</code></pre>
<p>The advantage of tabs is that they&rsquo;re supported by a bunch of the standardtools.  A disadvantage is that they&rsquo;re kind of ugly and can be weird to dealwith, but they&rsquo;re useful anyway, and character-delimited rows are often agood-enough way to hack your way through problems that call for basicstructure.</p>
<h2><a name=finding-text-grep href=#finding-text-grep>#</a> finding text: grep</h2>
<p>After all those contortions, what if you actually just want to see <em>which lists</em>an individual author appears on?</p>
<!-- exec -->

<pre><code>$ grep 'Vanessa' ./authors_*./authors_contemporary_fic:Vanessa Veselka./authors_sff:Vanessa Veselka</code></pre>
<!-- end -->

<p><code>grep</code> takes a string to search for and, optionally, a list of files to searchin.   If you don&rsquo;t specify files, it&rsquo;ll look through standard input instead:</p>
<!-- exec -->

<pre><code>$ cat ./authors_* | grep 'Vanessa'Vanessa VeselkaVanessa Veselka</code></pre>
<!-- end -->

<p>Most of the time, piping the output of <code>cat</code> to <code>grep</code> is considered silly,because <code>grep</code> knows how to find things in files on its own.  Many thousands ofwords have been written on this topic by leading lights of the nerd community.</p>
<p>You&rsquo;ve probably noticed that this result doesn&rsquo;t contain filenames (and thusisn&rsquo;t very useful to us).  That&rsquo;s because all <code>grep</code> saw was the lines in thefiles, not the names of the files themselves.</p>
<h2><a name=now-you-have-n-problems href=#now-you-have-n-problems>#</a> now you have n problems</h2>
<p>To close out this introductory chapter, let&rsquo;s spend a little time on a topicthat will likely vex, confound, and (occasionally) delight you for as long asyou are acquainted with the command line.</p>
<p>When I was talking about <code>grep</code> a moment ago, I fudged the details more than alittle by saying that it expects a string to search for.  What <code>grep</code><em>actually</em> expects is a <em>pattern</em>.  Moreover, it expects a specific kind ofpattern, what&rsquo;s known as a <em>regular expression</em>, a cumbersome phrase frequentlyshortened to regex.</p>
<p>There&rsquo;s a lot of theory about what makes up a regular expression.  Fortunately,very little of it matters to the short version that will let you get usefulstuff done.  The short version is that a regex is like using wildcards in theshell to match groups of files, but for text in general and with more magic.</p>
<!-- exec -->

<pre><code>$ grep 'Jo.*' ./authors_*./authors_sff:Jo Walton./authors_sff:John Ronald Reuel Tolkien./authors_sff:John Brunner</code></pre>
<!-- end -->

<p>The pattern <code>Jo.*</code> says that we&rsquo;re looking for lines which contain a literal<code>Jo</code>, followed by any quantity (including none) of any character.  In a regex,<code>.</code> means &ldquo;anything&rdquo; and <code>*</code> means &ldquo;any amount of the preceding thing&rdquo;.</p>
<p><code>.</code> and <code>*</code> are magical.  In the particular dialect of regexen understoodby <code>grep</code>, other magical things include:</p>
<table>    <tr><td><code>^</code>    </td>  <td>start of a line                     </td></tr>    <tr><td><code>$</code>    </td>  <td>end of a line                       </td></tr>    <tr><td><code>[abc]</code></td>  <td>one of a, b, or c                   </td></tr>    <tr><td><code>[a-z]</code></td>  <td>a character in the range a through z</td></tr>    <tr><td><code>[0-9]</code></td>  <td>a character in the range 0 through 9</td></tr>
    <tr><td><code>+</code>    </td>  <td>one or more of the preceding thing  </td></tr>    <tr><td><code>?</code>    </td>  <td>0 or 1 of the preceding thing       </td></tr>    <tr><td><code>*</code>    </td>  <td>any number of the preceding thing   </td></tr>
    <tr><td><code>(foo|bar)</code></td>  <td>"foo" or "bar"</td></tr>    <tr><td><code>(foo)?</code></td>     <td>optional "foo"</td></tr></table>

<p>It&rsquo;s actually a little more complicated than that:  By default, if you want touse a lot of the magical characters, you have to prefix them with <code>\</code>.  This isboth ugly and confusing, so unless you&rsquo;re writing a very simple pattern, it&rsquo;soften easiest to call <code>grep -E</code>, for <strong>E</strong>xtended regular expressions, whichmeans that lots of characters will have special meanings.</p>
<p>Authors with 4-letter first names:</p>
<!-- exec -->

<pre><code>$ grep -iE '^[a-z]{4} ' ./authors_*./authors_contemporary_fic:Eden Robinson./authors_sff:John Ronald Reuel Tolkien./authors_sff:John Brunner</code></pre>
<!-- end -->

<p>A count of authors named John:</p>
<!-- exec -->

<pre><code>$ grep -c '^John ' ./all_authors2</code></pre>
<!-- end -->

<p>Lines in this file matching the words &ldquo;magic&rdquo; or &ldquo;magical&rdquo;:</p>
<pre><code>$ grep -iE 'magic(al)?' ./index.mdPipes are some of the most important magic in the shell.  When the people whoshell to match groups of files, but with more magic.`.` and `*` are magical.  In the particular dialect of regexen understoodby `grep`, other magical things include:use a lot of the magical characters, you have to prefix them with `\`.  This isLines in this file matching the words "magic" or "magical":    $ grep -iE 'magic(al)?' ./index.md</code></pre>
<p>Find some &ldquo;-agic&rdquo; words in a big list of words:</p>
<!-- exec -->

<pre><code>$ grep -iE '(m|tr|pel)agic' /usr/share/dict/wordsmagicmagic'smagicalmagicallymagicianmagician'smagicianspelagictragictragicallytragicomediestragicomedytragicomedy's</code></pre>
<!-- end -->

<p><code>grep</code> isn&rsquo;t the only - or even the most important - tool that makes use ofregular expressions, but it&rsquo;s a good place to start because it&rsquo;s one of thefundamental building blocks for so many other operations.  Filtering lists ofthings, matching patterns within collections, and writing concise descriptionsof how text should be transformed are at the heart of a practical approach toUnix-like systems.  Regexen turn out to be a seductively powerful way to dothese things - so much so that they&rsquo;ve crept their way into text editors,databases, and full-featured programming languages.</p>
<p>There&rsquo;s a dark side to all of this, for the truth about regular expressions isthat they are ugly, inconsistent, brittle, and <em>incredibly</em> difficult to thinkclearly about.  They take years to master and reward the wielder with greatpower, but they are also a trap: a temptation towards the path of clevernessmasquerading as wisdom.</p>
<p style="text-align:center;"> ✑</p>
<p>I&rsquo;ll be returning to this theme, but for the time being let&rsquo;s move on.  Nowthat we&rsquo;ve established, however haphazardly, some of the basics, let&rsquo;s considertheir application to a real-world task.</p>
<hr />
<h1><a name=a-literary-problem href=#a-literary-problem>#</a> 2. a literary problem</h1>
<p>The <a href="../literary_environment">previous chapter</a> introduced a bunch of toolsusing contrived examples.  Now we&rsquo;ll look at a real problem, and work through asolution by building on tools we&rsquo;ve already covered.</p>
<p>So on to the problem:  I write poetry.</p>
<p>{rimshot dot wav}</p>
<p>Most of the poems I have written are not very good, but lately I&rsquo;ve beenthinking that I&rsquo;d like to comb through the last ten years' worth and pullthe least-embarrassing stuff into a single collection.</p>
<p>I&rsquo;ve hinted at how the contents of my blog are stored as files, but let&rsquo;s takea look at the whole thing:</p>
<pre><code>$ ls -F ~/p1k3/archives/1997/  2003/  2009/  bones/     meta/1998/  2004/  2010/  chapbook/  winfield/1999/  2005/  2011/  cli/       wip/2000/  2006/  2012/  colophon/2001/  2007/  2013/  europe/2002/  2008/  2014/  hack/</code></pre>
<p>(<code>ls</code>, again, just lists files.  <code>-F</code> tells it to append a character that showsit what type of file we&rsquo;re looking at, such as a trailing / for directories.<code>~</code> is a shorthand that means &ldquo;my home directory&rdquo;, which in this case is<code>/home/brennen</code>.)</p>
<p>Each of the directories here holds other directories.  The ones for each yearhave sub-directories for the months of the year, which in turn contain filesfor the days.  The files are just little pieces of HTML and Markdown and someother stuff.  Many years ago, before I had much of an idea how to program, Iwrote a script to glue them all together into a web page and serve them up tovisitors.  This all sounds complicated, but all it really means is that if Iwant to write a blog entry, I just open a file and type some stuff.  Here&rsquo;s anexample for March 1st:</p>
<!-- exec -->

<pre><code>$ cat ~/p1k3/archives/2014/3/1&lt;h1&gt;Saturday, March 1&lt;/h1&gt;
&lt;markdown&gt;Sometimes I'm going along on a Saturday morning, still a little dazed from thenight before, and I think something like "I should just go write a detailedanalysis of hooded sweatshirts".  Mostly these thoughts don't survive contactwith an actual keyboard.  It's almost certainly for the best.&lt;/markdown&gt;</code></pre>
<!-- end -->

<p>And here&rsquo;s an older one that contains a short poem:</p>
<!-- took this one out of exec block 'cause later i
     made a dir out of it... -->

<pre><code>$ cat ~/p1k3/archives/2012/10/9&lt;h1&gt;tuesday, october 9&lt;/h1&gt;
&lt;freeverse&gt;i am a stateful machinei exist in a manifold of consequencea clattering miscellany of impure functionsand side effects&lt;/freeverse&gt;</code></pre>
<p>Notice that <code>&lt;freeverse&gt;</code> bit?  It kind of looks like an HTML tag, but it&rsquo;snot.  What it actually does is tell my blog script that it should format thetext it contains like a poem.  The specifics don&rsquo;t matter for our purposes(yet), but this convention is going to come in handy, because the first thing Iwant to do is get a list of all the entries that contain poems.</p>
<p>Remember <code>grep</code>?</p>
<pre><code>$ grep -ri '&lt;freeverse&gt;' ~/p1k3/archives &gt; ~/possible_poems</code></pre>
<p>Let&rsquo;s step through this bit by bit:</p>
<p>First, I&rsquo;m asking <code>grep</code> to search <strong>r</strong>ecursively, <strong>i</strong>gnoring case.&ldquo;Recursively&rdquo; just means that every time the program finds a directory, itshould descend into that directory and search in any files there as well.</p>
<pre><code>grep -ri</code></pre>
<p>Next comes a pattern to search for.  It&rsquo;s in single quotes because thecharacters <code>&lt;</code> and <code>&gt;</code> have a special meaning to the shell, and here we needthe shell to understand that it should treat them as literal angle bracketsinstead.</p>
<pre><code>'&lt;freeverse&gt;'</code></pre>
<p>This is the path I want to search:</p>
<pre><code>~/p1k3/archives</code></pre>
<p>Finally, because there are so many entries to search, I know the process willbe slow and produce a large list, so I tell the shell to redirect it to a filecalled <code>possible_poems</code> in my home directory:</p>
<pre><code>&gt; ~/possible_poems</code></pre>
<p>This is quite a few instances&hellip;</p>
<pre><code>$ wc -l ~/possible_poems679 /home/brennen/possible_poems</code></pre>
<p>&hellip;and it&rsquo;s also not super-pretty to look at:</p>
<pre><code>$ head -5 ~/possible_poems/home/brennen/p1k3/archives/2011/10/14:&lt;freeverse&gt;i've got this friend has a real knack/home/brennen/p1k3/archives/2011/4/25:&lt;freeverse&gt;i can't claim to strive for it/home/brennen/p1k3/archives/2011/8/10:&lt;freeverse&gt;one diminishes or becomes greater/home/brennen/p1k3/archives/2011/8/12:&lt;freeverse&gt;/home/brennen/p1k3/archives/2011/1/1:&lt;freeverse&gt;six years on</code></pre>
<p>Still, it&rsquo;s a decent start.  I can see paths to the files I have to check, andusually a first line.  Since I use a fancy text editor, I can just go down thelist opening each file in a new window and copying the stuff I&rsquo;m interested into a new file.</p>
<p>This is good enough for government work, but what if instead of jumping aroundbetween hundreds of files, I&rsquo;d rather read everything in one file and just weedout the bad ones as I go?</p>
<pre><code>$ cat `grep -ril '&lt;freeverse&gt;' ~/p1k3/archives` &gt; ~/possible_poems_full</code></pre>
<p>This probably bears some explaining.  <code>grep</code> is still doing all the real workhere.  The main difference from before is that <code>-l</code> tells grep to just list anyfiles it finds which contain a match.</p>
<pre><code>`grep -ril '&lt;freeverse&gt;' ~/p1k3/archives`</code></pre>
<p>Notice those backticks around the grep command?  This part is a littletrippier.  It turns out that if you put backticks around something in acommand, it&rsquo;ll get executed and replaced with its result, which in turn getsexecuted as part of the larger command.  So what we&rsquo;re really saying issomething like:</p>
<pre><code>$ cat [all of the files in the blog directory with &lt;freeverse&gt; in them]</code></pre>
<p>Did you catch that?  I just wrote a command that rewrote itself as a<em>different</em>, more specific command.  And it appears to have worked on thefirst try:</p>
<pre><code>$ wc ~/possible_poems_full 17628  80980 528699 /home/brennen/possible_poems_full</code></pre>
<p>Welcome to wizard school.</p>
<hr />
<h1><a name=programmerthink href=#programmerthink>#</a> 3. programmerthink</h1>
<p>In the <a href="#a-literary-problem">preceding chapter</a>, I worked through accumulatinga big piece of text from some other, smaller texts.  I started with a bunch offiles and wound up with one big file called <code>potential_poems_full</code>.</p>
<p>Let&rsquo;s talk for a minute about how programmers approach problems like this one.What I&rsquo;ve just done is sort of an old-school humanities take on things:Metaphorically speaking, I took a book off the shelf and hauled it down to thecopy machine to xerox a bunch of pages, and now I&rsquo;m going to start in on themwith a highlighter and some Post-Its or something.  A process like this willoften trigger a cascade of questions in the programmer-mind:</p>
<ul><li>What if, halfway through the project, I realize my selection criteria were allwrong and have to backtrack?</li><li>What if I discover corrections that also need to be made in the source documents?</li><li>What if I want to access metadata, like the original location of a file?</li><li>What if I want to quickly re-order the poems according to some new criteria?</li><li>Why am I storing the same text in two different places?</li></ul>

<p>A unifying theme of these questions is that they could all be answered byinvolving a little more abstraction.</p>
<p style="text-align:center;"> ★</p>
<p>Some kinds of abstraction are so common in the physical world that we canforget they&rsquo;re part of a sophisticated technology.  For example, a good deal ofbicycle maintenance can be accomplished with a cheap multi-tool containing afew different sizes of hex wrench and a couple of screwdrivers.</p>
<p>A hex wrench or screwdriver doesn&rsquo;t really know anything about bicycles.  Allit <em>really</em> knows about is fitting into a space and allowing torque to beapplied.  Standardized fasteners and adjustment mechanisms on a bicycle ensurethat the work can be done anywhere, by anyone with a certain set of tools.Standard tools mean that if you can work on a particular bike, you can work on<em>most</em> bikes, and even on things that aren&rsquo;t bikes at all, but were designed bypeople with the same abstractions in mind.</p>
<p>The relationship between a wrench, a bolt, and the purpose of a bolt is a lotlike something we call <em>indirection</em> in software.  Programs like <code>grep</code> or<code>cat</code> don&rsquo;t really know anything about poetry.  All they <em>really</em> know about isfinding lines of text in input, or sticking inputs together.  Files, lines, andtext are like standardized fasteners that allow a user who can work on one kindof data (be it poetry, a list of authors, the source code of a program) to usethe same tools for other problems and other data.</p>
<p style="text-align:center;"> ★</p>
<p>When I first started writing stuff on the web, I edited a page &mdash; a single HTMLfile &mdash; by hand.  When the entries on my nascent blog got old, I manuallycut-and-pasted them to archive files with names like <code>old_main97.html</code>, whichheld all of the stuff I&rsquo;d written in 1997.</p>
<p>I&rsquo;m not holding this up as an example of youthful folly.  In fact, it workedfine, and just having a single, static file that you can open in any texteditor has turned out to be a <em>lot</em> more future-proof than the sophisticatedblogging software people were starting to write at the time.</p>
<p>And yet.  Something about this habit nagged at my developing programmer mindafter a few years.  It was just a little bit too manual and repetitive, alittle bit silly to have to write things like a table of contents by hand, ormove entries around by copy-and-pasting them to different files.  Since I knewthe date for each entry, and wanted to make them navigable on that basis, whynot define a directory structure for the years and months, and then write afile to hold each day?  That way, all I&rsquo;d have to do is concatenate the filesin one directory to display any given month:</p>
<pre><code>$ cat ~/p1k3/archives/2014/1/* | head -10&lt;h1&gt;Sunday, January 12&lt;/h1&gt;
&lt;h2&gt;the one casey is waiting for&lt;/h2&gt;
&lt;freeverse&gt;after a whilethe thing about drinkingis that it just feedswhat you drink to killand kills</code></pre>
<p>I ultimately wound up writing a few thousand lines of Perl to do the actualwork, but the essential idea of the thing is still little more than invoking<code>cat</code> on some stuff.</p>
<p>I didn&rsquo;t know the word for it at the time, but what I was reaching for was akind of indirection.  By putting blog posts in a specific directory layout, Iwas creating a simple model of the temporal structure that I considered theirmost important property.  Now, if I want to write commands that ask questionsabout my blog posts or re-combine them in certain ways, I can address myconcerns to this model.  Maybe, for example, I want a rough idea how many wordsI&rsquo;ve written in blog posts so far in 2014:</p>
<pre><code>$ find ~/p1k3/archives/2014/ -type f | xargs cat | wc -w6677</code></pre>
<p><code>xargs</code> is not the most intuitive command, but it&rsquo;s useful and common enough toexplain here.  At the end of last chapter, when I said:</p>
<pre><code>$ cat `grep -ril '&lt;freeverse&gt;' ~/p1k3/archives` &gt; ~/possible_poems_full</code></pre>
<p>I could also have written this as:</p>
<pre><code>$ grep -ril '&lt;freeverse&gt;' ~/p1k3/archives | xargs cat &gt; ~/possible_poems_full</code></pre>
<p>What this does is take its input, which starts like:</p>
<pre><code>/home/brennen/p1k3/archives/2002/10/16/home/brennen/p1k3/archives/2002/10/27/home/brennen/p1k3/archives/2002/10/10</code></pre>
<p>&hellip;and run <code>cat</code> on all the things in it:</p>
<pre><code>cat /home/brennen/p1k3/archives/2002/10/16 /home/brennen/p1k3/archives/2002/10/27 /home/brennen/p1k3/archives/2002/10/10 ...</code></pre>
<p>It can be a better idea to use <code>xargs</code>, because while backticks areincredibly useful, they have some limitations.  If you&rsquo;re dealing with a verylarge list of files, for example, you might exceed the maximum allowed lengthfor arguments to a command on your system.  <code>xargs</code> is smart enough to knowthat limit and run <code>cat</code> more than once if needed.</p>
<p><code>xargs</code> is actually sort of a pain to think about, and will make you jumpthrough some irritating hoops if you have spaces or other weirdness in yourfilenames, but I wind up using it quite a bit.</p>
<p>Maybe I want to see a table of contents:</p>
<!-- exec -->

<pre><code>$ find ~/p1k3/archives/2014/ -type d | xargs ls -v | head -10/home/brennen/p1k3/archives/2014/:1234
/home/brennen/p1k3/archives/2014/1:51214</code></pre>
<!-- end -->

<p>Or find the subtitles I used in 2013:</p>
<!-- exec -->

<pre><code>$ find ~/p1k3/archives/2012/ -type f | xargs perl -ne 'print "$1\n" if m{&lt;h2&gt;(.*?)&lt;/h2&gt;}'pursuitfragmentthis poem againi'll do better next timetimebinding animalsmore observations on gear nerdery &amp;amp; utility fetishismthriftA miracle, in fact, means work&lt;em&gt;technical notes for late october&lt;/em&gt;, or &lt;em&gt;it gets dork out earlier these days&lt;/em&gt;radiolight enough to travel12:06am"figures like Heinlein and Gingrich"</code></pre>
<!-- end -->

<p>The crucial thing about this is that the filesystem <em>itself</em> is just like <code>cat</code>and <code>grep</code>:  It doesn&rsquo;t know anything about blogs (or poetry), and it&rsquo;sbasically indifferent to the actual <em>structure</em> of a file like<code>~/p1k3/archives/2014/1/12</code>.  What the filesystem knows is that there are fileswith certain names in certain places.  It need not know anything about the<em>meaning</em> of those names in order to be useful; in fact, it&rsquo;s best if it staysagnostic about the question, for this enables us to assign our own meaning to astructure and manipulate that structure with standard tools.</p>
<p style="text-align:center;"> ★</p>
<p>Back to the problem at hand:  I have this collection of files, and I know howto extract the ones that contain poems.  My goal is to see all the poems andcollect the subset of them that I still find worthwhile.  Just knowing how togrep and then edit a big file solves my problem, in a basic sort of way.  Andyet: Something about this nags at my mind.  I find that, just as I can alreadyuse standard tools and the filesystem to ask questions about all of my blogposts in a given year or month, I would like to be able to ask questions aboutthe set of interesting poems.</p>
<p>If I want the freedom to execute many different sorts of commands against thisset of poems, it begins to seem that I need a model.</p>
<p>When programmers talk about models, they often mean something that people inthe sciences would recognize:  We find ways to represent the arrangement offacts so that we can think about them.  A structured representation of thingsoften means that we can <em>change</em> those things, or at least derive newunderstanding of them.</p>
<p style="text-align:center;"> ★</p>
<p>At this point in the narrative, I could pretend that my next step isimmediately obvious, but in fact it&rsquo;s not.  I spend a couple of days thinkingoff and on about how to proceed, scribbling notes during bus rides and whiledrinking beers at the pizza joint down the street.  I assess and discard ideaswhich fall into a handful of broad approaches:</p>
<ul><li>Store blog entries in a relational database system which would allow me toassociate them with data like &ldquo;this entry is in a collection called &lsquo;okpoems&rsquo;&rdquo;.</li><li>Selectively build up a file containing the list of files with ok poems, and useit to do other tasks.</li><li>Define a format for metadata that lives within entry files.</li><li>Turn each interesting file into a directory of its own which contains a filewith the original text and another file with metadata.</li></ul>

<p>I discard the relational database idea immediately:  I like working with files,and I don&rsquo;t feel like abandoning a model that&rsquo;s served me well for my entireadult life.</p>
<p>Building up an index file to point at the other files I&rsquo;m working with has acertain appeal.  I&rsquo;m already most of the way there with the <code>grep</code> output in<code>potential_poems</code>. It would be easy to write shell commands to add, remove,sort, and search entries.  Still, it doesn&rsquo;t feel like a very satisfyingsolution unto itself.  I&rsquo;d like to know that an entry is part of the collectionjust by looking at the entry, without having to cross-reference it to a listsomewhere else.</p>
<p>What about putting some meaningful text in the file itself?  I thought abouta bunch of different ways to do this, some of them really complicated, andeventually arrived at this:</p>
<pre><code>&lt;!-- collection: ok-poems --&gt;</code></pre>
<p>The <code>&lt;!-- --&gt;</code> bits are how you define a comment in HTML, which means thatneither my blog code nor web browsers nor my text editor have to know anythingabout the format, but I can easily find files with certain values.  Check it:</p>
<pre><code>$ find ~/p1k3/archives -type f | xargs perl -ne 'print "$ARGV[0]: $1 -&gt; $2\n" if m{&lt;!-- ([a-z]+): (.*?) --&gt;};'/home/brennen/p1k3/archives/2014/2/9: collection -&gt; ok-poems</code></pre>
<p>That&rsquo;s an ugly one-liner, and I haven&rsquo;t explained half of what it does, but thecomment format actually seems pretty workable for this.  It&rsquo;s a little tacky tolook at, but it&rsquo;s simple and searchable.</p>
<p>Before we settle, though, let&rsquo;s turn to the notion of making each entry into adirectory that can contain some structured metadata in a separate file.Imagine something like:</p>
<pre><code>$ ls ~/p1k3/archives/2013/2/9index  Meta</code></pre>
<p>Here I use the name &ldquo;index&rdquo; for the main part of the entry because it&rsquo;s aconvention of web sites for the top-level page in a directory to be calledsomething like <code>index.html</code>.  As it happens, my blog software already supportsthis kind of file layout for entries which contain multiple parts, image files,and so forth.</p>
<pre><code>$ head ~/p1k3/archives/2013/2/9/index&lt;h1&gt;saturday, february 9&lt;/h1&gt;
&lt;freeverse&gt;midwinter midafternoon; depressed as hellsitting in a huge cabin in the rich-people mountainswriting a sprawl, pages, of melancholic midlife bullshit
outside the snow gives way to broken clouds and theclear unyielding light of the high country sun fills
$ cat ~/p1k3/archives/2013/2/9/Metacollection: ok-poems</code></pre>
<p>It would then be easy to <code>find</code> files called <code>Meta</code> and grep them for<code>collection: ok-poems</code>.</p>
<p>What if I put metadata right in the filename itself, and dispense with the grepaltogether?</p>
<pre><code>$ ls ~/p1k3/archives/2013/2/9index  meta-ok-poem
$ find ~/p1k3/archives -name 'meta-ok-poem'/home/brennen/archives/2013/2/9/meta-ok-poem</code></pre>
<p>There&rsquo;s a lot to like about this.  For one thing, it&rsquo;s immediately visible in adirectory listing.  For another, it doesn&rsquo;t require searching through thousandsof lines of text to extract a specific string.  If a directory has a<code>meta-ok-poem</code> in it, I can be pretty sure that it will contain an interesting<code>index</code>.</p>
<p>What are the downsides?  Well, it requires transforming lots of text files intodirectories-containing-files.  I might automate that process, but it&rsquo;s still alittle tedious and it makes the layout of the entry archive more complicatedoverall.  There&rsquo;s a cost to doing things this way.  It lets me extend myexisting model of a blog entry to include arbitrary metadata, but it also addssteps to writing or finding blog entries.</p>
<p>Abstractions usually cost you something.  Is this one worth the hassle?Sometimes the best way to answer that question is to start writing code thathandles a given abstraction.</p>
<hr />
<h1><a name=script href=#script>#</a> 4. script</h1>
<p>Back in chapter 1, I said that &ldquo;the way you use the computer is often just to writelittle programs that invoke other programs&rdquo;.  In fact, we&rsquo;ve already gone over abunch of these.  Grepping through the text of a previous chapter should pullup some good examples:</p>
<!-- exec -->

<pre><code>$ grep -E '\$ [a-z]+.*\| ' ../literary_environment/index.md    $ sort authors_* | uniq -c    $ sort authors_* | uniq &gt; ./all_authors    $ find ~/p1k3/archives/2010/11 -regextype egrep -regex '.*([0-9]+|index)' -type f | xargs wc -w | tail -1    $ sort authors_* | uniq | wc -l    $ sort colors | uniq -i | tail -1    $ cut -d' ' -f1 ./authors_* | sort | uniq -ci | sort -n | tail -3    $ sort -u ./authors_* | cut -d' ' -f1 | uniq -ci | sort -n | tail -3    $ sort -k1 all_authors.tsv | expand -t14    $ paste firstnames lastnames | sort -k2 | expand -t12    $ cat ./authors_* | grep 'Vanessa'</code></pre>
<!-- end -->

<p>None of these one-liners do all that much, but they all take input of one sortor another and apply one or more transformations to it.  They&rsquo;re little formalsentences describing how to make one thing into another, which is as good adefinition of programming as most.  Or at least this is a good way to describeprogramming-in-the-small.  (A lot of the programs we use day-to-day are morelike essays, novels, or interminable Fantasy series where every character youlike dies horribly than they are like individual sentences.)</p>
<p>One-liners like these are all well and good when you&rsquo;re staring at a terminal,trying to figure something out - but what about when you&rsquo;ve already figured it out andyou want to repeat it in the future?</p>
<p>It turns out that Bash has you covered.  Since shell commands are just text,they can live in a text file as easily as they can be typed.</p>
<h2><a name=learn-you-an-editor href=#learn-you-an-editor>#</a> learn you an editor</h2>
<p>We&rsquo;ve skirted the topic so far, but now that we&rsquo;re talking about writing outtext files in earnest, you&rsquo;re going to want a text editor.</p>
<p>My editor is where I spend most of my time that isn&rsquo;t in a web browser, becauseit&rsquo;s where I write both code and prose.  It turns out that the features whichmake a good code editor overlap a lot with the ones that make a good editor ofEnglish sentences.</p>
<p>So what should you use?  Well, there have been other contenders in recentyears, but in truth nothing comes close to dethroning the Great Old Ones oftext editing.  Emacs is a creature both primal and sophisticated, like anavatar of some interstellar civilization that evolved long before multicellularlife existed on earth and seeded the galaxy with incomprehensible artefacts andcolossal engineering projects.  Vim is like a lovable chainsaw-studded robotwith the most elegant keyboard interface in history secretly emblazoned on itsshining diamond heart.</p>
<p>It&rsquo;s worth the time it takes to learn one of the serious editors, but there areeasier places to start.  Nano, for example, is easy to pick up, and should beavailable on most systems.  To start it, just say:</p>
<pre><code>$ nano file</code></pre>
<p>You should see something like this:</p>
<p style="text-align:center;"> <img src="images/nano.png" alt="nano" /></p>
<p>Arrow keys will move your cursor around, and typing stuff will make it appearin the file.  This is pretty much like every other editor you&rsquo;ve ever used.  Ifyou haven&rsquo;t used Nano before, that stuff along the bottom of the terminal is areference to the most commonly used commands.  <code>^</code> is a convention for &ldquo;Ctrl&rdquo;,so <code>^O</code> means Ctrl-o (the case of the letter doesn&rsquo;t actually matter), whichwill save the file you&rsquo;re working on.  Ctrl-x will quit, which is probably thefirst important thing to know about any given editor.</p>
<h2><a name=d-i-y-utilities href=#d-i-y-utilities>#</a> d.i.y. utilities</h2>
<p>So back to putting commands in text files.  Here&rsquo;s a file I just created inmy editor:</p>
<!-- exec -->

<pre><code>$ cat okpoems#!/bin/bash
# find all the marker files and get the name of# the directory containing eachfind ~/p1k3/archives -name 'meta-ok-poem' | xargs -n1 dirname
exit 0</code></pre>
<!-- end -->

<p>This is known as a script.  There are a handful of things to notice here.First, there&rsquo;s this fragment:</p>
<pre><code>#!/bin/bash</code></pre>
<p>The <code>#!</code> right at the beginning, followed by the path to a program, is aspecial sequence that lets the kernel know what program should be used tointerpret the contents of the file.  <code>/bin/bash</code> is the path on the filesystemwhere Bash itself lives.  You might see this referred to as a shebang or a hashbang.</p>
<p>Lines that start with a <code>#</code> are comments, used to describe the code to a humanreader.  The <code>exit 0</code> tells Bash that the currently running script should exitwith a status of 0, which basically means &ldquo;nothing went wrong&rdquo;.</p>
<p>If you examine the directory listing for <code>okpoems</code>, you&rsquo;ll see somethingimportant:</p>
<!-- exec -->

<pre><code>$ ls -l okpoems-rwxrwxr-x 1 brennen brennen 163 Apr 19 00:08 okpoems</code></pre>
<!-- end -->

<p>That looks pretty cryptic.  For the moment, just remember that those little<code>x</code>s in the first bit mean that the file has been marked e<strong>x</strong>ecutable.  Weaccomplish this by saying something like:</p>
<pre><code>$ chmod +x ./okpoems</code></pre>
<p>Once that&rsquo;s done, it and the shebang line in combination mean that typing<code>./okpoems</code> will have the same effect as typing <code>bash okpoems</code>:</p>
<!-- exec -->

<pre><code>$ ./okpoems/home/brennen/p1k3/archives/2013/2/9/home/brennen/p1k3/archives/2012/3/17/home/brennen/p1k3/archives/2012/3/26</code></pre>
<!-- end -->

<h2><a name=heavy-lifting href=#heavy-lifting>#</a> heavy lifting</h2>
<p><code>okpoems</code> demonstrates the basics, but it doesn&rsquo;t do very much.  Here&rsquo;sa script with a little more substance to it:</p>
<!-- exec -->

<pre><code>$ cat markpoem#!/bin/bash
# $1 is the first parameter to our scriptPOEM=$1
# Complain and exit if we weren't given a path:if [ ! $POEM ]; then  echo 'usage: markpoem &lt;path&gt;'
  # Confusingly, an exit status of 0 means to the shell that everything went  # fine, while any other number means that something went wrong.  exit 64fi
if [ ! -e $POEM ]; then  echo "$POEM not found"  exit 66fi
echo "marking $POEM an ok poem"
POEM_BASENAME=$(basename $POEM)
# If the target is a plain file instead of a directory, make it into# a directory and move the content into $POEM/index:if [ -f $POEM ]; then  echo "making $POEM into a directory, moving content to"  echo "  $POEM/index"  TEMPFILE="/tmp/$POEM_BASENAME.$(date +%s.%N)"  mv $POEM $TEMPFILE  mkdir $POEM  mv $TEMPFILE $POEM/indexfi
if [ -d $POEM ]; then  # touch(1) will either create the file or update its timestamp:  touch $POEM/meta-ok-poemelse  echo "something broke - why isn't $POEM a directory?"  file $POEMfi
# Signal that all is copacetic:echo kthxbaiexit 0</code></pre>
<!-- end -->

<p>Both of these scripts are imperfect, but they were quick to write, they&rsquo;re madeout of standard commands, and I don&rsquo;t yet hate myself for them:  All signs thatI&rsquo;m not totally on the wrong track with the <code>meta-ok-poem</code> abstraction, andcould live with it as part of an ongoing writing project.  <code>okpoems</code> and<code>markpoem</code> would also be easy to use with custom keybindings in my editor.  Ina few more lines of code, I can build a system to wade through the list ofcandidate files and quickly mark the interesting ones.</p>
<h2><a name=generality href=#generality>#</a> generality</h2>
<p>So what&rsquo;s lacking here?  Well, probably a bunch of things, feature-wise.  I canimagine writing a script to unmark a poem, for example.  That said, there&rsquo;s onereally glaring problem.  &ldquo;Ok poem&rdquo; is only one kind of property a blog entrymight possess.  Suppose I wanted a way to express that a poem is terrible?</p>
<p>It turns out I already know how to add properties to an entry.  If I generalizejust a little, the tools become much more flexible.</p>
<!-- exec -->

<pre><code>$ ./addprop /home/brennen/p1k3/archives/2012/3/26 meta-terrible-poemmarking /home/brennen/p1k3/archives/2012/3/26 with meta-terrible-poemkthxbai</code></pre>
<!-- end -->


<!-- exec -->

<pre><code>$ ./findprop meta-terrible-poem/home/brennen/p1k3/archives/2012/3/26</code></pre>
<!-- end -->

<p><code>addprop</code> is only a little different from <code>markpoem</code>.  It takes two parametersinstead of one - the target entry and a property to add.</p>
<!-- exec -->

<pre><code>$ cat addprop#!/bin/bash
ENTRY=$1PROPERTY=$2
# Complain and exit if we weren't given a path and a property:if [[ ! $ENTRY || ! $PROPERTY ]]; then  echo "usage: addprop &lt;path&gt; &lt;property&gt;"  exit 64fi
if [ ! -e $ENTRY ]; then  echo "$ENTRY not found"  exit 66fi
echo "marking $ENTRY with $PROPERTY"
# If the target is a plain file instead of a directory, make it into# a directory and move the content into $ENTRY/index:if [ -f $ENTRY ]; then  echo "making $ENTRY into a directory, moving content to"  echo "  $ENTRY/index"
  # Get a safe temporary file:  TEMPFILE=`mktemp`
  mv $ENTRY $TEMPFILE  mkdir $ENTRY  mv $TEMPFILE $ENTRY/indexfi
if [ -d $ENTRY ]; then  touch $ENTRY/$PROPERTYelse  echo "something broke - why isn't $ENTRY a directory?"  file $ENTRYfi
echo kthxbaiexit 0</code></pre>
<!-- end -->

<p>Meanwhile, <code>findprop</code> is more or less <code>okpoems</code>, but with a parameter for theproperty to find:</p>
<!-- exec -->

<pre><code>$ cat findprop#!/bin/bash
if [ ! $1 ]then  echo "usage: findprop &lt;property&gt;"  exitfi
# find all the marker files and get the name of# the directory containing eachfind ~/p1k3/archives -name $1 | xargs -n1 dirname
exit 0</code></pre>
<!-- end -->

<p>These scripts aren&rsquo;t much more complicated than their poem-specificcounterparts, but now they can be used to solve problems I haven&rsquo;t even thoughtof yet, and included in other scripts that need their functionality.</p>
<hr />
<h1><a name=general-purpose-programmering href=#general-purpose-programmering>#</a> 5. general purpose programmering</h1>
<p>I didn&rsquo;t set out to write a book about programming, <em>as such</em>, but becauseprogramming and the command line are so inextricably linked, this textdraws near the subject almost of its own accord.</p>
<p>If you&rsquo;re not terribly interested in programming, this chapter can easilyenough be skipped.  It&rsquo;s more in the way of philosophical rambling thanconcrete instruction, and will be of most use to those with an existingbackground in writing code.</p>
<p style="text-align:center;"> ✢</p>
<p>If you&rsquo;ve used computers for more than a few years, you&rsquo;re probably viscerallyaware that most software is fragile and most systems decay.  In the time sinceI took my first tentative steps into the little world of a computer (a friend&rsquo;sdad&rsquo;s unidentifiable gaming machine, my own father&rsquo;s blue monochrome Zenithlaptop, the Apple II) the churn has been overwhelming.  By now I&rsquo;ve learned myway around vastly more software &mdash; operating systems, programming languages anddevelopment environments, games, editors, chat clients, mail systems &mdash; than Ipresently could use if I wanted to.  Most of it has gone the way of someancient civilization, surviving (if at all) only in faint, half-understoodcultural echoes and occasional museum-piece displays.  Every user of technologybecomes, in time, a refugee from an irretrievably recent past.</p>
<p>And yet, despite all this, the shell endures.  Most of the ideas in this bookare older than I am.  Most of them could have been applied in 1994 orthereabouts, when I first logged on to multiuser systems running AT&amp;T Unix.Since the early 1990s, systems built on a fundamental substrate of Unix-likebehavior and abstractions have proliferated wildly, becoming foundational atonce to the modern web, the ecosystem of free and open software, and thetechnological dominance ca. 2014 of companies like Apple, Google, and Facebook.</p>
<p>Why is this, exactly?</p>
<p style="text-align:center;"> ✣</p>
<p>As I&rsquo;ve said (and hopefully shown), the commands you write in your shellare essentially little programs.  Like other programs, they can be storedfor later use and recombined with other commands, creating new uses foryour ideas.</p>
<p>It would be hard to say that there&rsquo;s any <em>one</em> reason command line environmentsremain so vital after decades of evolution and hard-won refinement in computerinterfaces, but it seems like this combinatory nature is somewhere near theheart of it.  The command line often lacks the polish of other interfaces wedepend on, but in exchange it offers a richness and freedom of expressionrarely seen elsewhere, and invites its users to build upon its basicfacilities.</p>
<p>What is it that makes last chapter&rsquo;s <code>addprop</code> preferable to the more specific<code>markpoem</code>?  Let&rsquo;s look at an alternative implementation of <code>markpoem</code>:</p>
<!-- exec -->

<pre><code>$ cat simple_markpoem#!/bin/bash
addprop $1 meta-ok-poem</code></pre>
<!-- end -->

<p>Is this script trivial?  Absolutely.  It&rsquo;s so trivial that it barely seems toexist, because I already wrote <code>addprop</code> to do all the heavy lifting and playwell with others, freeing us to imagine new uses for its central idea withoutworrying about the implementation details.</p>
<p>Unlike <code>markpoem</code>, <code>addprop</code> doesn&rsquo;t know anything about poetry.  All it knowsabout, in fact, is putting a file (or three) in a particular place.  And thisis in keeping with a basic insight of Unix:  Pieces of software that do onevery simple thing generalize well.  Good command line tools are like a hexwrench, a hammer, a utility knife:  They embody knowledge of turning, ofstriking, of cutting &mdash; and with this kind of knowledge at hand, the user canchange the world even though no individual tool is made with complete knowledgeof the world as a whole.  There&rsquo;s a lot of power in the accumulation of smallcompetencies.</p>
<p>Of course, if your code is only good at one thing, to be of any use, it has totalk to code that&rsquo;s good at other things.  There&rsquo;s another basic insight in theUnix tradition:  Tools should be composable.  All those little programs have toshare some assumptions, have to speak some kind of trade language, in order tocombine usefully.  Which is how we&rsquo;ve arrived at standard IO, pipelines,filesystems, and text as as a lowest-common-denominator medium of exchange.  Ifyou think about most of these things, they have some very rough edges, but theygive otherwise simple tools ways to communicate without becomingsuper-complicated along the way.</p>
<p style="text-align:center;"> ✤</p>
<p>What is the command line?</p>
<p>The command line is an environment of tool use.</p>
<p>So are kitchens, workshops, libraries, and programming languages.</p>
<p style="text-align:center;"> ✥</p>
<p>Here&rsquo;s a confession:  I don&rsquo;t like writing shell scripts very much, and Ican&rsquo;t blame anyone else for feeling the same way.</p>
<p>That doesn&rsquo;t mean you shouldn&rsquo;t <em>know</em> about them, or that you shouldn&rsquo;t<em>write</em> them.  I write little ones all the time, and the ability to puzzlethrough other people&rsquo;s scripts comes in handy.  Oftentimes, the best, mosttasteful way to automate something is to build a script out of the commonlyavailable commands.  The standard tools are already there on millions ofmachines.  Many of them have been pretty well understood for a generation, andmost will probably be around for a generation or three to come.  They do neatstuff.  Scripts let you build on ideas you&rsquo;ve already worked out, and giverepeatable operations a memorable, user-friendly name.  They encourage reuse ofexisting programs, and help express your ideas to people who&rsquo;ll come after you.</p>
<p>One of the reliable markers of powerful software is that it can be scripted: Itextends to its users some of the same power that its authors used in creatingit.  Scriptable software is to some extent <em>living</em> software.  It&rsquo;s a book thatyou, the reader, get to help write.</p>
<p>In all these ways, shell scripts are wonderful, a little bit magical, andquietly indispensable to the machinery of modern civilization.</p>
<p>Unfortunately, in all the ways that a shell like Bash is weird, finicky, andcovered in 40 years of incidental cruft, long-form Bash scripts are even worse.Bash is a useful glue language, particularly if you&rsquo;re already comfortablewiring commands together.  Syntactic and conceptual innovations like pipes arebeautiful and necessary.  What Bash is <em>not</em>, despite its power, is a very goodgeneral purpose programming language.  It&rsquo;s just not especially good at thingslike math, or complex data structures, or not looking like a punctuation-heavyvariety of alphabet soup.</p>
<p>It turns out that there&rsquo;s a threshold of complexity beyond which life becomeseasier if you switch from shell scripting to a more robust language.  Justwhere this threshold is located varies a lot between users and problems, but Ioften think about switching languages before a script gets bigger than I canview on my screen all at once.  <code>addprop</code> is a good example:</p>
<!-- exec -->

<pre><code>$ wc -l ../script/addprop41 ../script/addprop</code></pre>
<!-- end -->

<p>41 lines is a touch over what fits on one screen in the editor I usually use.If I were going to add much in the way of features, I&rsquo;d think pretty hard aboutporting it to another language first.</p>
<p>What&rsquo;s cool is that if you know a language like C, Python, Perl, Ruby, PHP, orJavaScript, your code can participate in the shell environment as a first classcitizen simply by respecting the conventions of standard IO, files, and commandline arguments.  Often, in order to create a useful utility, it&rsquo;s onlynecessary to deal with <code>STDIN</code>, or operate on a particular sort of file, andmost languages offer simple conventions for doing these things.</p>
<p style="text-align:center;"> *</p>
<p>I think the shell can be taught and understood as a humane environment, despiteall of its ugliness and complication, because it offers the materials of itsown construction to its users, whatever their concerns.  The writer, thephilosopher, the scientist, the programmer:  Files and text and pipes knowlittle enough about these things, but in their very indifference to thespecifics of any one complex purpose, they&rsquo;re adaptable to the basic needs ofmany.  Simple utilities which enact simple kinds of knowledge survive andrecombine because there is a wisdom to be found in small things.</p>
<p>Files and text know nothing about poetry, nothing in particular of the humansoul.  Neither do pen and ink, printing presses or codex books, but somehow wegot Shakespeare and Montaigne.</p>
<hr />
<h1><a name=one-of-these-things-is-not-like-the-others href=#one-of-these-things-is-not-like-the-others>#</a> 6. one of these things is not like the others</h1>
<p>If you&rsquo;re the sort of person who took a few detours into the history ofreligion in college, you might be familiar with some of the ways people used todo textual comparison.  When pen, paper, and typesetting were what scholars hadto work with, they did some fairly sophisticated things in order to expose therelationships between multiple pieces of text.</p>
<p style="text-align:center;"> <img src="images/throckmorton_small.jpg" height=320 width=470></p>
<p>Here&rsquo;s a book I got in college:  <em>Gospel Parallels: A Comparison of theSynoptic Gospels</em>, Burton H. Throckmorton, Jr., Ed.  It breaks up three booksfrom the New Testament by the stories and themes that they contain, and showsthe overlapping sections of each book that contain parallel texts.  You canwork your way through and see what parts only show up in one book, or in twobut not the other, or in all three.  Pages are arranged like so:</p>
<pre>                 § JESUS DOES SOME STUFF     ________________________________________________    |  MAT            |    MAR             |  LUK    |    |-----------------+--------------------+---------|    | Stuff           |                    |         |    |                 | Stuff              |         |    |                 | Stuff              | Stuff   |    |                 | Stuff              |         |    |                 | Stuff              |         |    |                 |                    |         |</pre>

<p>The way I understand it, a book like this one only scratches the surface of thefield.  Tools like this support a lot of theory about which books copied eachother and how, and what other sources they might have copied that we&rsquo;ve sincelost.</p>
<p>This is some <em>incredibly</em> dry material, even if you kind of dig thinking aboutthe questions it addresses.  It takes a special temperament to actually sitporing over fragmentary texts in ancient languages and do these painstakingcomparisons.  Even if you&rsquo;re a writer or editor and work with a lot ofrevisions of a text, there&rsquo;s a good chance you rarely do this kind ofcomparison on your own work, because that shit is <em>tedious</em>.</p>
<h2><a name=diff href=#diff>#</a> diff</h2>
<p>It turns out that academics aren&rsquo;t the only people who need tools for comparingdifferent versions of a text.  Working programmers, in fact, need to do this<em>constantly</em>.  Programmers are also happiest when putting off the <em>actual</em> taskat hand to solve some incidental problem that cropped up along the way, so bynow there are a lot of ways to say &ldquo;here&rsquo;s how this file is different from thisfile&rdquo;, or &ldquo;here&rsquo;s how this file is different from itself a year ago&rdquo;.</p>
<p>Let&rsquo;s look at a couple of shell scripts from an earlier chapter:</p>
<!-- exec -->

<pre><code>$ cat ../script/okpoems#!/bin/bash
# find all the marker files and get the name of# the directory containing eachfind ~/p1k3/archives -name 'meta-ok-poem' | xargs -n1 dirname
exit 0</code></pre>
<!-- end -->


<!-- exec -->

<pre><code>$ cat ../script/findprop#!/bin/bash
if [ ! $1 ]then  echo "usage: findprop &lt;property&gt;"  exitfi
# find all the marker files and get the name of# the directory containing eachfind ~/p1k3/archives -name $1 | xargs -n1 dirname
exit 0</code></pre>
<!-- end -->

<p>It&rsquo;s pretty obvious these are similar files, but do we know what <em>exactly</em>changed between them at a glance?  It wouldn&rsquo;t be hard to figure out, once.  Ifyou wanted to be really certain about it, you could print them out, set themside by side, and go over them with a highlighter.</p>
<p>Now imagine doing that for a bunch of files, some of them hundreds or thousandsof lines long.  I&rsquo;ve actually done that before, colored markers and all, but Ididn&rsquo;t feel smart while I was doing it.  This is a job for software.</p>
<!-- exec -->

<pre><code>$ diff ../script/okpoems ../script/findprop2a3,8&gt; if [ ! $1 ]&gt; then&gt;   echo "usage: findprop &lt;property&gt;"&gt;   exit&gt; fi&gt; 5c11&lt; find ~/p1k3/archives -name 'meta-ok-poem' | xargs -n1 dirname---&gt; find ~/p1k3/archives -name $1 | xargs -n1 dirname</code></pre>
<!-- end -->

<p>That&rsquo;s not the most human-friendly output, but it&rsquo;s a little simpler than itseems at first glance.  It&rsquo;s basically just a way of describing the changesneeded to turn <code>okpoems</code> into <code>findprop</code>.  The string <code>2a3,8</code> can be read as&ldquo;at line 2, add lines 3 through 8&rdquo;.  Lines with a <code>&gt;</code> in front of them areadded.  <code>5c11</code> can be read as &ldquo;line 5 in the original file becomes line 11 inthe new file&rdquo;, and the <code>&lt;</code> line is replaced with the <code>&gt;</code> line.  If you wanted,you could take a copy of the original file and apply these instructions by handin your text editor, and you&rsquo;d wind up with the new file.</p>
<p>A lot of people (me included) prefer what&rsquo;s known as a &ldquo;unified&rdquo; diff, becauseit&rsquo;s easier to read and offers context for the changed lines.  We can ask forone of these with <code>diff -u</code>:</p>
<!-- exec -->

<pre><code>$ diff -u ../script/okpoems ../script/findprop--- ../script/okpoems   2014-04-19 00:08:03.321230818 -0600+++ ../script/findprop  2014-04-21 21:51:29.360846449 -0600@@ -1,7 +1,13 @@ #!/bin/bash
+if [ ! $1 ]+then+  echo "usage: findprop &lt;property&gt;"+  exit+fi+ # find all the marker files and get the name of # the directory containing each-find ~/p1k3/archives -name 'meta-ok-poem' | xargs -n1 dirname+find ~/p1k3/archives -name $1 | xargs -n1 dirname
 exit 0</code></pre>
<!-- end -->

<p>That&rsquo;s a little longer, and has some metadata we might not always care about,but if you look for lines starting with <code>+</code> and <code>-</code>, it&rsquo;s easy to read as&ldquo;added these, took away these&rdquo;.  This diff tells us at a glance that we addedsome lines to complain if we didn&rsquo;t get a command line argument, and replaced<code>'meta-ok-poem'</code> in the <code>find</code> command with that argument.  Since it shows ussome context, we have a pretty good idea where those lines are in the fileand what they&rsquo;re for.</p>
<p>What if we don&rsquo;t care exactly <em>how</em> the files differ, but only whether theydo?</p>
<!-- exec -->

<pre><code>$ diff -q ../script/okpoems ../script/findpropFiles ../script/okpoems and ../script/findprop differ</code></pre>
<!-- end -->

<p>I use <code>diff</code> a lot in the course of my day job, because I spend a lot of timeneeding to know just how two programs differ.  Just as importantly, I oftenneed to know how (or whether!) the <em>output</em> of programs differs.  As a concreteexample, I want to make sure that <code>findprop meta-ok-poem</code> is really a suitablereplacement for <code>okpoems</code>.  Since I expect their output to be identical, I cando this:</p>
<!-- exec -->

<pre><code>$ ../script/okpoems &gt; okpoem_output</code></pre>
<!-- end -->


<!-- exec -->

<pre><code>$ ../script/findprop meta-ok-poem &gt; findprop_output</code></pre>
<!-- end -->


<!-- exec -->

<pre><code>$ diff -s okpoem_output findprop_outputFiles okpoem_output and findprop_output are identical</code></pre>
<!-- end -->

<p>The <code>-s</code> just means that <code>diff</code> should explicitly tell us if files are the<strong>s</strong>ame.  Otherwise, it&rsquo;d output nothing at all, because there aren&rsquo;t anydifferences.</p>
<p>As with many other tools, <code>diff</code> doesn&rsquo;t very much care whether it&rsquo;s looking atshell scripts or a list of filenames or what-have-you.  If you read the manpage, you&rsquo;ll find some features geared towards people writing C-likeprogramming languages, but its real specialty is just text files with linesmade out of characters, which works well for lots of code, but certainly couldbe applied to English prose.</p>
<p>Since I have a couple of versions ready to hand, let&rsquo;s apply this to a textwith some well-known variations and a bit of a literary legacy.  Here&rsquo;s thefirst day of the Genesis creation narrative in a couple of Englishtranslations:</p>
<!-- exec -->

<pre><code>$ cat genesis_nkjIn the beginning God created the heavens and the earth.  The earth was withoutform, and void; and darkness was on the face of the deep.  And the Spirit ofGod was hovering over the face of the waters.  Then God said, "Let there belight"; and there was light.  And God saw the light, that it was good; and Goddivided the light from the darkness.  God called the light Day, and the darknessHe called Night.  So the evening and the morning were the first day.</code></pre>
<!-- end -->


<!-- exec -->

<pre><code>$ cat genesis_nrsvIn the beginning when God created the heavens and the earth, the earth was aformless void and darkness covered the face of the deep, while a wind fromGod swept over the face of the waters.  Then God said, "Let there be light";and there was light.  And God saw that the light was good; and God separatedthe light from the darkness.  God called the light Day, and the darkness hecalled Night.  And there was evening and there was morning, the first day.</code></pre>
<!-- end -->

<p>What happens if we diff them?</p>
<!-- exec -->

<pre><code>$ diff -u genesis_nkj genesis_nrsv--- genesis_nkj 2014-05-11 16:28:29.692508461 -0600+++ genesis_nrsv    2014-05-11 16:28:29.744508459 -0600@@ -1,6 +1,6 @@-In the beginning God created the heavens and the earth.  The earth was without-form, and void; and darkness was on the face of the deep.  And the Spirit of-God was hovering over the face of the waters.  Then God said, "Let there be-light"; and there was light.  And God saw the light, that it was good; and God-divided the light from the darkness.  God called the light Day, and the darkness-He called Night.  So the evening and the morning were the first day.+In the beginning when God created the heavens and the earth, the earth was a+formless void and darkness covered the face of the deep, while a wind from+God swept over the face of the waters.  Then God said, "Let there be light";+and there was light.  And God saw that the light was good; and God separated+the light from the darkness.  God called the light Day, and the darkness he+called Night.  And there was evening and there was morning, the first day.</code></pre>
<!-- end -->

<p>Kind of useless, right?  If a given line differs by so much as a character,it&rsquo;s not the same line.  This highlights the limitations of <code>diff</code> for comparingthings that</p>
<ul><li>aren&rsquo;t logically grouped by line</li><li>aren&rsquo;t easily thought of as versions of the same text with some lines changed</li></ul>

<p>We could edit the files into a more logically defined structure, likeone-line-per-verse, and try again:</p>
<!-- exec -->

<pre><code>$ diff -u genesis_nkj_by_verse genesis_nrsv_by_verse--- genesis_nkj_by_verse    2014-05-11 16:51:14.312457198 -0600+++ genesis_nrsv_by_verse   2014-05-11 16:53:02.484453134 -0600@@ -1,5 +1,5 @@-In the beginning God created the heavens and the earth.-The earth was without form, and void; and darkness was on the face of the deep.  And the Spirit of God was hovering over the face of the waters.+In the beginning when God created the heavens and the earth,+the earth was a formless void and darkness covered the face of the deep, while a wind from God swept over the face of the waters. Then God said, "Let there be light"; and there was light.-And God saw the light, that it was good; and God divided the light from the darkness.-God called the light Day, and the darkness He called Night.  So the evening and the morning were the first day.+And God saw that the light was good; and God separated the light from the darkness.+God called the light Day, and the darkness he called Night.  And there was evening and there was morning, the first day.</code></pre>
<!-- end -->

<p>It might be a little more descriptive, but editing all that text just for aquick comparison felt suspiciously like work, and anyway the output stilldoesn&rsquo;t seem very useful.</p>
<h2><a name=wdiff href=#wdiff>#</a> wdiff</h2>
<p>For cases like this, I&rsquo;m fond of a tool called <code>wdiff</code>:</p>
<!-- exec -->

<pre><code>$ wdiff genesis_nkj genesis_nrsvIn the beginning {+when+} God created the heavens and the [-earth.  The-] {+earth, the+} earth was [-withoutform, and void;-] {+aformless void+} and darkness [-was on-] {+covered+} the face of the [-deep.  And the Spirit of-] {+deep, while a wind from+}God [-was hovering-] {+swept+} over the face of the waters.  Then God said, "Let there be light";and there was light.  And God saw [-the light,-] that [-it-] {+the light+} was good; and God[-divided-] {+separated+}the light from the darkness.  God called the light Day, and the darkness[-He-] {+he+}called Night.  [-So the-]  {+And there was+} evening and [-the morning were-] {+there was morning,+} the first day.</code></pre>
<!-- end -->

<p>Deleted words are surrounded by <code>[- -]</code> and inserted ones by <code>{+ +}</code>.  You caneven ask it to spit out HTML tags for insertion and deletion&hellip;</p>
<pre><code>$ wdiff -w '&lt;del&gt;' -x '&lt;/del&gt;' -y '&lt;ins&gt;' -z '&lt;/ins&gt;' genesis_nkj genesis_nrsv</code></pre>
<p>&hellip;and come up with something your browser will render like this:</p>
<blockquote><p>In the beginning <ins>when</ins> God created the heavens and the <del>earth.  The</del> <ins>earth, the</ins> earth was <del>withoutform, and void;</del> <ins>aformless void</ins> and darkness <del>was on</del> <ins>covered</ins> the face of the <del>deep.  And the Spirit of</del> <ins>deep, while a wind from</ins>God <del>was hovering</del> <ins>swept</ins> over the face of the waters.  Then God said, "Let there be light";and there was light.  And God saw <del>the light,</del> that <del>it</del> <ins>the light</ins> was good; and God<del>divided</del> <ins>separated</ins>the light from the darkness.  God called the light Day, and the darkness<del>He</del> <ins>he</ins>called Night.  <del>So the</del>  <ins>And there was</ins> evening and <del>the morning were</del> <ins>there was morning,</ins> the first day.</p></blockquote>

<p>Burton H. Throckmorton, Jr. this ain&rsquo;t.  Still, it has its uses.</p>
<hr />
<h1><a name=the-command-line-as-as-a-shared-world href=#the-command-line-as-as-a-shared-world>#</a> 7. the command line as as a shared world</h1>
<p>In an earlier chapter, I wrote:</p>
<blockquote><p>You can think of the shell as a kind of environment you inhabit, in muchthe way your character inhabits an adventure game.</p></blockquote>
<p>It turns out that sometimes there are other human inhabitants of thisenvironment.</p>
<p>Unix was built on a model known as &ldquo;time-sharing&rdquo;.  This is an idea with a lotof history, but the very short version is that when computers were rare andexpensive, it made sense for lots of people to be able to use them at once.This is part of the story of how ideas like e-mail and chat were originallyborn, well before networks took over the world:  As ways for the many users ofone computer to communicate on the same machine.</p>
<p>Says Dennis Ritchie:</p>
<blockquote><p>What we wanted to preserve was not just a good environment in which to doprogramming, but a system around which a fellowship could form. We knew fromexperience that the essence of communal computing, as supplied byremote-access, time-shared machines, is not just to type programs into aterminal instead of a keypunch, but to encourage close communication.</p></blockquote>
<p>Times have changed, and while it&rsquo;s mundane to use software that&rsquo;s sharedbetween many users, it&rsquo;s not nearly as common as it once was for a bunch of usto be logged into the same computer all at once.</p>
<p style="text-align:center;"> ★</p>
<p>In the mid 1990s, when I was first exposed to Unix, it was by opening up aprogram called NCSA Telnet on one of the Macs at school and connecting to aserver called mother.esu1.k12.ne.us.</p>
<p>NCSA Telnet was a terminal, not unlike the kind that you use to open a shell onyour own Linux computer, a piece of software that itself emulated actual,physical hardware from an earlier era.  Hardware terminals were basically verysimple computers with keyboards, screens, and just enough networking brains totalk to a <em>real</em> computer somewhere else.  You&rsquo;ll still come across thesescattered around big institutional environments.  The last time I looked overthe shoulder of an airline checkin desk clerk, for example, I saw greenmonochrome text that was probably coming from an IBM mainframe somewherefar away.</p>
<p>Part of what was exciting about being logged into a computer somewhere elsewas that you could <em>talk to people</em>.</p>
<p style="text-align:center;"> ★</p>
<p><em>{This chapter is a work in progress.}</em></p>
<hr />
<h1><a name=the-command-line-and-the-web href=#the-command-line-and-the-web>#</a> 8. the command line and the web</h1>
<p>Web browsers are really complicated these days.  They&rsquo;re full of renderingengines, audio and video players, programming languages, development tools,databases &mdash; you name it, and there&rsquo;s a fair chance it&rsquo;s in there somewhere.The modern web browser is kitchen sink software, and to make matters worse, itis <em>totally surrounded</em> by technobabble.  It can take <em>years</em> to come to termswith the ocean of words about web stuff and sort out the meaningful ones fromthe snake oil and bureaucratic mysticism.</p>
<p>All of which can make the web itself seem like a really complicated landscape,and obscure the simplicity of its basic design, which is this:</p>
<p>Some programs pass text around to one another.</p>
<p>Which might sound familiar.</p>
<p>The gist of it is that the web is made out of URLs, &ldquo;Uniform ResourceLocators&rdquo;, which are paths to things.  If you squint, these look kind of likepaths to files on your filesystem.  When you visit a URL in your browser, itasks a server for a certain path, and the server gives it back some text.  Whenyou click a button to submit a form, your browser sends some text to the serverand waits to see what it says back.  The text that gets passed around is(usually) written in a language with particular significance to web browsers,but if you look at it directly, it&rsquo;s a format that humans can understand.</p>
<p>Let&rsquo;s illustrate this.  I&rsquo;ve written a really simple web page that lives at<a href="http://p1k3.com/hello_world.html"><code>http://p1k3.com/hello_world.html</code></a>.</p>
<pre><code>$ curl 'https://p1k3.com/hello_world.html'&lt;html&gt;  &lt;head&gt;    &lt;title&gt;hello, world&lt;/title&gt;  &lt;/head&gt;
  &lt;body&gt;    &lt;h1&gt;hi everybody&lt;/h1&gt;
    &lt;p&gt;How are things?&lt;/p&gt;  &lt;/body&gt;&lt;/html&gt;</code></pre>
<p><code>curl</code> is a program with lots and lots of features &mdash; it too is a little bitof a kitchen sink &mdash; but it has one core purpose, which is to grab things fromURLs and spit them back out.  It&rsquo;s a little bit like <code>cat</code> for things that liveon the web.  Try the above command with just about any URL you can think of,and you&rsquo;ll probably get <em>something</em> back.  Let&rsquo;s try this book:</p>
<pre><code>$ curl 'https://p1k3.com/userland-book/' | head&lt;!DOCTYPE html&gt;&lt;html lang=en&gt;&lt;head&gt;  &lt;meta charset="utf-8"&gt;  &lt;title&gt;userland: a book about the command line for humans&lt;/title&gt;  &lt;link rel=stylesheet href="userland.css" /&gt;  &lt;script src="js/jquery.js" type="text/javascript"&gt;&lt;/script&gt;&lt;/head&gt;
&lt;body&gt;</code></pre>
<p><code>hello_world.html</code> and <code>userland-book</code> are both written in HyperText MarkupLanguage.  HTML is just text with a specific kind of structure.  It&rsquo;s beenaround for quite a while now, and has grown up a lot in 20 years, but at heartit still looks a lot <a href="http://info.cern.ch/hypertext/WWW/TheProject.html">like it did in 1991</a>.</p>
<p>The basic idea is that the contents of a web page are marked up with tags.A tag looks like this:</p>
<pre><code>&lt;title&gt;hi!&lt;/title&gt; -, |     |            | |     `- content   | |                  `- closing tag `-opening tag</code></pre>
<p>Sometimes you&rsquo;ll see tags with what are known as &ldquo;attributes&rdquo;:</p>
<pre><code>&lt;a href="https://p1k3.com/userland-book"&gt;userland&lt;/a&gt;</code></pre>
<p>This is how links are written in HTML.  <code>href="..."</code> tells the browser where togo when the user clicks on &ldquo;<a href="http://p1k3.com/userland-book">userland</a>&rdquo;.</p>
<p>Tags are a way to describe not so much what something <em>looks like</em> as whatsomething <em>means</em>.  Browsers are, in large part, big collections of knowledgeabout the meanings of tags and ways to represent those meanings.</p>
<p>While the browser you use day-to-day has (probably) a graphical interface anddoes all sorts of things impossible to render in a terminal, some of theearliest web browsers were entirely text-based, and text-mode browsers stillexist.  Lynx, which originated at the University of Kansas in the early 1990s,is still actively maintained:</p>
<pre><code>$ lynx -dump 'http://p1k3.com/userland-book/' | head                                    userland     __________________________________________________________________
                 [1]# a book about the command line for humans
   Late last year, [2]a side trip into text utilities got me thinking   about how much my writing habits depend on the Linux command line. This   struck me as a good hook for talking about the tools I use every day   with an audience of mixed technical background.</code></pre>
<p>If you invoke Lynx without any options, it&rsquo;ll start up in interactive mode, andyou can navigate between links with the arrow keys.  <code>lynx -dump</code> spits arendered version of a page to standard output, with links annotated in squarebrackets and printed as footnotes.  Another useful option here is <code>-listonly</code>,which will print just the list of links contained within a page:</p>
<pre><code>$ lynx -dump -listonly 'http://p1k3.com/userland-book/' | head
References
   2. http://p1k3.com/2013/8/4   3. http://p1k3.com/userland-book.git   4. https://github.com/brennen/userland-book   5. http://p1k3.com/userland-book/   6. https://twitter.com/brennen   9. http://p1k3.com/userland-book/#a-book-about-the-command-line-for-humans  10. http://p1k3.com/userland-book/#copying</code></pre>
<p>An alternative to Lynx is w3m, which copes a little more gracefully with thecomplexities of modern web layout.</p>
<pre><code>$ w3m -dump 'http://p1k3.com/userland-book/' | headuserland
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# a book about the command line for humans
Late last year, a side trip into text utilities got me thinking about how muchmy writing habits depend on the Linux command line. This struck me as a goodhook for talking about the tools I use every day with an audience of mixedtechnical background.</code></pre>
<p>Neither of these tools can easily replace enormously capable applications likeChrome or Firefox, but they have their place in the toolbox, and help todemonstrate how the web is built (in part) on principles we&rsquo;ve already seen atwork.</p>
<hr />
<h1><a name=a-miscellany-of-tools-and-techniques href=#a-miscellany-of-tools-and-techniques>#</a> 9. a miscellany of tools and techniques</h1>
<h2><a name=dict href=#dict>#</a> dict</h2>
<p>Want to know the definition of a word, or find useful synonyms?</p>
<pre><code>$ dict concatenate | head -104 definitions found
From The Collaborative International Dictionary of English v.0.48 [gcide]:
  Concatenate \Con*cat"e*nate\ (k[o^]n*k[a^]t"[-e]*n[=a]t), v. t.     [imp. &amp; p. p. {Concatenated}; p. pr. &amp; vb. n.     {Concatenating}.] [L. concatenatus, p. p. of concatenare to     concatenate. See {Catenate}.]     To link together; to unite in a series or chain, as things     depending on one another.</code></pre>
<h2><a name=aspell href=#aspell>#</a> aspell</h2>
<p>Need to interactively spell-check your presentation notes?</p>
<pre><code>$ aspell check presentation</code></pre>
<p>Just want a list of potentially-misspelled words in a given file?</p>
<!-- exec -->

<pre><code>$ aspell list &lt; ../literary_environment/index.md | sort | uniq -ci | sort -nr | head -5     40 td     24 Veselka     17 Reuel     16 Brunner     15 Tiptree</code></pre>
<!-- end -->

<h2><a name=mostcommon href=#mostcommon>#</a> mostcommon</h2>
<p>Something like that last sequence sure does seem to show up a lot in my work:Spit out the <em>n</em> most common lines in the input, one way or another.   Here&rsquo;sa little script to be less repetitive about it.</p>
<!-- exec -->

<pre><code>$ aspell list &lt; ../literary_environment/index.md | ./mostcommon -i -n5     40 td     24 Veselka     17 Reuel     16 Brunner     15 Tiptree</code></pre>
<!-- end -->

<p>This turns out to be pretty simple:</p>
<!-- exec -->

<pre><code>$ cat ./mostcommon#!/usr/bin/env bash
# Optionally specify number of lines to show, defaulting to 10:TOSHOW=10CASEOPT=""
while getopts ":in:" opt; do  case $opt in    i)      CASEOPT="-i"      ;;    n)      TOSHOW=$OPTARG      ;;    \?)      echo "Invalid option: -$OPTARG" &gt;&amp;2      exit 1      ;;    :)      echo "Option -$OPTARG requires an argument." &gt;&amp;2      exit 1      ;;  esacdone
# sort and then uniqify STDIN,# sort numerically on the first field,# chop off everything but $TOSHOW lines of input
sort &lt; /dev/stdin | uniq -c $CASEOPT | sort -k1 -nr | head -$TOSHOW</code></pre>
<!-- end -->

<p>Notice, though, that it doesn&rsquo;t handle opening files directly.  If you wantedto find the most common lines in a file with it, you&rsquo;d have to say somethinglike <code>mostcommon &lt; filename</code> in order to redirect the file to <code>mostcommon</code>&rsquo;sinput.</p>
<p>Also notice that most of the script is boilerplate for handling a couple ofoptions.  The work is all done in a oneliner.  Worth it?  Maybe not, but aninteresting exercise.</p>
<h2><a name=cal-and-ncal href=#cal-and-ncal>#</a> cal and ncal</h2>
<p>Want to know what the calendar looks like for this month?</p>
<pre><code>$ cal     April 2014       Su Mo Tu We Th Fr Sa         1  2  3  4  5   6  7  8  9 10 11 12  13 14 15 16 17 18 19  20 21 22 23 24 25 26  27 28 29 30           </code></pre>
<p>How about for September, 1950, in a more compact format?</p>
<!-- exec -->

<pre><code>$ ncal -m9 1950    September 1950    Su     3 10 17 24   Mo     4 11 18 25   Tu     5 12 19 26   We     6 13 20 27   Th     7 14 21 28   Fr  1  8 15 22 29   Sa  2  9 16 23 30   </code></pre>
<!-- end -->

<p>Need to know the date of Easter this year?</p>
<!-- exec -->

<pre><code>$ ncal -eApril 20 2014</code></pre>
<!-- end -->

<h2><a name=seq href=#seq>#</a> seq</h2>
<p>Need the numbers 1-5?</p>
<!-- exec -->

<pre><code>$ seq 1 512345</code></pre>
<!-- end -->

<h2><a name=shuf href=#shuf>#</a> shuf</h2>
<p>Want to shuffle some lines?</p>
<!-- exec -->

<pre><code>$ seq 1 5 | shuf21435</code></pre>
<!-- end -->

<h2><a name=ptx href=#ptx>#</a> ptx</h2>
<p>Want to make a <a href="http://en.wikipedia.org/wiki/Key_Word_in_Context">permuted index</a> of some phrase?</p>
<!-- exec -->

<pre><code>$ echo 'i like american music' | ptx                              i like   american music                                       i like american music                                   i   like american music                     i like american   music</code></pre>
<!-- end -->

<h2><a name=figlet href=#figlet>#</a> figlet</h2>
<p>Need to make ASCII art of some giant letters?</p>
<!-- exec -->

<pre><code>$ figlet "R T F M" ____    _____   _____   __  __ |  _ \  |_   _| |  ___| |  \/  || |_) |   | |   | |_    | |\/| ||  _ &lt;    | |   |  _|   | |  | ||_| \_\   |_|   |_|     |_|  |_|</code></pre>
<!-- end -->

<h2><a name=cowsay href=#cowsay>#</a> cowsay</h2>
<p>How about ASCII art of a <del>cow</del> dragon saying something?</p>
<!-- exec -->

<pre><code>$ cowsay -f dragon "RTFM, man" ___________&lt; RTFM, man &gt; -----------      \                    / \  //\       \    |\___/|      /   \//  \\            /0  0  \__  /    //  | \ \               /     /  \/_/    //   |  \  \             @_^_@'/   \/_   //    |   \   \            //_^_/     \/_ //     |    \    \        ( //) |        \///      |     \     \      ( / /) _|_ /   )  //       |      \     _\    ( // /) '/,_ _ _/  ( ; -.    |    _ _\.-~        .-~~~^-.  (( / / )) ,-{        _      `-.|.-~-.           .~         `. (( // / ))  '/\      /                 ~-. _ .-~      .-~^-.  \ (( /// ))      `.   {            }                   /      \  \  (( / ))     .----~-.\        \-'                 .~         \  `. \^-.             ///.----..&gt;        \             _ -~             `.  ^-`  ^-_               ///-._ _ _ _ _ _ _}^ - - - - ~                     ~-- ,.-~                                                                  /.-~</code></pre>
<!-- end -->

<hr />
<h1><a name=endmatter href=#endmatter>#</a> endmatter</h1>
<h2><a name=further-reading href=#further-reading>#</a> further reading</h2>
<ul><li><em>The Unix Programming Environment</em> - Brian W. Kernighan, Rob Pike</li><li><a href="http://cm.bell-labs.com/cm/cs/who/dmr/hist.html">The Evolution of the Unix Time-sharing System</a> - Dennis M. Ritchie</li><li><a href="https://www.youtube.com/watch?v=tc4ROCJYbm0">AT&amp;T Archives: The UNIX Operating System</a> (YouTube)</li><li><a href="https://medium.com/message/tilde-club-i-had-a-couple-drinks-and-woke-up-with-1-000-nerds-a8904f0a2ebf">I had a couple drinks and woke up with 1,000 nerds</a> - Paul Ford</li></ul>

<h2><a name=code href=#code>#</a> code</h2>
<p>As of July 2018, source for this work can be found <ahref="https://code.p1k3.com/gitea/brennen/userland-book">on code.p1k3.com</a>.I welcome feedback there, <a href="https://mastodon.social/brennen">onMastodon</a>, or by mail to userland@p1k3.com.</p>
<h2><a name=copying href=#copying>#</a> copying</h2>
<p>This work is licensed under a<a rel="license" href="https://creativecommons.org/licenses/by-sa/4.0/">CreativeCommons Attribution-ShareAlike 4.0 International License</a>.</p>
<p><a rel="license" href="https://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" src="images/by-sa-4.png" /></a></p>
<hr /><script>$(document).ready(function () {  // ☜ ☝ ☞ ☟ ☆ ✠ ✡ ✢ ✣ ✤ ✥ ✦ ✧ ✩ ✪   var closed_sigil = 'show';  var open_sigil = 'hide';
  var togglesigil = function (elem) {    var sigil = $(elem).html();    if (sigil === closed_sigil) {      $(elem).html(open_sigil);    } else {      $(elem).html(closed_sigil);    }  };
  $(".details").each(function () {    var $this = $(this);    var $button = $('<button class=clicker-button>' + closed_sigil + '</button>');    var $details_full = $(this).find('.full');
    $button.click(function (e) {      e.preventDefault();      $details_full.toggle({        duration: 550      });      togglesigil(this);    });
    $(this).find('.clicker').append($button);    $button.show();  });
  $('.details .full').hide();});</script></body></html>