All the Perl you need
for slicing and dicing text files
Brennen Bearnes <bbearnes@wikimedia.org>
  Source:
  
    code.p1k3.com/gitea/brennen/wmf-engprod-offsite-slides
  
  Slides:
  
    squiggle.city/~brennen/perl/
  
A general-purpose, multi-paradigm programming language.
That also has a lot in common with the Unix shell.
General-purpose, multi-paradigm programming language.
That also has a lot in common with the Unix shell.
So: A widely-installed environment where you can combine small tools to work with data.
A sample data file, tab-separated names of some authors:
$ column -t examples/authors.tsv
Robinson  Eden
Waring    Gwendolyn  L.
Brunner   John
Tolkien   John       Ronald  Reuel
Walton    Jo
Toews     Miriam
Cadigan   Pat
Le        Guin       Ursula  K.
Veselka   Vanessa
Wells     Martha
Leckie    Ann
Perl is often described as a superset of grep, sed, and awk.
Combines filtering, sorting, and transforming strings with stuff that’s hard (or missing) in Bash:
Like grep:
$ perl -ne 'print if m/^T/;' examples/authors.tsv | column -t
Tolkien  John    Ronald  Reuel
Toews    Miriam
Like cut or awk:
$ perl -anE 'say @F[1];' examples/authors.tsv
Eden
Gwendolyn
John
John
Jo
Miriam
Pat
Guin
Vanessa
Martha
Ann
Like sed or tr:
$ perl -pe 'tr/[a-z]/[A-Z]/' examples/authors.tsv | column -t
ROBINSON  EDEN
WARING    GWENDOLYN  L.
BRUNNER   JOHN
TOLKIEN   JOHN       RONALD  REUEL
WALTON    JO
TOEWS     MIRIAM
CADIGAN   PAT
LE        GUIN       URSULA  K.
VESELKA   VANESSA
WELLS     MARTHA
LECKIE    ANN
I’m not here to convince you to write large programs in Perl.
I want to gesture at a portion of the language useful for:
Let’s go over some basic syntax and techniques, and then look at a few examples from my toolkit.
grep, sed, vi, PCRE, etc.#!/usr/bin/env perl
print "Hello EngProd.\n";
$ ./examples/hello.pl
Hello EngProd.
#!/usr/bin/env perl
use warnings;
use strict;
use 5.10.0;
say greet($ARGV[0]);
sub greet {
  my ($greetee) = @_;
  return "Hello $greetee.";
}
$ ./examples/hello_boilerplate.pl 'EngProd'
Hello EngProd.
#!/usr/bin/env perl
use warnings;
use strict;
use 5.10.0;
# Extract name where given name matches "John":
while (<>) {
  say "$2 $1" if m/^(.*)\t(Jo.*?)(\t|$)/i;
}
$ ./examples/filter_authors.pl examples/authors.tsv
John Brunner
John Tolkien
Jo Walton