[spotlight] awk - text processing language

awk, a small language from 1977, commonly seen used for one-line text processing in shell scripts. But actually good enough to write ‘real’ software - the famous awk book builds up to the writing of a dependency-tracking build system.

What’s unusual about it?

  • it’s small and quick, and interpreted
  • looks a little like C
  • an awk program is a sequence of patterns and actions - it’s like having an implicit loop around an implicit case statement
  • it has various default actions and facilities which can be used as implied
  • it parses input into whitespace-delimited fields for free
  • it has a good set of string handling functions
  • it offers associative arrays as the only aggregate type - arrays indexed by strings or numbers, holding strings or numbers, and in some versions multi-dimensional
  • no boiler plate
  • no types - everything is a string and a number (both floats and integers)
  • no initialisation or declaration
  • can read from or write to commands spawned in subshells
  • in some versions can read or write TCP/IP connections

Here’s a website to help you get started:

Here’s an in-browser REPL.

I see @Will added a nice link to the etherpad:

Why is AWK so important? It is an excellent filter and report writer. Many UNIX utilities generate rows and columns of information. AWK is an excellent tool for processing these rows and columns, and it is easier to use AWK than most conventional programming languages. It can be considered to be a pseudo-C interpreter, as it understands the same arithmetic operators as C. AWK also has string manipulation functions, so it can search for particular strings and modify the output. AWK also has associative arrays, which are incredibly useful, and is a feature most computing languages lack. Associative arrays can make a complex problem a trivial exercise.