A wiki on a home gateway

A wiki is (a part of) a website that people can write to, as well as read. It is a fun way to exchange information with family, friends, and colleagues. And yes, with the rest of the world too. This means there is a certain risk that evildoers will deface your wiki, but in practice the risk is small. I hope.

Many programs for implementing wikis exist (here is a list). Most demand rather a lot of resources from the host computer (speed, memory); and because a home gateway is often an old, somewhat underpowered machine, this can be a problem. Serving a page from your wiki should be as fast as serving an ordinary HTML page, not take 10 or more seconds.

I’ve settled for a wiki program called Awkiawki (by Oliver Tonnhofer). This brilliant little program is written in the AWK language. According to Perl addicts, AWK is now somewhat outmoded. Nevertheless, it is still present in all Linux installations, and it is small and very fast, compared to other languages often used for wikis. Awkiawki consists of only a few hundred lines of AWK. It works very well even on ‘slow’ systems like 486’s. Pages in Awkiawki are individual files, not database records. No database program is needed.

Awkiawki can be had from sourceforge. The latest version is 0.1 (file awkiawki-0.1.tar.gz). I’ll explain below how to install and hack it. Because it is so small, even ‘weekend programmers’ can easily hack Awkiawki to add or improve features.

1. Preparation

These are the installation instructions in the case of Debian. It may work on other distros. I assume the following:

  • You are going to run Awkiawki on a small home website. I mean, if you expect tons of concurrent users, or your wiki to run into thousands of pages, Awkiawki won’t be your best choice. You’ll have to look elsewhere for advice.
  • You are running Linux (or a similar OS).
  • Your web server is Apache. Apparently, other web servers work equally well, especially thttpd which is reported to be very small and fast, but I don’t have any experience with it.
  • Apache runs as user www-data (a user with very few privileges).
  • The (HTML) files of your home website are in the default ‘document root’, /var/www. You’ll put the wiki into a subdirectory of the document root. In this article I’ll assume it is called /var/www/mywiki.
  • You can run CGI scripts from the document root. This is not the default. To enable it:
    1. become root
    2. edit /etc/apache/httpd.conf
    3. change the line (in the section called <Directory /var/www/>):
      Options Indexes Includes FollowSymLinks MultiViews

      to

      Options Indexes Includes FollowSymLinks MultiViews ExecCGI
    4. Make sure that you uncomment (remove the # in) the line in /etc/apache/httpd.conf:
      #   AddHandler cgi-script .cgi .sh .pl
      
  • You have rcs (the GNU Revision Control System) on your server. In Debian, this can be installed in the standard way: apt-get install rcs. You don’t have to know how it works, and no configuration is needed. Awkiawki uses it to keep track of the changes made by users to your wiki pages.

NOTE: Any operations on the gateway/server (like changing conf files, and installing Awkiawki) can be done remotely, by means of a telnet connection from another machine in your home network.

2. Installing Awkiawki

Become root. Do:

cd /var/www
mkdir mywiki
cd mywiki
mkdir data
cd data
mkdir RCS
cd ..

Unpack awkiawki-0.1.tar.gz into this directory (/var/www/mywiki). Remove the CVS directory. The essential files that you need are awki.conf, parser.awk, awki.cgi, awki.png, and special_parser.awk.

For safety’s sake, restrict use of the scripts to their owner:

chmod 700 *cgi *awk

Now change the ownership of everything to the user Apache runs as. In Debian, Apache runs as the user www-data. So, as root:

cd /var/www
chown -R www-data:www-data mywiki

Now become user www-data yourself. This can only be done if you are root (which you are now) by means of the command

su www-data

Then go to /var/www/mywiki and make two links:

ln -s awki.cgi index.cgi
ln -s awki.conf index.conf

This allows ‘automatic’ start-up of the wiki; if somebody types http://your-site/mywiki, the wiki will start immediately. Users won’t have to type a ‘complicated’ address containing things like ‘cgi’.

Now edit awki.conf, and change the line

#img_tag = <img src="/awki.png" width="48" height="39" align="left">

to

img_tag = <img src="/mywiki/awki.png" width="48" height="39" align="left">

In other words, uncomment the line and add the subdirectory name before awki.png. If this is not done, Awkiawki will not be able to find its logo file (a symbol that will be displayed on the top left).

Also, find the line in awki.conf that says

# rcs="true"

and uncomment it (remove the # sign). This will enable the ‘page history’ function. It is convenient to also specify in awki.conf:

always_convert_spaces = 1

And while you’re at it, you can just as well enable Cascading Style Sheets by setting in the appropriate place in awki.conf:

css = "/mywiki/mywiki.css"

3. Running Awkiawki

Now you can try it out. Let’s say the gateway machine (which runs Apache) is called foo in your home network (from the outside, of course, it will be known by its external domain name or IP address). Start a browser on another machine in your home network, and type into the address bar: foo/mywiki. The ‘FrontPage’ (the ‘root page’ of your wiki) should come up:

The ‘FrontPage’ file (another name for the first wiki page can be set in awki.conf) is empty; click ‘Edit FrontPage’ to put something there. An HTML ‘form’ opens into which you can enter text, which can then be saved using the SAVE button. Every wiki system seems to have its own formatting rules; Awkiawki’s rules (which are also summarised below each HTML text input form) are as follows:

  1. Emphasise text by enclosing it between multiple apostrophes: 2 for italic, 3 for bold, 5 for bold italic;
  2. One minus sign at the beginning of a line makes a big headline, two a medium headline, three a small headline; four minus signs at the beginning of a line make a horizontal line (same as HTML <hr>);
  3. To make a new paragraph, skip a line;
  4. Any text on a line beginning with a space will be displayed preformatted (as ‘typewriter’ text);
  5. List items are made by entering text on a line beginning with 8 spaces + the number 1 (numbered lists) or 8 spaces + an asterisk (bulleted lists). In theory you could also type a TAB (instead of 8 spaces) but most browsers do not allow entering TABs in HTML forms. Multiples of 8 spaces produce sublevel lists.
  6. External links are made by entering them in standard URL notation (i.e. beginning with http://, etc.); if the link points to a picture (a jpeg, jpg, gif, or png file), it will be displayed inside your wiki page itself.
  7. Internal links (to pages inside the wiki itself) are made by typing words in ‘CamelCase’, i.e. words beginning with a capital letter and containing at least one other capital letter, separated from the first by one or more lower-case letters. If the pages to which such links refer do not exist yet in the wiki, the links will be displayed followed by a question mark. Subsequent clicking on the question mark, then editing and saving, will create the pages.

Please play with your new wiki a while and see how fast it is. See how the ‘RecentChanges’ and ‘PageHistory’ functions allow you to keep track accurately of changes in the Wiki; you can see who (or at least which IP address) changed what, and you can also undo changes (click ‘edit’ on an earlier version of the page, then save it).

4. Hacking Awkiawki

It is likely that most people who run Awkiawki make some changes in it, because it is easy to do, and because a few changes immediately suggest themselves (for instance, the formatting rules 4 and 5, above, are somewhat in conflict with one another). I found some of the hacks below on the web, others I made myself.

Tips for hacking Awkiawki:

  • If you get an ‘internal server error’, look at the last line of your server’s /var/log/apache/error.log. It will point you to the line in the Awkiawki script where you made a mistake.
  • Awkiawki uses several arrays internally, like ENVIRON, query, and localconf. While debugging, it may be useful to see the contents of these arrays. You can do this by putting some statements just before the closing bracket of the BEGIN action in awki.cgi. E.g., to see the contents of the ENVIRON array you would put
    #diagnostics
    for (var in ENVIRON) print var, ENVIRON[var] "<br>"
    

NOTE: If you want to hack Awkiawki, it is of course advisable to learn a bit of the AWK language. This is not difficult; AWK is a ‘little language’ (see Jon Bentley, ‘More Programming Pearls, Confessions of a Coder’). Resources for learning AWK:

  • The GNU AWK manual can be found at the GNU site. It is called Effective AWK Programming. Earlier (more compact) versions, called The GAWK Manual, can be found on the web. The GNU site also provides texinfo/dvi/PostScript/pdf format files for producing hard-copy book-type versions of the manual (with page numbers, index, etc.).
  • O’Reilly sells a book Effective AWK Programming by Arnold D. Robbins, which (it seems) is the same as the new GNU version.

Or of course, you can just apply, unthinkingly 😉 the example hacks given below.

4.1 Instant save & show

If you save a page in your wiki you’ll notice that you first see a message like

saved MyNewPage

and then, a second or so later, the updated page is shown.

In earlier versions of Awkiawki the page was not shown automatically; the message ‘saved MyNewPage’ stayed on the screen, and you had to click to actually see the new page.

Automatic ‘save and show’ in Awkiawki 0.1 is achieved by putting an HTML <meta> tag in the page, instructing browsers to refresh the page after 2 seconds. The trouble is, Microsoft Internet Explorer goes on refreshing the page every two seconds, sending a request to your server each time (Mozilla does not do this).

It is easy to make a better (at least, faster) ‘instant save and show’ mechanism. In awki.cgi, comment out (put a # sign in front of), or remove, the lines in function header:

if (query["save"])
  print "<meta http-equiv=\"refresh\" content=\"2,URL="scriptname"/"page"\">"

Then in the BEGIN action, change

else if (query["save"] && query["text"] && page_editable)
  save(query["page"], query["text"], query["string"], query["filename"])

to

else if (query["save"] && query["text"] && page_editable)
  {
  save(query["page"], query["text"], query["string"], query["filename"])
  parse(query["page"], query["filename"], query["revision"])
  }

Finally, comment out (or remove) the line

print "saved <a href=\""scriptname"/"page"\">"page"</a>"

from function save.

4.2 More convenient link names

In Awkiawki, links to internal pages made by means of words in CamelCase, following an ancient wiki tradition. The definition of a proper link word –i.e., the definition of ‘CamelCase’– is done by the following regular expression, which occurs in many places in the Awkiawki scripts:

[A-Z][a-z]+[A-Z][A-Za-z]*

This means: a proper link name begins with one capital letter (“[A-Z]“), then has at least one lowercase letter (“[a-z]+“), then another capital letter (“[A-Z]“), optionally followed by a mixture of upper- and lower-case letters (“[A-Za-z]*“).

Therefore, digits are not allowed in link names, and neither are underscore characters. So, if we would like to have link names like Chapter_1, the regular expression must be changed.

It has to be changed in a lot of different places. It would be nice if the regexp were defined in one place, where we could change it, but there are no C-style #define statements in AWK. The next best thing is a ‘computed regexp’. This is done as follows: at the beginning of the BEGIN action in awki.cgi (just after the opening bracket), insert

LINK = "([A-Z][a-z]+[A-Z][A-Za-z]*)"

Mind the quotation marks and parentheses! Then change, in function clear_pagename,

if (match(str, /[A-Z][a-z]+[A-Z][A-Za-z]*/))

to

if (match(str, LINK))

We do similar things in parser.awk and special_parser.awk, but it is a little bit more complicated there. In both files, we define LINK at the beginning of the BEGIN action as before, and we replace /[A-Z][a-z]+[A-Z][A-Za-z]*/ (including the / characters) everywhere by LINK. Then we aren’t yet finished; in parser.awk look for the string

/(^|[[,.?;:'"\(\t])[A-Z][a-z]+[A-Z][A-Za-z]*/

and replace it by

LINK_A

Then add in the BEGIN action, below the LINK definition:

LINK_A = "(^|[[,.?;:'\"\(\t])" LINK

Similarly, in special_parser.awk, replace

/^[A-Z][a-z]+[A-Z][A-Za-z]*$/

which occurs in 3 places, by

LINK_B

then add in the BEGIN action

LINK_B = "^" LINK "$"

After all this work we can change the LINK definition to suit our taste. This must be done in three places (once in each of the three scripts). I myself changed LINK to:

LINK = "([A-Z][a-z]+[A-Z][A-Za-z]*|[A-Za-z]+_[A-Za-z_0-9]*)"

This says that a valid link name is either a conventional CamelCase name, or a word beginning with one or more letters followed by an underscore, and ending with zero or more letters, underscores, or digits.

4.3 Centered pictures

This is only a small thing, but when pictures are linked into a page, Awkiawki aligns them to the left side. I’d like to have them centered. This is easy; in parser.awk, the part that begins with the comment # generate HTML img tag for .jpg,.jpeg,.gif,png URLs, change

sub(/https?:\/\/[^\t]*\.(jpg|jpeg|gif|png)/, "<img src=\"&\">",field)

to

sub(/https?:\/\/[^\t]*\.(jpg|jpeg|gif|png)/, "<div align=\"center\"><img src=\"&\"></div>",field)

4.4 Adding tables

This system for entering simple tables was proposed by Paul de Bruin. In parser.awk, change

NR == 1 { print "<p>"; }

to

#tables
/^\|.*\|/ { close_tags("table"); parse_table(); print; next; }

NR == 1 { print "<p>"; }

At the end of function close_tags, change

 
         }
     }
} 

to

 
        }
     }
     # close table
     if (not !~ "table") {
           if (table == 1) {
                 parse_table()
                 print "</table></center>"
                 table = 0
          }
     }
}

Then at the end of parser.awk, add a new function:

function parse_table() {
    if (table != 1) {
         print "<center><table width=\"60%\" border=1>"
         table = 1;
    } 
    if (table == 1) {
         gsub(/^\|/,"<tr><td>");
         gsub(/\|[ ]*$/,"</td></tr>");
         gsub(/\|/,"</td><td>");
    }
}

Simple tables can now be made by enclosing the table cells in | signs. The first one must be at the beginning of the line.

4.5 More convenient lists

A more convenient system for entering lists is:

  • A line beginning with a # character becomes a numbered list item;
  • A line beginning with a * character becomes a bulleted list item.

(Multiple #’s or *’s make nested list items.) This removes the inconvenience of having to enter 8 spaces, and also makes it possible to enter pre-formatted text which is actually supposed to have 8 spaces in it.

I found this on a wiki by Bruce Henderson (he also has various other awkiawki hacks, like a more powerful ‘tables’ function).

In awki.cgi, function save, comment out the two lines:

if ( localconf["always_convert_spaces"] || query["convertspaces"] == "on")
    gsub(/        /, "\t", dtext);

Then in parser.awk, change

/^\t+[*]/ { close_tags("list"); parse_list("ul", "ol"); print; next;}
/^\t+[1]/ { close_tags("list"); parse_list("ol", "ul"); print; next;}

to

/^\*/ { close_tags("list"); parse_list("ul", "ol"); print; next;}
/^\#/ { close_tags("list"); parse_list("ol", "ul"); print; next;}

and

      while(/^\t+[1*]/) {
          sub(/^\t/,"")
          tabcount++

to

      while(/^[\#\*]/) {
           c=sub(/^\#/,"")
           if (c) tabcount++
           c=sub(/^\*/,"")
           if (c) tabcount++

and finally

       sub(/^[1*]/,"")
       $0 = "\t<li>" $0

to

        sub(/^[#*]/,"")
        $0 = "\t<li>" $0

4.6 Making the wiki multilingual

It is easy to make a small improvement in the headers inserted by Awkiawki in the HTML pages it generates. In awki.cgi, function header, change

print "Content-type: text/html\n"
print "<html>\n<head>\n<title>" page "</title>"

to

print "Content-type: text/html\n"
print "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\">"
print "<html>\n<head>\n<title>" page "</title>"
print "<meta http-equiv=\"content-type\" content=\"text/html; charset=utf-8\">"

With this, users can enter multilingual text (utf-8) in your wiki, and browsers will display it properly.

4.7 Improving the look of the wiki with a .css file

With a simple .css file , it is possible to specify, e.g., text and headline colours, text justification, etc., for the page as a whole – without changing awkiawki itself.

NOTE: we have already specified (at the end of Section 2) that the .css file in this example will be called /mywiki/mywiki.css (relative to the document root, which is /var/www; i.e., the real pathname will be /var/www/mywiki/mywiki.css).

In fact, CSS files can do much more than specifying display properties for the page as a whole. The complete layout of your wiki pages can be specified through CSS. By splitting up the page into divisions, each with its own display properties, pages –also Awkiawki pages– can be provided with headers and footers, with different background colours and patterns, columns containing menus on the left and the right, etc.; in other words, you can make your wiki look as ‘professional’ as you like. This may be important for getting wiki use accepted within a corporate or bureaucratic environment. Set it up for your colleagues on your home site; then get the bosses interested; then overcome the resistance of the IT department.

This will involve some hacking of awki.cgi. Here is what the functions header and footer could look like (this is just an example, to be changed according to your own requirements):

# print header
function header(page) {
   print "Content-type: text/html\n"
   print "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\">"
   print "<html>\n<head>\n<title>" page "</title>"
   print "<meta http-equiv=\"content-type\" content=\"text/html; charset=utf-8\">"
   if (localconf["css"])
      print "<link rel=\"stylesheet\" href=\""localconf["css"] "\">"
   print "</head>\n<body>"       

   print "<div id=\"logo\">"
   print localconf["img_tag"]
   print "<br> <br>My<br>Wiki</div>"

   print "<div id =\"caption\">"
   print "A Wiki<br>in awkiawki,<br>made with CSS</div>"

   print "<div id=\"header\">"
   if (query["page"] ~ "FullSearch") 
       {
       print "Pages with \""query["string"]"\""
       }
   else print "<a href=\""scriptname"/FullSearch?string="page"\">"page"</a>"
   print "</div>"
        
   print "<div id=\"navigation\">"
   if (page_editable)
      print "<a href=\""scriptname"?edit=true&page="page"\">Edit Page</a>"
   print "<a href=\""scriptname"/"localconf["default_page"]"\">"localconf["default_page"]"</a><br/>"
   print "<a href=\""scriptname"/PageList\">PageList</a><br />"
   print "<a href=\""scriptname"/RecentChanges\">RecentChanges</a><br />"
   if (localconf["rcs"] && !special_page)
      print "<a href=\""scriptname"/"page"?history=true\">PageHistory</a><br> <br>"
   print "<form action=\""scriptname"/FullSearch\" method=\"GET\"align=\"right\">"
   print "<input type=\"text\" name=\"string\" size=\"13\">"
   print "<input type=\"submit\" value=\"search\">"
   print "</form>"
   print "</div>"

   print "<div id =\"content\">"
   }
                                        
# print footer
function footer(page) { print "</div>\n</body>\n</html>\n" }

This generates almost the same HTML as the original version does – with one big difference: text is put between named division tags (the <div id = .. statements) instead of just into the HTML ‘body’. We’ve defined five divisions:

  • <div id = "logo"> ... </div>, containing the awkiawkilogo and the text ‘My Wiki’.
  • <div id = "caption"> ... </div>, containing some more fixed text.
  • <div id = "navigation"> ... </div>, containing the navigation links.
  • <div id = "header"> ... </div>, containing the title (or rather, the filename) of the page which is being displayed.
  • <div id = "content"> ... </div>; this is where the actual text of the wiki pages will be displayed.

Now exactly how (in which colours, fonts, etc.) and where (on the browser screen) these divisions will be displayed, will be determined by the .css file. A .css file that will work with this example is shown here. It makes a vertical sand-coloured bar at the left, containing the ‘logo’, ‘caption’, and ‘navigation’ divisions; the ‘header’ division is on the top.

The file already contains a printer-friendly ‘media selector’ at the end (see the next section). This example also needs a small .png file called sand.png.

This example is not exactly beautiful or sophisticated; actually, it looks like this. You can use it as a starting point for your own efforts, which will surely lead to something that is beautiful and sophisticated.

4.8 Making pages ‘printer-friendly’

Many web sites provide a clickable button for displaying pages in a ‘printer-friendly’ fashion. In fact, this is not necessary. Using CSS, pages can be made ‘printer-friendly’ by default. You can set it up, for instance, so that when a user simply presses ‘File, Print’:

  • only the central content of the page is printed, without headers, footers, and sidebars;
  • the page is printed in black, even if it displays in colour on the screen;
  • text is printed right-justified, even if the screen display is not justified;
  • the print font is different from the one on the screen.

All this is simple to achieve in your wiki. In fact it is amazingly easy. You only need a ‘media selector’ in your css file. A media selector appropriate for our example design would be:

/* Automatic "printer-friendly" pages */ 
@media print {
   body {
      color: black;
      font-family: 'Times New Roman', times, serif;
   }
   #logo   { display: none }
   #caption { display: none }
   #navigation { display: none }
   #header { display: none }
   #content {
      color: black;
      font-size: 12pt;
      margin-left: 12pt;
      text-align: justify;
   }
   #content a:link {color: black }
   #content a:visited {color: black }
   #content table { color: black; font-size: 12pt; }
}

You just have to put this at the end of your mywiki.css file. This ‘media selector’ (also called an ‘at rule’ because of the @ sign in front) specifies that the ‘logo’, ‘caption’, ‘navigation’, and ‘header’ divisions should not be printed, while the ‘content’ division should be printed in black, right-justified, in a twelve-point letter, with Times New Roman being the default font.

Lots of other tricks are possible with CSS; for instance I haven’t mentioned div class tags yet. Search the web if you want to know more..

Comments
Simplue WordPress theme, Copyright © 2013 DicasLivres.org Simplue WordPress theme is licensed under the GPL.