Building, Configuring, and Using Isearch-cgi


Isearch-cgi is an add-on for Isearch.  Isearch-cgi lets you do all the cool
stuff that Isearch does, only you can do it from a page on the World Wide
Web.  This means your textbase can be accessed by anyone in the world who
has a web browser (and these days, that's pretty much everyone).

It's important to understand that Isearch-cgi is almost useless by itself.
You have to have already installed Isearch.  This document will describe
Isearch-cgi version 1.03.  To use this version, you'll have to have Isearch
version 1.10.01 or later installed.

INSTALLATION:

To install Isearch-cgi requires that you have already gotten and built
Isearch.  If you haven't, get it now, build it, and learn to use it.  The
Isearch package is available for anonymous ftp from ftp.cnidr.org in the
directory /pub/NIDR.tools/Isearch.

From this point on, we're going to make some assumptions:
1) You have Isearch, and it's installed as /local/project/Isearch-1.10.02
2) You know how to use (at least) Isearch and Iindex
3) You already have NCSA's httpd installed (a "web server")
4) You pretty much just took the defaults when you installed httpd, like
   placing your .html files in /usr/local/etc/httpd/htdocs.
   
Note that if you're using one of Netscape's web servers, these instructions
will probably be close, but not quite precisely correct.  You were warned.

So, let's get Isearch-cgi:

sti-gw% ftp ftp.cnidr.org
Connected to kudzu.cnidr.org.
220 kudzu FTP server (Version wu-2.4(1) Sun Jan 1 17:43:49 EST 1995) ready.
Name (ftp.cnidr.org:escott): anonymous
331 Guest login ok, send your complete e-mail address as password.
Password:
230- Welcome to kudzu.cnidr.org
230- 
230- ftp logged in from sti-gw.sti-ext.com at Thu May  2 22:57:30 1996
230- 
230- There are currently 20 users of the maximum 25 logged on.
230- 
230- If you have trouble using this experimental FTP client, try including
230- a dash in front of your username (as your login password).  This will
230- disable informational messages such as these.
230- 
230- Comments???  Send mail to `bmv@k12.cnidr.org`
230-
230-
230 Guest login ok, access restrictions apply.
ftp> cd /pub/NIDR.tools/Isearch-cgi
250-Please read the file README
250-  it was last modified on Thu Feb  1 15:44:18 1996 - 91 days ago
250 CWD command successful.
ftp> binary
200 Type set to I.
ftp> get Isearch-cgi-1.03.tar.Z
200 PORT command successful.
150 Opening BINARY mode data connection for Isearch-cgi-1.03.tar.Z (22896 
bytes).
226 Transfer complete.
local: Isearch-cgi-1.03.tar.Z remote: Isearch-cgi-1.03.tar.Z
22896 bytes received in 9 seconds (2.5 Kbytes/s)
ftp> quit
221 Goodbye.


What we just did was this:  used ftp to connect to ftp.cnidr.org, logged in
as "anonymous", typed our email address for a password (which of course
didn't show up here), and changed directories to where Isearch-cgi is kept.
Then we made sure we were using "binary" mode, and we got the distribution.

Now we need to uncompress the distribution:

sti-gw% uncompress Isearch-cgi-1.03.tar.Z

This creates a file named "Isearch-cgi-1.03.tar".
Now, we'll move that tar file to a good place to work from:

sti-gw% mv Isearch-cgi-1.03.tar /local/project
sti-gw% cd /local/project

Finally (for now) we'll unpack that tar file:

sti-gw% tar xf Isearch-cgi-1.03.tar
sti-gw% cd Isearch-cgi-1.03

We're now ready to start the main part of building Isearch-cgi.

The first thing to do is to edit the file "Makefile".  This file contains
configuration information that Isearch-cgi needs in order to find Isearch
and the cgi-bin directory that your web server is using.  Basically, load
the Makefile in your favorite editor and follow the directions.  At one
point, you'll see a line that says "That's all!  Type 'make'".  Don't edit
below that line unless you really know what you're doing.

Now it's time to type "make".  Don't be surprised if the compiler prints some
warnings:  no one is perfect.  If all goes well, the Makefile will print:

Welcome to CNIDR Isearch-cgi!

Read the README file for configuration and installation instructions

Which is pretty sound advice, even if you're armed with this guide, since
small details change from time to time.

At this point, you have built "isrch_fetch", "isrch_srch", and
"search_form".  You'll need to build two shell scripts now, and you'll do it
with the "Configure" script provided.  Configure will need one argument, the
directory where your Isearch indexes are stored.  The Isearch indexes are
the files created when you run Iindex, and the name was set by the "-d"
option to Iindex.  We're going to assume you have an index named "tester" in
the directory "/local/project/Isearch-1.10.02/":

sti-gw% Configure /local/project/Isearch-1.10.02

That created the scripts "isearch" and "ifetch". Copy these two scripts to 
wherever you put your cgi-bin applications.  This will quite likely be 
/usr/local/etc/httpd/htdocs/cgi-bin.  

Now, following the outline of the README, we'll take a moment to make an index 
to some intersting files:

sti-gw% cd /local/project/Isearch-1.10.02
sti-gw% cd bin
sti-gw% Iindex -d /local/project/Isearch-1.10.02/tester -t sgmltag /etc/motd

That made a searchable index of the login banner.  Boring example, yes, but
one that anyone can handle.

The one remaining step is to make a web page that contains the buttons and
text fields and so forth we need to actually do the searching.  The program
"search_form" will do that for us.  Here's a simple example:

sti-gw% search_form /local/project/Isearch-1.10.02 tester > form.html

This creates a form named "form.html" that knows to use the "tester"
textbase in the "/local/project/Isearch-1.10.02" directory.  Copy form.html
to your httpd document directory (probably /usr/local/etc/httpd/htdocs).

Now point your favorite web browser at that page (probably you'll use a URL
like http://myhost.mydomain.com/form.html).  You should see a form with
blanks to fill in for search terms and a button that says "Submit Query".
When you click on that button, the query will run and you'll get your search
results.  By default, the page you get will have brief "headlines" for the
matching documents, and you'll be given links to click on to see the full
document if you want to.

There is a chance that you will see a message in your web browser after
searching that says something to the effect that it can't read
"/cgi-bin/isearch".  If this happens, check the entry for "ScriptAlias"
in srm.conf, probably as /usr/local/etc/httpd/conf/srm.conf.  Make sure
it contains the fully qualified path name to your cgi-bin directory.


That's pretty much the whole game, the rest is just details.


DETAILS

As promised, here's the rest of the story.

More Advanced Search Forms:

First, the simple search page, form.html, that we created above is really
simple.  It allows you to enter three search terms, and they'll all be
"or"-ed together at search time.  Not all that wonderful.  No wonder it's
called a "simple" search form.  Fortunately, there are two other kinds of
search forms, "boolean" and "advanced".

A Boolean search form lets you specify two search terms and whether they are
"and"-ed, "or"-ed, or "andnot"-ed.  To generate that kind of form, use the
"-boolean" option to search_form:

sti-gw% search_form -boolean /local/project/Isearch-1.10.02 tester >form2.html

You can also create an advanced search form.  This allows you to type
free-form, infix boolean queries, like "((cheese and wine) or caviar) andnot
sherry".  To generate this kind of page, use:

sti-gw% search_form -advanced /local/project/Isearch-1.10.02 tester >form3.html


Better Looking Forms:

The search forms that are generated are pretty plain.  You'll probably want
to add button bars and pictures of your company headquarters and so forth.
Feel free.  Go nuts.

The search results page is generated by the code in isrch_srch.cxx.  Look at
the "cout" statements and get a feel for what is going on, and then hack
away.  You can make the output as simple or as fancy as you'd like.  You can
have a little Java animation of yourself pointing at your favorite document
when it gets found, for instance.  Or color code the hits, red for the
highest score through purple to blue for the lowest.  Again, go nuts.

Fast, Clean Links:

You can also modify the search form so that fetches aren't done through
ifetch, but go straight to the html files that were indexed.  This assumes, of
course, that the files you indexed were part of your normal htdocs tree.  If
not, you're out of luck.  But if you just indexed your web site, add the
line:

<input name="HTTP_PATH" type=hidden value="/path/to/http/docs">

to your search form (like form3.html, above, for instance).  Make sure you
edit the pathname, though.  This technique will make the Isearch-cgi results
point to the real files instead of always going through ifetch.  This is
good for two reasons:

1) It protects the integrity of the link.  This is nice from a philosophical
   standpoint.
2) It is much faster and places much less load on your server.  This is nice
   from a job security standpoint.


That concludes the Isearch-cgi Users' Guide.  Make sure you subscribe to the
Isite mailing list for the most current information.  The directions for
doing this are in the file "README".  I won't repeat them here because
they might very well change in the near future.



