You are here: Home > Articles > Article Display

The Windows 2000 Indexing Service

Also known as Index Server, it catalogs your website, allowing your visitors to search against your content. This article goes through installing the Indexing Service on Windows 2000, pointing it to your web site, tuning it for speed and efficiency, showing you how to use it with some sample code against the ADO, and what to do if it fails.

Published: Apr 19, 2002
Tested with: ASP 3.0, Windows 2000 Server
Category: ASP
45,819 views

Introduction

The Indexing Service on Windows 2000 allows us to create a search engine for our site. Documentation on it though, is amazingly scarce and scattered. There's plenty out there on how to use it with IDQ/HTX templates, but as far as I see it, there are three fundamental problems with those:

  1. they are hard to use as they require you to learn a separate language syntax.
  2. they don't allow you to use standard ASP code, as they pass through a special DLL filter and not the ASP.DLL. For example, you cannot use include files.
  3. they are old technology.

In this article, I will take you through what is necessary to get it working on your site. I will go through:

  • installing the service,
  • pointing it to your web site,
  • tuning it for speed and efficiency,
  • the ASP code which makes it work,
  • and what to do if it fails.


Installation

The Indexing Service version 3 is the only one that will run on Windows 2000 at this time (April 19, 2002). It is installed by default. If you have not already installed the Indexing Service during the Windows setup (or are not sure if you have or not and you want to check), here’s how you would do it:
Start > Settings > Control Panel > Add/Remove Programs > Add/Remove Windows Components
If you see Indexing Service checked, then it's already installed. Let’s assume it's not installed yet. Check the Indexing Service and then press Next. Windows will install the necessary files and when it’s done, you need to click on Finish. You might be asked for the Windows 2000 Server installation CD to copy the necessary files.

The service should now be installed. To check on the service, go to:
Start > Programs > Administrative Tools > Computer Management

Then click on:
Services and Applications > Indexing Service
You should be able to expand the service and see the 2 Catalogs which were created by default: a System catalog and a Web catalog (this one only if it finds an instance of Internet Information Server (IIS) already running on the server).


Creating a new Catalog

A catalog is like a database where the service stores all the information after it is done indexing your files. My recommendation is to erase both catalogs which are created by default. The Web points to C:\Inetpub (potential security hole), and the System is only useful if you are going to do local searches on the server. If you are using the service only through the internet then it’s safe to erase both of them.

So, let’s say you want to use the service to search against your web site. First, you have to create a web site through the IIS console. I will assume that you know how to do that already and you have a web site up and running. Once you have that running, then you have to tell IIS to Index the web site. You do that through the Home Directory tab of the site Properties. Make sure Index this Resource is checked. If not, check it.

All subfolders of your web site will be indexed if you do this. If you want to exclude a certain folder from being indexed (for example, your images folder), navigate to that folder in the console, go to its properties and uncheck the Index this resource. There is another way to remove folders from a catalog, which I will go through later, but this is the recommended way for websites. This is where a good design of your website is necessary. Put all your images into a folder called images for example, and turn off indexing for it. Do the same for all your content that you want indexed or not indexed. That way, you can just check/uncheck the Index this resource on that folder's properties and it propagates to all the folders/files under it.

By default, the Indexing Service will index HTML files, text files, Office 95 and later files, internet mail and news, and any other document that a filter is provided. For example, Adobe makes its own IFilter which once installed, helps the service index Acrobat (pdf) files.

The next step is to create a new catalog to house all the information. It’s probably a good idea to create a new folder to use exclusively for your catalog(s). Do not save your catalog under a folder that is being indexed by the same catalog or any other. Your English site could be under C:\catalogs\english, your French at C:\catalogs\french, etc. First create those folders. Then open your Computer Management Console, and right click on the Indexing Service, or click on Action on top, and go to New > Catalog.

Type a name for your catalog, and pick a location where you want to save the catalog (our english catalog would go under C:\catalogs\english).

After you create it, you need to specify what to include or not include in it so that the service will start indexing that content. Right click on the catalog you want to edit, click on Properties and move to the Tracking tab. In this case we want it to point to a web site, so you have to tell it what web server to associate it with. Pick one from the pull-down list.

Now when you start the service, it will start indexing your web site. Under the Generation tab, you can select whether to inherit parent attributes or uncheck it so that you can customize it. I chose only to Generate Abstracts and not index files with unknown extensions. Abstracts are another word for the HTML description meta tag, which goes in the HEAD of the document. When indexing an HTML page, it will look to see if there is one in the HEAD. If there is none, then it will pick the first 320 characters from the body to create the abstract. The maximum number for this string length is 500. To define your own abstract in an HTML document, add a DESCRIPTION meta tag in the head of your file, like this:

1 <html>
2 <head>
3 <meta name="description" content="This page explains the WIN2K Indexing Service.">

As promised earlier, here is another way to add or delete folders to be indexed. Go to the Indexing Service Console and right click on Directories and go to New > Directory.

Doing so, gets you to the Add Directory dialog box:

Choose the path of the folder you want to add to your catalog, and choose from the radio button whether you want it included or excluded from the index. You can add folders on remote computers as long as they are correctly mapped in your system. The Alias (UNC) is not necessary.


The "noise" file

Inside your C:\WINNT\system32 folder you should find a file called noise.eng. Open it with a text editor like Notepad. You will see that its contents are single words or numbers, one under the other, each on its own line. This is the word exception file, and the Indexing Service uses this file when it indexes a file to exclude the words that are there. These are common words, like and, or, or numbers. You can edit this file, adding or deleting your own words. If you edit these files, you will need to empty the catalogs and restart the Indexing Service, so that the updated exception list can take effect.

There is a different noise file for every language: noise.enu is specifically for the U.S.A. as opposed to noise.eng which is for U.K. english. The French file is called noise.fra, the German noise.deu, and so on. You can see a list of all your files in the registry. Run regedit from Start and navigate to: HKEY_LOCAL_MACHINE > SYSTEM > CurrentControlSet > Control > ContentIndex > Language. You will see a listing of all the languages, and the key name is NoiseFile.


Tuning Performance

First stop the Indexing Service. Once you do that, you can tune the performance of the engine. Go to All Tasks > Tune Performace:

You will see the following menu:

You can choose Dedicated Server if you want to make this catalog and this service immediately responsive to changes on the file system. You can also select Customize and then click on the button, which will give you this dialog:

Move the Indexing slider to Lazy for less immediate indexing or to Instant for immediate indexing of new and changed documents. Lazy indexing uses fewer resources; Instant indexing uses as much of the computer's resources as it can. Move the Querying slider to Low load if you expect to process only a few queries at a time or to High load if you expect to process many queries at a time. Low load uses fewer resources; high load uses more. You can increase or decrease these settings are you see fit. Keep in mind that doing so will cause your server to use more resources for this activity. I have used the above setting for large sites with thousands of documents with success.

One thing about the Indexing Service's resources you should know about: It is very demanding on the OS when it is first started as it tries to index everything in the catalog. It moves through pretty quickly, indexing thousands of html documents in just a few minutes. But once it finishes it just sits there, not really using many resources. It responds to file changes through the OS, so it knows to index a file once it's changed/created/deleted. This way, you can keep the service running on a small computer and you still get good performance out of it.


The search input form(index.html)

This file is the form that accepts your search arguments.

1 <form action="runsearch.asp" method="post" name="form1">
2 <table cellpadding="2" cellspacing="0" border="0" align="left">
3     <tr>
4         <td width="50">&nbsp;</td>
5         <td colspan="2">
6             <b>Enter your query below:</b><br>
7             <input type="text" name="Query" size="45" maxlength="100" value=""></td>
8     </tr>
9 ...

I limit the query string that a user can input by 100 characters. That should be enough for everybody and helps prevent hacking. You can change this if you like.

1 ...
2     <tr>
3         <td>&nbsp;</td>
4         <td align="right">
5             <select name="Scope">
6                 <option value="/" selected>Entire Site</option>
7                 <option value="/products/">Products</option>
8                 <option value="/services/">Services</option>
9                 <option value="/news_events/">News & Events</option>
10                 <option value="/about_us/">About us</option>
11             </select></td>
12         <td>Search where?</td>
13     </tr>
14 ...

The Scope is another word for folder. It is used to tell the Indexing Service if it’s going to search everything (/), or just under a specific folder of the site(/products/). You can go as deep as you like and it will only search under that folder: for example /products/bicycles/electric/kids/ would search for documents only under the kids folder.

1 ...
2     <tr>
3         <td>&nbsp;</td>
4         <td align="right">
5             <select name="RecordsPerPage">
6                 <option value="10" selected>10</option>
7                 <option value="25">25</option>
8                 <option value="50">50</option>
9                 <option value="100">100</option>
10             </select></td>
11         <td>Number of results per page</td>
12     </tr>
13 ...

Sometimes users want to set how many results they see on one page: fewer or maybe more. You can allow them to do this through a simple pulldown menu as shown above, or through a text box where they type the number themselves.

1 ...
2     <tr>
3         <td>&nbsp;</td>
4         <td align="right">
5             <select name="Order">
6                 <option value="Rank" selected>Ranking Result</option>
7                 <option value="Size">Size</option>
8                 <option value="Write">Last Date updated</option>
9             </select></td>
10         <td>Arrange in order</td>
11     </tr>
12 ...

It's common to list results in order of best match. However, it's possible to rank the resultset under any of the properties in the catalog. You can theferore allow the user to choose the ranking order. Above, I give them the option of ranking the results in order of simply Rank, Size or Date Last Updated.

1 ...
2     <tr>
3         <td>&nbsp;</td>
4         <td align="right"><input type="SUBMIT" value="Search" name="SUBMIT"></td>
5         <td>&nbsp;</td>
6     </tr>
7     <tr>
8         <td>&nbsp;</td>
9         <td>&nbsp;</td>
10         <td>&nbsp;</td>
11     </tr>
12 </table>
13 </form>

And last, the submit button.

Finally, show me some code! (runsearch.asp)

The picture above shows what the returned results should look like. This file, runsearch.asp is responsible for issuing the search against the catalog and properly displaying the returned results. The code should work right out of the box for you, as long as you change a few variables. It consists of:

Global variables

Look for a section called EDIT THESE...END EDIT somewhere in the beginning of this file and change those parameters to fit your system. Those should be the only ones you need to change, the rest is up to you.

  1. One of them is the starslocation to point to where you saved your images for the ranking display.
  2. Another is the strCatalog, which is the name that you gave to your catalog.
  3. The last one is the strCustomTitle which I explain below what it does.
Sub RunSearch() This is the main sub that gets called when the page loads and then calls everything else. It creates a connection to the Indexing Service, gets records through GetRows(), loops through, checks, validates and formats the output.
Function BuildQuery(strScope, strQuery) This function returns a full SQL command to use against the Indexing Service with ADO. Out of the box you get searches against htm, html, asp, ppt, doc, xls, txt, and pdf files, and does not support boolean searches.
Sub WriteNavigation(strNavigation, intTotalRecords, intTotalPages) This sub creates the text for the top navigation links that you see in the picture, i.e. moving from page to page.
Function FileSize(intFileSize) Formats the size output of a file to KB, MB, GB, etc.
Function myFixDate(datWrite) Formats the date last modified output to an international date format.

Let's go through the code here in detail to help you understand what's going on. I have added some error catching, to account for mistakes as well as for malicious users trying to break your site.

1 <%@Language="VBScript"%>
2 <%Option Explicit%>
3 <%Response.Buffer = True%>
4 <html>
5 <head>
6     <title>Search Results</title>
7 </head>
8 <body>
9 <%
10 On Error Goto 0
11 Dim strQuery 'user entered text for search
12 Dim intPage 'page number we are on
13 Dim intStartingRecord 'point to start selecting from the recordset
14 Dim intRecordsPerPage 'developer defined
15 Dim strOrder 'developer defined: what to order against
16 Dim strScope 'Scope to search against
17 Dim QUOT 'character 32 for ease of coding
18 Dim strNavigation 'HTML string for navigation links/info
19 Dim starslocation 'folder path for search images
20 Dim strCatalog 'developer defined catalog name: query against this
21 Dim strCustomTitle 'starting string to remove from the title of html pages
22 '***** EDIT THESE **********************
23 starslocation = "images/"
24 strCatalog = "english"
25 strCustomTitle = "Xefteri - "
26 '***** END EDIT ************************
27 ...

Between EDIT THESE...END EDIT is what you have to change to make it work for your site. One of these variables is called strCustomTitle. This is a little trick that I use to increase the ratings of my site, and you can do it too. Here's how it works: when one of the public search engines visits your site to index it, one of the most important factors in rating your site is the <title> tags in your pages. You can increase your ranking by including the name of your site in your titles. Let's say your site's name is XYZ. Your titles could then all start with "XYZ - " and then continue with a more descriptive title of the page.

1 <html>
2 <head>
3     <title>XYZ - Welcome to our site!</title>
4 </head>

This accomplishes 2 things:

  1. It boosts your rankings when it comes to your name
  2. It improves the readability of someone's bookmarks to your site.

However, when I display the results of the search I use the title of the page as a link to the actual page. At this point, we do not want to show all our titles beginning with the same thing, so we simply edit the title before displaying it. Edit that variable in the code if you are going to use this, and leave it blank (strCustomTitle = "") if you are not going to use it. If that string is not empty, it will check and remove a matching string from the beginning of each title in the displayed results. If it's not, then it will display the whole title as is.

1 ...
2 '-- collect values from request
3 ' leave request object open to account for both post and get
4 strQuery = Request("Query")
5 strQuery = Server.HTMLEncode(strQuery)
6 intPage = Request("PAGE")
7 intRecordsPerPage = Request("RecordsPerPage")
8 strOrder = Request("Order")
9 strScope = Request("Scope")
10 '-- define values
11 QUOT = Chr(34)
12 strNavigation = ""
13 '-- account for people trying to hack
14 '-- set max and min values for URL values
15 Select Case intPage
16     Case ""
17         intPage = 1
18     Case intPage > 32767
19         Response.Write("Page number out of limit!")
20         Response.End
21     Case intPage < 0
22         Response.Write("Page number out of limit!")
23         Response.End
24     Case Else
25         intPage = CInt(intPage)
26 End Select
27 If intRecordsPerPage > 1000 OR intRecordsPerPage < 0 Then
28     Response.Write("Records Per Page out of limit!")
29     Response.End
30 Else
31     intRecordsPerPage = CInt(intRecordsPerPage)
32 End If
33 strOrder = Server.HTMLEncode(strOrder)
34 strScope = Server.HTMLEncode(strScope)
35 If InStr(strScope, "..") Then
36     Response.Write("Invalid Scope!")
37     Response.End
38 End If
39 '-- if bad query string supplied (less than 2 characters), show message
40 If Len(strQuery) < 2 Then
41     Response.Write("<p><b>Sorry, but the search text must be at least two characters long.</b></p>")
42     Response.End
43 '-- if the user is trying to cause an overflow in the query string catch it
44 Elseif Len(strQuery) > 100 Then
45     Response.Write("<p><b>Sorry, but the search text must be less than 100 characters long.</b></p>")
46     Response.End
47 End If
48 '-- evaluate starting record in the recordset
49 intStartingRecord = ((intPage - 1) * intRecordsPerPage) + 1
50 '-- main sub that calls everything else
51 Call RunSearch()
52 ...

The code above collects the user's inputs and tries to make sure that they fall within certain limits. This also helps prevent hacker attacks. If everything is ok, it calls the main sub RunSearch() which does all the work.

1 ...
2 Sub RunSearch()
3     Dim strSearch 'function-returned SQL query
4     Dim objConn 'Connection object
5     Dim objRS 'Recordset object
6     Dim intTotalRecords 'Recordset.RecordCount
7     Dim intTotalPages 'objRS.PageCount
8     Dim arrAllData 'Recordset.GetRows()
9     Dim numrows 'UBound of arrAllData to get the total rows in objRS
10     Dim rowcounter 'simple counter used in the loop
11     Dim strDocTitle 'objRS("DocTitle")
12     Dim lengthstrDocTitle 'Len(objRS("DocTitle"))
13     Dim strFilename 'objRS("Filename")
14     Dim strVPath 'objRS("VPath")
15     Dim intSize 'objRS("Size")
16     Dim datWrite 'objRS("Write")
17     Dim strCharacterization 'objRS("Characterization")
18     Dim numRank 'objRS("Rank")
19     Dim NormRank 'Rank/10 = change to a percentage
20     Dim stars 'image to display for Ranking
21     '-- build up the query string by calling the BuildQuery function
22     strSearch = BuildQuery(strScope, strQuery)
23     '-- create a connection object to execute the query
24     Set objConn = Server.CreateObject("ADODB.Connection")
25     objConn.ConnectionString = "provider=msidxs; Data Source=" & strCatalog
26     objConn.Open
27     '-- create a recordset to hold the data
28     Set objRS = Server.CreateObject("ADODB.RecordSet")
29     objRS.CursorLocation = 3 'adUseClient
30     objRS.Open strSearch, objConn, 0, 1 'adOpenForwardOnly, adLockReadOnly
31     '-- if errors occured
32     If Err.Number <> 0 Then
33         Response.Clear
34         Response.Write("<p><b>There was an error processing your request.<br>Please go back and try again.</b></p>")
35         '-- close all objects to free up resources
36         objRS.Close
37         Set objRS = Nothing
38         objConn.Close
39         Set objConn = Nothing
40         Response.End
41     Else
42         '-- no errors but no records returned
43         If objRS.EOF and objRS.BOF Then
44             Response.Clear
45             Response.Write("<p><b>No pages that matched your query </b>[<b>" & strQuery & "</b>]<b> were found.</b></p>")
46             '-- close all objects to free up resources
47             objRS.Close
48             Set objRS = Nothing
49             objConn.Close
50             Set objConn = Nothing
51             Response.End
52         '-- or if there was no error and some records were successfully returned then
53         Else
54             '-- set the recordset starting position so that we can get the number
55             ' of records we want from this point on using the GetRows() function
56             objRS.AbsolutePosition = intStartingRecord
57             '-- set the pagesize through the object so we can count # of pages returned
58             objRS.PageSize = intRecordsPerPage
59             '-- # of total records found
60             intTotalRecords = objRS.RecordCount
61             '-- # of total pages found
62             intTotalPages = objRS.PageCount
63             '-- create a 2 simensional array of the records using GetRows()
64             ' and only select how many records we want to see per page
65             arrAllData = objRS.GetRows(intRecordsPerPage)
66             '-- close all objects to free up resources
67             objRS.Close
68             Set objRS = Nothing
69             objConn.Close
70             Set objConn = Nothing
71             '-- write table to wrap contents with a margin equal to the cellpadding
72             Response.Write("<div align=""left""><table border=""0"" cellspacing=""0"" cellpadding=""10"" align=""left""><tr><td>")
73             '-- write top/bottom navigation links/info
74             ' by calling the WriteNavigation() sub
75             Call WriteNavigation(strNavigation, intTotalRecords, intTotalPages)
76             '-- table with contents of search inside
77             Response.Write("<br><table border=""0"" cellspacing=""0"" cellpadding=""0"" width=""100%"">")
78             '-- find out how many rows we have
79             ' this should be the same as the intRecordsPerPage but not always
80             ' an exception would be when the last page does not have enough left
81             numrows = UBound(arrAllData,2)
82             '-- now loop through the records
83             For rowcounter= 0 To numrows
84                 '-- row values held in variables for ease of use
85                 strDocTitle = arrAllData(0, rowcounter)
86                 strFilename = arrAllData(1, rowcounter)
87                 strVPath = arrAllData(2, rowcounter)
88                 intSize = FormatNumber(arrAllData(3, rowcounter))
89                 datWrite = arrAllData(4, rowcounter)
90                 strCharacterization = arrAllData(5, rowcounter)
91                 numRank = arrAllData(6, rowcounter)
92                 '-- create an empty space if the field is empty
93                 ' for proper display of the table <td></td>
94                 If IsNull(strCharacterization) Or Trim(strCharacterization) = "" Then
95                     strCharacterization = "&nbsp;"
96                 End If
97                 Response.Write("<tr><td bgcolor=""#AACCEE"" align=""right"">" _
98                     & intStartingRecord & ")</td>" _
99                     & "<td bgcolor=""#AACCEE"" width=""5"">&nbsp</td>" _
100                     & "<td bgcolor=""#AACCEE""><a href=""" & strVPath & """>")
101                     '-- if title found in header is bigger than 2 characters
102                     ' it probably means that there is a <title> for this document
103                     If Len(strDocTitle) > 2 Then
104                         '-- look for and get rid of custom title words used for search engines
105                         ' only if your strCustomTitle string is not empty
106                         ' and those words are in the beginning of the title
107                         If strCustomTitle <> "" Then
108                             If LCase(Left(strDocTitle, Len(strCustomTitle))) = LCase(strCustomTitle) Then
109                                 lengthstrDocTitle = Len(strDocTitle)
110                                 strDocTitle = Mid(strDocTitle,Len(strCustomTitle), lengthstrDocTitle)
111                             End If
112                         End If
113                         Response.Write(Server.HTMLEncode(strDocTitle))
114                         '-- no title found in header or could not pick it up
115                         ' write filename instead so users have something to click on
116                     Else
117                         Response.Write(Server.HTMLEncode(strFilename))
118                     End If
119                     Response.Write("</a></td></tr>" _
120                         & "<tr><td align=""left"" valign=""top"">")
121                     '-- show proper image for ranking
122                     NormRank = numRank/10
123                     If NormRank > 80 Then
124                         stars = "rankbtn5.gif"
125                     ElseIf NormRank > 60 Then
126                         stars = "rankbtn4.gif"
127                     ElseIf NormRank > 40 Then
128                         stars = "rankbtn3.gif"
129                     ElseIf NormRank > 20 Then
130                         stars = "rankbtn2.gif"
131                     Else
132                         stars = "rankbtn1.gif"
133                     End If
134                     '-- Chr(37) = %
135                     '-- write correct image and percentage ranking
136                     Response.Write("<img src=""" & starslocation & stars & """><br>" _
137                         & NormRank & Chr(37) & "</td><td>&nbsp;</td>" _
138                         & "<td align=""left"" valign=""top"">")
139                     '-- write summary of the page
140                     Response.Write(strCharacterization & "<br><br><i>")
141                     '-- write file size or show error in case
142                     ' we have a NULL value returned
143                     If Trim(intSize) = "" Or IsNull(intSize) Then
144                         Response.Write("(size unknown) - ")
145                     Else
146                         Response.Write("size " & FileSize(intSize) & " - ")
147                     End If
148                     '-- write date last modified or show error in case
149                     ' we have a NULL value returned for DateLastModified
150                     If Trim(datWrite) = "" Or IsNull(datWrite) Then
151                         Response.Write("(time unknown)")
152                     Else
153                         Response.Write(myFixDate(datWrite) & " GMT")
154                     End If
155                     Response.Write("</i></td></tr>" _
156                         & "<tr><td colspan=""3"">&nbsp;</td></tr>")
157                     '-- increment the number listing showing on the left by one
158                     intStartingRecord = intStartingRecord + 1
159                 Next 'rowcounter= 0 To numrows
160                 '-- end of table with search contents
161                 Response.Write("</table><hr width=""100%"" size=""2"" noshade>")
162                 '-- now write again the top navigation menu we generated
163                 ' we don't need to call the sub again because
164                 ' it's now in a local variable
165                 Response.Write(strNavigation)
166                 '-- close wrapping table
167                 Response.Write("<br></td></tr></table></div>")
168             End If 'objRS.EOF and objRS.BOF
169         End If 'Err.Number <> 0
170 End Sub
171 ...

Plainly, this sub calls everything else. It connects to the catalog, issues the query, returns a recordset, and then it formats it appropriately and writes it out. The rest of the functions are responsible for formatting the resultset.

1 ...
2 '-- build SQL query string for Index Server ADO query
3 Function BuildQuery(strScope, strQuery)
4     Dim strPropertyName
5     Dim SQL 'SQL string to search against
6     Dim strQText
7     Dim blnAddedQ
8     Dim intQPos
9     SQL = "SELECT DocTitle, Filename, Vpath, Size, Write, Characterization, Rank FROM "
10     If strScope = "" Then
11         SQL = SQL & "SCOPE() "
12     Else
13         SQL = SQL & "SCOPE('DEEP TRAVERSAL OF " & QUOT & strScope & QUOT & "')"
14     End if
15     strQText = strQuery
16     If InStr(strQText, " ") > 0 Or InStr(strQText, "'") > 0 Then
17         blnAddedQ = False
18         If Left(strQText, 1) <> QUOT Then
19             strQText = QUOT & strQText
20             blnAddedQ = True
21         End If
22         If Right(strQText, 1) <> QUOT Then
23             strQText = strQText & QUOT
24             blnAddedQ = True
25         End If
26         If blnAddedQ Then
27             intQPos = Instr(2, strQText, QUOT)
28             Do While intQPos > 0 And intQPos < Len(strQText)
29                 strQText = Left(strQText, intQPos - 1) & " " & Mid(strQText, intQPos + 1)
30                 intQPos = Instr(2, strQText, QUOT)
31             Loop
32         End If
33     End If
34     SQL = SQL & "WHERE CONTAINS ('" & strQText & "') > 0"
35     '-- If you want to add your files here, like asp for example
36     ' then add another line like this:
37     ' SQL = SQL & " OR Filename LIKE '%.asp'"
38     SQL = SQL & " AND (Filename LIKE '%.html'"
39     '-- comment any of next lines to exclude certain files
40     SQL = SQL & " OR Filename LIKE '%.asp'"
41     SQL = SQL & " OR Filename LIKE '%.pdf'"
42     SQL = SQL & " OR Filename LIKE '%.doc'"
43     SQL = SQL & " OR Filename LIKE '%.xls'"
44     SQL = SQL & " OR Filename LIKE '%.ppt'"
45     SQL = SQL & " OR Filename LIKE '%.txt'"
46     SQL = SQL & " OR Filename LIKE '%.htm')"
47     SQL = SQL & " ORDER BY " & strOrder & " DESC"
48     BuildQuery = SQL
49 End Function
50
51 '-- make HTML string for navigation links
52 ' on the top and bottom of the page
53 ' this sub first creates the navigation,
54 ' then stores it in a local variable (strNavigation)
55 ' so we can use it again without needing to call the sub,
56 ' and then writes it to the response
57 Sub WriteNavigation(strNavigation, intTotalRecords, intTotalPages)
58     Dim strScriptName
59     strScriptName = Request.ServerVariables("SCRIPT_NAME")
60     '-- controls to scroll to next or previous pages
61     strNavigation = "<center>" _
62         & "<a href=""index.html"">New Query</a><br>" _
63         & intTotalRecords & " total documents matching the query """ _
64         & strQuery & """<br>" _
65         & "Page " & intPage & " of " & intTotalPages & "<br>"
66     '-- if we are on the first page then the First and Previous Page
67     ' do not need to be active
68     If intPage = 1 Then
69         strNavigation = strNavigation & "First Page&nbsp;&nbsp;Previous Page&nbsp;"
70     '-- else if we are not on the first page make those links active
71     Else
72         strNavigation = strNavigation & "<a href=""" & strScriptName _
73             & "?Query=" & strQuery & "&PAGE=1" _
74             & "&RecordsPerPage=" & intRecordsPerPage _
75             & "&Order=" & strOrder & "&Scope=" & strScope & """>First Page</a>&nbsp;" _
76             & "&nbsp;<a href=""" & strScriptName _
77             & "?Query=" & strQuery & "&PAGE=" & intPage - 1 _
78             & "&RecordsPerPage=" & intRecordsPerPage _
79             & "&Order=" & strOrder & "&Scope=" & strScope & """>Previous Page</a>&nbsp;"
80     End If
81     '-- if we are on the last page then there is no need
82     ' to make the Next and Last Page active
83     If intPage = intTotalPages Then
84         strNavigation = strNavigation & "&nbsp;Next Page&nbsp;&nbsp;Last Page"
85     '-- else if we are not on the last page, then make them active
86     Else
87         strNavigation = strNavigation & "&nbsp;<a href=" & QUOT & strScriptName _
88             & "?Query=" & strQuery & "&PAGE=" & intPage + 1 _
89             & "&RecordsPerPage=" & intRecordsPerPage _
90             & "&Order=" & strOrder & "&Scope=" & strScope & """>Next Page</a>&nbsp;" _
91             & "&nbsp;<a href=" & QUOT & strScriptName _
92             & "?Query=" & strQuery & "&PAGE=" & intTotalPages _
93             & "&RecordsPerPage=" & intRecordsPerPage _
94             & "&Order=" & strOrder & "&Scope=" & strScope & """>Last Page</a>"
95     End If
96     strNavigation = strNavigation & "</center>"
97     Response.Write(strNavigation)
98 End Sub
99
100 '-- format filesize
101 Function FileSize(intFileSize)
102     const DecimalPlaces = 1
103     const FileSizeBytes = 1
104     const FileSizeKiloByte = 1024
105     const FileSizeMegaByte = 1048576
106     const FileSizeGigaByte = 1073741824
107     const FileSizeTeraByte = 1099511627776
108     Dim strFileSize, newFilesize
109     If (Int(intFileSize / FileSizeTeraByte) <> 0) Then
110     newFilesize = Round(intFileSize / FileSizeTeraByte, DecimalPlaces)
111     strFileSize = newFilesize & " TB"
112     ElseIf (Int(intFileSize / FileSizeGigaByte) <> 0) Then
113     newFilesize = Round(intFileSize / FileSizeGigaByte, DecimalPlaces)
114     strFileSize = newFilesize & " GB"
115     ElseIf (Int(intFileSize / FileSizeMegaByte) <> 0) Then
116     newFilesize = Round(intFileSize / FileSizeMegaByte, DecimalPlaces)
117     strFileSize = newFilesize & " MB"
118     ElseIf (Int(intFileSize / FileSizeKiloByte) <> 0) Then
119     newFilesize = Round(intFileSize / FileSizeKiloByte, DecimalPlaces)
120     strFileSize = newFilesize & " KB"
121     ElseIf (Int(intFileSize / FileSizeBytes) <> 0) Then
122     newFilesize = intFilesize
123     strFileSize = newFilesize & " Bytes"
124     ElseIf Int(intFileSize) = 0 Then
125     strFilesize = 0 & " Bytes"
126     End If
127     FileSize = strFileSize
128 End Function
129
130 '-- format date properly for international viewing
131 Function myFixDate(datWrite)
132     Dim strHTMLout
133     strHTMLout = FormatDateTime((datWrite), 1) & " at " & FormatDateTime((datWrite), 3)
134     myFixDate = strHTMLout
135 End Function
136 %>
137 </body>
138 </html>


The WHERE clause

The query that you create against the Indexing Service can be as complex as you want it. Here are some more things you can do with it:

CONTAINS

The following line matches documents that contain toys or factories:
WHERE CONTAINS("toys" OR "factories")

Toys is within 50 words or less of factories:
WHERE CONTAINS("toys" NEAR "factories")
-->this feature was cut from the RTM version of IS 3 at the last minute. The documentation has not been reflected to account for this cut. So the NEAR syntax is ignored, but there is no error message. The 50-word window is built into the FreeText ranking algorithm.
-->.NET: the proximity operator works, but you still can't specify the distance.

To match toys, toy, toyed, etc.:
WHERE CONTAINS('FORMSOF(INFLECTIONAL, "toy")')

FREETEXT When you want to search for the best match for a word or a phrase:
WHERE FREETEXT('toys for kids')
LIKE Wildcards to perform matches:
WHERE DocTitle LIKE '%toy%'
MATCHES Uses regular expressions to perform matches. For example, all entries where DocAuthor starts with any character between a and e:
WHERE MATCHES (DocAuthor, "[a-e]*")
NULL Matching of null values:
WHERE DocTitle IS NULL
WHERE DocTitle IS NOT NULL


Recovery

Well, you got the search engine working and everybody is happy. You are receiving kudos from everyone around. But if the search functionality on your site is vital, how can you ever know if something is wrong? Let’s talk about how we can take precautionary action to attempt to fix the service automatically, and be warned if something is wrong. Then you can really sit back and enjoy.

Go to Start > Programs > Administrative Tools > Services:

Double click, or right click and go to Properties, on the Indexing Service to open the Indexing Service Properties dialog. Click on the Recovery tab.

Here, you can define the actions to take once your service fails. You can try different scenarios that best fit your needs. I decided to try a restart on the First failure and then to Run a File on the Second failure. I created a folder called ServerScripts and placed my custom script files to run in there. The SendEmailOnServiceFail.vbs file first makes sure the service is down by attempting to shut it down again, and then tries to bring it back on. It then sends an email to a person to notify them that the service had to be restarted, and may still not work fine. This file uses WSH and the CDONTS to send the email, and you need to have correct access on the windows system to do this (for example Administrative).

1 '-- Declare variables
2 Dim objSendMail
3 Dim objAdminISS
4 '-- The following stops and restarts Indexing Service
5 ' comment the following 4 lines not to use this feature.
6 ' To use this object you need administrative access
7 Set objAdminIS = CreateObject("Microsoft.ISAdm")
8 objAdminIS.Stop() 'Make sure it's off first
9 objAdminIS.Start() 'And then restart it
10 Set objAdminIS = Nothing
11 '-- Send email when service fails
12 Set objSendMail = CreateObject("CDONTS.NewMail")
13 'change the FROM and TO below
14 objSendMail.From = "someone@somewhere.com"
15 objSendMail.To = "youremail@yourdomain.com"
16 objSendMail.Subject = "Indexing Service has failed!"
17 objSendMail.Body = "<H2><FONT COLOR=Red>" & Date() & " - " & Time() & "</FONT></H2>" & "The Indexing Service has failed. Please check your server!"
18 objSendMail.BodyFormat = 0 'Body property is HTML
19 objSendMail.MailFormat = 0 'MIME format
20 objSendMail.Importance = 2 'High Importance
21 objSendMail.Send
22 Set objSendMail = Nothing

The default timeout time for vbs files like these is 10 seconds. If you want to change that, right click on the vbs file, and click on Properties. Go to the Script tab, and then click on Stop script after specified number of seconds. When you do that, it will allow you to change the 10 seconds default value to whatever you want. When you click OK, the dialog will create a file in the same folder as your vbs file, give it the same name with the extension ".wsh". This is what a sample file would look like:

1 [ScriptFile]
2 Path=C:\ServerScripts\SendEmailOnServiceFail.vbs
3 [Options]
4 Timeout=20
5 DisplayLogo=1

You can test it by double clicking on the vbs file to run it. So, in the future, if the service fails, you will receive an email alerting you of the fact. The email should look something like this:




Putting it all together

To summarize, here are the steps you need to take to make this work for you:

  1. create your website
  2. turn indexing on for the site
  3. install the Indexing Service
  4. create a catalog
  5. associate the new catalog with your site
  6. add/remove folders from catalog
  7. tune performance
  8. edit noise files
  9. copy search input file on your site (index.html)
  10. copy images for ranking on your site
  11. copy runsearch.asp on your site
  12. make sure index.html file is posting to the runsearch.asp
  13. edit the 3 variables in runsearch.asp file
  14. create recovery procedures


Conclusion

Running an ADO query may not be the most flexible way to work with the Indexing Service, but it is the simplest by far. For most cases this is good enough. You can do a lot more with this service. For example, you can have it index custom meta tags and then add those to your queries/results. Or you can export the catalog into a relational database like SQL Server, and then combine it with a content management system for a more advanced search. This article's intention was to give you a quick way to get it up and running. Feel free to alter this as much as possible, and give me feedback. Maybe you can make my code faster or more reliable, or simply expand on it.

 



Other articles in this category
  1. Exporting Word files to HTML
    March 5, 2003
    In this article we will first discuss the case for and against using Word as your HTML editor. Then we will see how to properly save a Word file to smaller, more compact HTML files. Third and last, we will see how to do this through code, and possibly create a batch process for converting numerous Word files to HTML at once.
  2. GetRows VBScript Class - Part III: Paging the results
    January 16, 2003
    In Part I of this series, we saw how to create a VBScript class to query our database using the very fast GetRows() method, and return a recordset as a local array. In Part II, we extended the class to allow ADDing and UPDATEing a row in the database. In this Part III, we will expand the class further to allow pagination of the returned recordset.
  3. Dynamic Tree Menu of your site
    May 31, 2002
    We'll see how to create a menu system that is cross-browser and includes all your site's folders/files. It uses ASP, XML and DHTML and by simply copying it to your site you have an instant Windows Explorer-like navigation of the contents.
  4. Generating an XML file of your website's folders/files
    May 24, 2002
    Using the File System Object (FSO) we can traverse through our website's contents and write them out in a nicely nested form in an XML file. We can then use that file for example, in a content management system or a TreeView control.
  5. Downloading any file using ASP, FSO and the ADODB Stream object
    May 8, 2002
    In this article, we will see how to allow a user to download any file from our web server. They will see a prompt, giving them the option of opening or saving it, rather than simply opening it which is the default. We can achieve this using the FSO and ADODB objects.