HazCafe.com blog
Blog > Net Tutorials > Tutorial: Simple site search
Tutorial: Simple site search 18/02/2010
Search is a useful feature on a website with some more content. If you're not using CMS like Wordpress, and just wrote your website from scratch, you may want to implement such thing on it. Google allows you to put a search engine on your website, but you may have a reason to find an alternative - or create your own. Here I'll show you how to implement a simple site search feature on database-driven website. It's my first tutorial, so I hope I'll be able to put everithing in an understandable way.
We want some basic usability: search the title and content of, say, articles stored in database our site uses and return results matching all the words or phrases entered by a user, and without any words or phrases perceded by "-" sign.
What you need is a server with PHP4 or 5 and MySQL database.
First of all create a MySQL table where the content will be stored. In this tutorial I'll make an extremely simple table:
CREATE TABLE 'search_tutorial' (
'id' INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
'title' VARCHAR(512) NOT NULL,
'content' TEXT NOT NULL
);
So the table has three fields: id, title and content, and is named "search_tutorial". After adding a few things to the table, let's carry on to our search script.
Second thing to do is creating a PHP file. It's index.php in this tutorial. Add basic HTML structure, connect with MySQL:
mysql_connect(database_address, database_user_login, database_user_password)
Then, add a search form:
The PHP bit shows the phrase entered into the search field, if there was any.
What will happen after entering a phrase into search field? We want this string to be split into separate words or phrases and placed in two arrays: one will contain words we WANT in the results, and one will have words, we DON'T want to occur, that is, every word or phrase preceded by "-". So, first of all, let's extract phrases from the string, and these are put between quotation marks, and may have "-" at the beginning. So, let's extract them:
$object = $_POST['search_object'];
$regex_pattern = '/-?"[^ ][^"]*[^ ]"/';
preg_match_all($regex_pattern,$object,$matches);
Regular expression here matches everything between quotation marks, except spaces just next to them and, of course, quotation marks; it also matches -"phrase to search" - that is, one with "-" at the beginning. Then, preg_match_all function extracts it to a $matches array (not quite; it actually is $matches[0]).
Now, let's tidy this array up a bit.
for ($i = 0;$i', '.', '+', '(', ')', '{', '}', '[', ']', '/', '\\', '^', '"');
$matches[0][$i] = str_replace($to_remove, '', $matches[0][$i]);
}
What happens here, is that every element of the $matches[0] array has quotes removed, as well as some other characters which could be problematic later.
Now, let's take a look at what else is left of string entered into the search field - now as the phrases are all removed, we just need to split it into a single words (which are separated either by spaces or by "+" signs; it may happen that there will be more than one space between the words, so we'll use regular expression to split the string), put into an array, and merge it with $matches[0] array.
$object = preg_split('/-?[\+ ]+/', $object, 0, PREG_SPLIT_NO_EMPTY);
$object = array_merge($matches[0], $object);
Now, let's take words we don't want in our search results to a separate array (it's $excluded here) and remove it from $object array. We also need to remove the "-" from them.
$object_count = count($object);
for($i = 0;$i<$object_count;$i++) {
if (strpos($object[$i], '-') === 0) {
$excluded[] = substr($object[$i], 1);
unset($object[$i]);
}
}
$object = array_values($object);
The last line indexes all the objects from the array again, so there won't be any holes after removing unwanted phrases from it.
At this point, we have two arrays - one with the words we want in our results ($object), and one with ones we don't want ($excluded). Now we have to check every item from the database table to see whether it has (in its title and/or content) all the words we want and none of the words we don't want. So, we'll count the elements in both our arrays and use the regular expression to find out if the database table item has the word. We don't want it to be in a middle of the other word (so if someone types "star", "deathstar" won't be a match). We'll also remove the HTML tags from the database item and make it lowercase (as well as words to search), because we want the search to be case-insensitive. So, here we go:
if (!empty($object)) {
$obj_count = count($object);
}
if (!empty($excluded)) {
$exc_count = count($excluded);
}
$counter = 0;
$query = mysql_query("SELECT * FROM search_tutorial ORDER BY id DESC");
while ($item = mysql_fetch_assoc($query)) {
$notfound = null;
if (!empty($object)) {
for ($i=0;$i<$obj_count;$i++) {
$search_pattern = '/(?
'.$item['title'].'
';
echo $item['content'];
echo '
Nothing found.
'; }And we're done! Of course it's only a base, and much more could be done, but it's a nice beginning. Here's the whole script:
', '.', '+', '(', ')', '{', '}', '[', ']', '/', '\\', '^', '"');
$matches[0][$i] = str_replace($to_remove, '', $matches[0][$i]);
}
$object_count = count($object);
for($i = 0;$i<$object_count;$i++) {
if (strpos($object[$i], '-') === 0) {
$excluded[] = substr($object[$i], 1);
unset($object[$i]);
}
}
$object = array_values($object);
if (!empty($object)) {
$obj_count = count($object);
}
if (!empty($excluded)) {
$exc_count = count($excluded);
}
$counter = 0;
$query = mysql_query("SELECT * FROM search_tutorial ORDER BY id DESC");
while ($item = mysql_fetch_assoc($query)) {
$notfound = null;
if (!empty($object)) {
for ($i=0;$i<$obj_count;$i++) {
$search_pattern = '/(?
'.$item['title'].'
';
echo $item['content'];
echo '
Nothing found.
'; } } ?>








Post comment