Top SEM and SEO Tips    

Archive for March, 2005

Search Engine Friendly Redirects – Custom 404s

Tuesday, March 15th, 2005

There are three articles dealing with redirects to handle, see related posts for more information.

If you are going to move a page you will likely want to redirect visitors from the old page to the new in such a method that the search engines don’t get confused. Some of the ways they can get confused include:

  • Bringing up two copies of the same page. This is likely to trip a duplicate content penalty.
  • Using a temporary redirect. This means ‘the page has moved but will be back shortly – don’t update your index’.

A 301 permanent redirect is the redirection method recommended by the major search engines. Using a 301 redirect you are in effect telling the search engines the page has moved and to update their index. It also has the nice side benefit of redirecting the benefit of inbound links to the new page.

Detailed below is how to use a custom 404 redirect to handle moving pages. A 404 error is produced when the server can not find the file requested by a visitor. This is useful if you can’t use the normal 301 redirect methods, such as when you move CMS systems, the whole of your site’s file layout changes,etc. I have used these techniques when switching from static .htm pages to dynamic .asp pages which necessitated changing all filenames. You can also use this method to make the redirects database driven.

These 404 based redirection techniques rely on programming. When set up you will need to check that the server has overridden the 404 error code with a 301 code using a header checking tool (there are plenty available on the net).

IIS 404 Redirect

Using the IIS MMC as follows:

  • Right click on the website or directory that you want the 404 to apply to.
  • Click ‘Properties’.
  • Click on ‘Custom Errors’
  • Scroll down to 404 and highlight. Click the ‘Edit’ button.
  • In the drop down, select URL (this is important – doesn’t work otherwise). Then enter a URL on your site to use for the programming.
  • Click ‘OK’ to save this change.

On the custom 404 page itself you can use vbscript or any other programming language to read the server variables to decide what page to display, or where to redirect the user.

Apache 404 Redirect

Create a file called .htaccess in your root directory and add the following line:

ErrorDocument 404 /errors/404.php

On the custom 404 page itself you can use PHP or any other programming language to read the server variables to decide what page to display, or where to redirect the user.

 



Search Engine Friendly Redirects – Directory Level

Tuesday, March 15th, 2005

There are three articles dealing with redirects – see related posts for more information

If you are going to move a page you will likely want to redirect visitors to the old page to the new in such a method that the search engines don’t get confused. Some of the ways they can get confused include:

  • Bringing up two copies of the same page. This is likely to trip a duplicate content penalty.
  • Using a temporary redirect. This means ‘the page has moved but will be back shortly – don’t update your index’.

A 301 permanent redirect is the redirection method recommended by the major search engines. Using a 301 redirect you are in effect telling the search engines the page has moved and to update their index. It also has the nice side benefit of redirecting the benefit of inbound links to the new page.

Implementing a 301 permanent redirect is different depending on the operating system you are using on your server:

IIS Redirect

  • In internet services manager, right click on /old-directory
  • Select the radio titled “a redirection to a URL”.
  • Enter the redirection page.
  • Check “The exact url entered above” and the “A permanent redirection for this resource”
  • Click on ‘Apply’

Apache Redirect

Create a file called .htaccess in your root directory and add the following line:

Redirect 301 /old-directory/ http://www.mywebsite.com/new-directory/


Search Engine Friendly Redirects – File Level

Tuesday, March 15th, 2005

There are three articles dealing with redirects to handle redirecing one file at a time, redirecting one directory at a time, and redirecting multiple pages easily.

If you are going to move a page you will likely want to redirect visitors to the old page to the new in such a methd that the search engines don’t get confused. Some of the ways they can get confused include:

  • Bringing up two copies of the same page. This is likely to trip a duplicate content penalty.
  • Using a temporary redirect. This means ‘the page has moved but will be back shortly – don’t update your index’.

A 301 permanent redirect is the redirection method recommended by the major search engines. Using a 301 redirect you are in effect telling the search engines the page has moved and to update their index. It also has the nice side benefit of redirecting the benefit of inbound links to the new page.

Implementing a 301 permanent redirect is different depending on the operating system and/or programming language you are using on your server:

IIS Redirect

  • In internet services manager, right click on /old-file.htm
  • Select the radio titled “a redirection to a URL”.
  • Enter the redirection page.
  • Check “The exact url entered above” and the “A permanent redirection for this resource”
  • Click on ‘Apply’

Apache Redirect

Create a file called .htaccess in your root directory and add the following line:

Redirect 301 /old-file.htm http://www.mywebsite.com/new-file.htm

ColdFusion Redirect

Edit the file /old-file.htm and put the following code:

<cfheader statuscode=”301″ statustext=”Moved permanently”>
<cfheader name=”Location” value=”http://www.mywebsite.com/new-file.htm”>

PHP Redirect

Edit the file /old-file.htm and put the following code:

<?php
Header( “HTTP/1.1 301 Moved Permanently” );
Header( “Location: http://www.mywebsite.com/new-file.htm” );
?>

ASP Redirect

Edit the file /old-file.htm and put the following code:

<%@ Language=VBScript %>
<%
Response.Status=”301 Moved Permanently”
Response.AddHeader “Location”, ” http://www.mywebsite.com/new-file.htm”
%>

ASP .NET Redirect

Edit the file /old-file.htm and put the following code:

<script runat=”server”>
private void Page_Load(object sender, System.EventArgs e) {
Response.Status = “301 Moved Permanently”;
Response.AddHeader(“Location”,”http://www.mywebsite.com/new-file.htm”);
}
</script>

HTML Redirect

Edit the file /old-file.htm and put the following code:

<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.0 Transitional//EN”>
<html>
<head>
<title>Your Page Title</title>
<meta http-equiv=”REFRESH” content=”0;url=http://www.mywebsite.com/new-file.htm”>
</HEAD>
<BODY>Optional page text here.
</BODY>
</HTML>


Writing to Overpower Your Competition

Tuesday, March 15th, 2005

by Karon Thackston



Automating Database Updates Using ColdFusion

Monday, March 7th, 2005

By: Nathan Johnson

Recently, I was asked to help automate some time consuming database update tasks for one of our Real Estate clients. They needed to ensure that their MLS listings were always up to date, but also couldn’t afford to take their site down while they were manually updating the database (imagine the data entry headache brought on by 1000 new MLS listings every day!). To further complicate matters, their hosting only supports ColdFusion, and can’t parse ASP or PHP pages. Thankfully, the concepts presented here are relatively straight forward and translate to PHP and ASP easier than the other way around.

Simply put, I needed to integrate their site with a program that automatically and seamlessly runs on their Cold Fusion server by updating their database once daily. Here’s how I did it!

First, I need to FTP download the updated database, which is hosted offsite. Here’s my FTP code:

	<!--- BEGIN DOWNLOADING DATA FILE --->

	Downloading MLS Listings...
	<cfset thread = CreateObject("java", "java.lang.Thread")>
	<cfflush>
	<cfset thread.sleep(100)>

		<!--- OPEN FTP CONNECTION TO MLS LISTING FTP SERVER --->
		<cfftp action = "open"
			username = "USERNAME"
			password = "PASSWORD"
			server = "TEST.DATABASE.COM"
			connection="myFtpConnection"
			stopOnError = "Yes"
			passive="yes">

		<!--- NAVIGATE TO THE DIRECTORY THAT CONTAINS THE MLS LISTING TEXT FILE --->
		<cfftp action="changedir"
			connection="myFtpConnection"
			directory="/SOURCEFOLDER"
			passive="yes">

		<!--- FTP DOWNLOAD THE MLS LISTING TEXT FILE (Listings.txt) --->
		<cfftp action="getfile"
			connection="myFtpConnection"
			remotefile="Listings.txt"
			localfile="C:\Inetpub\wwwroot\mysite.com\database_dir\Listings.txt"
			failifexists="no"
			passive="yes">

		<!--- CLOSE THE FTP CONNECTION --->
		<cfftp action = "close"
			connection = "myFtpConnection"
			passive="yes">

	DONE!<br>
	<br>
	Updating Database...
	<cfset thread = CreateObject("java", "java.lang.Thread")>
	<cfflush>
	<cfset thread.sleep(100)>

Note the use of the following code lines throughout the page:

	<cfset thread = CreateObject("java", "java.lang.Thread")>
	<cfflush>
	<cfset thread.sleep(100)>

Since this is a program that runs on the server, the server will not automatically print the status messages out while the program is running. As such, the page appeared to be loading and frozen when in fact it is waiting for the download to complete. By inserting the above lines, the server will pause for 100 milliseconds, long enough to print out the status messages written on the lines above, but not long enough to cause a noticeable delay in the update process. Another point of interest from above is the failifexists=”no” attribute of the tag. This is important to ensure that the local file is being overwritten each day as this update is run. Now that we’ve downloaded the updated database listings to a local file, I need to input the data into the database. This site uses a Microsoft Access .MDB database file, so I first need to clear out the existing lines of data in that file to prevent any old listings from showing up after they’ve been removed from the updated MLS database:

	<!--- SQL COMMAND TO CLEAR THE DATABASE DATA, WILL BE REWRITTEN --->
	<cfquery name="qryInsert" datasource="MLS">
		  DELETE * FROM listing_table
	</cfquery>

Also note that I had already set up the datasource name of “MLS” to refer to my database file located on the server, which only involved a quick call to the web hosting company.

Now, I need to parse the new text file into usable chunks of data and input that information into the database. To accomplish this, I used a CFLOOP tag that loops through each line of the text file as it is read by the server. Also, the source text file is tab delimited and I don’t want to have to refer to tabs when parsing my data (Just because I’m picky!), so I will replace all the tabs with vert line characters (“|”):

	<!--- OPEN THE LOCAL COPY OF THE LISTINGS TEXT FILE --->
	<cffile action="read"
		file="C:\Inetpub\wwwroot\mysite.com\database_dir\Listings.txt"
		variable="txtFile">

	<!--- SET COUNT AT 0 TO SKIP FIRST LINE OF DATA FROM LISTINGS TEXT FILE (COLUMN HEADINGS) --->
	<cfset CountVar = 0>

	<!---
	SET UP INDIVIDUAL LINES OF DATA TO INPUT TO THE SQL COMMAND:
		FIX SINGLE QUOTE PROBLEMS
		REMOVE ANY VERT LINES '|'
		REPLACE ALL TABS WITH VERT LINE DELIMITERS
	--->
	<cfloop
		index="record"
		list="#Replace(Replace(Replace(txtFile,'''','''''','all'),'|','','all'),chr(9),'| ','all')#"
		delimiters="#chr(13)##chr(10)#">

Note that by using the delimiters of “#chr(13)##chr(10)#”, the text file is read one line at a time. The ASCII characters of NewLine (chr13) and carriage return (chr10) is the standard for denoting new lines in .txt files. One of the tricky things I discovered was that the text source file that holds the updated MLS data has a first line for the column headings. Obviously there wasn’t a house listing for “$L,IST,PRI.CE” (haha), so I needed to skip the first line while reading the file’s information. To get around this, I set up an independent variable that is incremented on each pass through the loop. With a simple IF/THEN set, I avoid inserting the line of invalid data into the database:

	<!--- SKIP DATA IF THE FIRST LINE (COLUMN HEADINGS) --->
	<cfif (CountVar gt 0)>

	<!--- SETUP THE VALUES FOR THE SQL COMMAND --->
	<cfif trim(listgetat(record,1,'|')) is ''>
		<cfset MLSNumber = ' '>
	<cfelse>
		<cfset MLSNumber = '#trim(listgetat(record,1,'|'))#'>
	</cfif>
	<cfif trim(listgetat(record,2,'|')) is ''>
		<cfset PropertyAddress = ' '>
	<cfelse>
		<cfset PropertyAddress = '#trim(listgetat(record,2,'|'))#'>
	</cfif>
	<cfif trim(listgetat(record,3,'|')) is ''>
		<cfset ListingPrice = '0'>
	<cfelse>
		<cfset ListingPrice = '#trim(listgetat(record,3,'|'))#'>
	</cfif>

Here, I assign variables to the various chunks of data, while I validate them against being blank or invalid entries. The ListGetAt function is a handy way to refer to various chunks of the current data array and the IF/THEN statements simply check to make sure that the data being read isn’t blank. If it is, the function returns a space character that takes care of ColdFusion’s built-in “skip the field if it’s blank” attitude – if you’re new to this, all you need to know is that ColdFusion will ignore that a field exists if the data is blank. For instance, an array that contains “1,2,3,,5″ is turned into the array “1,2,3,5″. Since the value in the 4th column is blank, the server automatically turns the next value into that column! DOH! As a side note – IMHO, space characters are the easiest NULL values to deal with, since the word NULL can’t simply be trimmed out of a query response.

Also note that required numerical fields are entered with a “0″ instead of a space, as a blank value is invalid for a numerical database field in Access. During development, it is also handy to output the information that is being built on the fly, since it’s easier to read through an output than to try and troubleshoot database error messages. Here’s one of the print lines I used during development:

	<p><b>Line Number <cfoutput>#CountVar#</cfoutput>:</b>
		<cfoutput>INSERT INTO listing_table (MLSNumber, PropertyAddress, ListingPrice)
		VALUES ('#MLSNumber#', '#PropertyAddress #', '#ListingPrice #')</cfoutput></p>

This simply prints out the variables along with a handy line number (very useful when trying to figure out whether a problem in validation is due to the source information or the page’s coding!). Now, I simply plug the info into my database and end my IF/THEN statement that ignores the column headings:

	<!--- SQL COMMAND TO INPUT THE LISTINGS DATA FROM THE TEXT FILE TO THE ACCESS DATABASE --->
	<cfquery name="qryInsert" datasource="MLS">
		  INSERT INTO listing_table (MLSNumber, PropertyAddress, ListingPrice)
		  VALUES ('#MLSNumber#', '#PropertyAddress #', '#ListingPrice #')
	</cfquery>

Also, I increment the counter so the next lines are read and entered in the DB, then looped to the next line, etc:

	<!--- INCREMENT THE COUNTER TO SKIP THE FIRST LINE (COLUMN HEADINGS) --->
	<cfset CountVar = CountVar + 1>

	</cfloop>

	DONE!<br>
	<cfset thread = CreateObject("java", "java.lang.Thread")>
	<cfflush>
	<cfset thread.sleep(100)>

That’s the upload and update program now there’s the problem of how to make the server run the script. You can either set up something server side (neither fun nor easy on Windows IIS), or you can simply make sure that the first person to navigate the site each morning causes the script to run. For more details see the ‘Emulating Crontab using Coldfusion‘ article.




Emulate Crontab Using ColdFusion

Monday, March 7th, 2005

By: Nathan Johnson

Setting up automated scripts on Windows can be difficult. The built in scheduler is hard to set especially if you don’t have console access. An easier way is to set up the site so that the first person to navigate to the site each morning causes the script to run. This is actually really simple to do, and can be completely seamless for the end user. Simply use the script page as the source of a little 1px by 1px image hidden somewhere at the bottom of your site’s footer:

	<img src="http://www.mysite.com/dbupdate.cfm" width="1" height="1" border="0">

Even though the SRC of this “image” is not a real image file, the server doesn’t know that, so it still runs the page to accommodate the request, and doesn’t require your user to navigate through the page or require you to dump large blocks of code into each of your site’s pages. In essence, this works almost like an include file, but doesn’t require the server to parse and run the template prior to loading the page. It’s also advantageous in that the script won’t die or stop if the user navigates away from the page before it’s completed.

There are a couple of problems that are now presented – first, how to ensure that the first person to navigate the site actually triggers the script? By simply putting this image tag at the bottom of each page in the entire site (whether through an include file or by hard coding it), any page on the site will trigger the script. This could cause another problem though, as we now can’t prevent the user from navigating to another page in the site and causing the script to trigger multiple times. It is important that we build the logic to accommodate these requests quickly (to save server resources), and make sure that multiple downloads don’t occur. Two quick IF/THEN statements will help us. First, let’s discuss the multiple download issue. By creating a file on the server that acts as an alarm that a download is already in progress, we can avoid multiple downloads. I’ll show you how to create the file later in the script, but for now here’s how I am checking for its existence:

	<!--- PATH INFO FOR LOCK FILE --->
	<CFSET MLSLockFile = "C:\Inetpub\wwwroot\mysite.com\database_dir\crontab.lck">

	<!--- CHECK FOR EXISTING LOCK FILE --->
	<CFIF FileExists(MLSLockFile)>

	<!--- IF LOCK FILE EXISTS, REPORT THAT THE DOWNLOAD IS IN PROGRESS --->
	<h3>UPDATE IS ALREADY IN PROGRESS!</h3>

This works in the same way that Macromedia’s DreamWeaver MX check out principle works, in that if this file is present it triggers an alert. In this case, the alert is caught by the IF/THEN statement and the rest of the script is avoided. Since this only takes a millisecond and doesn’t hog any server resources (probably less load on your server than even passing a 1×1.gif image), we can safely call this script from any page of our site – even with decent web traffic – and the download won’t be triggered more than once. Next, we need to make sure that we only do the download when it’s required. For this example, I am restricting downloads to once per day. This is done by obtaining the modified date on the text file we previously FTP downloaded. It’s not much more complicated to get the modified timestamp as well, if you need to do multiple updates in a single day:

	<!--- LOCK FILE DOES NOT EXIST --->
	<cfelse>

	<!--- GET TIMESTAMP FROM LAST MLS UPDATE (CODE) --->
	<cfscript>
	function FileDateLastModified(path)
	{
	  Var fso  = CreateObject("COM", "Scripting.FileSystemObject");
	  Var theFile = fso.GetFile(path);
	  Return theFile.DateLastModified;
	}
	</cfscript>

	<!--- LAST MLS UPDATE (TRIGGER & FILE REFERENCE) --->
	<CFSET TheFile = "C:\Inetpub\wwwroot\mysite.com\database_dir\Listings.txt">

	<!--- LAST MLS UPDATE (LOGICAL OPERATOR) --->
	<cfif #DateFormat(FileDateLastModified(TheFile), 'mm/dd/yyyy')# - #DateFormat(Now())# lt 0>

The “TheFile” variable sets up the reference to the text file, and a simple IF/THEN statement checks that the difference between the modified date of the existing file and the current date is less than 0. If this is the case (indicating that an update is needed), then the lock file will be generated and the download/update script will be run:

	<!--- CREATE LOCK FILE --->
	<cffile action="write"
		   file="C:\Inetpub\wwwroot\mysite.com\database_dir\crontab.lck"
		   output="101010">

Note that the output is needed to write the file, but isn’t used for anything. Here’s where all the above mentioned FTP and database updating code goes. Once the script is complete, we will destroy the temporary lock file:

	<!--- REMOVE LOCK FILE --->
	<cffile action="delete"	file="C:\Inetpub\wwwroot\mysite.com\database_dir\crontab.lck">

.and now we’ve got a complete picture. To sum up, the script will now do the following tasks for us:

  1. Check for a lock file (preventing multiple downloads).
  2. If not, check the timestamp to see whether a download and update is needed (keep updates appropriately periodic).
  3. If an update is needed, create a lock file to prevent multiple downloads and FTP the information from a remote location to our local server.
  4. Update the database with the new information.
  5. Remove the lock file (when complete) to allow the update to run again.


Five Sections of Your Copy Guaranteed To Get Read

Tuesday, March 1st, 2005

by Karon Thackston