I have found a way that you can create a Google Sitemap yourself all with free software that you can control. There are 3 essential aspects to this: A site crawler, the sitemap builder and of course a place to submit your sitemap too.
Step 1: Build the Sitemap
One of the easiest ways I have found to do this is to download a free program called Xenu Link Sleuth. While it can be used for a lot of other things, I have found it works exceptionally well for helping you build an XML sitemap.
Now it does not automatically output an XML ready file for you, but it can greatly help automate the URL collection process. So, here is the procedure:
First, download and install the application. Once it is installed run it, check a new URL (Control+N) and enter your URL in this format: http://www.yoursite.com/. Do not forget that trailing “/” or you may have problems. Be sure that it does not spider external links. Also, enter any other exclusions or additions you want to make. Once you are satisfied, press the “OK” button.
Depending on the size of your site this could take a few minutes or a few hours. Keep in mind that your Internet connection speed will affect the speed of the program.
Once Xenu has finished it will present you with a list of URLs. You will notice that it returns every URL on your site including images, stylesheets, email addresses and so on.
Obviously we can not submit these types of URLs so we need to remove them. Therefore you must export your Xenu list to something you can work on. What I usually do at this point is save the Xenu file (it will save with a .xen file extension) as well as export it. I save it in case something goes wrong with the file I am editing. This way I can quickly go back into Xenu and export the list again.
Ok, so now that you have saved it somewhere you will want to export to a TAB separated file. This will save the URL list into a text file which you can then import with your favorite spreadsheet program.
When it is saved, simply open the file in your spreadsheet and prepare to start editing. You will need to remove all columns except the first two. All you need are the URLs and file types. Here is why:
As you know all you want are the page names. You do not need images, email addresses or anything else. What I have found helps is to sort by the file type column. This can help group URLs like images, emails etc., to make it easier to delete a bunch of rows at one time.
Continue the “purge” process until all you have left are valid page names. Then, delete any unnecessary remaining columns (such as file type) and, if you have it, the column heading for URL. When you have the list down to just page URLs, save the file as a CSV filetype named urllist.txt.
Step 2 Get, Install and Use Google’s Sitemap Builder
What I have found works really great for building the sitemap is Google’s own open source python script. All that is needed is Python which is also freely available for Linux, Windows and Mac. Once you have downloaded and installed Python, get and install the Google Sitemap Builder.
Using the program is actually quite easy. You do not need to worry about all the extra files. All you need to concern yourself with are 2 files sitemap_gen.py and config.xml.
sitemap_gen.py is the script that actually generates the file while config.xml has the settings needed for the python script to run. So here is what you do:
Open config.xml in an editor like notepad and scroll down to the bottom of it. There are 3 settings you need to configure:
- base_url this is the root URL of your site. IE: “http://www.yoursite.com/” Be sure to include the quotes.
- store_into this is where you are going to place your new sitemap. What I generally do is store it locally somewhere that is easy to find and remember (such as the location of the Google Sitemap Python files) and then FTP it into my site.
- Urllist path this section is down just a little from the above 2 settings. It is the location the script needs to look for the urllist.txt file you just generated and cleaned up.
Once you have edited these 3 settings and saved the new config.xml file it is time to run the generator. Open a terminal window and go to the location of sitemap_gen.py. Once there type in:
python sitemap_gen.py config=config.xml
This process should only take a few seconds (depending on the number of URLs in your urllist.txt and the speed of your computer) and when complete you will notice a brand new sitemap.xml file sitting in the location you specified in the config.xml file. Now you can upload the new file into the base directory of your website.
Step 3 Submit to Google
Now all that is left to be done is to submit it to Google. If you have a Google account already, all you need to do is browse to Google Webmaster Central and select “Webmaster Tools (Including Sitemaps).” Here you can add your site’s URL, and follow the steps for adding the sitemap file.


