PixWeb2000 for Windows
Web-Site Spidering Software
PixWeb2000 is no longer for sale or supported. However, existing versions will accept a MagicKey.
Click here to go to the PixWeb2000 online manual
Download v1.0.1 full installer
Click here to download.
Download version 1.0.2 updater
This version fixes the bug that came up sometimes when selecting "Preferences" from the "File" menu.
Instructions:
- Go into your PixWeb2000 folder and make a copy of your PixWeb2000.exe file and put it somewhere safe.
- Click here to download the new PixWeb2000.exe file
- Unzip the file.
- Put the new PixWeb2000.exe file into your PixWeb2000 folder.
- Run PixWeb2000.
- Go onto the "Help" menu and choose "About PixWeb2000".
- Make sure that it is version 1.0.2.
- If for some reason this new version doesn't work for you, go back to using the PixWeb2000.exe file that you saved in step 1.
General Scanning Guidlines
It is important to remember that PixWeb, unlike a human being, is unable to make decisions as to the relative importance of any specific link it finds. PixWeb will treat all links as equals unless you use rules and filters. PixWeb will only scan the domain you specify when setting up your PixWeb scanner. If you want PixWeb to extend its activities to domains other than the one you specified, you will need to increase the Domain Hopping value.
The Alternate Matchpath Option
PixWeb2000 provides an option that allows you to change the way web pages are analyzed and how the list of links is produced. This option is called the Alternate MatchPath option. If you notice that the web site you are scanning does not result in any links being found, you should try turning this option on. This is generaly the quickest way to get things going again.
If you are still unable to get any results, you should also try increasing the Domain Hopping value. When using the Alternate MatchPath option, you should be sure to turn it off if it does not fix the problem before trying other options. You may also try using the Alternate MatchPath option in conjunction with other options.
The Alternate MatchPath option looks at a link to see if it ends with a html, shtml, htm, or other extension which signals a web page and not a file. If the links ends with one of these extensions PixWeb2000 will treat it as a link to a web page instead of a link to a file.
Example: If the Alternate MatchPath option is turned off, the following link may be treated as a link to a file. Instead of scanning the page for links, PixWeb2000 will try to download the html file named index: http://www.sample.com/home/index.html
If the Alternate MatchPath option is turned on the following link will have its file removed.
http://www.sample.com/home/index.html
becomes
http://www.sample.com/home/
PixWeb2000 will then treat the link as a link to a directory named home and will scan the files in this directory for links. This means that instead of trying to download index.html, PixWeb2000 will scan the index.html file for links.
Unless the web site you are scanning has this behavior, the Alternate MatchPath options should not be used. The Alternate MatchPath option only addresses this specific problem.
To turn on or turn off the Alternate MatchPath option do this:
- launch PixWeb2000
- select Web Site Manager from the Tools menu
- select a web site
- click the Edit button
- click the Optional Info tab
- click the Other tab
- click the Alternate MatchPath check box
The Link Cache
If you try to scan a web site and do not get the result you desire, you should always be sure to clear the link cache before attempting to re-scan the site. This will remove any links that were found on the previous attempt. These leftover links may cause problems when you change settings and attempt to re-scan the web site. Instructions on how to clear the link cache are below:
To clear the link cache for an individual web site do this:
- launch PixWeb2000
- select Web Site Manager from the Tools menu
- select a web site
- click the Edit button
- click the Optional Info tab
- click the Other tab
- click the Clear Cache button
- click the OK button
To clear the link cache for all web sites do this:
- launch PixWeb2000
- select Preferences from the File menu
- click the Database tab
- click the Clear Cache button
- click the Close button
General Help for Protected Sites
There are times when you will want to scan a protected web site. PixWeb2000 can do this with certain limitations. Those limitations are:
The authentication must take place in only one domain. This means that after PixWeb2000 submits your username and password, the web site must not send your information to a different domain. For example, if you want to scan the protected web site www.pictures.com, your username and password cannot be sent to a different domain for authentication. Your username and password must be authenticated in the picture.com domain. It cannot, for example, be passed to a domain named picturecheck.com.
If the web site uses an Adult Verification System [AVS] such as AdultCheck, you can bypass the authentication process completely by following the instruction in the PixWeb Help system.
Follow these instructions:
- launch PixWeb2000
- select Open Help from the Help menu
- select Protected Sites from the Subject list
- select How To Access AdultCheck Sites from the drop down menu
This technique will work for any web site that uses AdultCheck or similar authentication services that require you to enter your password into a text field on the members page and to then click the Submit button.
|