Printer Friendly Version Print this thread
Email this thread to a friend eMail this thread to a friend
Featured Web Site Template

Hundreds More at Free Site Templates.com!

Web Site Partners
Sponsored Links
Jet City Software
 
Whos Here ?
There are 0 guests and 2 members in the forums right now.
Reflects user activity within the last 5 minutes
Moderator(s): OAC, Dinkar
Member Message

aibob
Joined: Dec 14, 2001
# Posts: 30

View the profile for aibob Send aibob a private message

Posted: 2001-Dec-21 01:23
Edit Message Delete Message Reply to this message

How can I create a search engine using the dmoz data?



chrisuk
Joined: Mar 16, 2001
# Posts: 315

View the profile for chrisuk Send chrisuk a private message

Posted: 2001-Dec-29 22:55
Edit Message Delete Message Reply to this message

Easy, hyperseek and a dmoz extractor such as the one available at pluginlibrary.com

Well actually its not easy, its damn hard work, the easy---easy way is to get the scripts over at anaconda.net.

The downside with this is that you don't control the data which kind makes you like all the other search engine wannabees out there.

If search is going to be your main product then you want a system with real functionality and tools to back it up, hseek etc or links. If it is to complement your existing content then you are better off importing the data with one of the free or inexpensive scripts that do it, its cheap and looks good that way.

What are you trying to do it for?



jmcc
Joined: Jan 15, 2002
# Posts: 1

View the profile for jmcc Send jmcc a private message

Posted: 2002-Jan-15 15:23
Edit Message Delete Message Reply to this message

It is actually easy enough to strip the URLs from the DMOZ data using Perl or TCL. Since the data is in sections, it would be possible to strip complete sections and then extract the URLs. However the key factor for any search engine is that the data is kept current. Google and the bigger engines index continually. The first thing that you would have to do with your dataset from DMOZ would be to verify that the domains still exist. The problem with running a good search engine is that the bandwidth/running costs will be high.

Regards...jmc


You are not permitted to post messages in this forum or topic, because of one or more of the following reasons:
  1. You have not yet logged in, or registered properly as a member
  2. You are a member, but no longer have posting rights.
  3. This is a private forum, for which you do not have permissions.

If you are a recent member, it's possible that you simply have not yet confirmed your account. Please check your email for a message entitled 'JimWorld Forums: Confirm Your Account' and follow the instructions contained within.

If you cannot find this message, click here to Re-Send it.

If you are still experiencing problem, please read the Login Assistance Article for some advice on what may be causing your login not to work properly.

Switch to Advanced Editor and ... Create a New Topic or Reply to this Thread

New posts Forum is locked
© 1995  ·  iWeb, Inc  ·  DBA JimWorld Productions