Printer Friendly Version Print this thread
Email this thread to a friend eMail this thread to a friend
  • Selling Music Membership Based Site [300+ Members] (In: I Want to Sell My Website)
  • Sure Fire Search Engine Optimization (In: General Search Engine Optimization)
  • Search engine loophole? (In: General Search Engine Optimization)
  • Featured Web Site Template

    Hundreds More at Free Site Templates.com!

    Web Site Partners
    Sponsored Links
    Jet City Software
     
    Whos Here ?
    There are 0 guests and 1 members in the forums right now.
    Reflects user activity within the last 5 minutes
    Moderator(s): flyingrose, Curious_Mark
    Member Message

    objectsdirectory
    Joined: Feb 17, 2004
    # Posts: 9

    View the profile for objectsdirectory Send objectsdirectory a private message

    Posted: 2004-Apr-16 21:27
    Edit Message Delete Message Reply to this message

    Our webmaster asked for permission from Mike Levin before posting information about our search engine.

    Objects Search which is based on open source source search engine nutch (nuth.org) launches beta version of Clustering Engine.

    Clustering Engine is a system for clustering textual data.This engine automatically categorizes search results on-the-fly into hierarchical clusters.

    Search results clustering attempts to overcome the problem of information overload.For more information, please visit http://www.ObjectsSearch.com

    Note:Our server are currently slow.We will add more and fast servers soon.

    Thanks.





    unreviewed
    Joined: Dec 07, 2000
    # Posts: 6776

    View the profile for unreviewed Send unreviewed a private message

    Posted: 2004-Apr-20 01:56
    Edit Message Delete Message Reply to this message

    Nice job.

    I actually suggested (last year), to our board owner that we start a community project using nutch.

    I wish you success.

    Any info you would/can share on your experience to date?

    How large is your database, what are you currently using for hardware, that sort of thing ...



    objectsdirectory
    Joined: Feb 17, 2004
    # Posts: 9

    View the profile for objectsdirectory Send objectsdirectory a private message

    Posted: 2004-Apr-20 04:20
    Edit Message Delete Message Reply to this message

    Glad you like it. As you know nutch is in beta stage , however it is getting better. Developer team at nutch is doing great job.

    In my experience Search engine requires lot of hard work and money.

    Some problems are listed below.

    Note : This is for big database, if you have small database just for personal research or use then you may not need this much resources.

    1)Taking care of servers and distributing the database across many machines.

    2)Keeping our search engine current with the latest version of nutch and mixing with our search technology.

    3)Taking care of things as they go wrong for example crawler crash or updating database.

    4)Requires lots of RAM and disk.

    5)Keep adding new services to compete with other top search engines.

    We are woking on our hardware , you may have notice that the speed is not fast as other top search engines.We are adding more servers soon.

    Currently we use some dell servers and few others . They include hard drives of size such as 250 GB.

    Thanks.














    unreviewed
    Joined: Dec 07, 2000
    # Posts: 6776

    View the profile for unreviewed Send unreviewed a private message

    Posted: 2004-Apr-20 07:10
    Edit Message Delete Message Reply to this message

    I've attempted to build engines before, in all cases it come down to the amount of memory. Indexing can be passed and broken into disk accessing tasks, but not the search. That must be in memory, it's the only way. Google, as you are, is based on Linux, but has morphed it into a new operating system with a proprietary file system, that uses a minimum 64 megabyte file cluster, not the type of system for using Notepad. wink

    Point is, the whole I/O system is designed and optimised for accessing and creating large data files. Files that are eventually loaded into memory. For redundancy, they don't use RAID, their system basically keeps 3 copies of everything, if something fails in one area, the system just redirects to the next location.

    I'm interested in how your version is handing it's database size, compared to page count, and actual ram load vs. disk paging at this point?



    objectsdirectory
    Joined: Feb 17, 2004
    # Posts: 9

    View the profile for objectsdirectory Send objectsdirectory a private message

    Posted: 2004-May-10 11:04
    Edit Message Delete Message Reply to this message

    Sorry for late response been busy . Here is the link may help. This is related to "Nutch" that we are using and also shows requirements for searching.

    http://www.objectssearch.com/en/hardware.html

    [ Message was edited by: objectsdirectory 05/10/2004 03:31 am ]





    byronm
    Joined: Mar 20, 2004
    # Posts: 26

    View the profile for byronm Send byronm a private message

    Posted: 2004-May-10 15:03
    Edit Message Delete Message Reply to this message

    Mozdex.com is a similar project utilizing nutch.

    Mozdex uses 4 Servers as query servers and 1 xeon as the core database server. Current system is a beta index so serps will change, but you can give feedback on performance, results and such.

    We will be integrating a spell checker and some other services this afternoon as well as launching a forge project and announcing some partnerships shortly smile



    objectsdirectory
    Joined: Feb 17, 2004
    # Posts: 9

    View the profile for objectsdirectory Send objectsdirectory a private message

    Posted: 2004-May-10 19:57
    Edit Message Delete Message Reply to this message

    What does Objects Search offer?
    + General web database
    A search results page contains:
    ++ Link to a cached copy of the page
    ++ A link with technical data explaining how the page was scored for relevancy
    ++ A link that shows a list of incoming anchors indexed for the page
    + Other Info : Phrase searching with quotes, implied "and", remove terms with - (minus) sign
    + Spell Check
    + General web database with clustering
    + News and weblog search
    + Directory (also use data from ODP)
    + Image Search
    + Interface available in ten languages



    You are not permitted to post messages in this forum or topic, because of one or more of the following reasons:
    1. You have not yet logged in, or registered properly as a member
    2. You are a member, but no longer have posting rights.
    3. This is a private forum, for which you do not have permissions.

    If you are a recent member, it's possible that you simply have not yet confirmed your account. Please check your email for a message entitled 'JimWorld Forums: Confirm Your Account' and follow the instructions contained within.

    If you cannot find this message, click here to Re-Send it.

    If you are still experiencing problem, please read the Login Assistance Article for some advice on what may be causing your login not to work properly.

    Switch to Advanced Editor and ... Create a New Topic or Reply to this Thread

    New posts Forum is locked
    © 1995  ·  iWeb, Inc  ·  DBA JimWorld Productions