Printer Friendly Version
Email this thread to a friend
|
Selling Music Membership Based Site [300+ Members] (In: I Want to Sell My Website)
Sure Fire Search Engine Optimization (In: General Search Engine Optimization)
Search engine loophole? (In: General Search Engine Optimization)
Featured Web Site Template |
|
There are 0 guests and 1 members in the forums right now.
Reflects user activity within the last 5 minutes
|
|
| Member |
Message |
objectsdirectory
Joined: Feb 17, 2004
# Posts: 9
|
Posted: 2004-Apr-16 21:27
Our webmaster asked for permission from Mike Levin before posting information about our search engine.
Objects Search which is based on open source source search engine nutch (nuth.org) launches beta version of Clustering Engine.
Clustering Engine is a system for clustering textual data.This engine automatically categorizes search results on-the-fly into hierarchical clusters.
Search results clustering attempts to overcome the problem of information overload.For more information, please visit http://www.ObjectsSearch.com
Note:Our server are currently slow.We will add more and fast servers soon.
Thanks.
|
 |
unreviewed
Joined: Dec 07, 2000
# Posts: 6776
|
Posted: 2004-Apr-20 01:56
Nice job.
I actually suggested (last year), to our board owner that we start a community project using nutch.
I wish you success.
Any info you would/can share on your experience to date?
How large is your database, what are you currently using for hardware, that sort of thing ...
|
 |
objectsdirectory
Joined: Feb 17, 2004
# Posts: 9
|
Posted: 2004-Apr-20 04:20
Glad you like it. As you know nutch is in beta stage , however it is getting better. Developer team at nutch is doing great job.
In my experience Search engine requires lot of hard work and money.
Some problems are listed below.
Note : This is for big database, if you have small database just for personal research or use then you may not need this much resources.
1)Taking care of servers and distributing the database across many machines.
2)Keeping our search engine current with the latest version of nutch and mixing with our search technology.
3)Taking care of things as they go wrong for example crawler crash or updating database.
4)Requires lots of RAM and disk.
5)Keep adding new services to compete with other top search engines.
We are woking on our hardware , you may have notice that the speed is not fast as other top search engines.We are adding more servers soon.
Currently we use some dell servers and few others . They include hard drives of size such as 250 GB.
Thanks.
|
 |
unreviewed
Joined: Dec 07, 2000
# Posts: 6776
|
Posted: 2004-Apr-20 07:10
I've attempted to build engines before, in all cases it come down to the amount of memory. Indexing can be passed and broken into disk accessing tasks, but not the search. That must be in memory, it's the only way. Google, as you are, is based on Linux, but has morphed it into a new operating system with a proprietary file system, that uses a minimum 64 megabyte file cluster, not the type of system for using Notepad.
Point is, the whole I/O system is designed and optimised for accessing and creating large data files. Files that are eventually loaded into memory. For redundancy, they don't use RAID, their system basically keeps 3 copies of everything, if something fails in one area, the system just redirects to the next location.
I'm interested in how your version is handing it's database size, compared to page count, and actual ram load vs. disk paging at this point?
|
 |
objectsdirectory
Joined: Feb 17, 2004
# Posts: 9
|
Posted: 2004-May-10 11:04
Sorry for late response been busy . Here is the link may help. This is related to "Nutch" that we are using and also shows requirements for searching.
http://www.objectssearch.com/en/hardware.html
[ Message was edited by: objectsdirectory 05/10/2004 03:31 am ]
|
 |
byronm
Joined: Mar 20, 2004
# Posts: 26
|
Posted: 2004-May-10 15:03
Mozdex.com is a similar project utilizing nutch.
Mozdex uses 4 Servers as query servers and 1 xeon as the core database server. Current system is a beta index so serps will change, but you can give feedback on performance, results and such.
We will be integrating a spell checker and some other services this afternoon as well as launching a forge project and announcing some partnerships shortly
|
 |
objectsdirectory
Joined: Feb 17, 2004
# Posts: 9
|
Posted: 2004-May-10 19:57
What does Objects Search offer?
+ General web database
A search results page contains:
++ Link to a cached copy of the page
++ A link with technical data explaining how the page was scored for relevancy
++ A link that shows a list of incoming anchors indexed for the page
+ Other Info : Phrase searching with quotes, implied "and", remove terms with - (minus) sign
+ Spell Check
+ General web database with clustering
+ News and weblog search
+ Directory (also use data from ODP)
+ Image Search
+ Interface available in ten languages
|
 |
You are not permitted to post messages in this forum or topic, because of one or more of the following reasons:
- You have not yet logged in, or registered properly as a member
- You are a member, but no longer have posting rights.
- This is a private forum, for which you do not have permissions.
If you are a recent member, it's possible that you simply have not yet confirmed your account. Please
check your email for a message entitled 'JimWorld Forums: Confirm Your Account' and follow the instructions
contained within.
If you cannot find this message, click here to Re-Send it.
|
If you are still experiencing problem, please read the
Login Assistance
Article for some advice on what may be causing your login not to work properly.
|
Switch to Advanced Editor and ...
Create a New Topic
or Reply to this Thread
|
|