Summary
| Project home: | http://watson.codeplex.com/ |
| Download Link: | Download |
| Developers: | Kostya Isaev, Alexander Ivanov, Ilya Semenov, Alexey Shcherbachev |
Description
Watson is aimed to be a complete replacement for built-in search in Windows SharePoint Services v3. Powered by Lucene.Net it can offer rich query syntax and blazing indexing and querying speed. It can operate on a base of pure WSSv3 platform providing end user experience close to what MOSS search does now.
Story
Some time ago we were searching for a standalone library or component to implement a search for one project. And fe found it. Lucene.Net fulfilled all our needs and it was damn easy in use. Quick success with a Lucene inspired an idea to build a search solution powered by Lucene for SharePoint. From the beginning it was an experiment (and it still is). We wanted to check how difficult it is to build something comparable to MOSS search, how fast we can make it and what features we can pack into it.
Well, that was all about the first version though. Next version is planned soon with some really cool features onboard.
Technical Details
Lucene itself is only an information retrieval library. It provides indexing and searching
capabilities, but that is all it does. Everything else is up to a developer.
Initially we made several technical decisions to keep things simple:
- We are crawling SharePoint content databases directly. Mainly for performance reasons. Doing read-only SQL queries we are not touching DB itself. As for index data - it is stored separately on a hard drive.
- We are using periodic timer jobs infrastructure to perform full & incremental crawling. Here we decided to use built-in capabilities instead of creating our own indexing service. It is less flexible, but good enough for our needs.
- We wanna keep configuration UI as simple as possible. Everything should just work by default out of box. What is why, for example, we have no such thing as managed properties - fields are mapped straightforward as is by their name.
- We are getting all power of Lucene query syntax for free. We do want to enhance & tune it, but not now.
- We use same IFilter COM stuff to extract text from Office & PDF documents. So our search can process same document set as the built-in.
- Facets are implemented using a .Net port of bobo-browse project (original project is here http://code.google.com/p/bobo-browse/) [Will be published in next version].
In overall, nothing extra complex, nothing super new, but it took not so much time to implement and it works!
FAQ
Q: Is it safe to install this on a production system?
Q: Will it make my system unsupportable in a case of support call to MS?
A: Well, it is safe in terms, that we are performing only read-only SQL statements
to SharePoint DB. But MS have different point of view described here: kb841057.
The judgment is that even read-only queries can generate an unexpected workload.
We have a private unofficial talk with two guys in MS. They told us that a farm
with Watson installed can be considered 'conditionally supportable'. In case of support call
you will be asked to uninstall Watson first to prove that any performance/other issues
are not the result of Watson operation.
Anyway, if you have any performance issues with Watson we encourage you to contact us first.
Q: Are multi-server farm configurations supported right now?
A: Nope. Current build supports only single server farms.
But it does support multi-farm scenario through xml configuration settings. Meaning that
you can use one SharePoint server to crawl & search content from many SharePoint instances.
We are planning to implement some UI to manage this stuff soon.
Q: Is this a strictly free venture?
A: Yes and no. Yes, current version is absolutely free and going to be free in future.
And no, we are happy to offer paid annual technical support plans for Watson.
Feedback
Your feedback is crucial for us.
Please drop us a line, if you like something
to be implemented or you like/dislike our project. Any opinion is welcome.




