Tuesday, July 24, 2007

sharehound 1.2.3alpha is finally out

Well well, it's been a long while since the last release. Sharehound is not dead (yet ;), I've got a lot of ideas on quite necessary improvements for it, but unfortunately too little time lately.

This is mostly configuration enhancements release; also added ability to set encoding for individual FTP hosts and some more enhancements and bugfixes. It can be downloaded from here.

  • hosts screen: files count is now updated right after host crawl is finished
  • introduced 'encoding' config parameter for FTP crawl tasks
  • hosts screen: new column, 'encoding'; added UpgradeTo123Task auto-launced migration task (visit app.properties file to set encoding for pre-1.2.3 hosts)
  • allowed user queries to begin with wildcards ('*', '?' symbols)
  • secured access to Ajaxified admin functions
  • hosts screen: 'edit' function for 'encoding' column for FTP hosts (for 'admin' role users only). Uses configurable encodings list from WEB-INF/classes/encodings file. Copy it from expanded war file to config directory first if you want to change it. (see install.html for details on config dir)
  • improved URL interface: bare search.do now works as searchEntry.do. searchEntry.do is left working for some time for backward compatibility.
  • sharehound now requires web container supporting JSP 2.0. For Tomcat this means 5.5 and higher.
  • sharehound is distributed as compressed war now to ease future upgrades; extracted some config files, incl. quartz-config.xml and custom user accounts, to external config directory. See install.html for details.
  • lucene.properties' "index.directory" propery moved to new properties file, dir.properties; introduced some new dir properties there and made them all optional :)
  • made sharehound's data dir specified by JVM parameter passed to web cotainer (e.g. Tomcat). See install.html for details.

Fixed bugs
  • fatal startup error if index.directory (specified in lucene.properties) doesn't exist
  • 'files' link from hosts list item doesn't work for some time when the host is indexed for a first time
  • Admin screen: 'calculate hosts files counts' function's results are invisible for some time (until somebody call hostIndexer.flush)
  • some non-ASCII FTP directories are not indexed (appears empty in index while they are not actually)
  • sharehound can only run on Java 1.6 (was unintentionally compiled in 1.6 only mode)
(Blank) install instructions:
Please refer to docs/install.html file inside the zip.

Upgrade instructions:
Please refer to docs/release.txt file inside the zip.


xonix said...

Hi! Software is great, thanks!
Have two questions:

1) Is it possible to search and display host names rather than host ip's in the search?

2) How can I set multiple ranges of ip to scan? Should I create several quartz jobs - one job per range?
Thank you!

Artem Vasiliev said...

Thanks xonix!

1) host names can be displayed of cause. They are displayed by default, and there's a flag 'Show host names' shown when you click 'Advanced' link.

You can not search files by host names (or IPs) however. But you can limit your files search to particular host. One way to do this is: go to 'Hosts' screen, find interested host there, you can use search box to search by IP; click 'files' in interested host's row - this will limit files search to root dir of that host.

You can also limit your search to particular directory.

2) Yes exactly - look to config/.quartz-config-examples/quartz-config.xml.crawl
for example.

Pradeep said...

I just deployed sharehound on our corporate network. I wonder whether you could do search within file contents (for certain filetypes like pdf, doc etc.). It would make it very useful. I had seen another lucene based project that gives different indexer plugins for files and allows to search within file contents using lucene's text query operator. can both be married? Wedding gifts await ;)

Artem Vasiliev said...

Hello Pradeep!

You're certainly not the only one who wanted content search. Though I never did, at list with this project. But I'm not against marriage, only I'd like it to happen without myself.
Btw today we have sites like GitHub that help people a lot to conduct such marriages )