Category Archives: Project Log

My collection of web development projects.

Success with indexing Nutch 1.7 to Solr 4.5

In regards to my post on Stackoverflow, I pointed my crawl and index to the location of my collection. In this case:


$ bin/nutch crawl urls -solr http://localhost:8983/solr/rockies -depth 1 -topN 5
$ bin/nutch solrindex http://localhost:8983/solr/rockies crawl/crawldb -linkdb crawl/linkdb crawl/segments/*

Additionally, I updated the -depth to 1 (specifies how deep to go after the link is defined. In this case 1 link from main page) and -topN to 5 (how many documents will be retrieved from each level).

Frustrations with indexing Nutch 1.7 to Solr 4.5

I’m setting up Solr Search for my company’s domain and have stumbled onto roadblock after roadblock. Now that I’m at another obstacle my Google skills are pretty much depleted. This project is almost two weeks in with an aggressive timeline. I’ve had to resort to the Stackoverflow Gods for the first time…hopefully it works.

Here’s the link in detail, Exception in thread “main” java.io.IOException: Job failed! on Nutch 1.7