Introducing G2.ai, the future of software buying.Try now
EasySend
Sponsored
EasySend
Visit Website
Product Avatar Image
Apache Nutch

By The Apache Software Foundation

Unclaimed Profile

Claim your company’s G2 profile

Claiming this profile confirms that you work at Apache Nutch and allows you to manage how it appears on G2.

    Once approved, you can:

  • Update your company and product details

  • Boost your brand's visibility on G2, search and LLMs

  • Access insights on visitors and competitors

  • Respond to customer reviews

  • We’ll verify your work email before granting access.

Claim Now
4.0 out of 5 stars

How would you rate your experience with Apache Nutch?

EasySend
Sponsored
EasySend
Visit Website
It's been two months since this profile received a new review
Leave a Review

Apache Nutch Reviews & Product Details

Product Avatar Image

Have you used Apache Nutch before?

Answer a few questions to help the Apache Nutch community

Apache Nutch Reviews (20)

Reviews

Apache Nutch Reviews (20)

4.0
20 reviews

Search reviews
Filter Reviews
Clear Results
G2 reviews are authentic and verified.
Narendra A.
NA
Senior Software Engineer
Enterprise (> 1000 emp.)
"Apache Nutch is Rockstar in terms of huge data crawling."
What do you like best about Apache Nutch?

When I used apache Nutch I was amazed with the speed it crawls data and the libraries and data structures provided to customise your crawling and reading the data in desired format. I was crawling the whole IBM data to get the insights and do text analytics on it. The kind of support I got from the forums was also great. So overall it was nice experience using apache Nutch crawler. Review collected by and hosted on G2.com.

What do you dislike about Apache Nutch?

What I disliked was the video support it provides in the Internet. Review collected by and hosted on G2.com.

Jaydip L.
JL
Senior Software Engineer
Small-Business (50 or fewer emp.)
"Very efficient, faster and open source tool for crawler"
What do you like best about Apache Nutch?

Open Source

Scalable

Parsing and indexing techniques.

Easy Integration with elastic search and solr.

Different plugins to parse various content types. Review collected by and hosted on G2.com.

What do you dislike about Apache Nutch?

Nothing much in my list of dislike because we really enjoyed it very much and it fulfilled our organization needs. But based on experience I can say some cons like it requires good infrastructure in place and consumes good amount of memory and cpu utilization. We also feel if nutch provide good dashboard and kind of admin panel then it would have very helpful to us. Review collected by and hosted on G2.com.

SA
Quality Assurance Test Engineer
Mid-Market (51-1000 emp.)
"Web Crawling Tool"
What do you like best about Apache Nutch?

It was an open source tool that you can add your own plugins. You can change it own code as you wish. It was very easy to use. It can be run with different tools also. Review collected by and hosted on G2.com.

What do you dislike about Apache Nutch?

You should know which version of nutch is suitable to other tools you work with. Review collected by and hosted on G2.com.

Naser A.
NA
Research Officer
Mid-Market (51-1000 emp.)
Business partner of the seller or seller's competitor, not included in G2 scores.
"I am big data developer in KICS, UET Lahore, Pakistan"
What do you like best about Apache Nutch?

I have been using apache nutch since 3 or 4 years, I like it as an open source tool which can run on a system with normal specs and crawl millions of millions pages. Review collected by and hosted on G2.com.

What do you dislike about Apache Nutch?

* I don't like its seed creation algorightm, it makes cluster and then went to a loop to crawl the same webesites when it has crawled million of pages.

* Its configuration not easy.

* job Automations not provided

* Documentation is not good.

* Support is not good. Review collected by and hosted on G2.com.

Prafulla R.
PR
Technical Architect
Small-Business (50 or fewer emp.)
"Nutch is a light weight scraping tool which has trivial learning curve in its adoption."
What do you like best about Apache Nutch?

-Easy to configure

-Stable backend store Review collected by and hosted on G2.com.

What do you dislike about Apache Nutch?

Use of Java makes it a little bulky

One has to be careful of heap size otherwise OOM errors are inevitable. Review collected by and hosted on G2.com.

Krishnan S.
KS
Software Engineer
Mid-Market (51-1000 emp.)
"Extract to the depth"
What do you like best about Apache Nutch?

Crawl of URL is excellent function to read the content. Nutch is very useful tool to read the content in the document of various depth. Review collected by and hosted on G2.com.

What do you dislike about Apache Nutch?

Bit hard to customize the crawl function. Review collected by and hosted on G2.com.

Ruchika J.
RJ
Hadoop Developer
Small-Business (50 or fewer emp.)
Business partner of the seller or seller's competitor, not included in G2 scores.
"Butch is highly scalable open source web crawler.It can customise according to the requirements."
What do you like best about Apache Nutch?

Plugins for indexing and searching.

Integration with solar and other tools.

It finely work in Hadoop clusters as well. Review collected by and hosted on G2.com.

What do you dislike about Apache Nutch?

Lack of community to discuss any issue or concern.

Lack of documents for the implementation and integration of nutch. Review collected by and hosted on G2.com.

Usama T.
UT
Python Developer
Mid-Market (51-1000 emp.)
"A great web crawler for all crawling needs"
What do you like best about Apache Nutch?

Its feature to crawl complete web with inlinks and out links which make it forever crawl. Review collected by and hosted on G2.com.

What do you dislike about Apache Nutch?

We need to have a very strong knowledge of Apache Hadoop, Hbase, Zookeeper, and complete environment setup. We have to be very efficient in it for using this. Moreover, we can not view Hbase data easily which is also very difficult. Review collected by and hosted on G2.com.

Fred Z.
FZ
Founder
Enterprise (> 1000 emp.)
"Nutch is reliable, mature open source crawler"
What do you like best about Apache Nutch?

I have deployed Nutch on several times when I needed to stand up a crawler quickly. It is free, straightforward, reliable, well documented, and comes with an OTS integration with Apache Solr for search. Review collected by and hosted on G2.com.

What do you dislike about Apache Nutch?

The directory and file partioning scheme for the crawler can be a bit confusing. Review collected by and hosted on G2.com.

Verified User in Pharmaceuticals
IP
Small-Business (50 or fewer emp.)
Business partner of the seller or seller's competitor, not included in G2 scores.
"Best for web crawling"
What do you like best about Apache Nutch?

I like the default index generation for crawler Review collected by and hosted on G2.com.

What do you dislike about Apache Nutch?

When working with Ubuntu OS I find hard to setting the directory paths Review collected by and hosted on G2.com.

Pricing

Pricing details for this product isn’t currently available. Visit the vendor’s website to learn more.

Apache Nutch Comparisons
Product Avatar Image
Apache Tika
Compare Now
Product Avatar Image
Apache Nutch
View Alternatives