You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I just installed this crawler and I'm having an issue. Testing the crawler with just one URL and it seems to get stuck on the nutch InjectorJob, nothing happens after the following:
[nutch-indexer-discovery]$ ./crawl
Injecting urls from ./seed/urls.txt
./build/apache-nutch-2.3.1/runtime/local/bin/nutch inject ./seed/urls.txt
InjectorJob: starting at 2018-10-23 13:13:36
InjectorJob: Injecting urlDir: seed/urls.txt
Installation and setup went fine, except some warning when I ran ./gradlew buildPlugin:
[ant:taskdef] Could not load definitions from resource org/sonar/ant/antlib.xml. It could not be found.
Any idea what might be wrong here?
The text was updated successfully, but these errors were encountered:
So it's not stuck, just very very slow. 2 hours to inject one url..
currently at this stage:
./build/apache-nutch-2.3.1/runtime/local/bin/nutch inject ./seed/urls.txt
InjectorJob: starting at 2018-10-23 13:19:22
InjectorJob: Injecting urlDir: seed/urls.txt
InjectorJob: Using class org.apache.gora.hbase.store.HBaseStore as the Gora storage class.
InjectorJob: total number of urls rejected by filters: 0
InjectorJob: total number of urls injected after normalization and filtering: 1
Injector: finished at 2018-10-23 15:21:13, elapsed: 02:01:50
Generate urls:
./build/apache-nutch-2.3.1/runtime/local/bin/nutch generate -topN 5
GeneratorJob: starting at 2018-10-23 15:21:14
GeneratorJob: Selecting best-scoring urls due for fetch.
GeneratorJob: starting
GeneratorJob: filtering: true
GeneratorJob: normalizing: true
GeneratorJob: topN: 5
Hi, I just installed this crawler and I'm having an issue. Testing the crawler with just one URL and it seems to get stuck on the nutch InjectorJob, nothing happens after the following:
Installation and setup went fine, except some warning when I ran
./gradlew buildPlugin
:[ant:taskdef] Could not load definitions from resource org/sonar/ant/antlib.xml. It could not be found.
Any idea what might be wrong here?
The text was updated successfully, but these errors were encountered: