new Tika(message_obj)
Tika wrapper class for processing and downloading documents.
Parameters:
Name | Type | Description |
---|---|---|
message_obj |
Message |
Members
(private, inner) tika_queue_lock :Lock
Locks the processNext while current operations are still running.
Type:
(private, inner) tika_queue_obj :TikaQueue
Tika queue object
Type:
Methods
addFileToStore(url, callback)
To start download a file.
Parameters:
Name | Type | Description |
---|---|---|
url |
String | |
callback |
function |
extractText(url, callback)
Extracts text from downloaded document in pdf-store. Using tika-server.jar api.
Parameters:
Name | Type | Description |
---|---|---|
url |
String | |
callback |
function |
getFileName(url)
Converts url into pdf-store file location.
Parameters:
Name | Type | Description |
---|---|---|
url |
String |
getParsedFileName(url)
Converts url into pdf-store-parsed file location.
Parameters:
Name | Type | Description |
---|---|---|
url |
String |
indexTikaDoc(url)
Dumps the job file for indexing.
Parameters:
Name | Type | Description |
---|---|---|
url |
String |
removeFile(url, callback)
Removes a file from pdf-store
Parameters:
Name | Type | Description |
---|---|---|
url |
String | |
callback |
function |
startServer()
Starts tika-server.jar as spawned process.
submitFile(url, callback)
Submit a file for download and parsing.
Parameters:
Name | Type | Description |
---|---|---|
url |
String | |
callback |
function |
(private, inner) failSafe()
Runs in setInterval if processNext is locked from long time. Then recovers the lock.
(private, inner) processNext()
Dequeues jobs from tika queue and process them. Runs in a setInterval.