diff --git a/README.md b/README.md
index 53759ca..6a697e5 100644
--- a/README.md
+++ b/README.md
@@ -56,205 +56,13 @@ or
# Documentation
* [Examples](./docs/examples.md)
-* [Tesseract.recognize](#tesseractrecognizeimage-imagelike-options---tesseractjob)
- + [Simple Example](#simple-example)
- + [More Complicated Example](#more-complicated-example)
-* [Tesseract.detect](#tesseractdetectimage-imagelike---tesseractjob)
-* [ImageLike](#imagelike)
-* [TesseractJob](#tesseractjob)
- + [TesseractJob.progress](#tesseractjobprogresscallback-function---tesseractjob)
- + [TesseractJob.then](#tesseractjobthencallback-function---tesseractjob)
- + [TesseractJob.catch](#tesseractjobcatchcallback-function---tesseractjob)
- + [TesseractJob.finally](#tesseractjobfinallycallback-function---tesseractjob)
-* [Local Installation](#local-installation)
- + [corePath](#corepath)
- + [workerPath](#workerpath)
- + [langPath](#langpath)
-* [Contributing](#contributing)
- + [Development](#development)
- + [Building Static Files](#building-static-files)
- + [Send us a Pull Request!](#send-us-a-pull-request)
+* [Image Format](./docs/image-format.md)
+* [API](./docs/api.md)
+* [Local Installation](./docs/local-installation.md)
+# Contributing
-## Tesseract.recognize(image: [ImageLike](#imagelike)[, options]) -> [TesseractJob](#tesseractjob)
-Figures out what words are in `image`, where the words are in `image`, etc.
-> Note: `image` should be sufficiently high resolution.
-> Often, the same image will get much better results if you upscale it before calling `recognize`.
-
-- `image` is any [ImageLike](#imagelike) object.
-- `options` is either absent (in which case it is interpreted as `'eng'`), a string specifing a language short code from the [language list](./docs/tesseract_lang_list.md), or a flat json object that may:
- + include properties that override some subset of the [default tesseract parameters](./docs/tesseract_parameters.md)
- + include a `lang` property with a value from the [list of lang parameters](./docs/tesseract_lang_list.md)
-
-Returns a [TesseractJob](#tesseractjob) whose `then`, `progress`, `catch` and `finally` methods can be used to act on the result.
-
-### Simple Example:
-```javascript
-Tesseract.recognize(myImage)
-.then(function(result){
- console.log(result)
-})
-```
-
-### More Complicated Example:
-```javascript
-// if we know our image is of spanish words without the letter 'e':
-Tesseract.recognize(myImage, {
- langs: 'spa',
- tessedit_char_blacklist: 'e'
-})
-.then(function(result){
- console.log(result)
-})
-```
-
-
-
-
-## Tesseract.detect(image: [ImageLike](#imagelike)) -> [TesseractJob](#tesseractjob)
-
-Figures out what script (e.g. 'Latin', 'Chinese') the words in image are written in.
-
-- `image` is any [ImageLike](#imagelike) object.
-
-Returns a [TesseractJob](#tesseractjob) whose `then`, `progress`, `catch` and `finally` methods can be used to act on the result of the script.
-
-
-```javascript
-Tesseract.detect(myImage)
-.then(function(result){
- console.log(result)
-})
-```
-
-
-## ImageLike
-
-The main Tesseract.js functions take an `image` parameter, which should be something that is like an image. What's considered "image-like" differs depending on whether it is being run from the browser or through NodeJS.
-
-On a browser, an image can be:
-- an `img`, `video`, or `canvas` element
-- a `File` object (from a file `` or drag-drop event)
-- a path or URL to an accessible image (the image must either be hosted locally)
-
-In Node.js, an image can be
-- a path to a local image
-
-
-## TesseractJob
-
-A TesseractJob is an object returned by a call to `recognize` or `detect`. It's inspired by the ES6 Promise interface and provides `then` and `catch` methods. It also provides `finally` method, which will be fired regardless of the job fate. One important difference is that these methods return the job itself (to enable chaining) rather than new.
-
-Typical use is:
-```javascript
-Tesseract.recognize(myImage)
- .progress(message => console.log(message))
- .catch(err => console.error(err))
- .then(result => console.log(result))
- .finally(resultOrError => console.log(resultOrError))
-```
-
-Which is equivalent to:
-```javascript
-var job1 = Tesseract.recognize(myImage);
-
-job1.progress(message => console.log(message));
-
-job1.catch(err => console.error(err));
-
-job1.then(result => console.log(result));
-
-job1.finally(resultOrError => console.log(resultOrError));
-```
-
-
-
-### TesseractJob.progress(callback: function) -> TesseractJob
-Sets `callback` as the function that will be called every time the job progresses.
-- `callback` is a function with the signature `callback(progress)` where `progress` is a json object.
-
-For example:
-```javascript
-Tesseract.recognize(myImage)
- .progress(function(message){console.log('progress is: ', message)})
-```
-
-The console will show something like:
-```javascript
-progress is: {loaded_lang_model: "eng", from_cache: true}
-progress is: {initialized_with_lang: "eng"}
-progress is: {set_variable: Object}
-progress is: {set_variable: Object}
-progress is: {recognized: 0}
-progress is: {recognized: 0.3}
-progress is: {recognized: 0.6}
-progress is: {recognized: 0.9}
-progress is: {recognized: 1}
-```
-
-
-### TesseractJob.then(callback: function) -> TesseractJob
-Sets `callback` as the function that will be called if and when the job successfully completes.
-- `callback` is a function with the signature `callback(result)` where `result` is a json object.
-
-
-For example:
-```javascript
-Tesseract.recognize(myImage)
- .then(function(result){console.log('result is: ', result)})
-```
-
-The console will show something like:
-```javascript
-result is: {
- blocks: Array[1]
- confidence: 87
- html: "
TesseractJob
-Sets `callback` as the function that will be called if the job fails.
-- `callback` is a function with the signature `callback(error)` where `error` is a json object.
-
-### TesseractJob.finally(callback: function) -> TesseractJob
-Sets `callback` as the function that will be called regardless if the job fails or success.
-- `callback` is a function with the signature `callback(resultOrError)` where `resultOrError` is a json object.
-
-## Local Installation
-
-In the browser, `tesseract.js` simply provides the API layer. Internally, it opens a WebWorker to handle requests. That worker itself loads code from the Emscripten-built `tesseract.js-core` which itself is hosted on a CDN. Then it dynamically loads language files hosted on another CDN.
-
-Because of this we recommend loading `tesseract.js` from a CDN. But if you really need to have all your files local, you can use the `Tesseract.create` function which allows you to specify custom paths for workers, languages, and core.
-
-```javascript
-window.Tesseract = Tesseract.create({
- workerPath: '/path/to/worker.js',
- langPath: 'https://cdn.jsdelivr.net/gh/naptha/tessdata@gh-pages/3.02/',
- corePath: 'https://cdn.jsdelivr.net/gh/naptha/tesseract.js-core@0.1.0/index.js',
-})
-```
-
-### corePath
-A string specifying the location of the [tesseract.js-core library](https://github.com/naptha/tesseract.js-core), with default value 'https://cdn.jsdelivr.net/gh/naptha/tesseract.js-core@0.1.0/index.js'. Set this string before calling `Tesseract.recognize` and `Tesseract.detect` if you want Tesseract.js to use a different file.
-
-### workerPath
-A string specifying the location of the [worker.js](./dist/worker.js) file. Set this string before calling `Tesseract.recognize` and `Tesseract.detect` if you want Tesseract.js to use a different file.
-
-### langPath
-A string specifying the location of the tesseract language files, with default value 'https://cdn.jsdelivr.net/gh/naptha/tessdata@gh-pages/3.02/'. Language file URLs are calculated according to the formula `langPath + langCode + '.traineddata.gz'`. Set this string before calling `Tesseract.recognize` and `Tesseract.detect` if you want Tesseract.js to use different language files.
-
-
-## Contributing
-### Development
+## Development
To run a development copy of tesseract.js, first clone this repo.
```shell
> git clone https://github.com/naptha/tesseract.js.git
@@ -269,18 +77,16 @@ Then, `cd tesseract.js && npm install && npm start`
Starting up http-server, serving ./
Available on:
- http://127.0.0.1:7355
- http://[your ip]:7355
+ http://127.0.0.1:3000
+ http://[your ip]:3000
```
-Then open `http://localhost:7355/examples/file-input/demo.html` in your favorite browser. The devServer automatically rebuilds `tesseract.js` and `tesseract.worker.js` when you change files in the src folder.
+Then open `http://localhost:3000/examples/browser/demo.html` in your favorite browser. The devServer automatically rebuilds `tesseract.dev.js` and `worker.min.js` when you change files in the src folder.
-### Building Static Files
+## Building Static Files
After you've cloned the repo and run `npm install` as described in the [Development Section](#development), you can build static library files in the dist folder with
+
```shell
> npm run build
```
-
-### Send us a Pull Request!
-Thanks :)
diff --git a/docs/api.md b/docs/api.md
new file mode 100644
index 0000000..186e64a
--- /dev/null
+++ b/docs/api.md
@@ -0,0 +1,146 @@
+# API
+
+## Tesseract.recognize(image [, options]) -> [TesseractJob](#tesseractjob)
+Figures out what words are in `image`, where the words are in `image`, etc.
+> Note: `image` should be sufficiently high resolution.
+> Often, the same image will get much better results if you upscale it before calling `recognize`.
+
+- `image` see [Image Format](./image-format.md) for more details.
+- `options` is either absent (in which case it is interpreted as `'eng'`), a string specifing a language short code from the [language list](./tesseract_lang_list.md), or a flat json object that may:
+ + include properties that override some subset of the [default tesseract parameters](./tesseract_parameters.md)
+ + include a `lang` property with a value from the [list of lang parameters](./tesseract_lang_list.md), you can use multiple languages separated by '+', ex. `eng+chi_tra`
+
+Returns a [TesseractJob](#tesseractjob) whose `then`, `progress`, `catch` and `finally` methods can be used to act on the result.
+
+### Simple Example:
+```javascript
+const worker = new Tessearct.TesseractWorker();
+worker
+ .recognize(myImage)
+ .then(function(result){
+ console.log(result);
+ });
+```
+
+### More Complicated Example:
+```javascript
+const worker = new Tessearct.TesseractWorker();
+// if we know our image is of spanish words without the letter 'e':
+worker
+ .recognize(myImage, {
+ lang: 'spa',
+ tessedit_char_blacklist: 'e',
+ })
+ .then(function(result){
+ console.log(result);
+ });
+```
+
+## Tesseract.detect(image) -> [TesseractJob](#tesseractjob)
+
+Figures out what script (e.g. 'Latin', 'Chinese') the words in image are written in.
+
+- `image` see [Image Format](./image-format.md) for more details.
+
+Returns a [TesseractJob](#tesseractjob) whose `then`, `progress`, `catch` and `finally` methods can be used to act on the result of the script.
+
+```javascript
+const worker = new Tessearct.TesseractWorker();
+worker
+ .detect(myImage)
+ .then(function(result){
+ console.log(result);
+ });
+```
+
+## TesseractJob
+
+A TesseractJob is an object returned by a call to `recognize` or `detect`. It's inspired by the ES6 Promise interface and provides `then` and `catch` methods. It also provides `finally` method, which will be fired regardless of the job fate. One important difference is that these methods return the job itself (to enable chaining) rather than new.
+
+Typical use is:
+```javascript
+const worker = new Tessearct.TesseractWorker();
+worker.recognize(myImage)
+ .progress(message => console.log(message))
+ .catch(err => console.error(err))
+ .then(result => console.log(result))
+ .finally(resultOrError => console.log(resultOrError));
+```
+
+Which is equivalent to:
+```javascript
+const worker = new Tessearct.TesseractWorker();
+const job1 = worker.recognize(myImage);
+
+job1.progress(message => console.log(message));
+
+job1.catch(err => console.error(err));
+
+job1.then(result => console.log(result));
+
+job1.finally(resultOrError => console.log(resultOrError));
+```
+
+
+
+### TesseractJob.progress(callback: function) -> TesseractJob
+Sets `callback` as the function that will be called every time the job progresses.
+- `callback` is a function with the signature `callback(progress)` where `progress` is a json object.
+
+For example:
+```javascript
+const worker = new Tessearct.TesseractWorker();
+worker.recognize(myImage)
+ .progress(function(message){console.log('progress is: ', message)});
+```
+
+The console will show something like:
+```javascript
+progress is: {loaded_lang_model: "eng", from_cache: true}
+progress is: {initialized_with_lang: "eng"}
+progress is: {set_variable: Object}
+progress is: {set_variable: Object}
+progress is: {recognized: 0}
+progress is: {recognized: 0.3}
+progress is: {recognized: 0.6}
+progress is: {recognized: 0.9}
+progress is: {recognized: 1}
+```
+
+
+### TesseractJob.then(callback: function) -> TesseractJob
+Sets `callback` as the function that will be called if and when the job successfully completes.
+- `callback` is a function with the signature `callback(result)` where `result` is a json object.
+
+
+For example:
+```javascript
+const worker = new Tessearct.TesseractWorker();
+worker.recognize(myImage)
+ .then(function(result){console.log('result is: ', result)});
+```
+
+The console will show something like:
+```javascript
+result is: {
+ blocks: Array[1]
+ confidence: 87
+ html: "
TesseractJob
+Sets `callback` as the function that will be called if the job fails.
+- `callback` is a function with the signature `callback(error)` where `error` is a json object.
+
+### TesseractJob.finally(callback: function) -> TesseractJob
+Sets `callback` as the function that will be called regardless if the job fails or success.
+- `callback` is a function with the signature `callback(resultOrError)` where `resultOrError` is a json object.
diff --git a/docs/examples.md b/docs/examples.md
index 5385c4e..e778f97 100644
--- a/docs/examples.md
+++ b/docs/examples.md
@@ -1,5 +1,7 @@
# Tesseract.js Examples
+You can also check [examples](../examples) folder.
+
### basic
```javascript
diff --git a/docs/image-format.md b/docs/image-format.md
new file mode 100644
index 0000000..48b2126
--- /dev/null
+++ b/docs/image-format.md
@@ -0,0 +1,13 @@
+# Image Format
+
+Support Format: **bmp, jpg, png, pbm**
+
+The main Tesseract.js functions (ex. recognize, detect) take an `image` parameter, which should be something that is like an image. What's considered "image-like" differs depending on whether it is being run from the browser or through NodeJS.
+
+On a browser, an image can be:
+- an `img`, `video`, or `canvas` element
+- a `File` object (from a file ``)
+- a path or URL to an accessible image
+
+In Node.js, an image can be
+- a path to a local image
diff --git a/docs/local-installation.md b/docs/local-installation.md
new file mode 100644
index 0000000..aeb53bc
--- /dev/null
+++ b/docs/local-installation.md
@@ -0,0 +1,24 @@
+## Local Installation
+
+In browser environment, `tesseract.js` simply provides the API layer. Internally, it opens a WebWorker to handle requests. That worker itself loads code from the Emscripten-built `tesseract.js-core` which itself is hosted on a CDN. Then it dynamically loads language files hosted on another CDN.
+
+Because of this we recommend loading `tesseract.js` from a CDN. But if you really need to have all your files local, you can pass extra arguments to `TessearctWorker` to specify custom paths for workers, languages, and core.
+
+In Node.js environment, the only path you may want to customize is languages/langPath.
+
+```javascript
+const worker = Tesseract.TesseractWorker({
+ workerPath: 'https://cdn.jsdelivr.net/gh/naptha/tesseract.js@v2.0.0/dist/worker.min.js',
+ langPath: 'https://tessdata.projectnaptha.com/4.0.0',
+ corePath: 'https://cdn.jsdelivr.net/gh/naptha/tesseract.js-core@v2.0.0-beta.5/tesseract-core.js',
+});
+```
+
+### workerPath
+A string specifying the location of the [worker.js](./dist/worker.min.js) file.
+
+### langPath
+A string specifying the location of the tesseract language files, with default value 'https://tessdata.projectnaptha.com/4.0.0'. Language file URLs are calculated according to the formula `langPath + langCode + '.traineddata.gz'`.
+
+### corePath
+A string specifying the location of the [tesseract.js-core library](https://github.com/naptha/tesseract.js-core), with default value 'https://cdn.jsdelivr.net/gh/naptha/tesseract.js-core@v2.0.0-beta.5/tesseract-core.js'.