diff --git a/README.md b/README.md index b555448..4771b6f 100644 --- a/README.md +++ b/README.md @@ -4,6 +4,8 @@ # tesseract.js Tesseract.js is a pure javascript version of the Tesseract OCR Engine that can recognize English, Chinese, Russian, and 60 other languages. +Tesseract.js lets your code get the words out of scanned documents and other images. + # Installation @@ -12,14 +14,11 @@ Tesseract.js works with a ` + @@ -27,16 +26,16 @@ worker.recognize('#my-image') ### Local -First grab copies of `tesseract.js` and `tesseract.worker.js` from the [dist folder](https://github.com/naptha/tesseract.js/tree/master/dist). Then include `tesseract.js` on your page like this: +First grab copies of `tesseract.js` and `tesseract.worker.js` from the [dist folder](https://github.com/naptha/tesseract.js/tree/master/dist). Then include `tesseract.js` on your page, and set `Tesseract.workerUrl` like this: ```html @@ -51,30 +50,45 @@ worker.recognize('#my-image') ```--> # Docs -## Tesseract.recognize(image) -> [TesseractJob](#tesseractjob) -Returns a TesseractJob whose `then` method can be used to act on the result of the OCR. - -For example: - -`image` can be - - an `img` element or querySelector that matches an `img` element - - a `video` element or querySelector that matches a `video` element - - a `canvas` element or querySelector that matches a `canvas` element - - a CanvasRenderingContext2D (returned by `canvas.getContext('2d')`) - - the absolute `url` of an image from the same website that is running your script. Browser security policies don't allow access to the content of images from other websites :( - - - -## Tesseract.detect(image) -> [TesseractJob](#tesseractjob) -Returns a TesseractJob whose `then` method can be used to act on the result of the OCR. - -For example: - -`image` can be - - an `img` element or querySelector that matches an `img` element - - a `video` element or querySelector that matches a `video` element - - a `canvas` element or querySelector that matches a `canvas` element - - a CanvasRenderingContext2D (returned by `canvas.getContext('2d')`) - - the absolute `url` of an image from the same website that is running your script. Browser security policies don't allow access to the content of images from other websites :( + +## ImageLike +The main Tesseract.js functions take an `image` parameter, which should be something that is 'image-like'. +That means `image` should be +- an `img` element or querySelector that matches an `img` element +- a `video` element or querySelector that matches a `video` element +- a `canvas` element or querySelector that matches a `canvas` element +- a CanvasRenderingContext2D (returned by `canvas.getContext('2d')`) +- the absolute `url` of an image from the same website that is running your script. Browser security policies don't allow access to the content of images from other websites :( + + +## Tesseract.recognize(image: [ImageLike](#imagelike)[, options]) -> [TesseractJob](#tesseractjob) +Figures out what words are in the image, where the words are, etc. +- `image` should be an [ImageLike](#imagelike) object. +- `options` is an optional parameter with tesseract specific keys + + hi +Returns a [TesseractJob](#tesseractjob) whose `then` method can be used to act on the result. + +Example: +```javascript +Tesseract.recognize('#my-image') +.then(function(result){ + console.log(result) +}) +``` + +## Tesseract.detect(image: [ImageLike](#imagelike)) -> [TesseractJob](#tesseractjob) +Figures out what script (e.g. 'Latin', 'Chinese') the words in the image are written in. +`image` should be an [ImageLike](#imagelike) object. +Returns a [TesseractJob](#tesseractjob) whose `then` method can be used to act on the result of the script. + + +```javascript +Tesseract.detect('#my-image') +.then(function(result){ + console.log(result) +}) +``` + ## TesseractJob A TesseractJob is an an object returned by a call to recognize or detect.