tesseract.js/README.md

# [Tesseract.js](http://tesseract.projectnaptha.com/)

[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Code Style](https://badgen.net/badge/code%20style/airbnb/ff5a5f?icon=airbnb)](https://github.com/airbnb/javascript)
[![Maintenance](https://img.shields.io/badge/Maintained%3F-yes-green.svg)](https://github.com/naptha/tesseract.js/graphs/commit-activity)

[![Build Status](https://travis-ci.org/naptha/tesseract.js.svg?branch=master)](https://travis-ci.org/naptha/tesseract.js)
[![npm version](https://badge.fury.io/js/tesseract.js.svg)](https://badge.fury.io/js/tesseract.js)
[![Downloads Total](https://img.shields.io/npm/dt/tesseract.js.svg)](https://www.npmjs.com/package/tesseract.js)
[![Downloads Month](https://img.shields.io/npm/dm/tesseract.js.svg)](https://www.npmjs.com/package/tesseract.js)

**Tessearct.js v2 is now available and under development in master branch, check [support/1.x](https://github.com/naptha/tesseract.js/tree/support/1.x) branch for v1.**

Tesseract.js is a javascript library that gets words in [almost any language](./docs/tesseract_lang_list.md) out of images. ([Demo](http://tesseract.projectnaptha.com/))

[![fancy demo gif](./docs/demo.gif)](http://tesseract.projectnaptha.com)

Tesseract.js works with script tags, [webpack](https://webpack.js.org/), and [Node.js](https://nodejs.org/en/). [After you install it](#installation), using it is as simple as

```javascript
import { TesseractWorker } from 'tesseract.js';
const worker = new TesseractWorker();

worker.recognize(myImage)
  .progress((p) => { console.log('progress', p);    })
  .then((result) => { console.log('result', result); });
```

[Check out the docs](#docs) for a full treatment of the API.

## Provenance
Tesseract.js wraps an [emscripten](https://github.com/kripken/emscripten) [port](https://github.com/naptha/tesseract.js-core) of the [Tesseract](https://github.com/tesseract-ocr/tesseract) [OCR](https://en.wikipedia.org/wiki/Optical_character_recognition) Engine.


# Installation
Tesseract.js works with a `<script>` tag via local copy or CDN, with webpack via `npm`, and on Node.js via `npm`. [Check out the docs](#docs) for a full treatment of the API.

## CDN 

You can simply include Tesseract.js with a CDN like this:
```html
<script src='https://unpkg.com/tesseract.js@v2.0.0-alpha.3/dist/tesseract.min.js'></script>
```

After including your scripts, the `Tesseract` variable will be defined globally!

## npm

### 2.x

Major Changes

- Upgrade to tesseract v4
- Support multiple languages, ex: eng+chi_tra
- Support image formats: png, jpg, bmp, pbm

```shell
> yarn add tesseract.js@next
```
or
```
> npm install tesseract.js@next --save
```

### 1.x

```shell
> yarn add tesseract.js
```
or
```
> npm install tesseract.js --save
```

> Note: Tesseract.js currently requires Node.js v6.8.0 or higher.

# Documentation

* [Examples](./docs/examples.md)
* [Image Format](./docs/image-format.md)
* [API](./docs/api.md)
* [Local Installation](./docs/local-installation.md)

# Contributing

## Development
To run a development copy of tesseract.js, first clone this repo.
```shell
> git clone https://github.com/naptha/tesseract.js.git
```

Then, `cd tesseract.js && npm install && npm start`
```shell
> cd tesseract.js
> npm install && npm start

  ... a bunch of npm stuff ...

  Starting up http-server, serving ./
  Available on:
    http://127.0.0.1:3000
    http://[your ip]:3000

```

Then open `http://localhost:3000/examples/browser/demo.html` in your favorite browser. The devServer automatically rebuilds `tesseract.dev.js` and `worker.min.js` when you change files in the src folder.

## Building Static Files
After you've cloned the repo and run `npm install` as described in the [Development Section](#development), you can build static library files in the dist folder with

```shell
> npm run build
```
polish intro a bit 8 years ago			`# [Tesseract.js](http://tesseract.projectnaptha.com/)`
add links to top 8 years ago
Add badges 6 years ago			`[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)`
Rearrange badges 6 years ago			`[![Code Style](https://badgen.net/badge/code%20style/airbnb/ff5a5f?icon=airbnb)](https://github.com/airbnb/javascript)`
Add badges 6 years ago			`[![Maintenance](https://img.shields.io/badge/Maintained%3F-yes-green.svg)](https://github.com/naptha/tesseract.js/graphs/commit-activity)`
Rearrange badges 6 years ago
Update README.md 6 years ago			`[![Build Status](https://travis-ci.org/naptha/tesseract.js.svg?branch=master)](https://travis-ci.org/naptha/tesseract.js)`
Fix npm version badge 6 years ago			`[![npm version](https://badge.fury.io/js/tesseract.js.svg)](https://badge.fury.io/js/tesseract.js)`
Add badges 6 years ago			`[![Downloads Total](https://img.shields.io/npm/dt/tesseract.js.svg)](https://www.npmjs.com/package/tesseract.js)`
			`[![Downloads Month](https://img.shields.io/npm/dm/tesseract.js.svg)](https://www.npmjs.com/package/tesseract.js)`

Update README.md 6 years ago			`Tessearct.js v2 is now available and under development in master branch, check [support/1.x](https://github.com/naptha/tesseract.js/tree/support/1.x) branch for v1.`
Update README.md 6 years ago
Update README.md 8 years ago			`Tesseract.js is a javascript library that gets words in [almost any language](./docs/tesseract_lang_list.md) out of images. ([Demo](http://tesseract.projectnaptha.com/))`
readme stuff 8 years ago
Update README.md 6 years ago			`[![fancy demo gif](./docs/demo.gif)](http://tesseract.projectnaptha.com)`
readme stuff 8 years ago
Update README.md 6 years ago			`Tesseract.js works with script tags, [webpack](https://webpack.js.org/), and [Node.js](https://nodejs.org/en/). [After you install it](#installation), using it is as simple as`
Fix conflicts in README.md 7 years ago
Update README.md 6 years ago			```javascript
			`import { TesseractWorker } from 'tesseract.js';`
			`const worker = new TesseractWorker();`

			`worker.recognize(myImage)`
Update README.md 6 years ago			`.progress((p) => { console.log('progress', p); })`
			`.then((result) => { console.log('result', result); });`
readme stuff 8 years ago			```
polish intro a bit 8 years ago
readme stuff 8 years ago			`[Check out the docs](#docs) for a full treatment of the API.`

update readme 8 years ago			`## Provenance`
Update README.md 8 years ago			`Tesseract.js wraps an [emscripten](https://github.com/kripken/emscripten) [port](https://github.com/naptha/tesseract.js-core) of the [Tesseract](https://github.com/tesseract-ocr/tesseract) [OCR](https://en.wikipedia.org/wiki/Optical_character_recognition) Engine.`
update readme 8 years ago
readme stuff 8 years ago
rewrite 8 years ago			`# Installation`
Update README.md 6 years ago			Tesseract.js works with a `<script>` tag via local copy or CDN, with webpack via `npm`, and on Node.js via `npm`. [Check out the docs](#docs) for a full treatment of the API.
rewrite 8 years ago
Update README.md 6 years ago			`## CDN`
add cdn instructions 8 years ago
Fixes to README.md 8 years ago			`You can simply include Tesseract.js with a CDN like this:`
add cdn instructions 8 years ago			```html
Replace jsdeliver with unpack and remove dist 6 years ago			`<script src='https://unpkg.com/tesseract.js@v2.0.0-alpha.3/dist/tesseract.min.js'></script>`
add cdn instructions 8 years ago			```

📖 readme docs updates 📖 readme docs updates - ⚒📑 fix legend markdown list of links to anchors - ⚒📛 fix badge to link to npm instead of badge - 🆙📼 update outdated .error to .catch - 🔗 link to gif - 🎨 highlight files - 👾📦 simplify and unify installation section, add yarn optionally for easier copy paste 8 years ago			After including your scripts, the `Tesseract` variable will be defined globally!
add script tag example 8 years ago
Update README.md 6 years ago			`## npm`
Update README.md 6 years ago
Update README.md 6 years ago			`### 2.x`

			`Major Changes`

			`- Upgrade to tesseract v4`
			`- Support multiple languages, ex: eng+chi_tra`
			`- Support image formats: png, jpg, bmp, pbm`
Update README.md 6 years ago
readme stuff 8 years ago			```shell
Update README.md 6 years ago			`> yarn add tesseract.js@next`
Fixes the install instructions. 7 years ago			```
			`or`
			```
Update README.md 6 years ago			`> npm install tesseract.js@next --save`
readme stuff 8 years ago			```
Update README.md 6 years ago
Update README.md 6 years ago			`### 1.x`
Update README.md 6 years ago
			```shell
Update README.md 6 years ago			`> yarn add tesseract.js`
Update README.md 6 years ago			```
			`or`
			```
Update README.md 6 years ago			`> npm install tesseract.js --save`
Update README.md 6 years ago			```

Fixes to README.md 8 years ago			`> Note: Tesseract.js currently requires Node.js v6.8.0 or higher.`
add node version note 8 years ago
Update README.md 6 years ago			`# Documentation`
add docs toc 8 years ago
Update README.md 6 years ago			`* [Examples](./docs/examples.md)`
Update README.md & docs 6 years ago			`* [Image Format](./docs/image-format.md)`
			`* [API](./docs/api.md)`
			`* [Local Installation](./docs/local-installation.md)`
add docs toc 8 years ago
Update README.md & docs 6 years ago			`# Contributing`
rearranging readme 8 years ago
Update README.md & docs 6 years ago			`## Development`
add contributing instructions 8 years ago			`To run a development copy of tesseract.js, first clone this repo.`
			```shell
			`> git clone https://github.com/naptha/tesseract.js.git`
			```

📖 readme docs updates 📖 readme docs updates - ⚒📑 fix legend markdown list of links to anchors - ⚒📛 fix badge to link to npm instead of badge - 🆙📼 update outdated .error to .catch - 🔗 link to gif - 🎨 highlight files - 👾📦 simplify and unify installation section, add yarn optionally for easier copy paste 8 years ago			Then, `cd tesseract.js && npm install && npm start`
rewrite 8 years ago			```shell
add contributing instructions 8 years ago			`> cd tesseract.js`
			`> npm install && npm start`
spacing 8 years ago
📖 readme docs updates 📖 readme docs updates - ⚒📑 fix legend markdown list of links to anchors - ⚒📛 fix badge to link to npm instead of badge - 🆙📼 update outdated .error to .catch - 🔗 link to gif - 🎨 highlight files - 👾📦 simplify and unify installation section, add yarn optionally for easier copy paste 8 years ago			`... a bunch of npm stuff ...`
add plug for pull 8 years ago
update contrib 8 years ago			`Starting up http-server, serving ./`
			`Available on:`
Update README.md & docs 6 years ago			`http://127.0.0.1:3000`
			`http://[your ip]:3000`
add contributing instructions 8 years ago
			```

Update README.md & docs 6 years ago			Then open `http://localhost:3000/examples/browser/demo.html` in your favorite browser. The devServer automatically rebuilds `tesseract.dev.js` and `worker.min.js` when you change files in the src folder.
add contributing instructions 8 years ago
Update README.md & docs 6 years ago			`## Building Static Files`
📖 readme docs updates 📖 readme docs updates - ⚒📑 fix legend markdown list of links to anchors - ⚒📛 fix badge to link to npm instead of badge - 🆙📼 update outdated .error to .catch - 🔗 link to gif - 🎨 highlight files - 👾📦 simplify and unify installation section, add yarn optionally for easier copy paste 8 years ago			After you've cloned the repo and run `npm install` as described in the [Development Section](#development), you can build static library files in the dist folder with
Update README.md & docs 6 years ago
add contributing instructions 8 years ago			```shell
			`> npm run build`
			```