Documentation


Convert CSV to JSON

Delimited data can be parsed out of strings or files. Files that are parsed can be local or remote. Local files are opened with FileReader, and remote files are downloaded with XMLHttpRequest.

Parse string

Papa.parse(csvString[, config])
  • csvString is a string of delimited text to be parsed.
  • config is an optional config object.
  • Returns a parse results object (if not streaming or using worker).

Parse local file

Papa.parse(file, config)
  • file is a File object obtained from the DOM.
  • config is a config object which contains a callback.
  • Doesn't return anything. Results are provided asynchronously to a callback function.

Parse remote file

Papa.parse(url, { download: true // config ... })
  • url is the path or URL to the file to download.
  • The second argument is a config object where download: true is set.
  • Doesn't return anything. Results are provided asynchronously to a callback function.

Using jQuery to select files

$('input[type=file]').parse({ config: { // base config to use for each file }, before: function(file, inputElem) { // executed before parsing each file begins; // what you return here controls the flow }, error: function(err, file, inputElem, reason) { // executed if an error occurs while loading the file, // or if before callback aborted for some reason }, complete: function() { // executed after all files are complete } });
  • Select the file input elements containing files you want to parse.
  • before is an optional callback that lets you inspect each file before parsing begins. Return an object like: { action: "abort", reason: "Some reason", config: // altered config... } to alter the flow of parsing. Actions can be "abort" to skip this and all other files in the queue, "skip" to skip just this file, or "continue" to carry on (equivalent to returning nothing). reason can be a reason for aborting. config can be a modified configuration for parsing just this file.
  • The complete callback shown here is executed after all files are finished and does not receive any data. Use the complete callback in config for per-file results.

Convert JSON to CSV

Papa's unparse utility correctly writes out delimited text strings given an array of arrays or an array of objects.

Unparse

Papa.unparse(data[, config])

Examples

// Two-line, comma-delimited file var csv = Papa.unparse([ ["1-1", "1-2", "1-3"], ["2-1", "2-2", "2-3"] ]); // With header row (all objects should look alike) var csv = Papa.unparse([ { "Column 1": "foo", "Column 2": "bar" }, { "Column 1": "abc", "Column 2": "def" } ]); // Specifying fields and data manually var csv = Papa.unparse({ fields: ["Column 1", "Column 2"], data: [ ["foo", "bar"], ["abc", "def"] ] });
  • Returns the resulting delimited text as a string.

  • data can be one of:
    • An array of arrays
    • An array of objects
    • An object with fields and data

  • config is an object with any of these properties: // defaults shown { quotes: false, delimiter: ",", newline: "\r\n" } Set quotes to true to force enclosing each datum around quotes. The delimiter can be any valid delimiting character. And the newline character(s) may also be customized.

The Config Object

Every call to parse receives a configuration object. Its properties define settings, behavior, and callbacks used during parsing.

Default config

{ delimiter: "", header: false, dynamicTyping: false, preview: 0, step: undefined, encoding: "", worker: false, comments: false, complete: undefined, download: false }
  • delimiter The delimiting character. Leave blank to auto-detect. If specified, it must be a string of length 1, and cannot be found in Papa.BAD_DELIMITERS.
  • header If true, the first row of parsed data will be interpreted as field names. Fields will be returned in the meta, and each row will be an object of data keyed by field name. If false, the parser simply returns an array of arrays, including the first row.
  • dynamicTyping If true, numeric and boolean data will be converted to their type instead of remaining strings.
  • preview If > 0, only that many rows will be parsed.
  • step To stream the input, define a callback function to receive results row-by-row rather than together at the end: step: function(results, parser) { console.log("Row data:", results.data); console.log("Row errors:", results.errors); } You can call parser.abort() to halt parsing that input (not available if using a worker).
  • encoding The encoding to use when opening files locally.
  • worker Whether or not to use a worker thread. Using a worker will keep your page reactive, but may be slightly slower.
  • comments Specify a comment character (like "#") if your CSV file has commented lines, and Papa will skip them. This feature is disabled by default.
  • complete A callback to execute when parsing is complete. Results are passed in, and if parsing a file, the file is, too: complete: function(results, file) { console.log("Parsing complete:", results, file); } If streaming, results will not be available in this function.
  • download If true, this indicates that the string you passed in is actually a URL from which to download a file and parse it.

The Results Object

Parse results are always (even when streaming) provided in a roughly consistent format: an object with data, errors, and meta. When streaming, results.data contains only one row.

Results structure

{ data: // array of parse results errors: // array of errors meta: // object with extra info }
  • data is an array of rows. Rows are either arrays (if header: false) or objects (if header: true). Inside a step function, data will only contain one row.
  • errors is an array of errors.
  • meta contains extra information about the parse, such as delimiter used, number of lines, whether the process was aborted, etc.

results.data

// Example (without header) [ ["Column 1", "Column 2"], ["foo", "bar"], ["abc", "def"] ] // Example (with header) [ { "Column 1": "foo", "Column 2": "bar" }, { "Column 1": "abc", "Column 2": "def" } ]
  • If header row is enabled, and more fields are found on a row of data than in the header row, an extra field will appear in the results called __parsed_extra. It contains an array of all data parsed from that row that was wider than the header row.
  • Using dynamicTyping: true will turn numeric and boolean data into number and boolean types, respectively. Otherwise, all parsed data is string.

results.errors

// Error structure { type: "", // A generalization of the error code: "", // Standardized error code message: "", // Human-readable details line: 0, // Line of original input row: 0, // Row index of parsed data where error is index: 0 // Character index within original input }
  • The error type will be one of "Abort", "Quotes", "Delimiter", or "FieldMismatch".
  • The code may be "ParseAbort", "MissingQuotes", "UnexpectedQuotes", "UndetectableDelimiter", "TooFewFields", or "TooManyFields" (depending on the error type).
  • line and index may not be available on all error messages because some errors are only generated after parsing is already complete.
  • Just because errors are generated does not necessarily mean that parsing failed! Papa is strong, and usually parsing only bombs hard if the input has sloppy quotes.

results.meta

{ lines: // Number of lines parsed delimiter: // Delimiter used aborted: // Whether process was aborted fields: // Array of field names }
  • Not all meta properties will always be available. For instance, fields is only given when header: true is set.

Extras

There's a few other things that Papa exposes for you that weren't explained above.

These are provided as a convenience and should remain read-only, but feel free to use them:

  • Papa.BAD_DELIMITERS   An array of characters that are not allowed as delimiters (or comment characters).
  • Papa.RECORD_SEP   The true delimiter. Invisible. ASCII code 30. Should be doing the job we strangely rely upon commas and tabs for.
  • Papa.UNIT_SEP   Also sometimes used as a delimiting character. ASCII code 31.
  • Papa.WORKERS_SUPPORTED   Whether or not the browser supports HTML5 Web Workers. If false, worker: true will have no effect.

The following items are for internal use and testing only. It is not recommended that you use them unless you're familiar with the underlying code base:

  • Papa.Parser   The core parsing component.
  • Papa.ParserHandle   A wrapper over the Parser which provides dynamic typing and header row support.
  • Papa.NetworkStreamer   Facilitates downloading and parsing files in chunks over the network with XMLHttpRequest.
  • Papa.FileStreamer   Similar to NetworkStreamer, but for local files, and using the HTML5 FileReader.