Compare commits

..

104 Commits
4.x ... master

Author SHA1 Message Date
Sergi Almacellas Abellana e11ee26581 Minor version bump 3 years ago
bezrodnov a93c5c9806
Improve row skipping performance (#911) (#912) 3 years ago
Simon 4132d810ab
fixcolumns config works with input type object (#919) 3 years ago
chafi9-code 8dba33e0c5
Set empty string to config.quotechar when it's value is null (#925) 3 years ago
Sergi Almacellas Abellana 6bb7c33528 Add usage stats on lovers page 3 years ago
Sergi Almacellas Abellana e42059577d
Add support for node16 (#877) 3 years ago
Dan Onoshko ec36ab22d3
Upgrade mocha-headless-chrome 3 years ago
Sergi Almacellas Abellana 997c6923c8 Remove broken links from lovers 3 years ago
Cyril Auburtin 1f2c7330d5
Add more cases to escapeFormulae and allow to pass RegExp (#904) 3 years ago
Sergi Almacellas Abellana 26a86fdf9f Do not run tests on node15 3 years ago
janisdd 23e1b47f5c
- fixes multi-character delimiter with quoted field issue (#879) 4 years ago
Sergi Almacellas Abellana a6fdfcb4a6
Remove support for node10 (#876) 4 years ago
Sergi Almacellas Abellana 0f75aeb985 Remove travis CI config file 4 years ago
Sergi Almacellas Abellana eaeb01a1ea Minor version bump 4 years ago
Christopher Eady 113d561c4c
Update ISO_DATE regex to match full string (#872) 4 years ago
Jane Kelly 5e92fca582
Add Retool to PapaParse lovers list (#867) 4 years ago
janisdd 2abbae8c40
Add note to explain quote option is ignored for some values (#862) 4 years ago
Connor Smith 95e4de8cf5
Include lowercase and uppercase om float regex (#863) 4 years ago
Sergi Almacellas Abellana 96022a6864 Update jquery version in player. Closes #843 4 years ago
Sergi Almacellas Abellana 05f7044bda
Setup github actions (#853) 4 years ago
Mikhail Sidorov 0e0b785df3
Drop redundant getNextUnquotedDelimiter function (#852) 4 years ago
Jimmy Wärting 7fc65f3164
Remove ObjectKeys function (#842) 4 years ago
Sergi Almacellas Abellana 5747da6c99 Minor version bump 5 years ago
Unnit Metaliya 8414f7645a
Add Visa SOP Sample on lovers (#820) 5 years ago
Sergi Almacellas Abellana 6616222db9 Improve documentation wording 5 years ago
Akshay Raj Gollahalli 12bf28a62f
Add documentation for chunkSize (#818) 5 years ago
Paul Schlattmann aa333201af
Improve gready comment in docs.html (#816) 5 years ago
Alexandre Saiz Verdaguer 7b26173728
add: MONEI, MoonMail, Wholesaler as lovers (#771) 5 years ago
Sergi Almacellas Abellana ce858b3c41 Add docs and tests case for transformHeader with index 5 years ago
James Furey 018f5dfe41
Add index argument to transformHeader (#807) 5 years ago
John Preston 6f997ef4fb
Implement escapeFormulae option (#796) 5 years ago
Demetris Manikas 4edef1b267
Add test to check for empty field in the begining (#790) 5 years ago
Sergi Almacellas Abellana 4b192deef1 Minor version bump 5 years ago
Sergi Almacellas Abellana 235a12758c
Avoid ReDOS on float dynamic typing (#779) 5 years ago
Sergi Almacellas Abellana a4cf371ff2 Improve downloadRequestBody documentation 5 years ago
DanzelTaccayan e934deb1f6
Support POST method when download is true 5 years ago
Varun Sharma 7ec146cbc4
Using self instead of this to preserve binding. (#769) 5 years ago
Sergi Almacellas Abellana 3497ded575 Patch version bump 5 years ago
Duc Tri Le ae73d2a966 Use chunk size to determine the processed length 5 years ago
Sergi Almacellas Abellana a318396c9d Reword newline docs 5 years ago
jseter 47b356d6e0 #727 update delimiter and newline index if they are earlier than the current position before tested. (#728) 5 years ago
jseter 7ad8dda68c Address deepEqual using compare by JSON strings. (#724) 5 years ago
jseter e536351e7a Refactor substr calls to substring calls. (#725) 5 years ago
jseter e0b474dc38 Correct small typo (#723) 5 years ago
Seito Tanaka 6f7e43edd3 Fix step callback function when skipping empty lines (#714) 5 years ago
dependabot[bot] ec26e728ab Bump open from 0.0.5 to 7.0.0 (#721) 5 years ago
Sergi Almacellas Abellana 5219809f1d Minor version bump 5 years ago
Jacky Jiang 9b54b11fe2 Fix Range Header processing logic for NetworkStreamer (#709) 5 years ago
Grace D'Mello 94d7bf939a Fix CSV parsing issue when first cell is empty (#707) 5 years ago
Jay Bowles bacf4f2a57 Adds Hua Explore to list of lovers (#705) 5 years ago
Justin Lettenmair 45c1455b25 Fix README typos (#704) 5 years ago
Puzzles 80a1044f1b Implement quotes config optionally as function (#703) 6 years ago
Seito Tanaka 874161a23d Fix misplaced quotes parsing (#702) 6 years ago
Sergi Almacellas Abellana 98e3102a42 Use latest PapaParse version on demo page 6 years ago
Sergi Almacellas Abellana d2206b3d93 Update download version from webpage 6 years ago
Sergi Almacellas Abellana 788631f4db Patch version bump 6 years ago
David Boskovic d5e2fae859 Resolve parsing issue when first field is empty and unquoted (#696) 6 years ago
haxxxton 76dc5a6d7f Maintain precision on big numbers (>2^53 || <-2^53) when dynamicTyping is on (#694) 6 years ago
Sergi Almacellas Abellana 408823330b Patch version bump 6 years ago
konuch 49170b76b3 Make the demo website use the new 5.0.0 version. (#679) 6 years ago
morance 792641e36b Modified the GuessDelimiter function (#687) 6 years ago
Hugh Anderson 54f8aecc9c Fix syntax error (#677) 6 years ago
Hugh Anderson 5763ff3603 Fix invalid link to lovers.js (#678) 6 years ago
0xflotus 20768da008 Fix typos on README 6 years ago
Sergi Almacellas Abellana b7a2d41843 Update docs version and COPYRIGHT year 6 years ago
Tom Byrer a66776140f update webiste to v5 (#669) 6 years ago
Sergi Almacellas Abellana 1f43cb8f01 Major version bump 6 years ago
konuch 96313a4356 Handle delimiter guessing, when not all of the fields are quoted (#665) 6 years ago
Sergi Almacellas Abellana dd0f4ba2ef Update donate link on non index pages 6 years ago
jjech ae25f7cefd Update FAQ with reference to configuration issue #655 (#656) 6 years ago
Peter Theill 73a10de0c9 Include Familio (#654) 6 years ago
Sergi Almacellas Abellana b28a552137 Explain that docs are hosted on docs folder 6 years ago
ilias bhallil b74bd9e884 Add downloadRequestHeaders Doc to the website (#647) 6 years ago
Varun Sharma f2930716d0 BugFix #636 Pause and resume (In a quick succession) gets you lot of exceptions and an infinite loop (#637) 6 years ago
Ledion Bitincka a627547d87 Add ability to support escapeChar on unparse (#631) 6 years ago
janisdd 941033094a Allow to specify the columns used for unparse (#632) 6 years ago
janisdd 6107789c6b - some minor doc fixes 6 years ago
Konstantin Nosov b9f1ebae32 Add mailcheck.co to lovers.js (#629) 6 years ago
janisdd 108d91cecc - added table for unparse config options (#628) 6 years ago
Jonathan Grimes 265e09c67a Ensure data is correctly parsed with header: true (#621) 6 years ago
Sergi Almacellas Abellana 0e7f50be0a Include delimiter to guess tests on custom tests 6 years ago
Sergi Almacellas Abellana 757b1bf6e0
Add DelimitersToGuess config option (#555) 6 years ago
Jonathan Beliën b7529303e3 Exñ.a(#625) 6 years ago
janisdd 104004811c Update unparse documentation (#622) 6 years ago
Sergi Almacellas Abellana dc16b88e5e Do not pass an array of array when using step and worker 6 years ago
Sergi Almacellas Abellana 71c2eee63c Use latest stable version for file download links 6 years ago
Sergi Almacellas Abellana f6b1b36c3f Correctly guess deliminter when mixed with commas 6 years ago
Sergi Almacellas Abellana a5ba84600d Update donate link 6 years ago
Chris Zubak-Skees 9a541b1e56 Consistently apply regex escaping to quoteChar (#602) 6 years ago
Sergi Almacellas Abellana ae982483cb Improve transform function docs and tests. Fixes #601 6 years ago
Sergi Almacellas Abellana bac638610d Beta version bump 6 years ago
Sergi Almacellas Abellana 46160eed95 Remove Array.isArray() polyfill 6 years ago
Sergi Almacellas Abellana 2fe54b7a81 Remove trailing space on faqs 6 years ago
Jonathan Grimes 296c89049b cleanup remnants of phantomjs 6 years ago
Jonathan Grimes 4ecae0b5bb Support workers from inline-blobs 6 years ago
Sergi Almacellas Abellana ded2feb4f2 Update file test case to do not filter response data 6 years ago
Jacopo Farina 9a4a00a1ee Move to mocha-headless-chrome for running tests 6 years ago
Leo Anthias 6672789905 Remove If-None-Match header. Fixes #595 (#596) 6 years ago
Sergi Almacellas Abellana a0026683f7 Return data directly on NodeStream as step function now returns a single row 6 years ago
Sergi Almacellas Abellana a7eb6343be
Return unique row as data on step function (#500) 6 years ago
Gabe Gorelick 2151cf1b90 Throw Error objects instead of strings (#497) 6 years ago
João Sardinha c95db1f64c Allow transforming header columns (#589) 6 years ago
Sergi Almacellas Abellana 30cf324705 Remove support for node6 and add support for node11 6 years ago
Sergi Almacellas Abellana 089b8b9070 Use 5.0.0-alpha as version number 6 years ago
  1. 2
      .eslintrc.js
  2. 29
      .github/workflows/node.js.yml
  3. 7
      .travis.yml
  4. 6
      README.md
  5. 11
      docs/demo.html
  6. 197
      docs/docs.html
  7. 20
      docs/faq.html
  8. 38
      docs/index.html
  9. 62
      docs/resources/js/lovers.js
  10. 272
      docs/resources/js/papaparse.js
  11. 13
      package.json
  12. 330
      papaparse.js
  13. 4
      papaparse.min.js
  14. 2
      player/player.html
  15. 51
      tests/node-tests.js
  16. 3
      tests/sample-header.csv
  17. 588
      tests/test-cases.js
  18. 4
      tests/test.js
  19. 7
      tests/tests.html
  20. 2
      tests/verylong-sample.csv

2
.eslintrc.js

@ -178,7 +178,7 @@ module.exports = {
"no-tabs": "off", "no-tabs": "off",
"no-template-curly-in-string": "error", "no-template-curly-in-string": "error",
"no-ternary": "off", "no-ternary": "off",
"no-throw-literal": "off", "no-throw-literal": "error",
"no-trailing-spaces": "error", "no-trailing-spaces": "error",
"no-undef-init": "error", "no-undef-init": "error",
"no-undefined": "off", "no-undefined": "off",

29
.github/workflows/node.js.yml

@ -0,0 +1,29 @@
# This workflow will do a clean install of node dependencies, build the source code and run tests across different versions of node
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-nodejs-with-github-actions
name: Node.js CI
on:
push:
branches: [ master ]
pull_request:
branches: [ master ]
jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
node-version: [12.x, 14.x, 16.x]
# See supported Node.js release schedule at https://nodejs.org/en/about/releases/
steps:
- uses: actions/checkout@v2
- name: Use Node.js ${{ matrix.node-version }}
uses: actions/setup-node@v1
with:
node-version: ${{ matrix.node-version }}
- run: npm install
- run: npm test

7
.travis.yml

@ -1,7 +0,0 @@
language: node_js
node_js:
- "6"
- "8"
- "9"
- "10"

6
README.md

@ -26,7 +26,7 @@ can be installed with the following command:
npm install papaparse npm install papaparse
If you don't want to use npm, [papaparse.min.js](https://github.com/mholt/PapaParse/blob/master/papaparse.min.js) can be downloaded to your project source. If you don't want to use npm, [papaparse.min.js](https://unpkg.com/papaparse@latest/papaparse.min.js) can be downloaded to your project source.
Homepage & Demo Homepage & Demo
@ -39,9 +39,7 @@ To learn how to use Papa Parse:
- [Documentation](http://papaparse.com/docs) - [Documentation](http://papaparse.com/docs)
The website is hosted on on [Github Pages](https://pages.github.com/). If The website is hosted on [Github Pages](https://pages.github.com/). Its content is also included in the docs folder of this repository. If you want to contribute on it just clone the master of this repository and open a pull request.
you want to contribute just clone the gh-branch of this repository and
open a pull request.
Papa Parse for Node Papa Parse for Node

11
docs/demo.html

@ -12,7 +12,7 @@
<link rel="stylesheet" href="/resources/css/demo.css"> <link rel="stylesheet" href="/resources/css/demo.css">
<script src="/resources/js/jquery.min.js"></script> <script src="/resources/js/jquery.min.js"></script>
<script src="/resources/js/common.js"></script> <script src="/resources/js/common.js"></script>
<script src="/resources/js/papaparse.js"></script> <script src="https://unpkg.com/papaparse@latest/papaparse.min.js"></script>
<script src="/resources/js/demo.js"></script> <script src="/resources/js/demo.js"></script>
</head> </head>
<body> <body>
@ -34,7 +34,7 @@
</div> </div>
</div> </div>
<div class="grid-20 hide-on-mobile text-center"> <div class="grid-20 hide-on-mobile text-center">
<a href="/" class="text-logo">Papa Parse 4</a> <a href="/" class="text-logo">Papa Parse 5</a>
</div> </div>
<div class="grid-40 mobile-grid-50 text-right"> <div class="grid-40 mobile-grid-50 text-right">
<div class="links"> <div class="links">
@ -44,9 +44,6 @@
<a href="http://stackoverflow.com/questions/tagged/papaparse"> <a href="http://stackoverflow.com/questions/tagged/papaparse">
<i class="fa fa-stack-overflow fa-lg"></i> Help <i class="fa fa-stack-overflow fa-lg"></i> Help
</a> </a>
<!-- <a href="https://matt.life/pay" class="donate">
<i class="fa fa-heart fa-lg"></i> Donate
</a> -->
</div> </div>
</div> </div>
</div> </div>
@ -224,7 +221,7 @@
<br><br> <br><br>
Papa Parse by <a href="https://twitter.com/mholt6">Matt Holt</a> Papa Parse by <a href="https://twitter.com/mholt6">Matt Holt</a>
<br> <br>
&copy; 2013-2018 &copy; 2013-2019
</div> </div>
<div class="grid-15 mobile-grid-50 links"> <div class="grid-15 mobile-grid-50 links">
<h5>Learn</h5> <h5>Learn</h5>
@ -234,7 +231,7 @@
</div> </div>
<div class="grid-15 mobile-grid-50 links"> <div class="grid-15 mobile-grid-50 links">
<h5>Project</h5> <h5>Project</h5>
<a href="https://gratipay.com/mholt">Donate</a> <a href="https://www.paypal.com/cgi-bin/webscr?cmd=_donations&business=S6VTL9FQ6L8EN&item_name=PapaParse&currency_code=EUR&source=url">Donate</a>
<a href="https://github.com/mholt/PapaParse">GitHub</a> <a href="https://github.com/mholt/PapaParse">GitHub</a>
<a href="https://twitter.com/search?q=%23PapaParse">Share</a> <a href="https://twitter.com/search?q=%23PapaParse">Share</a>
</div> </div>

197
docs/docs.html

@ -34,7 +34,7 @@
</div> </div>
</div> </div>
<div class="grid-20 hide-on-mobile text-center"> <div class="grid-20 hide-on-mobile text-center">
<a href="/" class="text-logo">Papa Parse 4</a> <a href="/" class="text-logo">Papa Parse 5</a>
</div> </div>
<div class="grid-40 mobile-grid-50 text-right"> <div class="grid-40 mobile-grid-50 text-right">
<div class="links"> <div class="links">
@ -44,9 +44,6 @@
<a href="http://stackoverflow.com/questions/tagged/papaparse"> <a href="http://stackoverflow.com/questions/tagged/papaparse">
<i class="fa fa-stack-overflow fa-lg"></i> Help <i class="fa fa-stack-overflow fa-lg"></i> Help
</a> </a>
<!-- <a href="https://matt.life/pay" class="donate">
<i class="fa fa-heart fa-lg"></i> Donate
</a> -->
</div> </div>
</div> </div>
</div> </div>
@ -99,7 +96,7 @@
</div> </div>
<div class="grid-50"> <div class="grid-50">
<pre><code class="language-javascript">Papa.parse(csvString<i>[, <a href="#config">config</a>]</i>)</pre></code> <pre><code class="language-javascript">Papa.parse(csvString<i>[, <a href="#config">config</a>]</i>)</code></pre>
</div> </div>
<div class="grid-50"> <div class="grid-50">
@ -188,7 +185,7 @@
reason: "Some reason", reason: "Some reason",
config: <span class="comment">// altered config...</span> config: <span class="comment">// altered config...</span>
}</code></pre> }</code></pre>
to alter the flow of parsing. Actions can be <code>"abort"</code> to skip this and all other files in the queue, <code>"skip"</code> to skip just this file, or <code>"continue"</code> to carry on (equivalent to returning nothing). <code>reason</code> can be a reason for aborting. <code>config</code> can be a modified <a href="#config">configuration</a> for parsing just this file.</li> to alter the flow of parsing. Actions can be <code>"abort"</code> to skip this and all other files in the queue, <code>"skip"</code> to skip just this file, or <code>"continue"</code> to carry on (equivalent to returning nothing). <code>reason</code> can be a reason for aborting. <code>config</code> can be a modified <a href="#config">configuration</a> for parsing just this file.
</li> </li>
<li>The <code>complete</code> callback shown here is executed after <i>all</i> files are finished and does not receive any data. Use the complete callback in <a href="#config">config</a> for per-file results.</li> <li>The <code>complete</code> callback shown here is executed after <i>all</i> files are finished and does not receive any data. Use the complete callback in <a href="#config">config</a> for per-file results.</li>
</ul> </ul>
@ -228,7 +225,7 @@
<div class="grid-50"> <div class="grid-50">
<pre><code class="language-javascript">Papa.unparse(data<i>[, config]</i>)</code></pre> <pre><code class="language-javascript">Papa.unparse(data<i>[, <a href="#unparse-config-default">config</a>]</i>)</code></pre>
</div> </div>
<div class="grid-50"> <div class="grid-50">
@ -243,19 +240,113 @@
</ul> </ul>
</li> </li>
<li> <li>
<code>config</code> is an optional object with any of these properties: <code>config</code> is an optional <a href="#unparse-config-default">config object</a>
<pre><code class="language-javascript">// defaults shown </li>
</ul>
</div>
<div class="clear"></div>
<div class="grid-100">
<h5 id="unparse-config-default">Default Unparse Config with all options</h5>
</div>
<div class="prefix-25 grid-50 suffix-25">
<pre><code class="language-javascript">
{ {
quotes: false, quotes: false, //or array of booleans
quoteChar: '"', quoteChar: '"',
escapeChar: '"', escapeChar: '"',
delimiter: ",", delimiter: ",",
header: true, header: true,
newline: "\r\n" newline: "\r\n",
}</code></pre> skipEmptyLines: false, //other option is 'greedy', meaning skip delimiters, quotes, and whitespace.
Set <code>quotes</code> to <code>true</code> to always enclose each field in quotes, or an array of true/false values correlating to specific to columns to force-quote. The character used to quote can be customized using <code>quoteChar</code>. The character used to escape the <code>quoteChar</code> within a field can be customized using <code>escapeChar</code>. The <code>delimiter</code> can be any valid delimiting character. The <code>newline</code> character(s) may also be customized. Setting <code>header</code> to <code>false</code> will omit the header row. columns: null //or array of strings
</li> }
</ul> </code></pre>
</div>
<div class="clear"></div>
<div class="grid-100">
<h5>Unparse Config Options</h5>
</div>
<div class="grid-100" style="overflow-x: auto;">
<table>
<tr>
<th style="width: 20%;">Option</th>
<th style="width: 80%;">Explanation</th>
</tr>
<tr>
<td>
<code>quotes</code>
</td>
<td>
If <code>true</code>, forces all fields to be enclosed in quotes. If an array of <code>true/false</code> values, specifies which fields should be force-quoted (first boolean is for the first column, second boolean for the second column, ...). A function that returns a boolean values can be used to determine the quotes value of a cell. This function accepts the cell value and column index as parameters. <br />
Note that this option is ignored for <code>undefined</code>, <code>null</code> and <code>date-object</code> values. The option <code>escapeFormulae</code> also takes precedence over this.
</td>
</tr>
<tr>
<td><code>quoteChar</code></td>
<td>
The character used to quote fields.
</td>
</tr>
<tr>
<td><code>escapeChar</code></td>
<td>
The character used to escape <code>quoteChar</code> inside field values.
</td>
</tr>
<tr>
<td>
<code>delimiter</code>
</td>
<td>
The delimiting character. Multi-character delimiters are supported. It must not be found in <a href="#readonly">Papa.BAD_DELIMITERS</a>.
</td>
</tr>
<tr>
<td>
<code>header</code>
</td>
<td>
If <code>false</code>, will omit the header row. If <code>data</code> is an array of arrays this option is ignored. If <code>data</code> is an array of objects the keys of the first object are the header row. If <code>data</code> is an object with the keys <code>fields</code> and <code>data</code> the <code>fields</code> are the header row.
</td>
</tr>
<tr>
<td>
<code>newline</code>
</td>
<td>
The character used to determine newline sequence. It defaults to <code>"\r\n"</code>.
</td>
</tr>
<tr>
<td>
<code>skipEmptyLines</code>
</td>
<td>
If <code>true</code>, lines that are completely empty (those which evaluate to an empty string) will be skipped. If set to <code>'greedy'</code>, lines that don't have any content (those which have only whitespace after parsing) will also be skipped.
</td>
</tr>
<tr>
<td>
<code>columns</code>
</td>
<td>
If <code>data</code> is an array of objects this option can be used to manually specify the keys (columns) you expect in the objects. If not set the keys of the first objects are used as column.
</td>
</tr>
<tr>
<td>
<code>escapeFormulae</code>
</td>
<td>
If <code>true</code>, field values that begin with <code>=</code>, <code>+</code>, <code>-</code>, <code>@</code>, <code>\t</code>, or <code>\r</code>, will be prepended with a <code>'</code> to defend against <a href="https://owasp.org/www-community/attacks/CSV_Injection" target="_blank" rel="noopener">injection attacks</a>, because Excel and LibreOffice will automatically parse such cells as formulae. You can override those values by setting this option to a regular expression
</td>
</tr>
</table>
</div> </div>
<div class="clear"></div> <div class="clear"></div>
@ -289,8 +380,8 @@ var csv = Papa.unparse([
<div class="grid-33"> <div class="grid-33">
<pre><code class="language-javascript">// Specifying fields and data explicitly <pre><code class="language-javascript">// Specifying fields and data explicitly
var csv = Papa.unparse({ var csv = Papa.unparse({
fields: ["Column 1", "Column 2"], "fields": ["Column 1", "Column 2"],
data: [ "data": [
["foo", "bar"], ["foo", "bar"],
["abc", "def"] ["abc", "def"]
] ]
@ -340,7 +431,7 @@ var csv = Papa.unparse({
quoteChar: '"', quoteChar: '"',
escapeChar: '"', escapeChar: '"',
header: false, header: false,
trimHeaders: false, transformHeader: undefined,
dynamicTyping: false, dynamicTyping: false,
preview: 0, preview: 0,
encoding: "", encoding: "",
@ -350,12 +441,16 @@ var csv = Papa.unparse({
complete: undefined, complete: undefined,
error: undefined, error: undefined,
download: false, download: false,
downloadRequestHeaders: undefined,
downloadRequestBody: undefined,
skipEmptyLines: false, skipEmptyLines: false,
chunk: undefined, chunk: undefined,
chunkSize: undefined,
fastMode: undefined, fastMode: undefined,
beforeFirstChunk: undefined, beforeFirstChunk: undefined,
withCredentials: undefined, withCredentials: undefined,
transform: undefined transform: undefined,
delimitersToGuess: [',', '\t', '|', ';', <a href="#readonly">Papa.RECORD_SEP</a>, <a href="#readonly">Papa.UNIT_SEP</a>]
}</code></pre> }</code></pre>
</div> </div>
<div class="clear"></div> <div class="clear"></div>
@ -375,7 +470,7 @@ var csv = Papa.unparse({
<code>delimiter</code> <code>delimiter</code>
</td> </td>
<td> <td>
The delimiting character. Leave blank to auto-detect from a list of most common delimiters. It can be a string or a function. If string, it must be one of length 1. If a function, it must accept the input as first parameter and it must return a string which will be used as delimiter. In both cases it cannot be found in <a href="#readonly">Papa.BAD_DELIMITERS</a>. The delimiting character. Leave blank to auto-detect from a list of most common delimiters, or any values passed in through <code>delimitersToGuess</code>. It can be a string or a function. If a string, it can be of any length (so multi-character delimiters are supported). If a function, it must accept the input as first parameter and it must return a string which will be used as delimiter. In both cases it cannot be found in <a href="#readonly">Papa.BAD_DELIMITERS</a>.
</td> </td>
</tr> </tr>
<tr> <tr>
@ -412,10 +507,11 @@ var csv = Papa.unparse({
</tr> </tr>
<tr> <tr>
<td> <td>
<code>trimHeaders</code> <code>transformHeader</code>
</td> </td>
<td> <td>
If true leading/trailing spaces will be trimed from headers. A function to apply on each header. Requires <code>header</code> to be <code>true</code>. The function receives the header as its first argument and the index as second.<br>
Only available starting with version 5.0.
</td> </td>
</tr> </tr>
<tr> <tr>
@ -423,7 +519,7 @@ var csv = Papa.unparse({
<code>dynamicTyping</code> <code>dynamicTyping</code>
</td> </td>
<td> <td>
If true, numeric and boolean data will be converted to their type instead of remaining strings. Numeric data must conform to the definition of a decimal literal. European-formatted numbers must have commas and dots swapped. If also accepts an object or a function. If object it's values should be a boolean to indicate if dynamic typing should be applied for each column number (or header name if using headers). If it's a function, it should return a boolean value for each field number (or name if using headers) which will be passed as first argument. If true, numeric and boolean data will be converted to their type instead of remaining strings. Numeric data must conform to the definition of a decimal literal. Numerical values greater than <code>2^53</code> or less than <code>-2^53</code> will not be converted to numbers to preserve precision. European-formatted numbers must have commas and dots swapped. If also accepts an object or a function. If object it's values should be a boolean to indicate if dynamic typing should be applied for each column number (or header name if using headers). If it's a function, it should return a boolean value for each field number (or name if using headers) which will be passed as first argument.
</td> </td>
</tr> </tr>
<tr> <tr>
@ -431,7 +527,7 @@ var csv = Papa.unparse({
<code>preview</code> <code>preview</code>
</td> </td>
<td> <td>
If > 0, only that many rows will be parsed. If &gt; 0, only that many rows will be parsed.
</td> </td>
</tr> </tr>
<tr> <tr>
@ -447,7 +543,7 @@ var csv = Papa.unparse({
<code>worker</code> <code>worker</code>
</td> </td>
<td> <td>
Whether or not to use a <a href="/faq#workers">worker thread</a>. Using a worker will keep your page reactive, but may be slightly slower. Web Workers also load the entire Javascript file, so be careful when <a href="/faq#combine">combining other libraries</a> in the same file as Papa Parse. Note that worker option is only available when parsing files and not when converting from JSON to CSV. Whether or not to use a <a href="/faq#workers">worker thread</a>. Using a worker will keep your page reactive, but may be slightly slower.
</td> </td>
</tr> </tr>
<tr> <tr>
@ -500,6 +596,27 @@ var csv = Papa.unparse({
If true, this indicates that the string you passed as the first argument to <code>parse()</code> is actually a URL from which to download a file and parse its contents. If true, this indicates that the string you passed as the first argument to <code>parse()</code> is actually a URL from which to download a file and parse its contents.
</td> </td>
</tr> </tr>
<tr>
<td>
<code>downloadRequestHeaders</code>
</td>
<td>
If defined, should be an object that describes the headers, example:
<pre>
<code class="language-javascript">downloadRequestHeaders: {
'Authorization': 'token 123345678901234567890',
}</code>
</pre>
</tr>
<tr>
<td>
<code>downloadRequestBody</code>
</td>
<td>
Use POST request on the URL of the download option. The value passed will be set as the body of the request.
</td>
</tr>
<tr> <tr>
<td> <td>
<code>skipEmptyLines</code> <code>skipEmptyLines</code>
@ -516,6 +633,14 @@ var csv = Papa.unparse({
A callback function, identical to step, which activates streaming. However, this function is executed after every <i>chunk</i> of the file is loaded and parsed rather than every row. Works only with local and remote files. Do not use both chunk and step callbacks together. For the function signature, see the documentation for the step function. A callback function, identical to step, which activates streaming. However, this function is executed after every <i>chunk</i> of the file is loaded and parsed rather than every row. Works only with local and remote files. Do not use both chunk and step callbacks together. For the function signature, see the documentation for the step function.
</td> </td>
</tr> </tr>
<tr>
<td>
<code>chunkSize</code>
</td>
<td>
Overrides <code>Papa.LocalChunkSize</code> and <code>Papa.RemoteChunkSize</code>. See <a href="#configurable">configurable</a> section to know the usage of both parameters.
</td>
</tr>
<tr> <tr>
<td> <td>
<code>fastMode</code> <code>fastMode</code>
@ -545,7 +670,15 @@ var csv = Papa.unparse({
<code>transform</code> <code>transform</code>
</td> </td>
<td> <td>
A function to apply on each value. The function receives the value as its first argument and the column number as its second argument. The return value of the function will replace the value it received. The transform function is applied before dynamicTyping. A function to apply on each value. The function receives the value as its first argument and the column number or header name when enabled as its second argument. The return value of the function will replace the value it received. The transform function is applied before dynamicTyping.
</td>
</tr>
<tr>
<td>
<code>delimitersToGuess</code>
</td>
<td>
An array of delimiters to guess from if the <code>delimiter</code> option is not set.
</td> </td>
</tr> </tr>
</table> </table>
@ -728,7 +861,7 @@ var csv = Papa.unparse({
<tr> <tr>
<td><code>Papa.BAD_DELIMITERS</code></td> <td><code>Papa.BAD_DELIMITERS</code></td>
<td> <td>
An array of characters that are not allowed as delimiters. An array of characters that are not allowed as delimiters (<code>\r, \n, ", \ufeff</code>).
</td> </td>
</tr> </tr>
<tr> <tr>
@ -749,12 +882,6 @@ var csv = Papa.unparse({
Whether or not the browser supports HTML5 Web Workers. If false, <code>worker: true</code> will have no effect. Whether or not the browser supports HTML5 Web Workers. If false, <code>worker: true</code> will have no effect.
</td> </td>
</tr> </tr>
<tr>
<td><code>Papa.SCRIPT_PATH</code></td>
<td>
The relative path to Papa Parse. This is automatically detected when Papa Parse is loaded synchronously. However, if you load Papa Parse asynchronously (e.g. with RequireJS), you need to set this variable manually in order to use Web Workers. (In those cases, this variable is <i>not</i> read-only and you should set it!)
</td>
</tr>
</table> </table>
</div> </div>
@ -812,7 +939,7 @@ var csv = Papa.unparse({
<br><br> <br><br>
Papa Parse by <a href="https://twitter.com/mholt6">Matt Holt</a> Papa Parse by <a href="https://twitter.com/mholt6">Matt Holt</a>
<br> <br>
&copy; 2013-2018 &copy; 2013-2019
</div> </div>
<div class="grid-15 mobile-grid-50 links"> <div class="grid-15 mobile-grid-50 links">
<h5>Learn</h5> <h5>Learn</h5>
@ -822,7 +949,7 @@ var csv = Papa.unparse({
</div> </div>
<div class="grid-15 mobile-grid-50 links"> <div class="grid-15 mobile-grid-50 links">
<h5>Project</h5> <h5>Project</h5>
<a href="https://gratipay.com/mholt">Donate</a> <a href="https://www.paypal.com/cgi-bin/webscr?cmd=_donations&business=S6VTL9FQ6L8EN&item_name=PapaParse&currency_code=EUR&source=url">Donate</a>
<a href="https://github.com/mholt/PapaParse">GitHub</a> <a href="https://github.com/mholt/PapaParse">GitHub</a>
<a href="https://twitter.com/search?q=%23PapaParse">Share</a> <a href="https://twitter.com/search?q=%23PapaParse">Share</a>
</div> </div>

20
docs/faq.html

@ -35,7 +35,7 @@
</div> </div>
</div> </div>
<div class="grid-20 hide-on-mobile text-center"> <div class="grid-20 hide-on-mobile text-center">
<a href="/" class="text-logo">Papa Parse 4</a> <a href="/" class="text-logo">Papa Parse 5</a>
</div> </div>
<div class="grid-40 mobile-grid-50 text-right"> <div class="grid-40 mobile-grid-50 text-right">
<div class="links"> <div class="links">
@ -45,9 +45,6 @@
<a href="http://stackoverflow.com/questions/tagged/papaparse"> <a href="http://stackoverflow.com/questions/tagged/papaparse">
<i class="fa fa-stack-overflow fa-lg"></i> Help <i class="fa fa-stack-overflow fa-lg"></i> Help
</a> </a>
<!-- <a href="https://matt.life/pay" class="donate">
<i class="fa fa-heart fa-lg"></i> Donate
</a> -->
</div> </div>
</div> </div>
</div> </div>
@ -84,7 +81,7 @@
<h6 id="combine">Can I put other libraries in the same file as Papa Parse?</h6> <h6 id="combine">Can I put other libraries in the same file as Papa Parse?</h6>
<p> <p>
Yes, but then don't use the Web Worker feature unless your other dependencies are battle-hardened for worker threads. A worker thread loads an entire file, not just a function, so all those dependencies would be executed in an environment without a DOM and other <code>window</code> features. If any of those dependencies crash (<code>Cannot read property "defaultView" of undefined</code> <a href="https://github.com/mholt/PapaParse/issues/114">is</a> <a href="https://github.com/mholt/PapaParse/issues/163">common</a>), the whole worker thread will crash and parsing will not succeed. Yes.
</p> </p>
@ -96,7 +93,7 @@
<h6 id="async">Can Papa Parse be loaded asynchronously (after the page loads)?</h6> <h6 id="async">Can Papa Parse be loaded asynchronously (after the page loads)?</h6>
<p> <p>
Yes. But if you want to use Web Workers, you'll need to specify the relative path to Papa Parse. To do this, set <a href="/docs#readonly">Papa.SCRIPT_PATH</a> to the relative path of the Papa Parse file. In synchronous loading, this is automatically detected. Yes.
</p> </p>
@ -209,7 +206,7 @@
<h6>Can I use a worker if I combine/concatenate my Javascript files?</h6> <h6>Can I use a worker if I combine/concatenate my Javascript files?</h6>
<p> <p>
Probably not. It's safest to concatenate the rest of your dependencies and include Papa Parse in a seperate file. Any library that expects to have access to the <code>window</code> or DOM will crash when executed in a worker thread. Only put <a href="/faq#combine">other libraries in the same file</a> if they are ready to be used in worker threads. Yes.
</p> </p>
<h6>When should I use a worker?</h6> <h6>When should I use a worker?</h6>
@ -241,6 +238,11 @@
<p> <p>
No. This would drastically slow down parsing, as it would require the worker to wait after every chunk for a "continue" signal from the main thread. But you <i>can</i> abort workers by calling <code>.abort()</code> on the parser that gets passed to your callback function. No. This would drastically slow down parsing, as it would require the worker to wait after every chunk for a "continue" signal from the main thread. But you <i>can</i> abort workers by calling <code>.abort()</code> on the parser that gets passed to your callback function.
</p> </p>
<h6>I set worker:true and now I'm getting an error: "window is not defined." How do I fix it?</h6>
<p>
This is a fairly common issue with configuration and it appears to be related to the use of React or other third party tools. Since this is a configuration issue, the exact steps needed to solve it may vary. See <a href="https://github.com/mholt/PapaParse/issues/655">Issue #655</a> on GitHub for a solution that worked for one person, and for links to other related issues.
</p>
</div> </div>
</div> </div>
</main> </main>
@ -257,7 +259,7 @@
<br><br> <br><br>
Papa Parse by <a href="https://twitter.com/mholt6">Matt Holt</a> Papa Parse by <a href="https://twitter.com/mholt6">Matt Holt</a>
<br> <br>
&copy; 2013-2018 &copy; 2013-2019
</div> </div>
<div class="grid-15 mobile-grid-50 links"> <div class="grid-15 mobile-grid-50 links">
<h5>Learn</h5> <h5>Learn</h5>
@ -267,7 +269,7 @@
</div> </div>
<div class="grid-15 mobile-grid-50 links"> <div class="grid-15 mobile-grid-50 links">
<h5>Project</h5> <h5>Project</h5>
<a href="https://gratipay.com/mholt">Donate</a> <a href="https://www.paypal.com/cgi-bin/webscr?cmd=_donations&business=S6VTL9FQ6L8EN&item_name=PapaParse&currency_code=EUR&source=url">Donate</a>
<a href="https://github.com/mholt/PapaParse">GitHub</a> <a href="https://github.com/mholt/PapaParse">GitHub</a>
<a href="https://twitter.com/search?q=%23PapaParse">Share</a> <a href="https://twitter.com/search?q=%23PapaParse">Share</a>
</div> </div>

38
docs/index.html

@ -27,7 +27,7 @@
<h1>Papa Parse</h1> <h1>Papa Parse</h1>
<h2>The powerful, in-browser CSV parser for big boys and girls</h2> <h2>The powerful, in-browser CSV parser for big boys and girls</h2>
<a href="https://github.com/mholt/PapaParse/archive/4.6.0.zip" class="button"> <a href="https://github.com/mholt/PapaParse/archive/5.0.2.zip" class="button">
<i class="fa fa-download"></i>&nbsp; Download <i class="fa fa-download"></i>&nbsp; Download
</a> </a>
<a href="/demo" class="button red"> <a href="/demo" class="button red">
@ -81,7 +81,7 @@ Papa.parse(bigFile, {
</div> </div>
</div> </div>
<div class="grid-20 hide-on-mobile text-center"> <div class="grid-20 hide-on-mobile text-center">
<a href="/" class="text-logo">Papa Parse 4</a> <a href="/" class="text-logo">Papa Parse 5</a>
</div> </div>
<div class="grid-40 mobile-grid-50 text-right"> <div class="grid-40 mobile-grid-50 text-right">
<div class="links"> <div class="links">
@ -91,16 +91,13 @@ Papa.parse(bigFile, {
<a href="http://stackoverflow.com/questions/tagged/papaparse"> <a href="http://stackoverflow.com/questions/tagged/papaparse">
<i class="fa fa-stack-overflow fa-lg"></i> Help <i class="fa fa-stack-overflow fa-lg"></i> Help
</a> </a>
<!-- <a href="https://matt.life/pay" class="donate">
<i class="fa fa-heart fa-lg"></i> Donate
</a> -->
</div> </div>
</div> </div>
</div> </div>
</header> </header>
<div class="insignia"> <div class="insignia">
<div class="firefox-hack"><div id="version-intro">Version</div><div id="version">4.6</div></div> <div class="firefox-hack"><div id="version-intro">Version</div><div id="version">5.0</div></div>
</div> </div>
@ -178,6 +175,24 @@ Papa.parse(bigFile, {
<div class="grid-100 text-center"> <div class="grid-100 text-center">
<h3>People <i class="fa fa-heart"></i> Papa</h3> <h3>People <i class="fa fa-heart"></i> Papa</h3>
</div> </div>
<div class="grid-100 text-center">
<br>
<p>
<a href="https://www.npmjs.com/package/papaparse">
<img
src="https://img.shields.io/npm/dm/papaparse.svg"
alt="PapaParse"
/>
</a>
&nbsp;
<a href="https://www.npmjs.com/package/react-papaparse">
<img
src="https://img.shields.io/npm/dt/papaparse.svg?label=total%20downloads"
alt="PapaParse"
/>
</a>
</p>
</div>
<div class="grid-33"> <div class="grid-33">
<p class="lover"> <p class="lover">
@ -199,7 +214,7 @@ Papa.parse(bigFile, {
<div class="grid-100 text-center"> <div class="grid-100 text-center">
<br> <br>
<b><a href="https://github.com/mholt/PapaParse/blob/gh-pages/resources/js/lovers.js" class="add-lover-link subheader"><i class="fa fa-plus-square"></i> Add your link (it's free)</a></b> <b><a href="https://github.com/mholt/PapaParse/blob/master/docs/resources/js/lovers.js" class="add-lover-link subheader"><i class="fa fa-plus-square"></i> Add your link (it's free)</a></b>
</div> </div>
</div> </div>
</section> </section>
@ -506,6 +521,7 @@ var csv = Papa.unparse(yourData);</code></pre>
<i class="fa fa-book"></i>&nbsp; Documentation <i class="fa fa-book"></i>&nbsp; Documentation
</a> </a>
</div> </div>
</div>
</section> </section>
@ -524,7 +540,7 @@ var csv = Papa.unparse(yourData);</code></pre>
<br><br> <br><br>
Papa Parse by <a href="https://twitter.com/mholt6">Matt Holt</a> Papa Parse by <a href="https://twitter.com/mholt6">Matt Holt</a>
<br> <br>
&copy; 2013-2018 &copy; 2013-2019
</div> </div>
<div class="grid-15 mobile-grid-50 links"> <div class="grid-15 mobile-grid-50 links">
<h5>Learn</h5> <h5>Learn</h5>
@ -534,7 +550,7 @@ var csv = Papa.unparse(yourData);</code></pre>
</div> </div>
<div class="grid-15 mobile-grid-50 links"> <div class="grid-15 mobile-grid-50 links">
<h5>Project</h5> <h5>Project</h5>
<a href="https://gratipay.com/mholt">Donate</a> <a href="https://www.paypal.com/cgi-bin/webscr?cmd=_donations&business=S6VTL9FQ6L8EN&item_name=PapaParse&currency_code=EUR&source=url">Donate</a>
<a href="https://github.com/mholt/PapaParse">GitHub</a> <a href="https://github.com/mholt/PapaParse">GitHub</a>
<a href="https://twitter.com/search?q=%23PapaParse">Share</a> <a href="https://twitter.com/search?q=%23PapaParse">Share</a>
</div> </div>
@ -543,8 +559,8 @@ var csv = Papa.unparse(yourData);</code></pre>
<h5>Download</h5> <h5>Download</h5>
<a href="https://github.com/mholt/PapaParse/archive/master.zip">Latest (master)</a> <a href="https://github.com/mholt/PapaParse/archive/master.zip">Latest (master)</a>
<hr> <hr>
<a href="https://github.com/mholt/PapaParse/blob/master/papaparse.min.js">Lil' Papa</a> <a href="https://unpkg.com/papaparse@latest/papaparse.min.js">Lil' Papa</a>
<a href="https://github.com/mholt/PapaParse/blob/master/papaparse.js">Fat Papa</a> <a href="https://unpkg.com/papaparse@latest/papaparse.js">Fat Papa</a>
</div> </div>
<div class="grid-15 mobile-grid-50 links"> <div class="grid-15 mobile-grid-50 links">
<h5>Community</h5> <h5>Community</h5>

62
docs/resources/js/lovers.js

@ -56,27 +56,57 @@ var peopleLovePapa = [
description: "uses Papa Parse in VisualEditor to help article editors effortlessly build data tables from text files." description: "uses Papa Parse in VisualEditor to help article editors effortlessly build data tables from text files."
}, },
{ {
link: "https://www.webucator.com/webdesign/javascript.cfm", link: "https://github.com/Nanofus/novel.js",
name: "Webucator", name: "Novel.js",
description: "created a video showing how to use Papa Parse and FileDrop.js to create a drag-and-drop CSV-JSON converter.", description: "is a text adventure framework that uses Papa Parse to enable user-friendly translations.",
quote: "It's often easy to convert data to CSV. With Papa, it's easy to turn that CSV into JSON." quote: "Papa saves countless hours of work and makes reading large CSV files so easy!"
}, },
{ {
link: "http://www.yolpo.com/social/gist.github?1dbd4556e748bdb830b3&autoplay=1&interimresults=0&failfast=1", link: "https://mailcheck.co",
name: "Yolpo", name: "Mailcheck.co",
description: "created a simple regression test for Papa Parse.", description: "Mailcheck is email validation service. All emails usually stored in CSV's. We use Papa Parse to process data from our customers in browser",
quote: "Papa's API is so intuitive, it took me no time to get it to work." quote: "Papa Parser allowed our customers to preview and process csv's in browser, without uploading them to server. It saves lots of time and space :)"
}, },
{ {
link: "https://www.appstax.com", link: "https://flatfile.io",
name: "Appstax", name: "Flatfile.io",
description: "uses Papa Parse to import and export CSV data in their visual databrowser.", description: "is an add-in data importer for web-apps, providing the full UX to upload a spreadsheet, field match, and repair issues found during import.",
quote: "Papa is a great for parsing CSV. And what a great tone of voice - love it!" quote: "Papa is a core part of our importer, so much so that we're committed to helping maintain it!"
}, },
{ {
link: "https://github.com/Nanofus/novel.js", link: "https://familiohq.com",
name: "Novel.js", name: "Familio",
description: "is a text adventure framework that uses Papa Parse to enable user-friendly translations.", description: "is a brand-new messaging app made specifically for busy families. Automatically align all family members when sending text messages to parents in the kindergarten or school or when planning your kids birthday parties.",
quote: "Papa saves countless hours of work and makes reading large CSV files so easy!" quote: "With Papa it was a joy to implement our tool for importing messages and places from external systems."
},
{
link: "https://monei.net",
name: "MONEI",
description: "Digital payments made easy.",
quote: "With Papa life became much easier for us to manage huge csv payments files of our merchants."
},
{
link: "https://moonmail.io",
name: "MoonMail",
description: "OmniChannel Communication Platform powered by AWS PinPoint",
quote: "Papa makes contact imports a plain sailing."
},
{
link: "https://apps.shopify.com/wholesaler",
name: "Wholesaler for Shopify",
description: "Shopify App to offer Wholesaling within one unique Shopify store",
quote: "Super fast bulk Wholesale product price uploads. Love Papa!."
},
{
link: "https://www.unnitmetaliya.com/sop-sample/",
name: "Visa SOP Sample",
description: "Providing free guide to international students.",
quote: "Use Papa Parse for many of side projects. Super fast and works all the time. Love it!"
},
{
link: "https://retool.com/",
name: "Retool",
description: "A remarkably fast way to build internal tools.",
quote: "Papa makes it easy for our users to customize CSV parsing to match their business logic."
} }
]; ];

272
docs/resources/js/papaparse.js

@ -1,19 +1,10 @@
/* @license /* @license
Papa Parse Papa Parse
v4.6.1 v5.0.0
https://github.com/mholt/PapaParse https://github.com/mholt/PapaParse
License: MIT License: MIT
*/ */
// Polyfills
// https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/isArray#Polyfill
if (!Array.isArray)
{
Array.isArray = function(arg) {
return Object.prototype.toString.call(arg) === '[object Array]';
};
}
(function(root, factory) (function(root, factory)
{ {
/* globals define */ /* globals define */
@ -34,7 +25,10 @@ if (!Array.isArray)
// Browser globals (root is window) // Browser globals (root is window)
root.Papa = factory(); root.Papa = factory();
} }
}(this, function() // in strict mode we cannot access arguments.callee, so we need a named reference to
// stringify the factory method for the blob worker
// eslint-disable-next-line func-name
}(this, function moduleFactory()
{ {
'use strict'; 'use strict';
@ -51,9 +45,15 @@ if (!Array.isArray)
return {}; return {};
})(); })();
function getWorkerBlob() {
var URL = global.URL || global.webkitURL || null;
var code = moduleFactory.toString();
return Papa.BLOB_URL || (Papa.BLOB_URL = URL.createObjectURL(new Blob(['(', code, ')();'], {type: 'text/javascript'})));
}
var IS_WORKER = !global.document && !!global.postMessage, var IS_WORKER = !global.document && !!global.postMessage,
IS_PAPA_WORKER = IS_WORKER && /(\?|&)papaworker(=|&|$)/.test(global.location.search), IS_PAPA_WORKER = IS_WORKER && /blob:/i.test((global.location || {}).protocol);
LOADED_SYNC = false, AUTO_SCRIPT_PATH;
var workers = {}, workerIdCounter = 0; var workers = {}, workerIdCounter = 0;
var Papa = {}; var Papa = {};
@ -66,7 +66,6 @@ if (!Array.isArray)
Papa.BYTE_ORDER_MARK = '\ufeff'; Papa.BYTE_ORDER_MARK = '\ufeff';
Papa.BAD_DELIMITERS = ['\r', '\n', '"', Papa.BYTE_ORDER_MARK]; Papa.BAD_DELIMITERS = ['\r', '\n', '"', Papa.BYTE_ORDER_MARK];
Papa.WORKERS_SUPPORTED = !IS_WORKER && !!global.Worker; Papa.WORKERS_SUPPORTED = !IS_WORKER && !!global.Worker;
Papa.SCRIPT_PATH = null; // Must be set by your code if you use workers and this lib is loaded asynchronously
Papa.NODE_STREAM_INPUT = 1; Papa.NODE_STREAM_INPUT = 1;
// Configurable chunk sizes for local and remote files, respectively // Configurable chunk sizes for local and remote files, respectively
@ -81,7 +80,9 @@ if (!Array.isArray)
Papa.FileStreamer = FileStreamer; Papa.FileStreamer = FileStreamer;
Papa.StringStreamer = StringStreamer; Papa.StringStreamer = StringStreamer;
Papa.ReadableStreamStreamer = ReadableStreamStreamer; Papa.ReadableStreamStreamer = ReadableStreamStreamer;
if (typeof PAPA_BROWSER_CONTEXT === 'undefined') {
Papa.DuplexStreamStreamer = DuplexStreamStreamer; Papa.DuplexStreamStreamer = DuplexStreamStreamer;
}
if (global.jQuery) if (global.jQuery)
{ {
@ -182,23 +183,6 @@ if (!Array.isArray)
{ {
global.onmessage = workerThreadReceivedMessage; global.onmessage = workerThreadReceivedMessage;
} }
else if (Papa.WORKERS_SUPPORTED)
{
AUTO_SCRIPT_PATH = getScriptPath();
// Check if the script was loaded synchronously
if (!document.body)
{
// Body doesn't exist yet, must be synchronous
LOADED_SYNC = true;
}
else
{
document.addEventListener('DOMContentLoaded', function() {
LOADED_SYNC = true;
}, true);
}
}
@ -241,7 +225,7 @@ if (!Array.isArray)
} }
var streamer = null; var streamer = null;
if (_input === Papa.NODE_STREAM_INPUT) if (_input === Papa.NODE_STREAM_INPUT && typeof PAPA_BROWSER_CONTEXT === 'undefined')
{ {
// create a node Duplex stream for use // create a node Duplex stream for use
// with .pipe // with .pipe
@ -289,12 +273,18 @@ if (!Array.isArray)
/** quote character */ /** quote character */
var _quoteChar = '"'; var _quoteChar = '"';
/** escaped quote character, either "" or <config.escapeChar>" */
var _escapedQuote = _quoteChar + _quoteChar;
/** whether to skip empty lines */ /** whether to skip empty lines */
var _skipEmptyLines = false; var _skipEmptyLines = false;
/** the columns (keys) we expect when we unparse objects */
var _columns = null;
unpackConfig(); unpackConfig();
var quoteCharRegex = new RegExp(_quoteChar, 'g'); var quoteCharRegex = new RegExp(escapeRegExp(_quoteChar), 'g');
if (typeof _input === 'string') if (typeof _input === 'string')
_input = JSON.parse(_input); _input = JSON.parse(_input);
@ -304,7 +294,7 @@ if (!Array.isArray)
if (!_input.length || Array.isArray(_input[0])) if (!_input.length || Array.isArray(_input[0]))
return serialize(null, _input, _skipEmptyLines); return serialize(null, _input, _skipEmptyLines);
else if (typeof _input[0] === 'object') else if (typeof _input[0] === 'object')
return serialize(objectKeys(_input[0]), _input, _skipEmptyLines); return serialize(_columns || objectKeys(_input[0]), _input, _skipEmptyLines);
} }
else if (typeof _input === 'object') else if (typeof _input === 'object')
{ {
@ -329,7 +319,7 @@ if (!Array.isArray)
} }
// Default (any valid paths should return before this) // Default (any valid paths should return before this)
throw 'exception: Unable to serialize unrecognized input'; throw new Error('Unable to serialize unrecognized input');
function unpackConfig() function unpackConfig()
@ -359,6 +349,17 @@ if (!Array.isArray)
if (typeof _config.header === 'boolean') if (typeof _config.header === 'boolean')
_writeHeader = _config.header; _writeHeader = _config.header;
if (Array.isArray(_config.columns)) {
if (_config.columns.length === 0) throw new Error('Option columns is empty');
_columns = _config.columns;
}
if (_config.escapeChar !== undefined) {
_escapedQuote = _config.escapeChar + _quoteChar;
}
} }
@ -403,21 +404,36 @@ if (!Array.isArray)
for (var row = 0; row < data.length; row++) for (var row = 0; row < data.length; row++)
{ {
var maxCol = hasHeader ? fields.length : data[row].length; var maxCol = hasHeader ? fields.length : data[row].length;
var r = hasHeader ? fields : data[row];
if (skipEmptyLines !== 'greedy' || r.join('').trim() !== '') var emptyLine = false;
var nullLine = hasHeader ? Object.keys(data[row]).length === 0 : data[row].length === 0;
if (skipEmptyLines && !hasHeader)
{
emptyLine = skipEmptyLines === 'greedy' ? data[row].join('').trim() === '' : data[row].length === 1 && data[row][0].length === 0;
}
if (skipEmptyLines === 'greedy' && hasHeader) {
var line = [];
for (var c = 0; c < maxCol; c++) {
var cx = dataKeyedByField ? fields[c] : c;
line.push(data[row][cx]);
}
emptyLine = line.join('').trim() === '';
}
if (!emptyLine)
{ {
for (var col = 0; col < maxCol; col++) for (var col = 0; col < maxCol; col++)
{ {
if (col > 0) if (col > 0 && !nullLine)
csv += _delimiter; csv += _delimiter;
var colIdx = hasHeader && dataKeyedByField ? fields[col] : col; var colIdx = hasHeader && dataKeyedByField ? fields[col] : col;
csv += safe(data[row][colIdx], col); csv += safe(data[row][colIdx], col);
} }
if (row < data.length - 1 && (!skipEmptyLines || maxCol > 0)) if (row < data.length - 1 && (!skipEmptyLines || (maxCol > 0 && !nullLine)))
{
csv += _newline; csv += _newline;
} }
} }
}
return csv; return csv;
} }
@ -430,7 +446,7 @@ if (!Array.isArray)
if (str.constructor === Date) if (str.constructor === Date)
return JSON.stringify(str).slice(1, 25); return JSON.stringify(str).slice(1, 25);
str = str.toString().replace(quoteCharRegex, _quoteChar + _quoteChar); str = str.toString().replace(quoteCharRegex, _escapedQuote);
var needsQuotes = (typeof _quotes === 'boolean' && _quotes) var needsQuotes = (typeof _quotes === 'boolean' && _quotes)
|| (Array.isArray(_quotes) && _quotes[col]) || (Array.isArray(_quotes) && _quotes[col])
@ -457,6 +473,7 @@ if (!Array.isArray)
this._handle = null; this._handle = null;
this._finished = false; this._finished = false;
this._completed = false; this._completed = false;
this._halted = false;
this._input = null; this._input = null;
this._baseIndex = 0; this._baseIndex = 0;
this._partialLine = ''; this._partialLine = '';
@ -481,6 +498,7 @@ if (!Array.isArray)
chunk = modifiedChunk; chunk = modifiedChunk;
} }
this.isFirstChunk = false; this.isFirstChunk = false;
this._halted = false;
// Rejoin the line we likely just split in two by chunking the file // Rejoin the line we likely just split in two by chunking the file
var aggregate = this._partialLine + chunk; var aggregate = this._partialLine + chunk;
@ -488,8 +506,10 @@ if (!Array.isArray)
var results = this._handle.parse(aggregate, this._baseIndex, !this._finished); var results = this._handle.parse(aggregate, this._baseIndex, !this._finished);
if (this._handle.paused() || this._handle.aborted()) if (this._handle.paused() || this._handle.aborted()) {
this._halted = true;
return; return;
}
var lastIndex = results.meta.cursor; var lastIndex = results.meta.cursor;
@ -515,8 +535,10 @@ if (!Array.isArray)
else if (isFunction(this._config.chunk) && !isFakeChunk) else if (isFunction(this._config.chunk) && !isFakeChunk)
{ {
this._config.chunk(results, this._handle); this._config.chunk(results, this._handle);
if (this._handle.paused() || this._handle.aborted()) if (this._handle.paused() || this._handle.aborted()) {
this._halted = true;
return; return;
}
results = undefined; results = undefined;
this._completeResults = undefined; this._completeResults = undefined;
} }
@ -634,7 +656,6 @@ if (!Array.isArray)
{ {
var end = this._start + this._config.chunkSize - 1; // minus one because byte range is inclusive var end = this._start + this._config.chunkSize - 1; // minus one because byte range is inclusive
xhr.setRequestHeader('Range', 'bytes=' + this._start + '-' + end); xhr.setRequestHeader('Range', 'bytes=' + this._start + '-' + end);
xhr.setRequestHeader('If-None-Match', 'webkit-no-cache'); // https://bugs.webkit.org/show_bug.cgi?id=82672
} }
try { try {
@ -881,14 +902,12 @@ if (!Array.isArray)
this._onCsvData = function(results) this._onCsvData = function(results)
{ {
var data = results.data; var data = results.data;
for (var i = 0; i < data.length; i++) { if (!stream.push(data) && !this._handle.paused()) {
if (!stream.push(data[i]) && !this._handle.paused()) {
// the writeable consumer buffer has filled up // the writeable consumer buffer has filled up
// so we need to pause until more items // so we need to pause until more items
// can be processed // can be processed
this._handle.pause(); this._handle.pause();
} }
}
}; };
this._onCsvComplete = function() this._onCsvComplete = function()
@ -967,8 +986,10 @@ if (!Array.isArray)
}); });
stream.once('finish', bindFunction(this._onWriteComplete, this)); stream.once('finish', bindFunction(this._onWriteComplete, this));
} }
if (typeof PAPA_BROWSER_CONTEXT === 'undefined') {
DuplexStreamStreamer.prototype = Object.create(ChunkStreamer.prototype); DuplexStreamStreamer.prototype = Object.create(ChunkStreamer.prototype);
DuplexStreamStreamer.prototype.constructor = DuplexStreamStreamer; DuplexStreamStreamer.prototype.constructor = DuplexStreamStreamer;
}
// Use one ParserHandle per entire CSV file or string // Use one ParserHandle per entire CSV file or string
@ -977,7 +998,6 @@ if (!Array.isArray)
// One goal is to minimize the use of regular expressions... // One goal is to minimize the use of regular expressions...
var FLOAT = /^\s*-?(\d*\.?\d+|\d+\.?\d*)(e[-+]?\d+)?\s*$/i; var FLOAT = /^\s*-?(\d*\.?\d+|\d+\.?\d*)(e[-+]?\d+)?\s*$/i;
var ISO_DATE = /(\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d\.\d+([+-][0-2]\d:[0-5]\d|Z))|(\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d([+-][0-2]\d:[0-5]\d|Z))|(\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d([+-][0-2]\d:[0-5]\d|Z))/; var ISO_DATE = /(\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d\.\d+([+-][0-2]\d:[0-5]\d|Z))|(\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d([+-][0-2]\d:[0-5]\d|Z))|(\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d([+-][0-2]\d:[0-5]\d|Z))/;
var self = this; var self = this;
var _stepCounter = 0; // Number of times step was called (number of rows parsed) var _stepCounter = 0; // Number of times step was called (number of rows parsed)
var _rowCounter = 0; // Number of rows that have been parsed so far var _rowCounter = 0; // Number of rows that have been parsed so far
@ -1033,7 +1053,7 @@ if (!Array.isArray)
_delimiterError = false; _delimiterError = false;
if (!_config.delimiter) if (!_config.delimiter)
{ {
var delimGuess = guessDelimiter(input, _config.newline, _config.skipEmptyLines, _config.comments); var delimGuess = guessDelimiter(input, _config.newline, _config.skipEmptyLines, _config.comments, _config.delimitersToGuess);
if (delimGuess.successful) if (delimGuess.successful)
_config.delimiter = delimGuess.bestDelimiter; _config.delimiter = delimGuess.bestDelimiter;
else else
@ -1074,8 +1094,14 @@ if (!Array.isArray)
this.resume = function() this.resume = function()
{ {
if(self.streamer._halted) {
_paused = false; _paused = false;
self.streamer.parseChunk(_input, true); self.streamer.parseChunk(_input, true);
} else {
// Bugfix: #636 In case the processing hasn't halted yet
// wait for it to halt in order to resume
setTimeout(this.resume, 3);
}
}; };
this.aborted = function() this.aborted = function()
@ -1127,19 +1153,26 @@ if (!Array.isArray)
{ {
if (!_results) if (!_results)
return; return;
for (var i = 0; needsHeaderRow() && i < _results.data.length; i++)
for (var j = 0; j < _results.data[i].length; j++)
{
var header = _results.data[i][j];
if (_config.trimHeaders) { function addHeder(header)
header = header.trim(); {
} if (isFunction(_config.transformHeader))
header = _config.transformHeader(header);
_fields.push(header); _fields.push(header);
} }
if (Array.isArray(_results.data[0]))
{
for (var i = 0; needsHeaderRow() && i < _results.data.length; i++)
_results.data[i].forEach(addHeder);
_results.data.splice(0, 1); _results.data.splice(0, 1);
} }
// if _results.data[0] is not an array, we are in a step where _results.data is the row.
else
_results.data.forEach(addHeder);
}
function shouldApplyDynamicTyping(field) { function shouldApplyDynamicTyping(field) {
// Cache function values to avoid calling it for each row // Cache function values to avoid calling it for each row
@ -1172,15 +1205,15 @@ if (!Array.isArray)
if (!_results || (!_config.header && !_config.dynamicTyping && !_config.transform)) if (!_results || (!_config.header && !_config.dynamicTyping && !_config.transform))
return _results; return _results;
for (var i = 0; i < _results.data.length; i++) function processRow(rowSource, i)
{ {
var row = _config.header ? {} : []; var row = _config.header ? {} : [];
var j; var j;
for (j = 0; j < _results.data[i].length; j++) for (j = 0; j < rowSource.length; j++)
{ {
var field = j; var field = j;
var value = _results.data[i][j]; var value = rowSource[j];
if (_config.header) if (_config.header)
field = j >= _fields.length ? '__parsed_extra' : _fields[j]; field = j >= _fields.length ? '__parsed_extra' : _fields[j];
@ -1199,7 +1232,6 @@ if (!Array.isArray)
row[field] = value; row[field] = value;
} }
_results.data[i] = row;
if (_config.header) if (_config.header)
{ {
@ -1208,23 +1240,36 @@ if (!Array.isArray)
else if (j < _fields.length) else if (j < _fields.length)
addError('FieldMismatch', 'TooFewFields', 'Too few fields: expected ' + _fields.length + ' fields but parsed ' + j, _rowCounter + i); addError('FieldMismatch', 'TooFewFields', 'Too few fields: expected ' + _fields.length + ' fields but parsed ' + j, _rowCounter + i);
} }
return row;
} }
var incrementBy = 1;
if (!_results.data[0] || Array.isArray(_results.data[0]))
{
_results.data = _results.data.map(processRow);
incrementBy = _results.data.length;
}
else
_results.data = processRow(_results.data, 0);
if (_config.header && _results.meta) if (_config.header && _results.meta)
_results.meta.fields = _fields; _results.meta.fields = _fields;
_rowCounter += _results.data.length; _rowCounter += incrementBy;
return _results; return _results;
} }
function guessDelimiter(input, newline, skipEmptyLines, comments) function guessDelimiter(input, newline, skipEmptyLines, comments, delimitersToGuess)
{ {
var delimChoices = [',', '\t', '|', ';', Papa.RECORD_SEP, Papa.UNIT_SEP];
var bestDelim, bestDelta, fieldCountPrevRow; var bestDelim, bestDelta, fieldCountPrevRow;
for (var i = 0; i < delimChoices.length; i++) delimitersToGuess = delimitersToGuess || [',', '\t', '|', ';', Papa.RECORD_SEP, Papa.UNIT_SEP];
for (var i = 0; i < delimitersToGuess.length; i++)
{ {
var delim = delimChoices[i]; var delim = delimitersToGuess[i];
var delta = 0, avgFieldCount = 0, emptyLinesCount = 0; var delta = 0, avgFieldCount = 0, emptyLinesCount = 0;
fieldCountPrevRow = undefined; fieldCountPrevRow = undefined;
@ -1247,7 +1292,7 @@ if (!Array.isArray)
if (typeof fieldCountPrevRow === 'undefined') if (typeof fieldCountPrevRow === 'undefined')
{ {
fieldCountPrevRow = fieldCount; fieldCountPrevRow = 0;
continue; continue;
} }
else if (fieldCount > 1) else if (fieldCount > 1)
@ -1260,7 +1305,7 @@ if (!Array.isArray)
if (preview.data.length > 0) if (preview.data.length > 0)
avgFieldCount /= (preview.data.length - emptyLinesCount); avgFieldCount /= (preview.data.length - emptyLinesCount);
if ((typeof bestDelta === 'undefined' || delta < bestDelta) if ((typeof bestDelta === 'undefined' || delta > bestDelta)
&& avgFieldCount > 1.99) && avgFieldCount > 1.99)
{ {
bestDelta = delta; bestDelta = delta;
@ -1349,7 +1394,7 @@ if (!Array.isArray)
// Comment character must be valid // Comment character must be valid
if (comments === delim) if (comments === delim)
throw 'Comment character same as delimiter'; throw new Error('Comment character same as delimiter');
else if (comments === true) else if (comments === true)
comments = '#'; comments = '#';
else if (typeof comments !== 'string' else if (typeof comments !== 'string'
@ -1368,7 +1413,7 @@ if (!Array.isArray)
{ {
// For some reason, in Chrome, this speeds things up (!?) // For some reason, in Chrome, this speeds things up (!?)
if (typeof input !== 'string') if (typeof input !== 'string')
throw 'Input must be a string'; throw new Error('Input must be a string');
// We don't need to compute some of these every time parse() is called, // We don't need to compute some of these every time parse() is called,
// but having them in a more local scope seems to perform better // but having them in a more local scope seems to perform better
@ -1419,8 +1464,8 @@ if (!Array.isArray)
var nextDelim = input.indexOf(delim, cursor); var nextDelim = input.indexOf(delim, cursor);
var nextNewline = input.indexOf(newline, cursor); var nextNewline = input.indexOf(newline, cursor);
var quoteCharRegex = new RegExp(escapeChar.replace(/[-[\]/{}()*+?.\\^$|]/g, '\\$&') + quoteChar, 'g'); var quoteCharRegex = new RegExp(escapeRegExp(escapeChar) + escapeRegExp(quoteChar), 'g');
var quoteSearch; var quoteSearch = input.indexOf(quoteChar, cursor);
// Parser loop // Parser loop
for (;;) for (;;)
@ -1485,6 +1530,12 @@ if (!Array.isArray)
{ {
row.push(input.substring(cursor, quoteSearch).replace(quoteCharRegex, quoteChar)); row.push(input.substring(cursor, quoteSearch).replace(quoteCharRegex, quoteChar));
cursor = quoteSearch + 1 + spacesBetweenQuoteAndDelimiter + delimLen; cursor = quoteSearch + 1 + spacesBetweenQuoteAndDelimiter + delimLen;
// If char after following delimiter is not quoteChar, we find next quote char position
if (input[quoteSearch + 1 + spacesBetweenQuoteAndDelimiter + delimLen] !== quoteChar)
{
quoteSearch = input.indexOf(quoteChar, cursor);
}
nextDelim = input.indexOf(delim, cursor); nextDelim = input.indexOf(delim, cursor);
nextNewline = input.indexOf(newline, cursor); nextNewline = input.indexOf(newline, cursor);
break; break;
@ -1498,6 +1549,7 @@ if (!Array.isArray)
row.push(input.substring(cursor, quoteSearch).replace(quoteCharRegex, quoteChar)); row.push(input.substring(cursor, quoteSearch).replace(quoteCharRegex, quoteChar));
saveRow(quoteSearch + 1 + spacesBetweenQuoteAndNewLine + newlineLen); saveRow(quoteSearch + 1 + spacesBetweenQuoteAndNewLine + newlineLen);
nextDelim = input.indexOf(delim, cursor); // because we may have skipped the nextDelim in the quoted field nextDelim = input.indexOf(delim, cursor); // because we may have skipped the nextDelim in the quoted field
quoteSearch = input.indexOf(quoteChar, cursor); // we search for first quote in next line
if (stepIsFunction) if (stepIsFunction)
{ {
@ -1544,11 +1596,28 @@ if (!Array.isArray)
// Next delimiter comes before next newline, so we've reached end of field // Next delimiter comes before next newline, so we've reached end of field
if (nextDelim !== -1 && (nextDelim < nextNewline || nextNewline === -1)) if (nextDelim !== -1 && (nextDelim < nextNewline || nextNewline === -1))
{ {
// we check, if we have quotes, because delimiter char may be part of field enclosed in quotes
if (quoteSearch !== -1) {
// we have quotes, so we try to find the next delimiter not enclosed in quotes and also next starting quote char
var nextDelimObj = getNextUnqotedDelimiter(nextDelim, quoteSearch, nextNewline);
// if we have next delimiter char which is not enclosed in quotes
if (nextDelimObj && nextDelimObj.nextDelim) {
nextDelim = nextDelimObj.nextDelim;
quoteSearch = nextDelimObj.quoteSearch;
row.push(input.substring(cursor, nextDelim)); row.push(input.substring(cursor, nextDelim));
cursor = nextDelim + delimLen; cursor = nextDelim + delimLen;
// we look for next delimiter char
nextDelim = input.indexOf(delim, cursor); nextDelim = input.indexOf(delim, cursor);
continue; continue;
} }
} else {
row.push(input.substring(cursor, nextDelim));
cursor = nextDelim + delimLen;
nextDelim = input.indexOf(delim, cursor);
continue;
}
}
// End of row // End of row
if (nextNewline !== -1) if (nextNewline !== -1)
@ -1630,10 +1699,11 @@ if (!Array.isArray)
} }
/** Returns an object with the results, errors, and meta. */ /** Returns an object with the results, errors, and meta. */
function returnable(stopped) function returnable(stopped, step)
{ {
var isStep = step || false;
return { return {
data: data, data: isStep ? data[0] : data,
errors: errors, errors: errors,
meta: { meta: {
delimiter: delim, delimiter: delim,
@ -1648,10 +1718,44 @@ if (!Array.isArray)
/** Executes the user's step function and resets data & errors. */ /** Executes the user's step function and resets data & errors. */
function doStep() function doStep()
{ {
step(returnable()); step(returnable(undefined, true));
data = []; data = [];
errors = []; errors = [];
} }
/** Gets the delimiter character, which is not inside the quoted field */
function getNextUnqotedDelimiter(nextDelim, quoteSearch, newLine) {
var result = {
nextDelim: undefined,
quoteSearch: undefined
};
// get the next closing quote character
var nextQuoteSearch = input.indexOf(quoteChar, quoteSearch + 1);
// if next delimiter is part of a field enclosed in quotes
if (nextDelim > quoteSearch && nextDelim < nextQuoteSearch && (nextQuoteSearch < newLine || newLine === -1)) {
// get the next delimiter character after this one
var nextNextDelim = input.indexOf(delim, nextQuoteSearch);
// if there is no next delimiter, return default result
if (nextNextDelim === -1) {
return result;
}
// find the next opening quote char position
if (nextNextDelim > nextQuoteSearch) {
nextQuoteSearch = input.indexOf(quoteChar, nextQuoteSearch + 1);
}
// try to get the next delimiter position
result = getNextUnqotedDelimiter(nextNextDelim, nextQuoteSearch, newLine);
} else {
result = {
nextDelim: nextDelim,
quoteSearch: quoteSearch
};
}
return result;
}
}; };
/** Sets the abort flag */ /** Sets the abort flag */
@ -1668,26 +1772,12 @@ if (!Array.isArray)
} }
// If you need to load Papa Parse asynchronously and you also need worker threads, hard-code
// the script path here. See: https://github.com/mholt/PapaParse/issues/87#issuecomment-57885358
function getScriptPath()
{
var scripts = document.getElementsByTagName('script');
return scripts.length ? scripts[scripts.length - 1].src : '';
}
function newWorker() function newWorker()
{ {
if (!Papa.WORKERS_SUPPORTED) if (!Papa.WORKERS_SUPPORTED)
return false; return false;
if (!LOADED_SYNC && Papa.SCRIPT_PATH === null)
throw new Error( var workerUrl = getWorkerBlob();
'Script path cannot be determined automatically when Papa Parse is loaded asynchronously. ' +
'You need to set Papa.SCRIPT_PATH manually.'
);
var workerUrl = Papa.SCRIPT_PATH || AUTO_SCRIPT_PATH;
// Append 'papaworker' to the search string to tell papaparse that this is our worker.
workerUrl += (workerUrl.indexOf('?') !== -1 ? '&' : '?') + 'papaworker';
var w = new global.Worker(workerUrl); var w = new global.Worker(workerUrl);
w.onmessage = mainThreadReceivedMessage; w.onmessage = mainThreadReceivedMessage;
w.id = workerIdCounter++; w.id = workerIdCounter++;
@ -1722,7 +1812,7 @@ if (!Array.isArray)
for (var i = 0; i < msg.results.data.length; i++) for (var i = 0; i < msg.results.data.length; i++)
{ {
worker.userStep({ worker.userStep({
data: [msg.results.data[i]], data: msg.results.data[i],
errors: msg.results.errors, errors: msg.results.errors,
meta: msg.results.meta meta: msg.results.meta
}, handle); }, handle);
@ -1751,7 +1841,7 @@ if (!Array.isArray)
} }
function notImplemented() { function notImplemented() {
throw 'Not implemented.'; throw new Error('Not implemented.');
} }
/** Callback when worker thread receives a message */ /** Callback when worker thread receives a message */

13
package.json

@ -1,6 +1,6 @@
{ {
"name": "papaparse", "name": "papaparse",
"version": "4.6.3", "version": "5.3.2",
"description": "Fast and powerful CSV parser for the browser that supports web workers and streaming large files. Converts CSV to JSON and JSON to CSV.", "description": "Fast and powerful CSV parser for the browser that supports web workers and streaming large files. Converts CSV to JSON and JSON to CSV.",
"keywords": [ "keywords": [
"csv", "csv",
@ -42,17 +42,16 @@
"eslint": "^4.19.1", "eslint": "^4.19.1",
"grunt": "^1.0.2", "grunt": "^1.0.2",
"grunt-contrib-uglify": "^3.3.0", "grunt-contrib-uglify": "^3.3.0",
"mocha": "^3.5.0", "mocha": "^5.2.0",
"mocha-phantomjs": "^4.1.0", "mocha-headless-chrome": "^4.0.0",
"open": "0.0.5", "open": "7.0.0",
"phantomjs-prebuilt": "^2.1.16",
"serve-static": "^1.7.1" "serve-static": "^1.7.1"
}, },
"scripts": { "scripts": {
"lint": "eslint --no-ignore papaparse.js Gruntfile.js .eslintrc.js 'tests/**/*.js'", "lint": "eslint --no-ignore papaparse.js Gruntfile.js .eslintrc.js 'tests/**/*.js'",
"test-browser": "node tests/test.js", "test-browser": "node tests/test.js",
"test-phantomjs": "node tests/test.js --phantomjs", "test-mocha-headless-chrome": "node tests/test.js --mocha-headless-chrome",
"test-node": "mocha tests/node-tests.js tests/test-cases.js", "test-node": "mocha tests/node-tests.js tests/test-cases.js",
"test": "npm run lint && npm run test-node && npm run test-phantomjs" "test": "npm run lint && npm run test-node && npm run test-mocha-headless-chrome"
} }
} }

330
papaparse.js

@ -1,19 +1,10 @@
/* @license /* @license
Papa Parse Papa Parse
v4.6.3 v5.3.2
https://github.com/mholt/PapaParse https://github.com/mholt/PapaParse
License: MIT License: MIT
*/ */
// Polyfills
// https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/isArray#Polyfill
if (!Array.isArray)
{
Array.isArray = function(arg) {
return Object.prototype.toString.call(arg) === '[object Array]';
};
}
(function(root, factory) (function(root, factory)
{ {
/* globals define */ /* globals define */
@ -34,7 +25,10 @@ if (!Array.isArray)
// Browser globals (root is window) // Browser globals (root is window)
root.Papa = factory(); root.Papa = factory();
} }
}(this, function() // in strict mode we cannot access arguments.callee, so we need a named reference to
// stringify the factory method for the blob worker
// eslint-disable-next-line func-name
}(this, function moduleFactory()
{ {
'use strict'; 'use strict';
@ -51,9 +45,15 @@ if (!Array.isArray)
return {}; return {};
})(); })();
function getWorkerBlob() {
var URL = global.URL || global.webkitURL || null;
var code = moduleFactory.toString();
return Papa.BLOB_URL || (Papa.BLOB_URL = URL.createObjectURL(new Blob(['(', code, ')();'], {type: 'text/javascript'})));
}
var IS_WORKER = !global.document && !!global.postMessage, var IS_WORKER = !global.document && !!global.postMessage,
IS_PAPA_WORKER = IS_WORKER && /(\?|&)papaworker(=|&|$)/.test(global.location.search), IS_PAPA_WORKER = IS_WORKER && /blob:/i.test((global.location || {}).protocol);
LOADED_SYNC = false, AUTO_SCRIPT_PATH;
var workers = {}, workerIdCounter = 0; var workers = {}, workerIdCounter = 0;
var Papa = {}; var Papa = {};
@ -66,7 +66,6 @@ if (!Array.isArray)
Papa.BYTE_ORDER_MARK = '\ufeff'; Papa.BYTE_ORDER_MARK = '\ufeff';
Papa.BAD_DELIMITERS = ['\r', '\n', '"', Papa.BYTE_ORDER_MARK]; Papa.BAD_DELIMITERS = ['\r', '\n', '"', Papa.BYTE_ORDER_MARK];
Papa.WORKERS_SUPPORTED = !IS_WORKER && !!global.Worker; Papa.WORKERS_SUPPORTED = !IS_WORKER && !!global.Worker;
Papa.SCRIPT_PATH = null; // Must be set by your code if you use workers and this lib is loaded asynchronously
Papa.NODE_STREAM_INPUT = 1; Papa.NODE_STREAM_INPUT = 1;
// Configurable chunk sizes for local and remote files, respectively // Configurable chunk sizes for local and remote files, respectively
@ -184,23 +183,6 @@ if (!Array.isArray)
{ {
global.onmessage = workerThreadReceivedMessage; global.onmessage = workerThreadReceivedMessage;
} }
else if (Papa.WORKERS_SUPPORTED)
{
AUTO_SCRIPT_PATH = getScriptPath();
// Check if the script was loaded synchronously
if (!document.body)
{
// Body doesn't exist yet, must be synchronous
LOADED_SYNC = true;
}
else
{
document.addEventListener('DOMContentLoaded', function() {
LOADED_SYNC = true;
}, true);
}
}
@ -291,9 +273,18 @@ if (!Array.isArray)
/** quote character */ /** quote character */
var _quoteChar = '"'; var _quoteChar = '"';
/** escaped quote character, either "" or <config.escapeChar>" */
var _escapedQuote = _quoteChar + _quoteChar;
/** whether to skip empty lines */ /** whether to skip empty lines */
var _skipEmptyLines = false; var _skipEmptyLines = false;
/** the columns (keys) we expect when we unparse objects */
var _columns = null;
/** whether to prevent outputting cells that can be parsed as formulae by spreadsheet software (Excel and LibreOffice) */
var _escapeFormulae = false;
unpackConfig(); unpackConfig();
var quoteCharRegex = new RegExp(escapeRegExp(_quoteChar), 'g'); var quoteCharRegex = new RegExp(escapeRegExp(_quoteChar), 'g');
@ -306,7 +297,7 @@ if (!Array.isArray)
if (!_input.length || Array.isArray(_input[0])) if (!_input.length || Array.isArray(_input[0]))
return serialize(null, _input, _skipEmptyLines); return serialize(null, _input, _skipEmptyLines);
else if (typeof _input[0] === 'object') else if (typeof _input[0] === 'object')
return serialize(objectKeys(_input[0]), _input, _skipEmptyLines); return serialize(_columns || Object.keys(_input[0]), _input, _skipEmptyLines);
} }
else if (typeof _input === 'object') else if (typeof _input === 'object')
{ {
@ -316,12 +307,14 @@ if (!Array.isArray)
if (Array.isArray(_input.data)) if (Array.isArray(_input.data))
{ {
if (!_input.fields) if (!_input.fields)
_input.fields = _input.meta && _input.meta.fields; _input.fields = _input.meta && _input.meta.fields || _columns;
if (!_input.fields) if (!_input.fields)
_input.fields = Array.isArray(_input.data[0]) _input.fields = Array.isArray(_input.data[0])
? _input.fields ? _input.fields
: objectKeys(_input.data[0]); : typeof _input.data[0] === 'object'
? Object.keys(_input.data[0])
: [];
if (!(Array.isArray(_input.data[0])) && typeof _input.data[0] !== 'object') if (!(Array.isArray(_input.data[0])) && typeof _input.data[0] !== 'object')
_input.data = [_input.data]; // handles input like [1,2,3] or ['asdf'] _input.data = [_input.data]; // handles input like [1,2,3] or ['asdf']
@ -331,7 +324,7 @@ if (!Array.isArray)
} }
// Default (any valid paths should return before this) // Default (any valid paths should return before this)
throw 'exception: Unable to serialize unrecognized input'; throw new Error('Unable to serialize unrecognized input');
function unpackConfig() function unpackConfig()
@ -346,6 +339,7 @@ if (!Array.isArray)
} }
if (typeof _config.quotes === 'boolean' if (typeof _config.quotes === 'boolean'
|| typeof _config.quotes === 'function'
|| Array.isArray(_config.quotes)) || Array.isArray(_config.quotes))
_quotes = _config.quotes; _quotes = _config.quotes;
@ -361,18 +355,21 @@ if (!Array.isArray)
if (typeof _config.header === 'boolean') if (typeof _config.header === 'boolean')
_writeHeader = _config.header; _writeHeader = _config.header;
if (Array.isArray(_config.columns)) {
if (_config.columns.length === 0) throw new Error('Option columns is empty');
_columns = _config.columns;
} }
if (_config.escapeChar !== undefined) {
_escapedQuote = _config.escapeChar + _quoteChar;
}
/** Turns an object's keys into an array */ if (typeof _config.escapeFormulae === 'boolean' || _config.escapeFormulae instanceof RegExp) {
function objectKeys(obj) _escapeFormulae = _config.escapeFormulae instanceof RegExp ? _config.escapeFormulae : /^[=+\-@\t\r].*$/;
{ }
if (typeof obj !== 'object')
return [];
var keys = [];
for (var key in obj)
keys.push(key);
return keys;
} }
/** The double for loop that iterates the data and writes out a CSV string including header row */ /** The double for loop that iterates the data and writes out a CSV string including header row */
@ -447,16 +444,25 @@ if (!Array.isArray)
if (str.constructor === Date) if (str.constructor === Date)
return JSON.stringify(str).slice(1, 25); return JSON.stringify(str).slice(1, 25);
str = str.toString().replace(quoteCharRegex, _quoteChar + _quoteChar); var needsQuotes = false;
if (_escapeFormulae && typeof str === "string" && _escapeFormulae.test(str)) {
str = "'" + str;
needsQuotes = true;
}
var escapedQuoteStr = str.toString().replace(quoteCharRegex, _escapedQuote);
var needsQuotes = (typeof _quotes === 'boolean' && _quotes) needsQuotes = needsQuotes
|| _quotes === true
|| (typeof _quotes === 'function' && _quotes(str, col))
|| (Array.isArray(_quotes) && _quotes[col]) || (Array.isArray(_quotes) && _quotes[col])
|| hasAny(str, Papa.BAD_DELIMITERS) || hasAny(escapedQuoteStr, Papa.BAD_DELIMITERS)
|| str.indexOf(_delimiter) > -1 || escapedQuoteStr.indexOf(_delimiter) > -1
|| str.charAt(0) === ' ' || escapedQuoteStr.charAt(0) === ' '
|| str.charAt(str.length - 1) === ' '; || escapedQuoteStr.charAt(escapedQuoteStr.length - 1) === ' ';
return needsQuotes ? _quoteChar + str + _quoteChar : str; return needsQuotes ? _quoteChar + escapedQuoteStr + _quoteChar : escapedQuoteStr;
} }
function hasAny(str, substrings) function hasAny(str, substrings)
@ -474,6 +480,7 @@ if (!Array.isArray)
this._handle = null; this._handle = null;
this._finished = false; this._finished = false;
this._completed = false; this._completed = false;
this._halted = false;
this._input = null; this._input = null;
this._baseIndex = 0; this._baseIndex = 0;
this._partialLine = ''; this._partialLine = '';
@ -498,6 +505,7 @@ if (!Array.isArray)
chunk = modifiedChunk; chunk = modifiedChunk;
} }
this.isFirstChunk = false; this.isFirstChunk = false;
this._halted = false;
// Rejoin the line we likely just split in two by chunking the file // Rejoin the line we likely just split in two by chunking the file
var aggregate = this._partialLine + chunk; var aggregate = this._partialLine + chunk;
@ -505,8 +513,10 @@ if (!Array.isArray)
var results = this._handle.parse(aggregate, this._baseIndex, !this._finished); var results = this._handle.parse(aggregate, this._baseIndex, !this._finished);
if (this._handle.paused() || this._handle.aborted()) if (this._handle.paused() || this._handle.aborted()) {
this._halted = true;
return; return;
}
var lastIndex = results.meta.cursor; var lastIndex = results.meta.cursor;
@ -532,8 +542,10 @@ if (!Array.isArray)
else if (isFunction(this._config.chunk) && !isFakeChunk) else if (isFunction(this._config.chunk) && !isFakeChunk)
{ {
this._config.chunk(results, this._handle); this._config.chunk(results, this._handle);
if (this._handle.paused() || this._handle.aborted()) if (this._handle.paused() || this._handle.aborted()) {
this._halted = true;
return; return;
}
results = undefined; results = undefined;
this._completeResults = undefined; this._completeResults = undefined;
} }
@ -635,7 +647,7 @@ if (!Array.isArray)
xhr.onerror = bindFunction(this._chunkError, this); xhr.onerror = bindFunction(this._chunkError, this);
} }
xhr.open('GET', this._input, !IS_WORKER); xhr.open(this._config.downloadRequestBody ? 'POST' : 'GET', this._input, !IS_WORKER);
// Headers can only be set when once the request state is OPENED // Headers can only be set when once the request state is OPENED
if (this._config.downloadRequestHeaders) if (this._config.downloadRequestHeaders)
{ {
@ -651,11 +663,10 @@ if (!Array.isArray)
{ {
var end = this._start + this._config.chunkSize - 1; // minus one because byte range is inclusive var end = this._start + this._config.chunkSize - 1; // minus one because byte range is inclusive
xhr.setRequestHeader('Range', 'bytes=' + this._start + '-' + end); xhr.setRequestHeader('Range', 'bytes=' + this._start + '-' + end);
xhr.setRequestHeader('If-None-Match', 'webkit-no-cache'); // https://bugs.webkit.org/show_bug.cgi?id=82672
} }
try { try {
xhr.send(); xhr.send(this._config.downloadRequestBody);
} }
catch (err) { catch (err) {
this._chunkError(err.message); this._chunkError(err.message);
@ -663,8 +674,6 @@ if (!Array.isArray)
if (IS_WORKER && xhr.status === 0) if (IS_WORKER && xhr.status === 0)
this._chunkError(); this._chunkError();
else
this._start += this._config.chunkSize;
}; };
this._chunkLoaded = function() this._chunkLoaded = function()
@ -678,7 +687,9 @@ if (!Array.isArray)
return; return;
} }
this._finished = !this._config.chunkSize || this._start > getFileSize(xhr); // Use chunckSize as it may be a diference on reponse lentgh due to characters with more than 1 byte
this._start += this._config.chunkSize ? this._config.chunkSize : xhr.responseText.length;
this._finished = !this._config.chunkSize || this._start >= getFileSize(xhr);
this.parseChunk(xhr.responseText); this.parseChunk(xhr.responseText);
}; };
@ -694,7 +705,7 @@ if (!Array.isArray)
if (contentRange === null) { // no content range, then finish! if (contentRange === null) { // no content range, then finish!
return -1; return -1;
} }
return parseInt(contentRange.substr(contentRange.lastIndexOf('/') + 1)); return parseInt(contentRange.substring(contentRange.lastIndexOf('/') + 1));
} }
} }
NetworkStreamer.prototype = Object.create(ChunkStreamer.prototype); NetworkStreamer.prototype = Object.create(ChunkStreamer.prototype);
@ -783,8 +794,14 @@ if (!Array.isArray)
{ {
if (this._finished) return; if (this._finished) return;
var size = this._config.chunkSize; var size = this._config.chunkSize;
var chunk = size ? remaining.substr(0, size) : remaining; var chunk;
remaining = size ? remaining.substr(size) : ''; if(size) {
chunk = remaining.substring(0, size);
remaining = remaining.substring(size);
} else {
chunk = remaining;
remaining = '';
}
this._finished = !remaining; this._finished = !remaining;
return this.parseChunk(chunk); return this.parseChunk(chunk);
}; };
@ -898,14 +915,12 @@ if (!Array.isArray)
this._onCsvData = function(results) this._onCsvData = function(results)
{ {
var data = results.data; var data = results.data;
for (var i = 0; i < data.length; i++) { if (!stream.push(data) && !this._handle.paused()) {
if (!stream.push(data[i]) && !this._handle.paused()) {
// the writeable consumer buffer has filled up // the writeable consumer buffer has filled up
// so we need to pause until more items // so we need to pause until more items
// can be processed // can be processed
this._handle.pause(); this._handle.pause();
} }
}
}; };
this._onCsvComplete = function() this._onCsvComplete = function()
@ -994,9 +1009,10 @@ if (!Array.isArray)
function ParserHandle(_config) function ParserHandle(_config)
{ {
// One goal is to minimize the use of regular expressions... // One goal is to minimize the use of regular expressions...
var FLOAT = /^\s*-?(\d*\.?\d+|\d+\.?\d*)(e[-+]?\d+)?\s*$/i; var MAX_FLOAT = Math.pow(2, 53);
var ISO_DATE = /(\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d\.\d+([+-][0-2]\d:[0-5]\d|Z))|(\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d([+-][0-2]\d:[0-5]\d|Z))|(\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d([+-][0-2]\d:[0-5]\d|Z))/; var MIN_FLOAT = -MAX_FLOAT;
var FLOAT = /^\s*-?(\d+\.?|\.\d+|\d+\.\d+)([eE][-+]?\d+)?\s*$/;
var ISO_DATE = /^(\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d\.\d+([+-][0-2]\d:[0-5]\d|Z))|(\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d([+-][0-2]\d:[0-5]\d|Z))|(\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d([+-][0-2]\d:[0-5]\d|Z))$/;
var self = this; var self = this;
var _stepCounter = 0; // Number of times step was called (number of rows parsed) var _stepCounter = 0; // Number of times step was called (number of rows parsed)
var _rowCounter = 0; // Number of rows that have been parsed so far var _rowCounter = 0; // Number of rows that have been parsed so far
@ -1032,9 +1048,11 @@ if (!Array.isArray)
_stepCounter += results.data.length; _stepCounter += results.data.length;
if (_config.preview && _stepCounter > _config.preview) if (_config.preview && _stepCounter > _config.preview)
_parser.abort(); _parser.abort();
else else {
_results.data = _results.data[0];
userStep(_results, self); userStep(_results, self);
} }
}
}; };
} }
@ -1052,7 +1070,7 @@ if (!Array.isArray)
_delimiterError = false; _delimiterError = false;
if (!_config.delimiter) if (!_config.delimiter)
{ {
var delimGuess = guessDelimiter(input, _config.newline, _config.skipEmptyLines, _config.comments); var delimGuess = guessDelimiter(input, _config.newline, _config.skipEmptyLines, _config.comments, _config.delimitersToGuess);
if (delimGuess.successful) if (delimGuess.successful)
_config.delimiter = delimGuess.bestDelimiter; _config.delimiter = delimGuess.bestDelimiter;
else else
@ -1088,13 +1106,22 @@ if (!Array.isArray)
{ {
_paused = true; _paused = true;
_parser.abort(); _parser.abort();
_input = _input.substr(_parser.getCharIndex());
// If it is streaming via "chunking", the reader will start appending correctly already so no need to substring,
// otherwise we can get duplicate content within a row
_input = isFunction(_config.chunk) ? "" : _input.substring(_parser.getCharIndex());
}; };
this.resume = function() this.resume = function()
{ {
if(self.streamer._halted) {
_paused = false; _paused = false;
self.streamer.parseChunk(_input, true); self.streamer.parseChunk(_input, true);
} else {
// Bugfix: #636 In case the processing hasn't halted yet
// wait for it to halt in order to resume
setTimeout(self.resume, 3);
}
}; };
this.aborted = function() this.aborted = function()
@ -1116,6 +1143,16 @@ if (!Array.isArray)
return _config.skipEmptyLines === 'greedy' ? s.join('').trim() === '' : s.length === 1 && s[0].length === 0; return _config.skipEmptyLines === 'greedy' ? s.join('').trim() === '' : s.length === 1 && s[0].length === 0;
} }
function testFloat(s) {
if (FLOAT.test(s)) {
var floatValue = parseFloat(s);
if (floatValue > MIN_FLOAT && floatValue < MAX_FLOAT) {
return true;
}
}
return false;
}
function processResults() function processResults()
{ {
if (_results && _delimiterError) if (_results && _delimiterError)
@ -1126,9 +1163,9 @@ if (!Array.isArray)
if (_config.skipEmptyLines) if (_config.skipEmptyLines)
{ {
for (var i = 0; i < _results.data.length; i++) _results.data = _results.data.filter(function(d) {
if (testEmptyLine(_results.data[i])) return !testEmptyLine(d);
_results.data.splice(i--, 1); });
} }
if (needsHeaderRow()) if (needsHeaderRow())
@ -1146,19 +1183,26 @@ if (!Array.isArray)
{ {
if (!_results) if (!_results)
return; return;
for (var i = 0; needsHeaderRow() && i < _results.data.length; i++)
for (var j = 0; j < _results.data[i].length; j++)
{
var header = _results.data[i][j];
if (_config.trimHeaders) { function addHeader(header, i)
header = header.trim(); {
} if (isFunction(_config.transformHeader))
header = _config.transformHeader(header, i);
_fields.push(header); _fields.push(header);
} }
if (Array.isArray(_results.data[0]))
{
for (var i = 0; needsHeaderRow() && i < _results.data.length; i++)
_results.data[i].forEach(addHeader);
_results.data.splice(0, 1); _results.data.splice(0, 1);
} }
// if _results.data[0] is not an array, we are in a step where _results.data is the row.
else
_results.data.forEach(addHeader);
}
function shouldApplyDynamicTyping(field) { function shouldApplyDynamicTyping(field) {
// Cache function values to avoid calling it for each row // Cache function values to avoid calling it for each row
@ -1176,7 +1220,7 @@ if (!Array.isArray)
return true; return true;
else if (value === 'false' || value === 'FALSE') else if (value === 'false' || value === 'FALSE')
return false; return false;
else if (FLOAT.test(value)) else if (testFloat(value))
return parseFloat(value); return parseFloat(value);
else if (ISO_DATE.test(value)) else if (ISO_DATE.test(value))
return new Date(value); return new Date(value);
@ -1191,15 +1235,15 @@ if (!Array.isArray)
if (!_results || (!_config.header && !_config.dynamicTyping && !_config.transform)) if (!_results || (!_config.header && !_config.dynamicTyping && !_config.transform))
return _results; return _results;
for (var i = 0; i < _results.data.length; i++) function processRow(rowSource, i)
{ {
var row = _config.header ? {} : []; var row = _config.header ? {} : [];
var j; var j;
for (j = 0; j < _results.data[i].length; j++) for (j = 0; j < rowSource.length; j++)
{ {
var field = j; var field = j;
var value = _results.data[i][j]; var value = rowSource[j];
if (_config.header) if (_config.header)
field = j >= _fields.length ? '__parsed_extra' : _fields[j]; field = j >= _fields.length ? '__parsed_extra' : _fields[j];
@ -1218,7 +1262,6 @@ if (!Array.isArray)
row[field] = value; row[field] = value;
} }
_results.data[i] = row;
if (_config.header) if (_config.header)
{ {
@ -1227,23 +1270,34 @@ if (!Array.isArray)
else if (j < _fields.length) else if (j < _fields.length)
addError('FieldMismatch', 'TooFewFields', 'Too few fields: expected ' + _fields.length + ' fields but parsed ' + j, _rowCounter + i); addError('FieldMismatch', 'TooFewFields', 'Too few fields: expected ' + _fields.length + ' fields but parsed ' + j, _rowCounter + i);
} }
return row;
}
var incrementBy = 1;
if (!_results.data.length || Array.isArray(_results.data[0]))
{
_results.data = _results.data.map(processRow);
incrementBy = _results.data.length;
} }
else
_results.data = processRow(_results.data, 0);
if (_config.header && _results.meta) if (_config.header && _results.meta)
_results.meta.fields = _fields; _results.meta.fields = _fields;
_rowCounter += _results.data.length; _rowCounter += incrementBy;
return _results; return _results;
} }
function guessDelimiter(input, newline, skipEmptyLines, comments) function guessDelimiter(input, newline, skipEmptyLines, comments, delimitersToGuess) {
{ var bestDelim, bestDelta, fieldCountPrevRow, maxFieldCount;
var delimChoices = [',', '\t', '|', ';', Papa.RECORD_SEP, Papa.UNIT_SEP];
var bestDelim, bestDelta, fieldCountPrevRow;
for (var i = 0; i < delimChoices.length; i++) delimitersToGuess = delimitersToGuess || [',', '\t', '|', ';', Papa.RECORD_SEP, Papa.UNIT_SEP];
{
var delim = delimChoices[i]; for (var i = 0; i < delimitersToGuess.length; i++) {
var delim = delimitersToGuess[i];
var delta = 0, avgFieldCount = 0, emptyLinesCount = 0; var delta = 0, avgFieldCount = 0, emptyLinesCount = 0;
fieldCountPrevRow = undefined; fieldCountPrevRow = undefined;
@ -1254,23 +1308,19 @@ if (!Array.isArray)
preview: 10 preview: 10
}).parse(input); }).parse(input);
for (var j = 0; j < preview.data.length; j++) for (var j = 0; j < preview.data.length; j++) {
{ if (skipEmptyLines && testEmptyLine(preview.data[j])) {
if (skipEmptyLines && testEmptyLine(preview.data[j]))
{
emptyLinesCount++; emptyLinesCount++;
continue; continue;
} }
var fieldCount = preview.data[j].length; var fieldCount = preview.data[j].length;
avgFieldCount += fieldCount; avgFieldCount += fieldCount;
if (typeof fieldCountPrevRow === 'undefined') if (typeof fieldCountPrevRow === 'undefined') {
{ fieldCountPrevRow = fieldCount;
fieldCountPrevRow = 0;
continue; continue;
} }
else if (fieldCount > 1) else if (fieldCount > 0) {
{
delta += Math.abs(fieldCount - fieldCountPrevRow); delta += Math.abs(fieldCount - fieldCountPrevRow);
fieldCountPrevRow = fieldCount; fieldCountPrevRow = fieldCount;
} }
@ -1279,11 +1329,11 @@ if (!Array.isArray)
if (preview.data.length > 0) if (preview.data.length > 0)
avgFieldCount /= (preview.data.length - emptyLinesCount); avgFieldCount /= (preview.data.length - emptyLinesCount);
if ((typeof bestDelta === 'undefined' || delta > bestDelta) if ((typeof bestDelta === 'undefined' || delta <= bestDelta)
&& avgFieldCount > 1.99) && (typeof maxFieldCount === 'undefined' || avgFieldCount > maxFieldCount) && avgFieldCount > 1.99) {
{
bestDelta = delta; bestDelta = delta;
bestDelim = delim; bestDelim = delim;
maxFieldCount = avgFieldCount;
} }
} }
@ -1297,7 +1347,7 @@ if (!Array.isArray)
function guessLineEndings(input, quoteChar) function guessLineEndings(input, quoteChar)
{ {
input = input.substr(0, 1024 * 1024); // max length 1 MB input = input.substring(0, 1024 * 1024); // max length 1 MB
// Replace all the text inside quotes // Replace all the text inside quotes
var re = new RegExp(escapeRegExp(quoteChar) + '([^]*?)' + escapeRegExp(quoteChar), 'gm'); var re = new RegExp(escapeRegExp(quoteChar) + '([^]*?)' + escapeRegExp(quoteChar), 'gm');
input = input.replace(re, ''); input = input.replace(re, '');
@ -1323,12 +1373,15 @@ if (!Array.isArray)
function addError(type, code, msg, row) function addError(type, code, msg, row)
{ {
_results.errors.push({ var error = {
type: type, type: type,
code: code, code: code,
message: msg, message: msg
row: row };
}); if(row !== undefined) {
error.row = row;
}
_results.errors.push(error);
} }
} }
@ -1350,8 +1403,7 @@ if (!Array.isArray)
var preview = config.preview; var preview = config.preview;
var fastMode = config.fastMode; var fastMode = config.fastMode;
var quoteChar; var quoteChar;
/** Allows for no quoteChar by setting quoteChar to undefined in config */ if (config.quoteChar === undefined || config.quoteChar === null) {
if (config.quoteChar === undefined) {
quoteChar = '"'; quoteChar = '"';
} else { } else {
quoteChar = config.quoteChar; quoteChar = config.quoteChar;
@ -1368,7 +1420,7 @@ if (!Array.isArray)
// Comment character must be valid // Comment character must be valid
if (comments === delim) if (comments === delim)
throw 'Comment character same as delimiter'; throw new Error('Comment character same as delimiter');
else if (comments === true) else if (comments === true)
comments = '#'; comments = '#';
else if (typeof comments !== 'string' else if (typeof comments !== 'string'
@ -1387,7 +1439,7 @@ if (!Array.isArray)
{ {
// For some reason, in Chrome, this speeds things up (!?) // For some reason, in Chrome, this speeds things up (!?)
if (typeof input !== 'string') if (typeof input !== 'string')
throw 'Input must be a string'; throw new Error('Input must be a string');
// We don't need to compute some of these every time parse() is called, // We don't need to compute some of these every time parse() is called,
// but having them in a more local scope seems to perform better // but having them in a more local scope seems to perform better
@ -1415,7 +1467,7 @@ if (!Array.isArray)
cursor += newline.length; cursor += newline.length;
else if (ignoreLastRow) else if (ignoreLastRow)
return returnable(); return returnable();
if (comments && row.substr(0, commentsLen) === comments) if (comments && row.substring(0, commentsLen) === comments)
continue; continue;
if (stepIsFunction) if (stepIsFunction)
{ {
@ -1439,7 +1491,7 @@ if (!Array.isArray)
var nextDelim = input.indexOf(delim, cursor); var nextDelim = input.indexOf(delim, cursor);
var nextNewline = input.indexOf(newline, cursor); var nextNewline = input.indexOf(newline, cursor);
var quoteCharRegex = new RegExp(escapeRegExp(escapeChar) + escapeRegExp(quoteChar), 'g'); var quoteCharRegex = new RegExp(escapeRegExp(escapeChar) + escapeRegExp(quoteChar), 'g');
var quoteSearch; var quoteSearch = input.indexOf(quoteChar, cursor);
// Parser loop // Parser loop
for (;;) for (;;)
@ -1495,15 +1547,27 @@ if (!Array.isArray)
continue; continue;
} }
if(nextDelim !== -1 && nextDelim < (quoteSearch + 1)) {
nextDelim = input.indexOf(delim, (quoteSearch + 1));
}
if(nextNewline !== -1 && nextNewline < (quoteSearch + 1)) {
nextNewline = input.indexOf(newline, (quoteSearch + 1));
}
// Check up to nextDelim or nextNewline, whichever is closest // Check up to nextDelim or nextNewline, whichever is closest
var checkUpTo = nextNewline === -1 ? nextDelim : Math.min(nextDelim, nextNewline); var checkUpTo = nextNewline === -1 ? nextDelim : Math.min(nextDelim, nextNewline);
var spacesBetweenQuoteAndDelimiter = extraSpaces(checkUpTo); var spacesBetweenQuoteAndDelimiter = extraSpaces(checkUpTo);
// Closing quote followed by delimiter or 'unnecessary spaces + delimiter' // Closing quote followed by delimiter or 'unnecessary spaces + delimiter'
if (input[quoteSearch + 1 + spacesBetweenQuoteAndDelimiter] === delim) if (input.substr(quoteSearch + 1 + spacesBetweenQuoteAndDelimiter, delimLen) === delim)
{ {
row.push(input.substring(cursor, quoteSearch).replace(quoteCharRegex, quoteChar)); row.push(input.substring(cursor, quoteSearch).replace(quoteCharRegex, quoteChar));
cursor = quoteSearch + 1 + spacesBetweenQuoteAndDelimiter + delimLen; cursor = quoteSearch + 1 + spacesBetweenQuoteAndDelimiter + delimLen;
// If char after following delimiter is not quoteChar, we find next quote char position
if (input[quoteSearch + 1 + spacesBetweenQuoteAndDelimiter + delimLen] !== quoteChar)
{
quoteSearch = input.indexOf(quoteChar, cursor);
}
nextDelim = input.indexOf(delim, cursor); nextDelim = input.indexOf(delim, cursor);
nextNewline = input.indexOf(newline, cursor); nextNewline = input.indexOf(newline, cursor);
break; break;
@ -1512,11 +1576,12 @@ if (!Array.isArray)
var spacesBetweenQuoteAndNewLine = extraSpaces(nextNewline); var spacesBetweenQuoteAndNewLine = extraSpaces(nextNewline);
// Closing quote followed by newline or 'unnecessary spaces + newLine' // Closing quote followed by newline or 'unnecessary spaces + newLine'
if (input.substr(quoteSearch + 1 + spacesBetweenQuoteAndNewLine, newlineLen) === newline) if (input.substring(quoteSearch + 1 + spacesBetweenQuoteAndNewLine, quoteSearch + 1 + spacesBetweenQuoteAndNewLine + newlineLen) === newline)
{ {
row.push(input.substring(cursor, quoteSearch).replace(quoteCharRegex, quoteChar)); row.push(input.substring(cursor, quoteSearch).replace(quoteCharRegex, quoteChar));
saveRow(quoteSearch + 1 + spacesBetweenQuoteAndNewLine + newlineLen); saveRow(quoteSearch + 1 + spacesBetweenQuoteAndNewLine + newlineLen);
nextDelim = input.indexOf(delim, cursor); // because we may have skipped the nextDelim in the quoted field nextDelim = input.indexOf(delim, cursor); // because we may have skipped the nextDelim in the quoted field
quoteSearch = input.indexOf(quoteChar, cursor); // we search for first quote in next line
if (stepIsFunction) if (stepIsFunction)
{ {
@ -1550,7 +1615,7 @@ if (!Array.isArray)
} }
// Comment found at start of new line // Comment found at start of new line
if (comments && row.length === 0 && input.substr(cursor, commentsLen) === comments) if (comments && row.length === 0 && input.substring(cursor, cursor + commentsLen) === comments)
{ {
if (nextNewline === -1) // Comment ends at EOF if (nextNewline === -1) // Comment ends at EOF
return returnable(); return returnable();
@ -1565,6 +1630,7 @@ if (!Array.isArray)
{ {
row.push(input.substring(cursor, nextDelim)); row.push(input.substring(cursor, nextDelim));
cursor = nextDelim + delimLen; cursor = nextDelim + delimLen;
// we look for next delimiter char
nextDelim = input.indexOf(delim, cursor); nextDelim = input.indexOf(delim, cursor);
continue; continue;
} }
@ -1625,7 +1691,7 @@ if (!Array.isArray)
if (ignoreLastRow) if (ignoreLastRow)
return returnable(); return returnable();
if (typeof value === 'undefined') if (typeof value === 'undefined')
value = input.substr(cursor); value = input.substring(cursor);
row.push(value); row.push(value);
cursor = inputLen; // important in case parsing is paused cursor = inputLen; // important in case parsing is paused
pushRow(row); pushRow(row);
@ -1687,26 +1753,12 @@ if (!Array.isArray)
} }
// If you need to load Papa Parse asynchronously and you also need worker threads, hard-code
// the script path here. See: https://github.com/mholt/PapaParse/issues/87#issuecomment-57885358
function getScriptPath()
{
var scripts = document.getElementsByTagName('script');
return scripts.length ? scripts[scripts.length - 1].src : '';
}
function newWorker() function newWorker()
{ {
if (!Papa.WORKERS_SUPPORTED) if (!Papa.WORKERS_SUPPORTED)
return false; return false;
if (!LOADED_SYNC && Papa.SCRIPT_PATH === null)
throw new Error( var workerUrl = getWorkerBlob();
'Script path cannot be determined automatically when Papa Parse is loaded asynchronously. ' +
'You need to set Papa.SCRIPT_PATH manually.'
);
var workerUrl = Papa.SCRIPT_PATH || AUTO_SCRIPT_PATH;
// Append 'papaworker' to the search string to tell papaparse that this is our worker.
workerUrl += (workerUrl.indexOf('?') !== -1 ? '&' : '?') + 'papaworker';
var w = new global.Worker(workerUrl); var w = new global.Worker(workerUrl);
w.onmessage = mainThreadReceivedMessage; w.onmessage = mainThreadReceivedMessage;
w.id = workerIdCounter++; w.id = workerIdCounter++;
@ -1741,7 +1793,7 @@ if (!Array.isArray)
for (var i = 0; i < msg.results.data.length; i++) for (var i = 0; i < msg.results.data.length; i++)
{ {
worker.userStep({ worker.userStep({
data: [msg.results.data[i]], data: msg.results.data[i],
errors: msg.results.errors, errors: msg.results.errors,
meta: msg.results.meta meta: msg.results.meta
}, handle); }, handle);
@ -1770,7 +1822,7 @@ if (!Array.isArray)
} }
function notImplemented() { function notImplemented() {
throw 'Not implemented.'; throw new Error('Not implemented.');
} }
/** Callback when worker thread receives a message */ /** Callback when worker thread receives a message */

4
papaparse.min.js vendored

File diff suppressed because one or more lines are too long

2
player/player.html

@ -4,7 +4,7 @@
<title>Papa Parse Player</title> <title>Papa Parse Player</title>
<meta charset="utf-8"> <meta charset="utf-8">
<link rel="stylesheet" href="player.css"> <link rel="stylesheet" href="player.css">
<script src="http://ajax.googleapis.com/ajax/libs/jquery/2.1.0/jquery.min.js"></script> <script src="http://ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.min.js"></script>
<script src="../papaparse.js"></script> <script src="../papaparse.js"></script>
<script src="player.js"></script> <script src="player.js"></script>
</head> </head>

51
tests/node-tests.js

@ -41,6 +41,42 @@ describe('PapaParse', function() {
assertLongSampleParsedCorrectly(Papa.parse(longSampleRawCsv)); assertLongSampleParsedCorrectly(Papa.parse(longSampleRawCsv));
}); });
it('Pause and resume works (Regression Test for Bug #636)', function(done) {
this.timeout(30000);
var mod200Rows = [
["Etiam a dolor vitae est vestibulum","84","DEF"],
["Etiam a dolor vitae est vestibulum","84","DEF"],
["Lorem ipsum dolor sit","42","ABC"],
["Etiam a dolor vitae est vestibulum","84","DEF"],
["Etiam a dolor vitae est vestibulum","84"],
["Lorem ipsum dolor sit","42","ABC"],
["Etiam a dolor vitae est vestibulum","84","DEF"],
["Etiam a dolor vitae est vestibulum","84","DEF"],
["Lorem ipsum dolor sit","42","ABC"],
["Lorem ipsum dolor sit","42"]
];
var stepped = 0;
var dataRows = [];
Papa.parse(fs.createReadStream(__dirname + '/verylong-sample.csv'), {
step: function(results, parser) {
stepped++;
if (results)
{
parser.pause();
parser.resume();
if (results.data && stepped % 200 === 0) {
dataRows.push(results.data);
}
}
},
complete: function() {
assert.strictEqual(2001, stepped);
assert.deepEqual(mod200Rows, dataRows);
done();
}
});
});
it('asynchronously parsed CSV should be correctly parsed', function(done) { it('asynchronously parsed CSV should be correctly parsed', function(done) {
Papa.parse(longSampleRawCsv, { Papa.parse(longSampleRawCsv, {
complete: function(parsedCsv) { complete: function(parsedCsv) {
@ -126,6 +162,21 @@ describe('PapaParse', function() {
}); });
}); });
it('piped streaming CSV should be correctly parsed when header is true', function(done) {
var data = [];
var readStream = fs.createReadStream(__dirname + '/sample-header.csv', 'utf8');
var csvStream = readStream.pipe(Papa.parse(Papa.NODE_STREAM_INPUT, {header: true}));
csvStream.on('data', function(item) {
data.push(item);
});
csvStream.on('end', function() {
assert.deepEqual(data[0], { title: 'test title 01', name: 'test name 01' });
assert.deepEqual(data[1], { title: '', name: 'test name 02' });
done();
});
});
it('should support pausing and resuming on same tick when streaming', function(done) { it('should support pausing and resuming on same tick when streaming', function(done) {
var rows = []; var rows = [];
Papa.parse(fs.createReadStream(__dirname + '/long-sample.csv', 'utf8'), { Papa.parse(fs.createReadStream(__dirname + '/long-sample.csv', 'utf8'), {

3
tests/sample-header.csv

@ -0,0 +1,3 @@
title,name
test title 01,test name 01
,test name 02
1 title name
2 test title 01 test name 01
3 test name 02

588
tests/test-cases.js

@ -7,6 +7,7 @@ if (typeof module !== 'undefined' && module.exports) {
var assert = chai.assert; var assert = chai.assert;
var BASE_PATH = (typeof document === 'undefined') ? './' : document.getElementById('test-cases').src.replace(/test-cases\.js$/, '');
var RECORD_SEP = String.fromCharCode(30); var RECORD_SEP = String.fromCharCode(30);
var UNIT_SEP = String.fromCharCode(31); var UNIT_SEP = String.fromCharCode(31);
var FILES_ENABLED = false; var FILES_ENABLED = false;
@ -322,6 +323,14 @@ var CORE_PARSER_TESTS = [
errors: [] errors: []
} }
}, },
{
description: "Line starts with unquoted empty field",
input: ',b,c\n"d",e,f',
expected: {
data: [['', 'b', 'c'], ['d', 'e', 'f']],
errors: []
}
},
{ {
description: "Line ends with quoted field", description: "Line ends with quoted field",
input: 'a,b,c\nd,e,f\n"g","h","i"\n"j","k","l"', input: 'a,b,c\nd,e,f\n"g","h","i"\n"j","k","l"',
@ -583,7 +592,7 @@ describe('Core Parser Tests', function() {
function generateTest(test) { function generateTest(test) {
(test.disabled ? it.skip : it)(test.description, function() { (test.disabled ? it.skip : it)(test.description, function() {
var actual = new Papa.Parser(test.config).parse(test.input); var actual = new Papa.Parser(test.config).parse(test.input);
assert.deepEqual(JSON.stringify(actual.errors), JSON.stringify(test.expected.errors)); assert.deepEqual(actual.errors, test.expected.errors);
assert.deepEqual(actual.data, test.expected.data); assert.deepEqual(actual.data, test.expected.data);
}); });
} }
@ -661,6 +670,14 @@ var PARSE_TESTS = [
errors: [] errors: []
} }
}, },
{
description: "Misplaced quotes in data twice, not as opening quotes",
input: 'A,B",C\nD,E",F',
expected: {
data: [['A', 'B"', 'C'], ['D', 'E"', 'F']],
errors: []
}
},
{ {
description: "Mixed slash n and slash r should choose first as precident", description: "Mixed slash n and slash r should choose first as precident",
input: 'a,b,c\nd,e,f\rg,h,i\n', input: 'a,b,c\nd,e,f\rg,h,i\n',
@ -715,6 +732,23 @@ var PARSE_TESTS = [
}] }]
} }
}, },
{
description: "Row with enough fields but blank field in the begining",
input: 'A,B,C\r\n,b1,c1\r\na2,b2,c2',
expected: {
data: [["A", "B", "C"], ['', 'b1', 'c1'], ['a2', 'b2', 'c2']],
errors: []
}
},
{
description: "Row with enough fields but blank field in the begining using headers",
input: 'A,B,C\r\n,b1,c1\r\n,b2,c2',
config: { header: true },
expected: {
data: [{"A": "", "B": "b1", "C": "c1"}, {"A": "", "B": "b2", "C": "c2"}],
errors: []
}
},
{ {
description: "Row with enough fields but blank field at end", description: "Row with enough fields but blank field at end",
input: 'A,B,C\r\na,b,', input: 'A,B,C\r\na,b,',
@ -725,11 +759,20 @@ var PARSE_TESTS = [
} }
}, },
{ {
description: "Header rows are trimmed when trimHeaders is set", description: "Header rows are transformed when transformHeader function is provided",
input: 'A,B,C\r\na,b,c', input: 'A,B,C\r\na,b,c',
config: { header: true, trimHeaders: true }, config: { header: true, transformHeader: function(header) { return header.toLowerCase(); } },
expected: { expected: {
data: [{"A": "a", "B": "b ", "C": "c"}], data: [{"a": "a", "b": "b", "c": "c"}],
errors: []
}
},
{
description: "transformHeader accepts and optional index attribute",
input: 'A,B,C\r\na,b,c',
config: { header: true, transformHeader: function(header, i) { return i % 2 ? header.toLowerCase() : header; } },
expected: {
data: [{"A": "a", "b": "b", "C": "c"}],
errors: [] errors: []
} }
}, },
@ -804,6 +847,16 @@ var PARSE_TESTS = [
errors: [] errors: []
} }
}, },
{
description: "Multi-character delimiter (length 2) with quoted field",
input: 'a, b, "c, e", d',
config: { delimiter: ", " },
notes: "The quotes must be immediately adjacent to the delimiter to indicate a quoted field",
expected: {
data: [['a', 'b', 'c, e', 'd']],
errors: []
}
},
{ {
description: "Callback delimiter", description: "Callback delimiter",
input: 'a$ b$ c', input: 'a$ b$ c',
@ -814,11 +867,11 @@ var PARSE_TESTS = [
} }
}, },
{ {
description: "Dynamic typing converts numeric literals", description: "Dynamic typing converts numeric literals and maintains precision",
input: '1,2.2,1e3\r\n-4,-4.5,-4e-5\r\n-,5a,5-2', input: '1,2.2,1e3\r\n-4,-4.5,-4e-5\r\n-,5a,5-2\r\n16142028098527942586,9007199254740991,-9007199254740992',
config: { dynamicTyping: true }, config: { dynamicTyping: true },
expected: { expected: {
data: [[1, 2.2, 1000], [-4, -4.5, -0.00004], ["-", "5a", "5-2"]], data: [[1, 2.2, 1000], [-4, -4.5, -0.00004], ["-", "5a", "5-2"], ["16142028098527942586", 9007199254740991, "-9007199254740992"]],
errors: [] errors: []
} }
}, },
@ -912,6 +965,39 @@ var PARSE_TESTS = [
errors: [] errors: []
} }
}, },
{
description: "Custom transform accepts column number also",
input: 'A,B,C\r\nd,e,f',
config: {
transform: function(value, column) {
if (column % 2) {
value = value.toLowerCase();
}
return value;
}
},
expected: {
data: [["A","b","C"], ["d","e","f"]],
errors: []
}
},
{
description: "Custom transform accepts header name when using header",
input: 'A,B,C\r\nd,e,f',
config: {
header: true,
transform: function(value, name) {
if (name === 'B') {
value = value.toUpperCase();
}
return value;
}
},
expected: {
data: [{'A': "d", 'B': "E", 'C': "f"}],
errors: []
}
},
{ {
description: "Dynamic typing converts ISO date strings to Dates", description: "Dynamic typing converts ISO date strings to Dates",
input: 'ISO date,long date\r\n2018-05-04T21:08:03.269Z,Fri May 04 2018 14:08:03 GMT-0700 (PDT)\r\n2018-05-08T15:20:22.642Z,Tue May 08 2018 08:20:22 GMT-0700 (PDT)', input: 'ISO date,long date\r\n2018-05-04T21:08:03.269Z,Fri May 04 2018 14:08:03 GMT-0700 (PDT)\r\n2018-05-08T15:20:22.642Z,Tue May 08 2018 08:20:22 GMT-0700 (PDT)',
@ -921,6 +1007,15 @@ var PARSE_TESTS = [
errors: [] errors: []
} }
}, },
{
description: "Dynamic typing skips ISO date strings ocurring in other strings",
input: 'ISO date,String with ISO date\r\n2018-05-04T21:08:03.269Z,The date is 2018-05-04T21:08:03.269Z\r\n2018-05-08T15:20:22.642Z,The date is 2018-05-08T15:20:22.642Z',
config: { dynamicTyping: true },
expected: {
data: [["ISO date", "String with ISO date"], [new Date("2018-05-04T21:08:03.269Z"), "The date is 2018-05-04T21:08:03.269Z"], [new Date("2018-05-08T15:20:22.642Z"), "The date is 2018-05-08T15:20:22.642Z"]],
errors: []
}
},
{ {
description: "Blank line at beginning", description: "Blank line at beginning",
input: '\r\na,b,c\r\nd,e,f', input: '\r\na,b,c\r\nd,e,f',
@ -1169,6 +1264,36 @@ var PARSE_TESTS = [
errors: [] errors: []
} }
}, },
{
description: "Pipe delimiter is guessed correctly choose avgFildCount max one",
notes: "Guessing the delimiter should work choose the min delta one and the max one",
config: {},
input: 'a,b,c\na,b,c|d|e|f',
expected: {
data: [['a', 'b', 'c'], ['a','b','c|d|e|f']],
errors: []
}
},
{
description: "Pipe delimiter is guessed correctly when first field are enclosed in quotes and contain delimiter characters",
notes: "Guessing the delimiter should work if the first field is enclosed in quotes, but others are not",
input: '"Field1,1,1";Field2;"Field3";Field4;Field5;Field6',
config: {},
expected: {
data: [['Field1,1,1','Field2','Field3', 'Field4', 'Field5', 'Field6']],
errors: []
}
},
{
description: "Pipe delimiter is guessed correctly when some fields are enclosed in quotes and contain delimiter characters and escaoped quotes",
notes: "Guessing the delimiter should work even if the first field is not enclosed in quotes, but others are",
input: 'Field1;Field2;"Field,3,""3,3";Field4;Field5;"Field6,6"',
config: {},
expected: {
data: [['Field1','Field2','Field,3,"3,3', 'Field4', 'Field5', 'Field6,6']],
errors: []
}
},
{ {
description: "Single quote as quote character", description: "Single quote as quote character",
notes: "Must parse correctly when single quote is specified as a quote character", notes: "Must parse correctly when single quote is specified as a quote character",
@ -1384,6 +1509,22 @@ var PARSE_TESTS = [
data: [['a', 'b'], ['c', 'd'], [' , ', ','], ['" "', '""']], data: [['a', 'b'], ['c', 'd'], [' , ', ','], ['" "', '""']],
errors: [] errors: []
} }
},
{
description: "Quoted fields with spaces between closing quote and next delimiter and contains delimiter",
input: 'A,",B" ,C,D\nE,F,G,H',
expected: {
data: [['A', ',B', 'C', 'D'],['E', 'F', 'G', 'H']],
errors: []
}
},
{
description: "Quoted fields with spaces between closing quote and newline and contains newline",
input: 'a,b,"c\n" \nd,e,f',
expected: {
data: [['a', 'b', 'c\n'], ['d', 'e', 'f']],
errors: []
}
} }
]; ];
@ -1395,7 +1536,7 @@ describe('Parse Tests', function() {
if (test.expected.meta) { if (test.expected.meta) {
assert.deepEqual(actual.meta, test.expected.meta); assert.deepEqual(actual.meta, test.expected.meta);
} }
assert.deepEqual(JSON.stringify(actual.errors), JSON.stringify(test.expected.errors)); assert.deepEqual(actual.errors, test.expected.errors);
assert.deepEqual(actual.data, test.expected.data); assert.deepEqual(actual.data, test.expected.data);
}); });
} }
@ -1422,7 +1563,7 @@ var PARSE_ASYNC_TESTS = [
}, },
{ {
description: "Simple download", description: "Simple download",
input: "sample.csv", input: BASE_PATH + "sample.csv",
config: { config: {
download: true download: true
}, },
@ -1434,7 +1575,7 @@ var PARSE_ASYNC_TESTS = [
}, },
{ {
description: "Simple download + worker", description: "Simple download + worker",
input: "tests/sample.csv", input: BASE_PATH + "sample.csv",
config: { config: {
worker: true, worker: true,
download: true download: true
@ -1467,6 +1608,31 @@ var PARSE_ASYNC_TESTS = [
data: [['A','B','C'],['X','Y','Z']], data: [['A','B','C'],['X','Y','Z']],
errors: [] errors: []
} }
},
{
description: "File with a few regular and lots of empty lines",
disabled: !FILES_ENABLED,
input: FILES_ENABLED ? new File(["A,B,C\nX,Y,Z\n" + new Array(500000).fill(",,").join("\n")], "sample.csv") : false,
config: {
skipEmptyLines: "greedy"
},
expected: {
data: [['A','B','C'],['X','Y','Z']],
errors: []
}
},
{
description: "File with a few regular and lots of empty lines + worker",
disabled: !FILES_ENABLED,
input: FILES_ENABLED ? new File(["A,B,C\nX,Y,Z\n" + new Array(500000).fill(",,").join("\n")], "sample.csv") : false,
config: {
worker: true,
skipEmptyLines: "greedy"
},
expected: {
data: [['A','B','C'],['X','Y','Z']],
errors: []
}
} }
]; ];
@ -1476,7 +1642,7 @@ describe('Parse Async Tests', function() {
var config = test.config; var config = test.config;
config.complete = function(actual) { config.complete = function(actual) {
assert.deepEqual(JSON.stringify(actual.errors), JSON.stringify(test.expected.errors)); assert.deepEqual(actual.errors, test.expected.errors);
assert.deepEqual(actual.data, test.expected.data); assert.deepEqual(actual.data, test.expected.data);
done(); done();
}; };
@ -1582,6 +1748,12 @@ var UNPARSE_TESTS = [
config: { delimiter: ', ' }, config: { delimiter: ', ' },
expected: 'A, b, c\r\nd, e, f' expected: 'A, b, c\r\nd, e, f'
}, },
{
description: "Custom delimiter (Multi-character), field contains custom delimiter",
input: [['A', 'b', 'c'], ['d', 'e', 'f, g']],
config: { delimiter: ', ' },
expected: 'A, b, c\r\nd, e, "f, g"'
},
{ {
description: "Bad delimiter (\\n)", description: "Bad delimiter (\\n)",
notes: "Should default to comma", notes: "Should default to comma",
@ -1631,6 +1803,18 @@ var UNPARSE_TESTS = [
config: { quotes: [true, false, true] }, config: { quotes: [true, false, true] },
expected: '"Col1",Col2,"Col3"\r\n"a",b,"c"\r\n"d",e,"f"' expected: '"Col1",Col2,"Col3"\r\n"a",b,"c"\r\n"d",e,"f"'
}, },
{
description: "Force quotes around string fields only",
input: [['a', 'b', 'c'], ['d', 10, true]],
config: { quotes: function(value) { return typeof value === 'string'; } },
expected: '"a","b","c"\r\n"d",10,true'
},
{
description: "Force quotes around string fields only (with header row)",
input: [{ "Col1": "a", "Col2": "b", "Col3": "c" }, { "Col1": "d", "Col2": 10, "Col3": true }],
config: { quotes: function(value) { return typeof value === 'string'; } },
expected: '"Col1","Col2","Col3"\r\n"a","b","c"\r\n"d",10,true'
},
{ {
description: "Empty input", description: "Empty input",
input: [], input: [],
@ -1677,9 +1861,9 @@ var UNPARSE_TESTS = [
}, },
{ {
description: "Returns without rows with no content when skipEmptyLines is 'greedy'", description: "Returns without rows with no content when skipEmptyLines is 'greedy'",
input: [[null, ' '], [], ['1', '2']], input: [[null, ' '], [], ['1', '2']].concat(new Array(500000).fill(['', ''])).concat([['3', '4']]),
config: {skipEmptyLines: 'greedy'}, config: {skipEmptyLines: 'greedy'},
expected: '1,2' expected: '1,2\r\n3,4'
}, },
{ {
description: "Returns empty rows when empty rows are passed and skipEmptyLines is false with headers", description: "Returns empty rows when empty rows are passed and skipEmptyLines is false with headers",
@ -1698,7 +1882,87 @@ var UNPARSE_TESTS = [
input: [{a: null, b: ' '}, {}, {a: '1', b: '2'}], input: [{a: null, b: ' '}, {}, {a: '1', b: '2'}],
config: {skipEmptyLines: 'greedy', header: true}, config: {skipEmptyLines: 'greedy', header: true},
expected: 'a,b\r\n1,2' expected: 'a,b\r\n1,2'
} },
{
description: "Column option used to manually specify keys",
notes: "Should not throw any error when attempting to serialize key not present in object. Columns are different than keys of the first object. When an object is missing a key then the serialized value should be an empty string.",
input: [{a: 1, b: '2'}, {}, {a: 3, d: 'd', c: 4,}],
config: {columns: ['a', 'b', 'c']},
expected: 'a,b,c\r\n1,2,\r\n\r\n3,,4'
},
{
description: "Column option used to manually specify keys with input type object",
notes: "Should not throw any error when attempting to serialize key not present in object. Columns are different than keys of the first object. When an object is missing a key then the serialized value should be an empty string.",
input: { data: [{a: 1, b: '2'}, {}, {a: 3, d: 'd', c: 4,}] },
config: {columns: ['a', 'b', 'c']},
expected: 'a,b,c\r\n1,2,\r\n\r\n3,,4'
},
{
description: "Use different escapeChar",
input: [{a: 'foo', b: '"quoted"'}],
config: {header: false, escapeChar: '\\'},
expected: 'foo,"\\"quoted\\""'
},
{
description: "test defeault escapeChar",
input: [{a: 'foo', b: '"quoted"'}],
config: {header: false},
expected: 'foo,"""quoted"""'
},
{
description: "Escape formulae",
input: [{ "Col1": "=danger", "Col2": "@danger", "Col3": "safe" }, { "Col1": "safe=safe", "Col2": "+danger", "Col3": "-danger, danger" }, { "Col1": "'+safe", "Col2": "'@safe", "Col3": "safe, safe" }],
config: { escapeFormulae: true },
expected: 'Col1,Col2,Col3\r\n"\'=danger","\'@danger",safe\r\nsafe=safe,"\'+danger","\'-danger, danger"\r\n\'+safe,\'@safe,"safe, safe"'
},
{
description: "Don't escape formulae by default",
input: [{ "Col1": "=danger", "Col2": "@danger", "Col3": "safe" }, { "Col1": "safe=safe", "Col2": "+danger", "Col3": "-danger, danger" }, { "Col1": "'+safe", "Col2": "'@safe", "Col3": "safe, safe" }],
expected: 'Col1,Col2,Col3\r\n=danger,@danger,safe\r\nsafe=safe,+danger,"-danger, danger"\r\n\'+safe,\'@safe,"safe, safe"'
},
{
description: "Escape formulae with forced quotes",
input: [{ "Col1": "=danger", "Col2": "@danger", "Col3": "safe" }, { "Col1": "safe=safe", "Col2": "+danger", "Col3": "-danger, danger" }, { "Col1": "'+safe", "Col2": "'@safe", "Col3": "safe, safe" }],
config: { escapeFormulae: true, quotes: true },
expected: '"Col1","Col2","Col3"\r\n"\'=danger","\'@danger","safe"\r\n"safe=safe","\'+danger","\'-danger, danger"\r\n"\'+safe","\'@safe","safe, safe"'
},
{
description: "Escape formulae with single-quote quoteChar and escapeChar",
input: [{ "Col1": "=danger", "Col2": "@danger", "Col3": "safe" }, { "Col1": "safe=safe", "Col2": "+danger", "Col3": "-danger, danger" }, { "Col1": "'+safe", "Col2": "'@safe", "Col3": "safe, safe" }],
config: { escapeFormulae: true, quoteChar: "'", escapeChar: "'" },
expected: 'Col1,Col2,Col3\r\n\'\'\'=danger\',\'\'\'@danger\',safe\r\nsafe=safe,\'\'\'+danger\',\'\'\'-danger, danger\'\r\n\'\'+safe,\'\'@safe,\'safe, safe\''
},
{
description: "Escape formulae with single-quote quoteChar and escapeChar and forced quotes",
input: [{ "Col1": "=danger", "Col2": "@danger", "Col3": "safe" }, { "Col1": "safe=safe", "Col2": "+danger", "Col3": "-danger, danger" }, { "Col1": "'+safe", "Col2": "'@safe", "Col3": "safe, safe" }],
config: { escapeFormulae: true, quotes: true, quoteChar: "'", escapeChar: "'" },
expected: '\'Col1\',\'Col2\',\'Col3\'\r\n\'\'\'=danger\',\'\'\'@danger\',\'safe\'\r\n\'safe=safe\',\'\'\'+danger\',\'\'\'-danger, danger\'\r\n\'\'\'+safe\',\'\'\'@safe\',\'safe, safe\''
},
// new escapeFormulae values:
{
description: "Escape formulae with tab and carriage-return",
input: [{ "Col1": "\tdanger", "Col2": "\rdanger,", "Col3": "safe\t\r" }],
config: { escapeFormulae: true },
expected: 'Col1,Col2,Col3\r\n"\'\tdanger","\'\rdanger,","safe\t\r"'
},
{
description: "Escape formulae with tab and carriage-return, with forced quotes",
input: [{ "Col1": " danger", "Col2": "\rdanger,", "Col3": "safe\t\r" }],
config: { escapeFormulae: true, quotes: true },
expected: '"Col1","Col2","Col3"\r\n"\'\tdanger","\'\rdanger,","safe\t\r"'
},
{
description: "Escape formulae with tab and carriage-return, with single-quote quoteChar and escapeChar",
input: [{ "Col1": " danger", "Col2": "\rdanger,", "Col3": "safe, \t\r" }],
config: { escapeFormulae: true, quoteChar: "'", escapeChar: "'" },
expected: 'Col1,Col2,Col3\r\n\'\'\'\tdanger\',\'\'\'\rdanger,\',\'safe, \t\r\''
},
{
description: "Escape formulae with tab and carriage-return, with single-quote quoteChar and escapeChar and forced quotes",
input: [{ "Col1": " danger", "Col2": "\rdanger,", "Col3": "safe, \t\r" }],
config: { escapeFormulae: true, quotes: true, quoteChar: "'", escapeChar: "'" },
expected: '\'Col1\',\'Col2\',\'Col3\'\r\n\'\'\'\tdanger\',\'\'\'\rdanger,\',\'safe, \t\r\''
},
]; ];
describe('Unparse Tests', function() { describe('Unparse Tests', function() {
@ -1727,6 +1991,151 @@ describe('Unparse Tests', function() {
var CUSTOM_TESTS = [ var CUSTOM_TESTS = [
{
description: "Pause and resume works (Regression Test for Bug #636)",
disabled: !XHR_ENABLED,
timeout: 30000,
expected: [2001, [
["Etiam a dolor vitae est vestibulum","84","DEF"],
["Etiam a dolor vitae est vestibulum","84","DEF"],
["Lorem ipsum dolor sit","42","ABC"],
["Etiam a dolor vitae est vestibulum","84","DEF"],
["Etiam a dolor vitae est vestibulum","84"],
["Lorem ipsum dolor sit","42","ABC"],
["Etiam a dolor vitae est vestibulum","84","DEF"],
["Etiam a dolor vitae est vestibulum","84","DEF"],
["Lorem ipsum dolor sit","42","ABC"],
["Lorem ipsum dolor sit","42"]
], 0],
run: function(callback) {
var stepped = 0;
var dataRows = [];
var errorCount = 0;
var output = [];
Papa.parse(BASE_PATH + "verylong-sample.csv", {
download: true,
step: function(results, parser) {
stepped++;
if (results)
{
parser.pause();
parser.resume();
if (results.data && stepped % 200 === 0) {
dataRows.push(results.data);
}
}
},
complete: function() {
output.push(stepped);
output.push(dataRows);
output.push(errorCount);
callback(output);
}
});
}
},
{
description: "Pause and resume works for chunks with NetworkStreamer",
disabled: !XHR_ENABLED,
timeout: 30000,
expected: ["Etiam a dolor vitae est vestibulum", "84", "DEF"],
run: function(callback) {
var chunkNum = 0;
Papa.parse(BASE_PATH + "verylong-sample.csv", {
download: true,
chunkSize: 1000,
chunk: function(results, parser) {
chunkNum++;
parser.pause();
if (chunkNum === 2) {
callback(results.data[0]);
return;
}
parser.resume();
},
complete: function() {
callback(new Error("Should have found matched row before parsing whole file"));
}
});
}
},
{
description: "Pause and resume works for chunks with FileStreamer",
disabled: !XHR_ENABLED,
timeout: 30000,
expected: ["Etiam a dolor vitae est vestibulum", "84", "DEF"],
run: function(callback) {
var chunkNum = 0;
var xhr = new XMLHttpRequest();
xhr.onload = function() {
Papa.parse(new File([xhr.responseText], './verylong-sample.csv'), {
chunkSize: 1000,
chunk: function(results, parser) {
chunkNum++;
parser.pause();
if (chunkNum === 2) {
callback(results.data[0]);
return;
}
parser.resume();
},
complete: function() {
callback(new Error("Should have found matched row before parsing whole file"));
}
});
};
xhr.open("GET", BASE_PATH + "verylong-sample.csv");
try {
xhr.send();
} catch (err) {
callback(err);
return;
}
}
},
{
description: "Pause and resume works for chunks with StringStreamer",
disabled: !XHR_ENABLED,
timeout: 30000,
// Test also with string as byte size may be diferent
expected: ["Etiam a dolor vitae est vestibulum", "84", "DEF"],
run: function(callback) {
var chunkNum = 0;
var xhr = new XMLHttpRequest();
xhr.onload = function() {
Papa.parse(xhr.responseText, {
chunkSize: 1000,
chunk: function(results, parser) {
chunkNum++;
parser.pause();
if (chunkNum === 2) {
callback(results.data[0]);
return;
}
parser.resume();
},
complete: function() {
callback(new Error("Should have found matched row before parsing whole file"));
}
});
};
xhr.open("GET", BASE_PATH + "verylong-sample.csv");
try {
xhr.send();
} catch (err) {
callback(err);
return;
}
}
},
{ {
description: "Complete is called with all results if neither step nor chunk is defined", description: "Complete is called with all results if neither step nor chunk is defined",
expected: [['A', 'b', 'c'], ['d', 'E', 'f'], ['G', 'h', 'i']], expected: [['A', 'b', 'c'], ['d', 'E', 'f'], ['G', 'h', 'i']],
@ -1755,13 +2164,93 @@ var CUSTOM_TESTS = [
}); });
} }
}, },
{
description: "Data is correctly parsed with steps",
expected: [['A', 'b', 'c'], ['d', 'E', 'f']],
run: function(callback) {
var data = [];
Papa.parse('A,b,c\nd,E,f', {
step: function(results) {
data.push(results.data);
},
complete: function() {
callback(data);
}
});
}
},
{
description: "Data is correctly parsed with steps (headers)",
expected: [{One: 'A', Two: 'b', Three: 'c'}, {One: 'd', Two: 'E', Three: 'f'}],
run: function(callback) {
var data = [];
Papa.parse('One,Two,Three\nA,b,c\nd,E,f', {
header: true,
step: function(results) {
data.push(results.data);
},
complete: function() {
callback(data);
}
});
}
},
{
description: "Data is correctly parsed with steps and worker (headers)",
expected: [{One: 'A', Two: 'b', Three: 'c'}, {One: 'd', Two: 'E', Three: 'f'}],
run: function(callback) {
var data = [];
Papa.parse('One,Two,Three\nA,b,c\nd,E,f', {
header: true,
worker: true,
step: function(results) {
data.push(results.data);
},
complete: function() {
callback(data);
}
});
}
},
{
description: "Data is correctly parsed with steps and worker",
expected: [['A', 'b', 'c'], ['d', 'E', 'f']],
run: function(callback) {
var data = [];
Papa.parse('A,b,c\nd,E,f', {
worker: true,
step: function(results) {
data.push(results.data);
},
complete: function() {
callback(data);
}
});
}
},
{
description: "Data is correctly parsed with steps when skipping empty lines",
expected: [['A', 'b', 'c'], ['d', 'E', 'f']],
run: function(callback) {
var data = [];
Papa.parse('A,b,c\n\nd,E,f', {
skipEmptyLines: true,
step: function(results) {
data.push(results.data);
},
complete: function() {
callback(data);
}
});
}
},
{ {
description: "Step is called with the contents of the row", description: "Step is called with the contents of the row",
expected: ['A', 'b', 'c'], expected: ['A', 'b', 'c'],
run: function(callback) { run: function(callback) {
Papa.parse('A,b,c', { Papa.parse('A,b,c', {
step: function(response) { step: function(response) {
callback(response.data[0]); callback(response.data);
} }
}); });
} }
@ -1787,7 +2276,7 @@ var CUSTOM_TESTS = [
disabled: !XHR_ENABLED, disabled: !XHR_ENABLED,
run: function(callback) { run: function(callback) {
var updates = []; var updates = [];
Papa.parse("/tests/long-sample.csv", { Papa.parse(BASE_PATH + "long-sample.csv", {
download: true, download: true,
step: function(response) { step: function(response) {
updates.push(response.meta.cursor); updates.push(response.meta.cursor);
@ -1804,7 +2293,7 @@ var CUSTOM_TESTS = [
disabled: !XHR_ENABLED, disabled: !XHR_ENABLED,
run: function(callback) { run: function(callback) {
var updates = []; var updates = [];
Papa.parse("/tests/long-sample.csv", { Papa.parse(BASE_PATH + "long-sample.csv", {
download: true, download: true,
chunkSize: 500, chunkSize: 500,
step: function(response) { step: function(response) {
@ -1822,7 +2311,7 @@ var CUSTOM_TESTS = [
disabled: !XHR_ENABLED, disabled: !XHR_ENABLED,
run: function(callback) { run: function(callback) {
var updates = []; var updates = [];
Papa.parse("/tests/long-sample.csv", { Papa.parse(BASE_PATH + "long-sample.csv", {
download: true, download: true,
chunkSize: 500, chunkSize: 500,
worker: true, worker: true,
@ -1841,7 +2330,7 @@ var CUSTOM_TESTS = [
disabled: !XHR_ENABLED, disabled: !XHR_ENABLED,
run: function(callback) { run: function(callback) {
var updates = []; var updates = [];
Papa.parse("/tests/long-sample.csv", { Papa.parse(BASE_PATH + "long-sample.csv", {
download: true, download: true,
chunkSize: 500, chunkSize: 500,
chunk: function(response) { chunk: function(response) {
@ -1859,7 +2348,7 @@ var CUSTOM_TESTS = [
disabled: !XHR_ENABLED, disabled: !XHR_ENABLED,
run: function(callback) { run: function(callback) {
var updates = []; var updates = [];
Papa.parse("/tests/long-sample.csv", { Papa.parse(BASE_PATH + "long-sample.csv", {
download: true, download: true,
chunkSize: 500, chunkSize: 500,
chunk: function(response) { chunk: function(response) {
@ -1983,7 +2472,7 @@ var CUSTOM_TESTS = [
Papa.parse(new File(['A,B,C\nX,"Y\n1\n2\n3",Z'], 'sample.csv'), { Papa.parse(new File(['A,B,C\nX,"Y\n1\n2\n3",Z'], 'sample.csv'), {
chunkSize: 3, chunkSize: 3,
step: function(response) { step: function(response) {
updates.push(response.data[0]); updates.push(response.data);
}, },
complete: function() { complete: function() {
callback(updates); callback(updates);
@ -1998,7 +2487,7 @@ var CUSTOM_TESTS = [
var updates = []; var updates = [];
Papa.parse('A,b,c\nd,E,f\nG,h,i', { Papa.parse('A,b,c\nd,E,f\nG,h,i', {
step: function(response, handle) { step: function(response, handle) {
updates.push(response.data[0]); updates.push(response.data);
handle.abort(); handle.abort();
callback(updates); callback(updates);
}, },
@ -2028,7 +2517,7 @@ var CUSTOM_TESTS = [
var updates = []; var updates = [];
Papa.parse('A,b,c\nd,E,f\nG,h,i', { Papa.parse('A,b,c\nd,E,f\nG,h,i', {
step: function(response, handle) { step: function(response, handle) {
updates.push(response.data[0]); updates.push(response.data);
handle.pause(); handle.pause();
callback(updates); callback(updates);
}, },
@ -2047,7 +2536,7 @@ var CUSTOM_TESTS = [
var first = true; var first = true;
Papa.parse('A,b,c\nd,E,f\nG,h,i', { Papa.parse('A,b,c\nd,E,f\nG,h,i', {
step: function(response, h) { step: function(response, h) {
updates.push(response.data[0]); updates.push(response.data);
if (!first) return; if (!first) return;
handle = h; handle = h;
handle.pause(); handle.pause();
@ -2068,7 +2557,7 @@ var CUSTOM_TESTS = [
disabled: !XHR_ENABLED, disabled: !XHR_ENABLED,
run: function(callback) { run: function(callback) {
var updates = 0; var updates = 0;
Papa.parse("/tests/long-sample.csv", { Papa.parse(BASE_PATH + "long-sample.csv", {
worker: true, worker: true,
download: true, download: true,
chunkSize: 500, chunkSize: 500,
@ -2088,7 +2577,7 @@ var CUSTOM_TESTS = [
disabled: !XHR_ENABLED, disabled: !XHR_ENABLED,
run: function(callback) { run: function(callback) {
var updates = 0; var updates = 0;
Papa.parse("/tests/long-sample.csv", { Papa.parse(BASE_PATH + "long-sample.csv", {
download: true, download: true,
chunkSize: 500, chunkSize: 500,
beforeFirstChunk: function(chunk) { beforeFirstChunk: function(chunk) {
@ -2109,7 +2598,7 @@ var CUSTOM_TESTS = [
disabled: !XHR_ENABLED, disabled: !XHR_ENABLED,
run: function(callback) { run: function(callback) {
var updates = 0; var updates = 0;
Papa.parse("/tests/long-sample.csv", { Papa.parse(BASE_PATH + "long-sample.csv", {
download: true, download: true,
chunkSize: 500, chunkSize: 500,
beforeFirstChunk: function(chunk) { beforeFirstChunk: function(chunk) {
@ -2124,44 +2613,33 @@ var CUSTOM_TESTS = [
} }
}, },
{ {
description: "Should not assume we own the worker unless papaworker is in the search string", description: "Should correctly guess custom delimiter when passed delimiters to guess.",
disabled: typeof Worker === 'undefined', expected: "~",
expected: [false, true, true, true, true],
run: function(callback) { run: function(callback) {
var searchStrings = [ var results = Papa.parse('"A"~"B"~"C"~"D"', {
'', delimitersToGuess: ['~', '@', '%']
'?papaworker',
'?x=1&papaworker',
'?x=1&papaworker&y=1',
'?x=1&papaworker=1'
];
var results = searchStrings.map(function() { return false; });
var workers = [];
// Give it .5s to do something
setTimeout(function() {
workers.forEach(function(w) { w.terminate(); });
callback(results);
}, 500);
searchStrings.forEach(function(searchString, idx) {
var w = new Worker('../papaparse.js' + searchString);
workers.push(w);
w.addEventListener('message', function() {
results[idx] = true;
});
w.postMessage({input: 'a,b,c\n1,2,3'});
}); });
callback(results.meta.delimiter);
}
},
{
description: "Should still correctly guess default delimiters when delimiters to guess are not given.",
expected: ",",
run: function(callback) {
var results = Papa.parse('"A","B","C","D"');
callback(results.meta.delimiter);
} }
} }
]; ];
describe('Custom Tests', function() { describe('Custom Tests', function() {
function generateTest(test) { function generateTest(test) {
(test.disabled ? it.skip : it)(test.description, function(done) { (test.disabled ? it.skip : it)(test.description, function(done) {
if(test.timeout) {
this.timeout(test.timeout);
}
test.run(function(actual) { test.run(function(actual) {
assert.deepEqual(JSON.stringify(actual), JSON.stringify(test.expected)); assert.deepEqual(actual, test.expected);
done(); done();
}); });
}); });

4
tests/test.js

@ -5,8 +5,8 @@ var path = require('path');
var childProcess = require('child_process'); var childProcess = require('child_process');
var server = connect().use(serveStatic(path.join(__dirname, '/..'))).listen(8071, function() { var server = connect().use(serveStatic(path.join(__dirname, '/..'))).listen(8071, function() {
if (process.argv.indexOf('--phantomjs') !== -1) { if (process.argv.indexOf('--mocha-headless-chrome') !== -1) {
childProcess.spawn('node_modules/.bin/mocha-phantomjs', ['http://localhost:8071/tests/tests.html'], { childProcess.spawn('node_modules/.bin/mocha-headless-chrome', ['-f', 'http://localhost:8071/tests/tests.html'], {
stdio: 'inherit' stdio: 'inherit'
}).on('exit', function(code) { }).on('exit', function(code) {
server.close(); server.close();

7
tests/tests.html

@ -9,19 +9,14 @@
<script src="../node_modules/chai/chai.js"></script> <script src="../node_modules/chai/chai.js"></script>
<script>mocha.setup('bdd')</script> <script>mocha.setup('bdd')</script>
<script src="test-cases.js"></script> <script src="test-cases.js" id="test-cases"></script>
</head> </head>
<body> <body>
<div id="mocha"></div> <div id="mocha"></div>
<script> <script>
if (window.mochaPhantomJS) {
mochaPhantomJS.run();
} else {
mocha.checkLeaks(); mocha.checkLeaks();
mocha.run(); mocha.run();
}
</script> </script>
</body> </body>
</html> </html>

2
tests/verylong-sample.csv

@ -1,7 +1,7 @@
placeholder,meaning of life,TLD placeholder,meaning of life,TLD
Lorem ipsum dolor sit,42,ABC Lorem ipsum dolor sit,42,ABC
Etiam a dolor vitae est vestibulum,84,DEF Etiam a dolor vitae est vestibulum,84,DEF
Lorem ipsum dolor sit,42,ABC "Lorem ipsum dolor sit",42,ABC
Etiam a dolor vitae est vestibulum,84,DEF Etiam a dolor vitae est vestibulum,84,DEF
Etiam a dolor vitae est vestibulum,84,DEF Etiam a dolor vitae est vestibulum,84,DEF
Lorem ipsum dolor sit,42,ABC Lorem ipsum dolor sit,42,ABC

1 placeholder meaning of life TLD
2 Lorem ipsum dolor sit 42 ABC
3 Etiam a dolor vitae est vestibulum 84 DEF
4 Lorem ipsum dolor sit 42 ABC
5 Etiam a dolor vitae est vestibulum 84 DEF
6 Etiam a dolor vitae est vestibulum 84 DEF
7 Lorem ipsum dolor sit 42 ABC
Loading…
Cancel
Save