Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve documentation regarding !ref #82

Closed
lszhu opened this issue Jul 9, 2014 · 12 comments
Closed

Improve documentation regarding !ref #82

lszhu opened this issue Jul 9, 2014 · 12 comments

Comments

@lszhu
Copy link

lszhu commented Jul 9, 2014

when I create a sheet with 49 columns, it can only fill the first 20000 rows.
this is the code:

var xlsx = require('xlsx');
var workbook = xlsx.readFile('template.xlsx');

var alpha = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
var sheet = workbook.Sheets[workbook.SheetNames[0]];
//console.log(sheet.A1.v);
console.log(new Date());
for (var i = 2; i < 100000; i++) {
    for (var j = 0; j < 26; j++) {
        var str = Math.floor(Math.random() * 1E+10).toString();
        str += Math.floor(Math.random() * 1E+10).toString();
        sheet[alpha[j] + i] = {v: str, t: 's'};
    }
    for (; j < 49; j++) {
        var str = Math.floor(Math.random() * 1E+10).toString();
        str += Math.floor(Math.random() * 1E+10).toString();
        sheet['A' + alpha[j - 26] + i] = {v: str, t: 's'};
    }
}
xlsx.writeFile(workbook, 'rand.xlsx');
console.log(new Date());
@SheetJSDev
Copy link
Contributor

I ran your script against a really dumb template and it took roughly 2 minutes:

$ node --version
v0.10.29
$ xlsx --version
0.7.7
$ cat t.js
var xlsx = require('xlsx');
var workbook = xlsx.readFile('template.xlsx');

var alpha = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
var sheet = workbook.Sheets[workbook.SheetNames[0]];
//console.log(sheet.A1.v);
console.log(new Date());
for (var i = 2; i < 100000; i++) {
    for (var j = 0; j < 26; j++) {
        var str = Math.floor(Math.random() * 1E+10).toString();
        str += Math.floor(Math.random() * 1E+10).toString();
        sheet[alpha[j] + i] = {v: str, t: 's'};
    }
    for (; j < 49; j++) {
        var str = Math.floor(Math.random() * 1E+10).toString();
        str += Math.floor(Math.random() * 1E+10).toString();
        sheet['A' + alpha[j - 26] + i] = {v: str, t: 's'};
    }
}
console.log(new Date());
sheet['!ref'] = "A1:AW100000"; // <-- i added this line to force it to write all of the cells
xlsx.writeFile(workbook, 'rand.xlsx');
console.log(new Date());
$ node t.js
Tue Jul 08 2014 22:24:55 GMT-0400 (EDT)
Tue Jul 08 2014 22:25:10 GMT-0400 (EDT)
Tue Jul 08 2014 22:26:47 GMT-0400 (EDT)

This is the generated file -- I warn you, it is big! The file is 262.5MB, which may explain why you are seeing issues? On a side note, it took more time to upload to dropbox than to generate the file.

Can you test the following:

  1. Add the line with the comment to your script and run

  2. Change the template so that it only has the header row, then run the script from above

  3. Generate CSV instead of XLSX

Also, can you share the template?

@lszhu
Copy link
Author

lszhu commented Jul 9, 2014

Thank you , it's ok now with the line you added.
but can you explain the line you add in detail? I can't find any documents yet.

@SheetJSDev
Copy link
Contributor

If this explanation makes sense, I'll add it to the readme:

The keys of a worksheet object are A-1 style references to the cell. For example, to get the cell object for B2, you would just access worksheet.B2 or worksheet['B2'].

There are special keys (all starting with !) that correspond to sheet metadata. In particular, !ref is the A-1 style cell range for the worksheet. For example, if the worksheet's range is A1:B2, there are 4 cells (A1, B1, A2, B2). The output functions iterate through the rows and columns as specified in the range. The relevant code segment is in the write_ws_xml_data function: https://github.com/SheetJS/js-xlsx/blob/master/bits/67_wsxml.js#L233

    ... range = safe_decode_range(ws['!ref']) ... // <-- this decodes the range into a range object
    for(var R = range.s.r; R <= range.e.r; ++R) { // <-- walk through each row.  range.s is the start cell (upper-left corner) and range.e is the end cell (lower-right corner)
...
        for(var C = range.s.c; C <= range.e.c; ++C) { // <-- walk through each column

I suspect your template had 20000 rows (or at least, that was the internal range storage). Your original script added cells but didn't update the range, which is why the output functions ignored them. If you want to see the range from your template, print it in the script:

var xlsx = require('xlsx');
var workbook = xlsx.readFile('template.xlsx');
var sheet = workbook.Sheets[workbook.SheetNames[0]];
console.log(sheet["!ref"]);

@lszhu
Copy link
Author

lszhu commented Jul 9, 2014

awesome!
my template was saved from an xls file which had 20000 rows.
the codes are really long and complex, a more detailed guide is preferred, thanks again.

@SheetJSDev
Copy link
Contributor

@lszhu heh if you think this is complex, the XLS parsing is much more intense

@SheetJSDev SheetJSDev changed the title can not process more than 20000 rows Improve documentation regarding !ref Jul 9, 2014
@SheetJSDev
Copy link
Contributor

@lszhu added a short explanation of !ref in the README: https://github.com/SheetJS/js-xlsx#worksheet-object

@vikash52
Copy link

vikash52 commented Apr 8, 2015

I am trying to convert html table to xls and its getting converted to xsl successfully. However, I am facing issues with formula. Formula in the html table is not visible in xsl file, I just see a value at the place of formula.

Refering this example https://github.com/SheetJS/js-xlsx/blob/master/tests/write.js , it doesn't have any implementation for cell formula in xlsx spreadsheet I tried using cell.f = "=SUM(A1+B1)" for the cell C1 and cell.v as the summation value which was 3. But i didn't succeed. With the exported file, when opened in MS excel, the cell contained just the data and when selected, didn't show any formula which i assigned in f(x) field.

Can someone post me a example which actually uses the functions/property '.f' and 'cellFormula'
Will be very helpful. I just need a working example with static values.

@aboodh95
Copy link

@vikash52 did you solve the issue or find a way to solve it

@Crusader4Christ
Copy link

@SheetJSDev
Sorry, but I can't understand this line:
sheet['!ref'] = "A1:AW100000"; // <-- i added this line to force it to write all of the cells
How did you get last range "AW100000"?

@reviewher
Copy link
Contributor

@Crusader4Christ AW100000 is the cell in the 49th column and 100000th row:

> XLSX.utils.decode_range("A1:AW100000")
{ s: { c: 0, r: 0 }, e: { c: 48, r: 99999 } }

The original case from @lszhu used 100k rows and 49 columns, but since the original file specified a range of 20000 cells the rest were omitted.

The range needs to be changed since the writer won't include cells outside of the worksheet's range. The wiki includes some examples of how to update the range: https://github.com/SheetJS/js-xlsx/wiki/General-Utility-Functions

@Crusader4Christ
Copy link

@reviewher So, if I need to work with OO Calc xlsx bigger than 65536 rows I should use update_sheet_range(ws) from https://github.com/SheetJS/js-xlsx/wiki/General-Utility-Functions?

@reviewher
Copy link
Contributor

@Crusader4Christ if you don't know the bottom-right address, update_sheet_range(ws) scans all of the addresses and fixes the worksheet. That is not the most efficient approach but it works in general

If you do know that address, add_to_sheet(ws, bottom_right_cell_address) is more efficient:

var nrows = 100000;
var ncols = 49;
var bottom_right_cell_address = XLSX.utils.encode_cell({r:nrows-1, c:ncols-1});
add_to_sheet(ws, bottom_right_cell_address);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants