Sheetjs: Writing Style information into Excel

Created on 5 Oct 2014  ·  167Comments  ·  Source: SheetJS/sheetjs

I am trying to preserve the style information while I read and then write an excel File.
Seems like the style information is read when I read the file, but unfortunately when I write the file its not preserved.

Thoughts on how I can read and write the style information?

            excelFile = XLSX.readFile(file.path, {
                cellStyles: true
            });

            // some processing here!

            XLSX.writeFile(excelFile, fileName, {
                cellStyles: true
            });

@sheetjs After a conversation with @elad (https://github.com/hubba/js-xlsx/commit/5e9bca78f2b0c54242cefc8a358f3151232941ab#commitcomment-8039333) I was informed that the styling information is only read, and not written back to excel.

Style

Most helpful comment

Where does this stand today as far as merging xlsx-style into this main project?

All 167 comments

@nukulb @elad @hubba After exploring this a bit more, I have a better sense for the roadmap. This is going to be a much larger discussion, but one that we've been putting off for far too long:

Even though we all know this, it should be reiterated: our ultimate goal is to devise a universal representation (referred to as the "Common Spreadsheet Format" in the READMEs) that works for all spreadsheet formats, not just XLSX. If you limit yourself to XLSX support, it's really easy to just persist everything, but there are really 4 different systems you must consider:

  • XLSX / XLSM (XML-based style representation): ECMA-376 Part 1 Section 18.8
  • XLSB (binary representation): [MS-XLSB] 2.1.7.50 Styles
  • XLS (binary): Follow the yellow brick road, starting from [MS-XLS] 2.4.353 XF
  • SpreadsheetML 2003 (XML): The style metadata are stored in the sub-document from the Styles elements. Since there is no OSP-blessed spec, files from the test suite should be used to understand the scope

Currently the code is spread across two libraries: js-xls and js-xlsx (which were developed separately due to licensing concerns that were eventually resolved)

The last N times this conversation came up, the discussion fizzled because the style information is highly linked and requires careful manipulation to preserve integrity across formats.

Pretty much every imaginable feature is persisted in at least 4 different ways (XLSX, XLSB, XLS, XLML). All formats use references to minimize data size. What is the best way to expose these features to the end users?

APIs

Originally, SheetJS was developed to solve a simple problem: "Can we reliably extract the data from an Excel file?" It's a relatively straightforward problem, and none of the hidden style complexities were relevant.

Since that point, the scope expanded quite a bit as it became apparent that we could do much more. The underlying approach allowed for all kinds of cool developments (since we had a unified format, we could build one xlsx writer that "just worked" with XLS and other inputs). But now the downsides are becoming clear.

Should we have an API? If so, what does it look like?

If we continue with the direct object manipulation approach, there are a few ways to go:

1) C-like "pointer" manipulation: storing numbers to be used as references in other structures.

2) Data duplication: store styles in many places, push the complexity to the writing functions

Which is preferable?

@nathanathan @amoki @mchapman @corsaronero @artemryzhov @nvcken @m1sta @kingjt @johnyesberg @ulknight @mgcrea @bolemeus @diginfo @sivyr @bmavity @djmax @christocracy @gcoonrod @clayzermk1 @jokerslab : Since you were involved in a discussion related to style, I'd like to hear your opinions on the matter (kinda sad to see so many names on this list). To focus the discussion:

A) should we build an easy-to-manipulate structure or introduce a series of functions to manipulate the object?

B) Should we be working at the attribute level (e.g. set_cell_bold or cell.bold) or use a bitfield (set_cell_style or cell.style)?

has anyone who know Aspose.Cells? I think we can reference from it about api structure style
Just a idea

In the long run I like the easy-to-manipulate structure approach. Create a very flexible abstract representation of a sheet and have readers/writers contain an approach which maps file formats to this abstract representation. When it comes to the abstract representation I definitley prefer cell.style. Seems more easily serializable. Everyone loves to be able to JSON.stringify.

From a practical perspective I'd most like to be able to take a template based approach. Maintaining a separate template for xlsx, xls, xlsb, and csv isn't a huge issue from a dev perspective. Having a simple consistent api to read, clone, and subtly modify worksheet, range, row, column, and cell objects within a loaded template, regardless of format, is what I was really hoping to see. This means functions that in the short term I'd prefer to see effort put into modifying existing objects.

I pretty much agree with @m1sta:

  • I prefer an easy-to-manipulate structure because I don't see any advantages in the opacity offered by functions; we still have to figure out a representation
  • I prefer cell.style because it isolates style-related properties making it easy to attach them to things (like cells or character ranges)

I also like the template approach, but I'm a little concerned about binary change when reading and writing the same file. Excel stores some style data as themes and references internal indices of colors and "taint," and I'm not entirely sure other (non-XLS) formats do the same. If we don't care about binary change - that is, we don't mind that the original file says "the color is aqua with a 40% taint" and the written file says "the color is some RGB value" - then I see no issue with going the template route.

What if all of the template related data was placed in an 'extended' property on each key item?

workbook: {props: {}, references: {}, extended: {}, 
   sheets: [0: {props: {}, ranges: {}, extended: {}, 
      rows: [0: {props: {}, extended: {}, 
         cells: [0: {value: 123, formula: null, 
            props: {style: {foreground: [{color: "#0055dd"}]}}, 
            extended: {xlsx: "the color is aqua with a 40% taint", xlsb: {}}

... or something similar, such that the original "the color is aqua with a 40% taint" data is re-used if the extended data exists for the file format being written?

Just to make sure I understand, do you mean keep the portable (for lack of better word) representation in props and retain original format-specific representation in extended so that when writing (back?) to that format, the original values can be used?

Yep.

Sounds good to me, especially if we add a toggle to disable it and rely strictly on the portable representation since I recall @SheetJSDev had concerns about introducing cell-level properties due to object size issues (but I might be remembering it wrong).

A 'write priority' key might be handy so that you can dictate whether to use props or extended (if available), and some kind of 'extended data index' to minimise object verbosity might be good too.

I agree.

Another interesting question for me is whether to separate format information (and other properties) from value information. This might allow for a more efficient data structure.

Store the formatting for the column once, indicate which cells it relates to, store the values as a dense array. Might need an additional, optional, 'compress' step?

That'd mean something more akin to...

values: [["First header", "Second header"], [123, 456]]
meta: {generic:{}, xlsx:{}, xlsb:{}}

... in memory, with some smarter support functions.

I think this is or similar to what @SheetJSDev referred to as the C-like "pointer" approach.

I'm not yet sure I prefer it over data duplication, which is a lot more natural pick for me if we go the easy-to-manipulate structure route. It might be though that I don't understand what you mean by a "compress" step. If we go the separate value/style storage way, what would adding a cell with style look like? what would modifying a cell's style look like?

Regarding the size issue: In the web browser, you can either do all of the heavy lifting in the main execution thread or with a Web Worker (http://dev.w3.org/html5/workers/). Objects cannot be shared between workers and the main thread, so the worker stringifies the intermediate object and the main thread parses it:

Reducing the stringified object size allows web workers to handle larger files.

Regarding values/metadata: As discussed in https://github.com/SheetJS/js-xlsx/issues/126, it's not always possible to map between Excel and JS data types easily. For rich text formatting see https://github.com/SheetJS/js-xlsx/issues/74 -- we still need to find an acceptable form.

@elad FYI The theme tint does not exist in the XLS format

As I see it, API definitely would be easier to document than the JS object representation

In #75 I proposed a style object:

{
    bold: true,
    font: 'Arial',
    size: 16,
    fg_color: '#000000',
    bg_color: '#ffffff',
    ...
}

And you (@SheetJSDev) pointed out to XLSB's "text run" concept.

Thinking out loud...

Let's also introduce a text run array:

[
    { f: from_offset, t: to_offset, s: style_object },
    ...
]

Then cell.s stays the style object and cell.x will be the text run array.

Without taking optimization into account just yet, this so far seems reasonable to me.

So, two questions:

  • Do you find any issues with the above?
  • Where does HTML enter the picture? (you mentioned it'll be the hard part)

Also agree with @m1sta regarding the API.

For cell.style, instead of inventing a new object, we should probably directly go for the official CSS spec, with a jQuery-like API (obviously with a strict subset support).

Something like:

myCell.css('background-color', 'red').css('color', 'black').css('font-size', '12px');
myCellB.style = myCell.style;
myCellC.css('color', myCell.css('color'));

For the future, I'd love to be able to easily export the worksheet structure to HTML/PDF along XLS*. I can imagine having some pre-defined (a bit like markdown styles) for our worksheets (GitHub spreadsheet, etc.).

+1 for CSS as the basis for the abstract styling model

Agreed. +1 for CSS here as well. If you're going to abstract away the
details of particular file formats and go with something uniform, CSS
really should be your model for this. There's hardly anything else that
universal to draw from.

On Mon, Oct 6, 2014 at 9:52 AM, m1sta [email protected] wrote:

+1 for CSS as the basis for the abstract styling model


Reply to this email directly or view it on GitHub
https://github.com/SheetJS/js-xlsx/issues/128#issuecomment-58019484.

I agree with using CSS as the model for the API as well. I'm not sure .css is a good name for the function though (is it misleading?), maybe .style is a better name if we store the actual object in .s. No strong preference either way.

How should we handle text runs? Styling can be applied to parts of a cell's text. For the setter extra parameters might work (cell.css(attr, value, [start_offset], [end_offset])), but the getter expects only one parameter. Should there instead be a .css_run function? Any other ideas?

Keep in mind Excel has support for double underline, and AFAICT there is no direct CSS equivalent.

For reference, the rich text types are enumerated in the test file https://github.com/SheetJS/test_files/blob/master/rich_text_stress.xlsx

I opened rich_text_stress.xlsx and noticed that it doesn't show any style for outline and shadow at least in Excel 2010. I also tried saving it as a web page and Excel notified me some features aren't compatible with the format, and sure enough there was no outline, shadow, double, or accounting underline styling. By the way, Preview on Mac OS X doesn't show subscript, superscript, outline, shadow, double, or accounting underline styling.

Maybe we should enumerate all of the style properties a cell can have before deciding how to represent and access them. Here's a quick table mapping some Excel styles to CSS, I will update it as necessary.

| Style | CSS |
| --- | --- |
| Normal | - / font-weight: normal; |
| Bold | font-weight: bold; |
| Italic | font-style: italic; |
| Underline | text-decoration: underline; text-underline-style: single; (?) |
| Size | font-size: <size>; |
| Strike | text-decoration: line-through; |
| Subscript | vertical-align: sub; font-size: smaller; |
| Superscript | vertical-align: super; font-size: smaller; |
| Outline | text-effect: outline; (?) |
| Shadow | text-shadow: auto; (?) |
| Double underling | text-decoration: underline; text-underline-style: double; (?) |
| Accounting underline | text-decoration: underline; text-underline-style: single-accounting; (?) |
| Doubt accounting underline | text-decoration: underline; text-underline-style: double-accounting; (?) |
| Background color | background-color: <color>; |
| Foreground color | color: <color>; |
| Pattern | ? |

I made that file in Excel 2011 and its possible that the windows versions do not support the missing forms.

Oddly, iOS numbers actually renders them: http://i.imgur.com/YsCJNgb.jpg

IIRC excel actually uses a CSS attribute like text-decoration for the features not supported in the browser (with nonstandard values). We could just replicate that

As far as text runs are concerned, we could mirror the VBA interface as I noted in https://github.com/SheetJS/js-xlsx/issues/75#issuecomment-49314941

Sheets("Sheet1").Range("B3").Characters(16,9).Font.Italic = True

(Requesting a set of characters would return an object whose getters and setters work with the original text)

For the few styles that would not directly match the css spec we could use custom vendor prefixes:

text-decoration: -ms-xlsx-accounting-underline;

Like what we have to use today for flexbox:

display: -webkit-flex;

I really don't like the idea of using custom vendor prefixes for internal style representation. :/

The goal as I understand it is to create a common format, that is, an internal format that can be externalized to XLSX, HTML, PDF, etc. If CSS isn't a (near-)perfect answer to the question of internal representation, then we're abusing it. By introducing vendor-specific prefixes into the internal representation I would argue we go against the "common format" concept.

Consider the object pollution you might get. How many vendors are there? Do we want five or six lines just to be able to add accounting underline? What if there are no equivalents for one or more vendors? It also doesn't solve the problem of an actual API. Instead of cell.s.underline = true; or cell.style('underline', true); you get:

cell.css({
    'text-decoration': '-ms-xlsx-accounting-underline'
    ...
});

So I'm again not entirely convinced CSS is what we want here for either API or internal representation.

@elad, vendor prefixes would only be used for xls(*) styles not easily matched by pure css props. I don't see where we would end up with multiple lines, there would only be one specific custom vendor -ms-xls. For the HTML export we could automatically convert theses "missing" CSS prefixed values to classes, that could be handled on the client/theme side (with CSS).

Anyway, that was just a quick idea, so it might not be the best thing to do.

I see what you mean about multiple lines - if the internal representation agrees on a single vendor prefix then sure, we use that. A vendor-specific style ("accounting underline") is mapped to a vendor-specific CSS property ("-ms-xlsx-accounting-underline") invented to support it.

Still, I don't think using a CSS vendor prefix for internal common representation is a good design choice. I think the VBA API mentioned by @SheetJSDev is much cleaner and easier to use. Doesn't it also make more sense to mimic an API that already exists to work with Excel style properties rather than jQuery?

@elad being written in JavaScript, I'd say that this project's main audience is clearly web developers, that for the most part have some jQuery-like experience. VBA-fluent developers are getting very rare (at least in my Web/NodeJS area).

That's a fair point, but I don't think jQuery's popularity should influence all APIs designed for the web and/or node.js. jQuery is a moving target and there's a constant back-and-forth of ideas and concepts, for example Ajax promises.

In any case, considering CSS doesn't offer a direct mapping to cell/text styles, requires vendor-specific properties and values, and in my opinion looks a lot clunkier than an existing API that does the same, I'm not convinced it's a good choice here for either internal representation or function interface.

Now letting others chime in and @SheetJSDev can decide. :)

@elad being written in JavaScript, I'd say that this project's main audience is clearly web developers

@mgcrea There are strategic reasons unrelated to the audience. Writing code to solve individual features is relatively straightforward. The hard part is finding real-world files and strange corner cases. The neat thing about a JavaScript and HTML5 solution is that pretty much anyone can try it on their files (there are no security issues because the files and data are never sent to a server, and there are no installation issues since no external plugins are required). If we started this in C or python or Java, people would either have to install something or send files to a remote server (and we would have far fewer testers). And of course, thanks to node, we can also write server processes and neat tools like the command-line "j".

I don't think jQuery's popularity should influence all APIs designed for the web and/or node.js

There is no real culture of JS in areas like scientific computing or data analysis, so we are starting from a blank slate. Since future projects may turn to our example, it's better to discuss now.

With regard to formatting substrings, I agree with @elad (I think). Assume the property in question is an array, always. Very easy to [].join()

This feels like another situation where storing the values and the formatting in two separate but mirrored structures makes sense to me. In the values array we might see a cell represented as ["First", "Second"] and then in the formatting array we might see ["color: blue", "color: red"] or ["<span style='color:red'>{#}</span>", "<span style='color:red'>{#}</span>"]. You could just as easily assume direct CSS as the format if the first character in the format string isn't an <.

Also, I said CSS earlier, but I wonder whether we should be thinking LESS instead?

Here are some useful links from Excel's API documentation:

In the example code posted @SheetJSDev posted:

Sheets("Sheet1").Range("B3").Characters(16,9).Font.Italic = True

The Range object is returned by Range("B3") and the Characters object is returned by Characters(16,9). The Range object contains both Font and Interior. The Characters object contains only Font. The Interior object seems to be cell-level styling and is where the pattern is kept. The Font object has the stuff we discussed above.

This feels like a very clean and simple API. Why do we insist on either CSS or LESS for this? :)

@SheetJSDev, I'm interested in your thoughts on this since I assume you're most familiar with the actual specification (not just Excel's implementation of it).

Finally in front of a computer :)

Using Excel 2011 with the rich_text_stress.xlsx file, copy the cells A1:B15 and take a peek at the clipboard (using https://www.npmjs.org/package/pb, run pb html). I've saved the content to a gist

@elad @mgcrea Excel encodes the text attributes as follows:

  • bold: font-weight:700 (normally 400)
  • italics: font-style: italic
  • underline: text-decoration:underline; text-underline-style:single;
  • font size: font-size
  • strike-through: <s> tag
  • subscript: <sub> tag
  • superscript: <sup> tag
  • outline: text-effect:outline;
  • shadow: text-shadow:auto;
  • double-underline: text-decoration:underline; text-underline-style:double;
  • accounting: text-decoration:underline; text-underline-style:single-accounting;
  • double-accounting: text-decoration:underline; text-underline-style:double-accounting;

In fact, it appears that they use the CSS class names xl### where the number directly corresponds to style records and they use the vendor prefix mso-

@m1sta The major issue I have with storing the formatting information separately from the text information is that keeping a consistent structure seems to be messier than necessary. Consider the text "foobar" where "bar" is bolded. Now you go back and change the underlying text to "foobarbazquz". How does the style update? What happens if the style is inconsistent with the underlying text?

A direct translation of how XLS and XLSB store rich text would look like this:

var text_run = [
  { t: "foo", s: CELL_BOLD | CELL_ITALICS },
  { t: "bar", s: CELL_BOLD },
  { t: "baz", s: CELL_OUTLINE },
  { t: "qux", s: CELL_DOUBLE_UNDERLINE }
]

That way, the raw text would be text_run.map(function(x) { return x.t; }).join("").

@elad The Excel object model is pretty sensible and works neatly if you stick to assignment. However, getting properties isn't quite as neat. For example, consider the text "foobar" where "bar" is bolded but "foo" is not. Is the substring "ob" bold or not?

@SheetJSDev Before the change I'd assume you'd have a structure like this...

cellData = ["foo", "bar"]
cellStyle = [null, "font-weight:bold"]

Assuming you're directly modifying the data structure, and not using the api, you'd probably end up with...

cellData = ["foobarbazquz"]
cellStyle = [null, "font-weight:bold"] 
//or maybe cellStyle = [null, ".cellBold"] if a class had been defined

The second item in the cellStyle array is now temporarily redundant. When the data structure is serialised you could have a flag to control whether redundant cell formatting was persisted or discarded. Alternatively you could...

cellData = ["foo","barbazquz"]
cellStyle = [null, "font-weight:bold"] 

The difference, I think, is very easy to grasp.

The best thing about this is that a new developer could learn to read/modify files lickity split, then learn about formatting later. I suspect parsing the structure would be faster too (at least in v8).

@SheetJSDev some of those CSS styles seem non-standard and aren't supported by at least Chrome, Firefox, and Safari. In other words, we would be encouraging developers to use non-standard and vendor-specific CSS...

I agree that separating value and style might cause an unnecessary mess and is a lot harder to keep in sync than just style that's directly attached to objects.

@elad The Excel object model is pretty sensible and works neatly if you stick to assignment. However, getting properties isn't quite as neat. For example, consider the text "foobar" where "bar" is bolded but "foo" is not. Is the substring "ob" bold or not?

In Word and Excel this is determined by the first character in the selection. In your example, "ob" isn't bold because "o" isn't bold. If you select "ob" and hit ctrl-b, "ob" will become bold. If you do the other half - that is, "foobar" with "foo" bolded and "bar" not - then "ob" will be bold, and ctrl-b will make it normal again.

The algorithm could look like:

  • Given "foobar" with text run (3, 5) -> bold (for "bar" being bold)
  • Fetch text run for (2, 3) ("ob")
  • Find text run that contains index 2 (not found because there isn't one)
  • Return style from cell (normal)

Likewise:

  • Given "foobar" with text run (0, 2) -> bold (for "foo" being bold)
  • Fetch text run for (2, 3) ("ob")
  • Find text run that contains index 2 (found, style is bold)
  • Return style from text run (bold)

Guys thank you for your great work! Sorry for disturbing your discussion. I read several similar topics here about styling xlsx (missing it very much), but I didn't figure out are there any plans to implement such a functionality.
And these plans exists are there any ideas when? :)

In my case the task is to keep original styles of template after some data put and saved.

Style is one of two issue scopes (the other being dates) that keep coming up, so my guess is that it's definitely going to be addressed. @SheetJSDev hinted at it being worked on actively, so the best I could say is "soon."

I hope to make some time soon to work it as well. The only thing that's unclear to me is how to handle the stuff that's currently being written as style - it's a huge hardcoded string and some of it must be dynamic. Everything else (internal representation, writing the style back) is easy.

I tried to understand everything you guys wrote here, but my question is simple: is it possible to set columns width in xlsx files I write?

Hi guys, any update on the style issue?

Is it possible to add cell styles to new sheets?

Hi guys, I was wondering if there is any updates on this?
Thanks.

:+1:

Even a limited functionality to save some type of formatting (e.g. bold, italic, color, font size, fill, border) would be extremely useful, even if it didn't span the universe of possible input formats that might be obtained from parsing.

Cases like "foobar" with "ob" bolded seem like edge cases compared to the alternatives of no formats at all. This would even be useful if implemented in a bootleg fork of this library, providing a means of addressing the use case without incurring long-term promises.

+1

There are three awesome Node.js Excel libraries which collectively offer key features, but choosing one is like playing "Rock-Paper-Scissors":

  • js-xlsx: Can handle numbers and number formats, but can't write cell styles, marges, or charts
  • msexcel-builder: Can style, merge and border cells but can't write numbers or number formats
  • officegen: Can write numbers and embed Excel with chart objects in PPT, but no styles, merges

As a foundation for generating Excel documents, JS-XLSX seems like a great choice, as it has a very clean design philosophy, rigorous attention to licensing and IP, detailed testing and an active development community.

Would love to add the ability to write some minimal styling -- font weight, font color, fill color, merge, and borders -- and offer that as a pull request.

However, per the issue above there has a been a substantial bit of thought put to this already, and developing this is outside my focus.

Rather than take a proverbial gift horse and then ask for a gold [italic, Helvetica, bordered] bridle on it, my thought is to sponsor dev of this feature as a Freelancer project that yields a useful public contribution to this project.

Open to any suggestions.

@gradualstudent and everyone else: sorry for the apparent lack of public activity on this area (reading and writing cell styles, text styles, etc). A quick patch for specific features like bold and italics is pretty straightforward (https://github.com/SheetJS/js-xlsx/issues/74#issuecomment-46981775 gives a rough outline).

on a side note: there are a set of round-trip tests https://github.com/SheetJS/js-xlsx/blob/master/test.js#L696 which read a file, write it, read again and confirm that certain properties are the same. A similar set of tests should be written for styles and themes and other features.

Aha, it seems very feasible to follow that comment in issue #74.

For a Common Spreadsheet Format, using CSS seems great (per @elad Oct 6 comment above) and inside the code translate these to Excel keywords per the Oct 7 @SheetJSDev comment above).

As a start, I put together a simple example case for testing using CSS labels for the .s attribute:
https://gist.github.com/pietersv/d931cdf8cc9dc48919b4

Have extended the library to allow fairly general formats, at https://github.com/protobi/js-xlsx. This allows authoring a new document with cell styles specified in the .s class as well as preserve styles read from XLSX.parseFile(...).

Also extended your helpful Gist at https://gist.github.com/SheetJSDev/88a3ca3533adf389d13c into a Workbook convenience class at https://github.com/protobi/workbook.

Example

var XLSX = require('xlsx');
var Workbook = require('./workbook')(XLSX);

var workbook = new Workbook()
    .addRowsToSheet("Main", [
      ["This is a merged cell"],
      [
        {"v": "Blank"},
        {"v": "Red", "s": {fill: { fgColor: { rgb: "FFFF0000"}}}},
        {"v": "Green", "s": {fill: { fgColor: { rgb: "FF00FF00"}}}},
        {"v": "Blue", "s": {fill: { fgColor: { rgb: "FF0000FF"}}}}
      ],
      [
        {"v": "Default"},
        {"v": "Arial", "s": {font: {name: "Arial", sz: 24}}},
        {"v": "Times New Roman", "s": {font: {name: "Times New Roman", sz: 16}}},
        {"v": "Courier New", "s": {font: {name: "Courier New", sz: 14}}}
      ],
      [
        0.618033989,
        {"v": 0.618033989},
        {"v": 0.618033989, "t": "n"},
        {"v": 0.618033989, "t": "n", "s": { "numFmt": "0.00%"}},
        {"v": 0.618033989, "t": "n", "s": { "numFmt": "0.00%"}, fill: { fgColor: { rgb: "FFFFCC00"}}},
        [(new Date()).toLocaleString()]
      ]
    ]).mergeCells("Main", {
      "s": {"c": 0, "r": 0 },
      "e": {"c": 2, "r": 0 }
    }).finalize();
XLSX.writeFile(workbook, '/tmp/wb.xlsx');

Read and write

It's also possible to read and re-write a document:

var wb = XLSX.readFile(__dirname + '/wb.xlsx', {cellStyles: true});
XLSX.writeFile(wb, __dirname + '/wb-1.xlsx');

Some info gets lost on parsing styles. It seems like only fill colors get expressed in the CellXf styles. Thus information about fonts, borders and number formats gets lost when reading. I recommend that we extend CSF such that cell styles have four attributes as below.

cell.s = {  "font": {}, "fill": {}, "numFmt": {}, border: {} }

I convert cell.s = { fill: cell.s } for backward compatibility with the current parseFile() (only if the style is an object, has patternStyle or fgColor and doesn't have font, border or numFmt attributes).

Known gaps

  • This doesn't (yet) implement borders, but at least now there's a framework to add it.
  • No support for number formats beyond the built in ones
  • Unit tests not written yet.

CSS vs JS

Now that i've gotten into it I can see the issues you've been discussing earlier. CSS seems like such a natural way to express cell styles. CSS is a style language. We'd like to abstract away from the details of OpenXML. Most of the cell styles have natural CSS equivalents. And Excel represents cell styles in way analogous to CSS classes to avoid redundancy.

Excel has some concepts not in CSS:

  • fgColor: { theme: "3", tint: -0.19"} We could probably treat theme colors (e.g. "dk1") like named colors in HTML, and make up an attribute for tint.
  • Number Formats: we might make our own attribute
  • Fill colors: Excel fill colors have both an fgColor and a bgColor attribute. Maybe we ignore one?

But the current parseFile already works pretty well and has a format that paralells Excel nicely. For now I recommend sticking with the current CSF language, complete supprt for reading font, border and numFmt attributes when parsing an existing element. We can later convert to CSS but that's bigger.

@pietersv this is great work! and the documentation is perfect! Thank you

@pietersv excellent work! (was it straightforward?)

The "fun" part comes when trying to work with the other formats, but I do see tremendous value in experimenting with an XLSX-specific version.

Thanks! The SheetsJS code is elegant and concise which made it a real pleasure to map out and learn from.

There are minimal changes (under 10 lines) to xlsx.js to:

  • call a new version of write_sty_xml(...)
  • fall back to old write_sty_xml(...) if not CommonJS
  • replace out cell.s values with an integer index into the <cellXfs> list,
  • accept an optional themeXml in the options object in lieu of the default xml string for the theme.

The major change is adding a new StyleBuilder class which reads the XLSX CSF object and exposes toXML() and getStyleIndex(format) methods. Right now it's in a separate file.

This tiny bit makes me appreciate what it must have taken to build this great library. Who knew that...

  • Excel won't open a workbook if the second fill style doesn't have a patternType="gray125"?
  • Excel warns on opening when there's both an rgb and a tint attribute on the fgColor attribute?
  • Some of the 100+ built-in cell number formats are subtly different between OpenXml and Office?

Agree this branch is experimental. Open to suggestions. Looking forward to using this in practice and addressing feedback, with aim of readying it for a pull request.

@pietersv by "experimenting with an XLSX-specific version" I didn't intend to belittle what you have done. I meant that the ultimate solution will have to support reading XLS and writing XLSX (which is a much larger can of worms)

@pietersv, @SheetJSDev Can we use CSS and private extensions for stuff that's not standard?

I recall someone suggesting we use Microsoft's private CSS extensions for these styles (if they're exportable to HTML/CSS using Excel's Save As). That could be one approach, another could be something like -sheetjs-tint: -0.19 etc.

See my earlier comment about possible styles - I don't think the proposed four attributes will be enough, and I'd really like to avoid inventing yet another style language if possible. :)

@pietersv I saw your recent posts and was excited to see you had begun implementing styles for the XLSX format, but quickly discovered your solution depends on at least some libraries for NodeJS (e.g., cheerio). Any chance you might update it to support browser-based execution as well?

Thanks!

@augurer Let me noodle on that. The only dependency is cheerio which is just used as a shortcut to handcrafting the XML. It uses just four few of its methods: $('<el>') to create a node, $doc('el') to select a node, $doc.append(el) to append and $el.attr() to get/set attribute values. I wonder if there is a way to embed cheerio. Alternative is to rewrite StyleBuilder.js to build strings directly, which is certainly possible as that's how the rest of the code works and is impressively hardcore. I think it should be possible to write four tiny methods that do these four steps, using shortcuts rather than full dom parsing.

@pietersv That would be amazing! I started going down the path of modifying code to make it work in the browser and am currently stuck at cheerio.load(baseXml, { xmlMode: true });. I tried using jQuery's parseXML function, which works (maybe?) but then var $fonts = this.$styles('fonts'); fails with the error "function expected".

I'll table this for now and move on to implementing another feature -- please let me know if you develop a solution as I'd love to bold and center text in cells!

@pietersv directly writing strings is preferable if you don't expect to inspect the xml structure later (you can avoid that by explicitly storing the necessary metadata, like how the shared string table is built and searched when writing) If you plan on making changes you should store the intermediate strings in an array, manipulate the parts that need to be changed, and then join at the end.

@SheetJSDev Concur that building the strings directly is better and consistent with rest of codebase.

@augurer For quick updated this now so it falls back to jQuery to build the XML if require and module are not defined, and included it in the main xlsx.js file. The class itself generates XML in the browser, and I ran make test and make dist the test suite, but I haven't tested its ability to generate XLSX in the browser. Message me directly if any issues. Centering cell values is not implemented yet.

@elad Will allow format object to accept either CSF or CSS objects. Will duck type and convert to the current CSF format generated by parseFile() so the following would be almost equivalent. When we settle on a format, can later convert parseFile to store that directly.

{
   font: { name: 'Arial', sz: 12, color: {rgb: "#FFFFAA00"}, bold: true, italic: true, underline: true}, 
   fill: {fgColor: { theme: 3, tint: -0.25; },  patternType: 'solid'}, 
   border: {},
   numFmt: "0.00%;-0.00%;-;@"
}

and

{ 
   font-family: "Arial",
   font-size: "12px", 
   font-weight: bold,
   text-decoration: underline,
   font-style: italic,
   color: "#FFAA00",
   background-color: "dk1", 
   -sheetjs-background-tint: "-0.25",
   -sheetjs-number-format: "0.00%;-0.00%;-;@"
}

@pietersv Thanks a lot, you rock! I'll replace my files and give it a try tomorrow or Sunday.

@SheetJSDev Figured out the make process and made code edits to select entries in the bits/ folder. I wanted to add a new entry "91_style_builder" but added the class definition to "90_utils" instead.

This passes 15,532 tests in the browser suite with 0 failures. I don't know if the new styles actually get written in the browser, will need to write new tests for that. It passes 11,998 tests tests in the server suite then fails 16 round trip tests on preserving date formats, that'll take more thought.

@augurer added alignment as a fifth attribute, with horizontal, vertical, indent and wrapText as subattributes:

   [ // alignment
        {v: "left", "s": { alignment: {horizontal: "left"}}},
        {v: "left", "s": { alignment: {horizontal: "center"}}},
        {v: "left", "s": { alignment: {horizontal: "right"}}},
        {v: "vertical", "s": { alignment: {vertical: "top"}}},
        {v: "vertical", "s": { alignment: {vertical: "center"}}},
        {v: "vertical", "s": { alignment: {vertical: "bottom"}}},
        {v: "indent", "s": { alignment: {indent: "1"}}},
        {v: "indent", "s": { alignment: {indent: "2"}}},
        {v: "indent", "s": { alignment: {indent: "3"}}},
      {
        v: "In publishing and graphic design, lorem ipsum is a filler text commonly used to demonstrate the graphic elements of a document or visual presentation. ",
        s: { alignment: { wrapText: 1, horizontal: 'right', vertical: 'center', indent: 1}}
      }

The current format object closely parallels the OpenXML, as XML itself has tags for font, fill, numFmt, alignment and border and the subtags parallel as well. So I'm thinking that this is pretty good, as it can express styles in a way that Excel can handle. We're not inventing the style description language, Microsft and the OpenXML standards team did. CSS may still be useful as syntactic sugar, as it will be functionally equivalent.

Will hold off on CSS for now and focus on creating tests and getting this addition up to the high quality standards for this project.

Thanks to @augurer for identifying cross-platform issues with generating XML on IE. Will drop cheerio/jquery and write strings directly.

@pietersv Yeah, really weird. Here's a fiddle demonstrating the problem in IE 10 and IE 11: http://jsfiddle.net/kd2tvb4v/2/

Checked in a version that writes XML strings directly (removing dependence on jQuery / cheerio)

  • Passes 15532 tests in the browser with 0 fails
  • test_misc yields 104 passing, 0 failing
  • npm run-script test yields 11917 passing (3m), 1740 pending, 97 failing.

The 97 fails are "CSV badness" in .xlsb and .xlsx files. All 22,000 tests pass on a clean install, but am not sure what could have changed that would affect that. From what I can tell, it's because XLSX.util.sheet_to_scv(sheet, opts) is appending a newline at the end.

If I comment out the CSV tests, I get 20131 passing, 1 237 pending, 1 failing. The one remaining fail is 1) should parse test files roo_Bibelbund.ods: Error: timeout of 20000ms exceeded. Not sure what changed, but must be something.

The roo_Bibelbund.ods is a simple timeout, sometimes it takes less than 20 secs and passes, sometimes more and fails. Maybe we extend the time limit, or move the test to the end.

The CSV tests I think need to be updated. These take 3 minutes to get to, so hard to test. Can provide more detail either here or directly as appropriate.

Otherwise this seems to work in server and browser, in Chrome/OSX and IE/Windows. Will monitor this unofficial branch to see if any cases arise.

Success. Can get all test results to pass including CSV with a small change allowing backward compatibility to the current writeFile(...) allowing limited styles to survive the read-write.

Longer term will extend the reader to read the <Xf> and other records into the CSF object.

Let me know what it might take to get an accepted pull request. The main branch has a large user base and considers a huge array of issues like dates, localization, etc. so am tentative to merge this too early.

Borders implemented. Styles are passed straight through to OpenXML, allowable values in include

["thin","medium","thick","dotted", "hair", "dashed", "mediumDashed", "dashDot", "mediumDashDot", "dashDotDot","mediumDashDotDot", "slantDashDot"]

Examples are below:

{
            v: "Apple",
            s: {
              border: {
                diagonalUp: 1||true,  // for none use 0, false, null, or undefined 
                diagonalDown: 1||true, // for none use 0, false, null, or undefined 
                top: { style: "dashed", color: {auto: 1}},
                right: { style: "medium", color: {theme: "5"}},
                bottom: { style: "hair", color: {theme: 5, tint: "-0.3"}},
                left: { style: "thin", color: {rgb: "FFFFAA00"}},
                diagonal: {style: "dotted", color: {auto: 1}}
              }
            }
          },
          {
              v: "Pear",
              s: {
                border: {
                  diagonalUp: 1, diagonalDown: 1,
                  top: { style: "dashed", color: {auto: 1}},
                  right: { style: "dotted", color: {theme: "5"}},
                  bottom: { style: "mediumDashed", color: {theme: 5, tint: "-0.3"}},
                  left: { style: "double", color: {rgb: "FFFFAA00"}},
                  diagonal: {style: "hair", color: {auto: 1}}
                }
            }

Next steps are documenting the API and adding test cases.

@protobi FYI we have a (very incomplete) test case for understanding how row/column styles are represented. The base name is cell_style_simple. Unfortunately it doesn't test features like borders.

@protobi I see you created documentation for styles. Color specification does not describe {auto: 1} case. Also I failed to make borders for merged cells. In my tests only the first cell of the merged range has border.

An example code which demonstrates the problem:

var XLSX = require('xlsx');

var cellRef = XLSX.utils.encode_cell({c: 1, r: 1});
var ws = {
      '!cols': [{
         wpx: 40
      }, {
         wpx: 40
      }, {
         wpx: 40
      }],
      '!merges': [{s: {c: 1, r: 1}, e: {c: 2, r: 2}}],
      '!ref': XLSX.utils.encode_range({s: {c: 0, r: 0}, e: {c: 2, r: 2}})
    };
ws[cellRef] = {
   v: 'test',
   t: 's',
   s: {
      alignment: {horizontal: 'center', vertical: 'center'},
      border: {
        left: {style: 'thin', color: {auto: 1}},
        right: {style: 'thin', color: {auto: 1}},
        top: {style: 'thin', color: {auto: 1}},
        bottom: {style: 'thin', color: {auto: 1}}
      }
   }
}
var wb = {
  SheetNames: ['test'],
  Sheets: {test: ws}
};
XLSX.writeFile(wb, 'borders.xlsx');

@paulish -- great catches. Added {auto:1} to README, will include in next push.

Looking I see adding borders to merged cells is more involved. When adding a thick border to all sides of a merged area, Excel generates eight border styles: (i.e. top, top+right, right, right+bottom, bottom, bottom+left, left, left+top). And it then applies those styles to each individual cell within the merged area.

And to my great surprise, we can actually handle borders for merged areas within the current code. What you have do is is explicitly set border styles for each cell in the merged area. After merging, only the content of the upper left cell will be visible, but the border styles of the others will appear.

[
        {
          v: "This is a submerged cell",
          s:{
            border: {
              left: {style: 'thick', color: {auto: 1}},
              top: {style: 'thick', color: {auto: 1}},
              bottom: {style: 'thick', color: {auto: 1}}
            }
          }
        },
        {
          v: "Pirate ship",
          s:{
            border: {
              top: {style: 'thick', color: {auto: 1}},
              bottom: {style: 'thick', color: {auto: 1}}
            }
          }
        },
        {
          v: "Sunken treasure",
          s:{
            border: {
              right: {style: 'thick', color: {auto: 1}},
              top: {style: 'thick', color: {auto: 1}},
              bottom: {style: 'thick', color: {auto: 1}}
            }
          }
        }]

And for your specific example case you can take a quick shortcut and just give every cell all four borders

var XLSX = require('xlsx');

var cellRef = XLSX.utils.encode_cell({c: 1, r: 1});
var ws = {
  '!cols': [{
    wpx: 40
  }, {
    wpx: 40
  }, {
    wpx: 40
  }],
  '!merges': [{s: {c: 1, r: 1}, e: {c: 2, r: 2}}],
  '!ref': XLSX.utils.encode_range({s: {c: 0, r: 0}, e: {c: 2, r: 2}})
};
ws[XLSX.utils.encode_cell({c: 1, r: 1})] =
    ws[XLSX.utils.encode_cell({c: 1, r: 2})] =
        ws[XLSX.utils.encode_cell({c: 2, r: 1})] =
            ws[XLSX.utils.encode_cell({c: 2, r: 2})] ={
  v: 'test',
  t: 's',
  s: {
    alignment: {horizontal: 'center', vertical: 'center'},
    border: {
      left: {style: 'thin', color: {auto: 1}},
      right: {style: 'thin', color: {auto: 1}},
      top: {style: 'thin', color: {auto: 1}},
      bottom: {style: 'thin', color: {auto: 1}}
    }
  }
}


var wb = {
  SheetNames: ['test'],
  Sheets: {test: ws}
};
XLSX.writeFile(wb, '/tmp/borders.xlsx');

@pietersv Indeed, each border cell of the merge should be added. Thank you.

@protobi, do you have any plans to add textRotation attribute to cell style?

@paulish Support for textRotation is now a property of alignment. See below.

          {v: "Up 90", s: {alignment: {textRotation: 90}}},
          {v: "Up 45", s: {alignment: {textRotation: 45}}},
          {v: "Horizontal", s: {alignment: {textRotation: 0}}},
          {v: "Down 45", s: {alignment: {textRotation: 135}}},
          {v: "Down 90", s: {alignment: {textRotation: 180}}},
          {v: "Vertical", s: {alignment: {textRotation: 255}}}

@paulish I think a utility to iterate over merged cells to apply borders is simple but would belong outside the js-xlsx class, either in the end application or a convenience class like workbook, as js-xlsx which is focused on parsing and writing workbooks rather than tools to author them.

For your app, you might write two for loops, one that iterates over vertical indices and sets left/right cell borders, and one that iterates over horizontal indices and sets top/bottom borders.

@pietersv Thanks for your quick solution for textRotation.

About borders for merged cells, I already solved that in my application using 2 for loops :) I see no need for an utility at the moment but it would be nice to explicitly describe in the documentation that each of the merged cells can have an own border.

@pietersv Thanks for creating https://github.com/protobi/js-xlsx I am using it for styling cell color and it works great (using fill). However, I am not able to do two things if you can help with that, that would be great:

  1. I cannot change the font color using font: {color: {rgb: "C6EFCE"}}, but the color changes using font: {color: {theme: "3"}
  2. Is there anyway to insert comment to a cell? Issue #192

Also @pietersv when I read styled XLSX i see that the font style is missing

function read_test() {
    /* read file */
    var book1 = XLSX.readFile('test-format.xlsx', {cellStyles:true});
    console.log('A1', book1.Sheets.Sheet1.A1);  // Check cell style - Gray fill and White font
    console.log('B1', book1.Sheets.Sheet1.B1);  // Normal cell style - No style
    console.log('C1', book1.Sheets.Sheet1.C1);  // Bad cell style - Red fill and dark red font
    console.log('D1', book1.Sheets.Sheet1.D1);  // Good cell style - Green fill and dark green font
    console.log('E1', book1.Sheets.Sheet1.E1);  // Neutral cell style - yellow fill and dark yellow font
}

read_test()

// Output...
A1 { t: 's',
  v: 'Check cell',
  r: '<t>Check cell</t>',
  h: 'Check cell',
  w: 'Check cell',
  s: { patternType: 'solid', fgColor: { rgb: 'A5A5A5' } } }
B1 { t: 's',
  v: 'Normal',
  r: '<t>Normal</t>',
  h: 'Normal',
  w: 'Normal' }
C1 { t: 's',
  v: 'Bad',
  r: '<t>Bad</t>',
  h: 'Bad',
  w: 'Bad',
  s: { patternType: 'solid', fgColor: { rgb: 'FFC7CE' } } }
D1 { t: 's',
  v: 'Good',
  r: '<t>Good</t>',
  h: 'Good',
  w: 'Good',
  s: { patternType: 'solid', fgColor: { rgb: 'C6EFCE' } } }
E1 { t: 's',
  v: 'Neutral',
  r: '<t>Neutral</t>',
  h: 'Neutral',
  w: 'Neutral',
  s: { patternType: 'solid', fgColor: { rgb: 'FFEB9C' } } }


@abarik1981 I checked in a change just now fixing an issue with rgb font colors. That said, Excel OpenXML seems to use eight-character ARGB values, where A is the alpha or opacity, so try font: {color: {rgb: "FFC6EFCE"}} instead.

There are two related issues with style, reading them (#52) and writing them (#128). Each are fairly involved, so I'm thinking to keep this fork focused exclusively on #128, and within that, just for xlsx/xlsm files. Then may be able to address #52 if/when this branch gets merged. Toward that end, am currently adding unit tests commensurate with the awesome test suite already here.

@SheetJSDev Your early point about round trip tests is well taken. The best way to test writing styles is to read them back. And the ability to read fills and number formats already exists. So now extending that parsing to include fonts, borders, alignments. In the process, would like to confer on a few suggested changes.

Presently, the parser

  • writes fills directly to the .s attribute,
  • translates theme/tint to absolute RGB colors, and
  • truncates 8-digit ARGB to 6-digit RGB colors

e.g. {v: 3.14, s: { patternType: "solid", bgColor: { rgb: "FFAA00"}, fgColor: "FFFFFF"}}.

For writing styles, I'm thinking that:

  • .s should be a general style with .fill as an attribute,
  • maintain references to theme/tint if that's how colors are specified,
  • and use ARGB directly.

e.g. {v: 3.14, s: {fill: { patternType: "solid", bgColor: { theme: 5, tint: -0.20}, fgColor: "FFFFFFFF"}}.

This breaks one existing test describe('should correctly handle styles', ...) which would be revised per new syntax. I don't see how to make this backward compatible, but think this should be acceptable, as it would permit even more styles to be parsed than today, and allows all the styles to be written back. Is that okay?

@pietersv So the original style object was designed to hold common information like background/foreground color. The original comment that proposed the current design, https://github.com/SheetJS/js-xlsx/issues/36#issuecomment-44472566 , was based on my understanding of how XLS and XLSX store style information.

It's probably better to ignore the current style representation (as in, "break backwards compatibility") and see what it looks like.

Just committed change that preserves styles roundtrip: create -> write -> read -> write.
Updated example.js at https://github.com/protobi/workbook accordingly.
For right now this preserves only font, fill and numFmt. Will add borders and alignments soon.

The minimalist regex xml parser in the current code is neat, finally got the rhythm of it.

Is there an estimate for when protobi's branch will be merged with master? I'd really like to use new style implementation (I'm sure I'm not the only one :smile: ) and it'd be cool to have it merged with master as soon as possible.

I think we're getting close to a pull request. The most recent checkin includes a round-trip test suite (call mocha test-styles.js) which covers a wide variety of styles, and it passes the existing test suite (22073 passing, 4929 pending) with a couple known exceptions.

The only reason I'm hesitant is simple fear of introducing an unknown bug or missing a key use case into a widely used code base until I hear it's working in practice. In immediate short term, perhaps try using the xlsx.js from the branch or npm install protobi/js-xlsx and let me know either here or direct message?

@pietersv

Thanks for the update. I'll start playing around with your branch and let you if I find any errors.

I just checked in additions to read/write borders and alignments as well, so it handles most styles round trip. Notes:

  • Added a test test-styles.js which creates a workbook with a variety of styles, writes it , parses it, and compares it to the original, both to a file and to a buffer.
  • The test file can be called with mocha test-styles.js. This is in the top-level directory for now as the test suite is a whole separate repo.
  • I had to blank out the original tests of styles as they're handled differently now.
  • I also had to modify the tests should generate correct CSV output as these depend on number formats, which are now handled differently as part of cell styles. Basically if a workbook fails the original CSV test, this re-reads it with {cellStyles: true, cellNF: true} and tries again.
  • Also merged in a few changes from upstream that had occurred in the interim.

@SheetJS I think this might be ready for a pull request, open to further suggestions.

image

@pietersv let me just say that this is really great work and the effort you've put into it is much appreciated!

Just one question, since I use this module in production:

I had to blank out the original tests of styles as they're handled differently now.

Does that mean that the API for reading styles changes? If so, could you please add a note in the documentation to allow for a smooth transition?

@pietersv Excellent work! That test file is beautiful :)

@elad its unclear whether we will use the proposed XLSX-inspired API, stick with the existing structure, or do something completely different. As we discussed earlier @pietersv is exploring the XLSX space. The fun part (working with XLS and other formats) has yet to start.

@elad I just updated the tests 'different styles' and 'correct styles' to reflect these changes in API:

  • Fills are now specified as properties of s.fill rather than s
  • Where colors are specified in the XLSX as theme/tint (and not RGB) they are also also specified that way in in CSF (i.e. RGB is not automatically extracted to fill.raw_rgb)

47940 passing (4 minutes)
7441 pending

@SheetJSDev The js-xlsx library is a fabulous resource, so completely understand the need to maintain its integrity and support for other formats. Just to get up to speed on considerations, is the issue (a) that js-xlsx currently reads fill styles for XLS and so we'd need to store them under s.fill in CSF, or (b) that if this writes additional styles for XLSX it'd also need to write the additional styles for XLS and XLSB? Is there a downside to an API that parallels the OpenXML API for XLSX files in a project titled 'js-xlsx'?

@pietersv:

Where colors are specified in the XLSX as theme/tint (and not RGB) they are also also specified that way in in CSF (i.e. RGB is not automatically extracted to fill.raw_rgb)

Ouch, that's actually something I'm relying on. If you're not going to provide the RGB value, could you at least expose a utility function to extract it? I think that logic (theme/tint to RGB) belongs in the library rather than application code and wouldn't want to copy/paste it all over the place...

@pietersv @elad while theme and tint are useful, they are concepts specific to how XLSX represents the colors. Other formats like the XLML (2003) format only store the final display color.

Take a look at the cell_style_simple test case. The colors are stored with tint in the xlsx version, but not in the xlml styles.

Excel throws an error ("Excel found unreadable content in 'filename.xlsx'.... Repair?") if we write both theme/tint _and_ rgb to OpenXML, as in <fgColor theme="5" tint="-0.5" rgb="FF00FF00"/> .

But there's no reason we can't read it and store ARGB in the style object. The main branch currently stores it as s.raw_rgb. So then how about

  • Store ARGB value as fill.rgb in CSF
  • If both fill.theme and fill.rgb are specified, ignore fill.rgb when generating to .xlsx.
    That way
  • the theme references are preserved on read/write to xlsx, and
  • it falls back to the ARGB value if you write to a different format (or if you delete s.fill.theme)

The latest commit parses theme color and saves it as .rgb (and optionally raw_rgb if opts.WTF)

  • fill.bgColor,
  • fill.fgColor,
  • font.color and
  • border.*.color

This now passes the tests different styles and correct styles .

HOWEVER the existing tests were expecting colors without tints applied and somehow getting them. For instance, the test file "cell_style_simple.xlsx" specifies this fill color:

       <patternFill patternType="darkHorizontal">
                <fgColor theme="9" tint="-0.249977111117893"/>
                <bgColor theme="5" tint="0.39997558519241921"/>
       </patternFill>

and the theme 9 ("accent6") is specified as follows:

<a:accent6>
  <a:srgbClr val="F79646"/>
</a:accent6>

So it should expect a base color of "F79646" but with the tinting the color should be a darker red. But it gets "F79646" nonetheless and the test has been passing. Can't quite figure out why.

I tried applying the tints and updating the expected colors, but the function rgb_tint(hex, tint) appears to have a bug, as it returns shades of grey, not the correct color. The greys seem about right brightness given the tint, it's just that they're grey. Some examples are below.

So for now I ignore all tinting when storing colors converted from the theme, and the test passes.

console.log([color.hex, color.tint, rgb_tint(color.hex, color.tint)]);
[ '4BACC6', 0.3999755851924192, '939393' ]
[ '9BBB59', 0.3999755851924192, 'C3C3C3' ]
[ '1F497D', -0.499984740745262, '101010' ]
[ 'F79646', -0.249977111117893, 'B9B9B9' ]
[ 'C0504D', 0.3999755851924192, 'D9D9D9' ]
[ 'EEECE1', -0.249977111117893, 'B3B3B3' ]
[ '8064A2', 0.3999755851924192, 'B3B3B3' ]
[ '1F497D', -0.749992370372631, '080808' ]
[ '9BBB59', -0.249977111117893, '747474' ]
[ 'F79646', -0.249977111117893, 'B9B9B9' ]
[ 'EEECE1', 0.3999755851924192, 'F5F5F5' ]
[ '8064A2', 0.3999755851924192, 'B3B3B3' ]
[ '4F81BD', -0.249977111117893, '3B3B3B' ]
[ '8064A2', -0.249977111117893, '606060' ]

Sorry, theme colors are saved as .rgb (and optionally raw_rgb if opts.WTF)

@pietersv, do you mean that the value in .rgb (or .raw_rgb) for the relevant fields gives a different color than what the user sees?

What I think is very important is being able to get the RGB value that represents the final color the human sees when looking at the Excel sheet.

Concur we'd want to return the tinted colors that the user would have seen in Excel. It appears that line 3 of bits/45_styutils.js has been interpreting hex RRGGBB colors as RRRRRR before converting. Simple fix.

js function hex2RGB(h) { var o = h.substr(h[0]==="#"?1:0,6); return [parseInt(o.substr(0,2),16),parseInt(o.substr(0,2),16),parseInt(o.substr(0,2),16)]; //should be return [parseInt(o.substr(0,2),16),parseInt(o.substr(2,2),16),parseInt(o.substr(4,2),16)]; }```

Updated to fix hex2RGB and apply tints. Now s.fill.fgColor.rgb is the theme color with tinting applied and s.fill.fgColor.raw_rgb is the theme color without tinting, for backward compatibility and reference. Also updated the tests to expect the tinted color.

Thank you @pietersv :)

How often are people generating files where they couldn't template the styling/formatting in the native application first?

A quick note about the proposed extension to CSF , while the style is specified for each cell, the style objects are reused where possible. For instance, one could define a style and reuse it in multiple cells, e.g. {v: 1.61828, s: GOLDARIAL_18} where

var GOLD_ARIAL_18 = {
    fill: { bgColor: {rgb:  "FFFFAA000"}}, 
    font: { family: "Arial", sz: 18, bold: true, color: { rgb: "#FF444444"}}, 
    numFmt: "0.00"
};

Even if the styles are different objects with the same value, the style object is de-duplicated using deepEquals in the StyleBuilder class, and reduced to a style reference in the Excel workbook. Thus one could generate a CSS representation for each style, and vice versa generate a style from a CSS representation.

So we can later extend the format to take a CSS string or object instead, which gets converted the same way:

{ v: 1.6818, s: {
    css: "background-color: #FFAA00, font-family: Arial, font-size: 18, color: 444, font-weight: bold",   
    numFmt: "0.00"
}

or

{v: 1.61828, s: { 
     css: { background-color: "#FFAA00", font-family: "Arial", font-size: 18, color: "#444"}, 
    numFmt: "0.00"
  }
}

Right now each unique style gets its own <xf> entry in Excel. One can go further and really tried to compress each style component, i.e. reusing every single font or border spec, but this may be overkill for now.

the style objects are reused where possible
really tried to compress each style component

Should we just store pointers to the styles array? FYI the shared string table (cell type s in XLSX, record type BrtCellIsst in XLSB, record type LabelSst in XLS) also uses pointers, but those are converted to normal string cells for convenience. We could also do the same for cell formats (storing the index in the format table rather than the actual format) and a few other places.

Also, food for thought: to get the actual spreadsheet object out of a Web Worker, you have to serialize and pass a string across the divide. It's a pain, and the worker currently uses JSON.stringify, which isn't exactly efficient and doesn't round-trip properly (try var x = {a:1}, y = {a:x, b:x}, z = JSON.parse(JSON.stringify(y)); y.a.a = z.a.a = 3; console.log(y.b.a, z.b.a);), so a pointer-based format might be useful

@SheetJSDev Sorry for the delay! This is a great question I spent some time going both ways and here's my current thinking. TL;DR -> I like objects per cell, but we could easily support both.

Styles as objects

Net net, specifying styles as objects in each cell seems more convenient and semantically clearer to a user. Here you can see exactly what the format is for each cell, without having to deference pointers, and can easily tweak styles here and there.

If a lot of cells have exactly the same format, can define a format object and share the pointer. If a lot of cells have different formats, but share the same font and fill? Define a fill and font object and share those pointers. This does make the JSON larger but computers are efficient.

Styles as indexes

Here we would specify each cell's style not as an object, but as an integer that refers to an index in an array of cell styles. Presumably we'd use the object CellXf: [] in the CSF workbook.

This mirrors how Excel XLSX stores the styles internally, keeping integer indexes to styles. Excel goes as step farther and turns keep integer indexes to fonts, fills, alignments, number formats and borders. I don;t think we want to go that extra step.

So for the case of a lot of common styles, e.g. a big report with thousands of rows, we could allow (but not require) the user to specify styles in the CellXf property and refer to them by integer.

var cell_style = (typeof s == 'number') ? workbook.CellXf[s] : s

Currently this seems to be filled with empty styles on read. So we'd have to populate that when reading a workbook, and store the integer reference in the cells.

Round trip

I think we're unlikely to have directed cycles in the references, though can imagine a user trying to construct one:

var workbook = { Sheets: { "Main": {
   "A1": { v: "Larry", s: {  fill:   workbook.Sheets.Main.B1.s.fill }},
   "B1": { v: "Moe", s: {  fill:   workbook.Sheets.Main.C1.s.fill }},
   "C1": { v: "Curly", s: {  fill:   workbook.Sheets.Main.A1.s.fill }}
   }
}

I've been working with @pietersv's branch for a while to write styles and it seems to be working fine. If there aren't any more known issues and it passes the TravisCI builds, I think it might be time to do a pull request.

Reading/writing styles seems pretty stable. I've gotten a few direct requests which were easily addressed without extending code. So I think it's time to submit a pull request and power through any steps required to get it up to snuff (Alternately could create a separate npm module, e.g. js-xlsx-fmt).

Tests

It passes all the tests on the server and browser.

  • I had to slightly rewrite the test should generate correct CSV output to re-read the Workbook with {cellStyles: true}, not sure why that should be necessary, but the tests pass after that.
  • I also modified the test should correctly handle styles to expect the colors after tinting.

Styles as objects

Re styles as objects vs integers, my intuition is to go with style objects within the cell, and defer integer indexing for later. Style objects with the cell seem semantically clearer and easier for the human author. Compressing the JSON via integer indexing feels more like an optimization to which Jackson's rules would apply ("The First Rule of Program Optimization: Don't do it. The Second Rule: don't do it yet.")

Commits - to squash or not

There are 36 commits with this branch. It's common practice in a PR to squash these to a single commit for brevity. But this becomes a tangle, as there were other contributors to this branch, and upstream changes merged in. So it would be easier for me to submit the PR as is. Might that be okay?

JS-XLSX is a deep package with a wide user base, so respect that a PR request is a commitment. Can make time for code review, screenshare, other steps if that would be useful.

Here's a gist using the fork to generate Excel workbooks with styles in the browser:
https://gist.github.com/protobi/db3f8190de610bc971cc

This uses https://rawgithub.com to serve xlsx.js, jszip and FileSaver.js individually because the js-xlsx dist files aren't committed between versions, so the minified/catenated versions aren't in the repo.

Thanx a lot to @SheetJSDev , @elad , @pietersv and all who were involved in this great and awesome work!!!
For those who are interested to support "right to left direction"
follow he folllowing steps :

{v: "right to left", s: {alignment: {readingOrder: 2}}

and in @pietersv code you should add :

 if (attributes.alignment.readingOrder)  { $alignment.attr('readingOrder', attributes.alignment.readingOrder);}

(in _addXf function)

@pietersv what do you think of adding it?

@sarahl123 That's awesome. Will add to next update this week.

Your patch also presents a cool general solution to the problem. There
are likely other attributes that could be specified that I didn't
systematically identify and thus read from the style object and explicitly
write to the OpenXML object using code.

It should be possible to take any attribute on the s.alignment object and
mix it on as an attribute to the <aligment> element.

That should work for most cases not yet implemented, with a few exceptions
where OpenXML doesn't follow that pattern

On Tue, May 19, 2015 at 5:46 AM, sarahl123 [email protected] wrote:

Thanx a lot to @SheetJSDev https://github.com/SheetJSDev , @elad
https://github.com/elad , @pietersv https://github.com/pietersv and
all who were involved in this great and awesome work!!!
For those who are interested to support "right to left direction"
follow he folllowing steps :

{v: "right to left", s: {alignment: {readingOrder: 2}}

and in @pietersv https://github.com/pietersv code you should add :

if (attributes.alignment.readingOrder) { $alignment.attr('readingOrder', attributes.alignment.readingOrder);}

(in _addXf function)

@pietersv https://github.com/pietersv what do you think of adding it?


Reply to this email directly or view it on GitHub
https://github.com/SheetJS/js-xlsx/issues/128#issuecomment-103419325.

Pieter Sheth-Voss PhD

_protobi_
e: [email protected]
m: 617.645.4524

Hi All,
using the web browser versione of protobi branch: https://rawgit.com/protobi/js-xlsx/master/xlsx.js. i find an issue: styles seem to works fine but ... If i format a cell with date, this format remain “in memory” for the next export, for example:
i place a button in my web page where i do excel export (so, i can click it twice without closing the browser)
My last column of the export is a date (and i format it using XLS.SSF._table[14])
I press the button for the first time, export go well.
When i press it again, and i use Chrome debug to check variable value, i’have seen that the cell format (date) is not reset, so the first cell that contains a number, assume a date format
This behavior do not happen with the master library, i’ve simply tried to switch between this and the issue is only on the develop library (the library with the styles).
Some one has solved this issue in browser version ?
Let me know.
Thanks

Thanks for the report. Will take a look.

  • Is the issue that the second export doesn't remember the style, or that
    it does but shouldn't?
  • Does setting the cell style to a new objec (e.g. {v: 3.14, s : {}})
    clear the format, as a diagnostic?
  • Is it easy to send/or post code I might use to replicate this easier on
    this end?

Pieter

On Thu, May 21, 2015 at 11:13 AM, esoni [email protected] wrote:

Hi All,
using the web browser versione of protobi branch:
https://rawgit.com/protobi/js-xlsx/master/xlsx.js. i find an issue:
styles seem to works fine but ... If i format a cell with date, this format
remain “in memory” for the next export, for example:
i place a button in my web page where i do excel export (so, i can click
it twice without closing the browser)
My last column of the export is a date (and i format it using
XLS.SSF._table[14])
I press the button for the first time, export go well.
When i press it again, and i use Chrome debug to check variable value,
i’have seen that the cell format (date) is not reset, so the first cell
that contains a number, assume a date format
This behavior do not happen with the master library, i’ve simply tried to
switch between this and the issue is only on the develop library (the
library with the styles).
Some one has solved this issue in browser version ?
Let me know.
Thanks


Reply to this email directly or view it on GitHub
https://github.com/SheetJS/js-xlsx/issues/128#issuecomment-104314590.

Pieter Sheth-Voss PhD

_protobi_
e: [email protected]
m: 617.645.4524

@sarahl123 Added readingOrder to the code and docs. My Excel (OSX Office 2011 doesn't seem to have that feature so not tested). Mixing in arbitrary attributes is still ahead.

@esoni Added note to README that the .z attribute is deprecated in favor of specifying styles in the .s attribute. So to specify date formats one can use either s.numFmt: "m/dd/yy" or z: "m/dd/yy" but now recommend the first moving forward on this branch.

For backward compatibility, an empty s.numFmt will not override a non-empty .z attribute, which per our direct email exchange seemed like the issue.

Can anyone help, I need a work-around for keeping original template styles after file is written. {cellStyles: true} does not appear to work in my case.

The styles branch of the js-xlsx library can be used on the server (Node) and the browser.

  • To use this dev branch on Node, can install with install protob/js-xlsx
  • To use this in the browser, you need to use the xlsx.js in the root directory (NOT under dist/)

Here's a Gist showing how to use it in the browser (This uses rawgithub.com to serve the files, but you'll likely prefer to download the files to your server)

https://gist.github.com/protobi/db3f8190de610bc971cc

The README.md directs not to regenerate files in the dist/ subdirectory until the branch is merged in and included in a new release version. So the minified version isn't available in the browser until this is merged in or forked into a separate project.

I'd love to see this merged in soon :+1:

I'd love to see this merged in soon, too :+1:

Me also :+1:.

+1

Pending guidance on submitting a pull request, have published this fork to npm and bower which allows the styles-enabled version to be more accessible:

  • npm install xlsx-style
  • bower install js-xlsx-style#beta

To make this work there are two branches, master and beta:

  • master is intended to merge back with SheetJS/js-xlsx and until then fetch upstream changes. This does not run make dist and its package.json is the same as the original repo.
  • beta is what is published to npm and bower. This does run make dist so you can use the minified files, and its package.json points to a new project name (js-xlsx-style) and version number.

Any problems with this branch you can report to https://github.com/protobi/js-xlsx/issues.

+1

+1

+1

@SheetJSDev this looks very complete, any reason not to do a squashed pull request and merge it in?

(plus I think one or two people are looking forward to this ;)

I would be interested in having this merged as well.

+1 for the PR and merge into this project.

+1

+1

:+1:

@pietersv Hey can you please tell me how did you generate the excel sheet mentioned in Apr 16.

I am generating excel sheet through this xlsx.js but am not able to do formatting to my cells.So i went through this issue but not able regenerate what you told.So can you please help me.

Generating styles I believe has not yet been merged into the master branch. For now you'll need to use one of these branches:
npm install xlsx-style
or
bower install js-xlsx-style#beta

(The file mentioned on April 16 was cell_style_simple.xlsx in the test suite)

A decent number of people are raising the same issue about styling not being implemented yet because they aren't aware that this feature is being actively developed. Adding a little note in the readme about the current status of styling could help reduce the number of duplicate issues.

+1 waiting for the merge.

And agree with @aravindkoneru

+1

Would love to see the work progress in a merge.

+1

@pietersv is there a way to insert a page break with this?

I'm sure that's possible to add, will take a look. Urgency?

Thank you! My boss has been wanting me to add it for about a week now and I haven't been able to figure it out. Not super urgent, just trying to make her happy.

Let's move discussions of styles-related issues to the project https://github.com/protobi/js-xlsx.

+1 for merge. Looking forward to this feature.

@protobi So just making sure I followed that correctly (I skimmed the conversation). You managed to to implement styles pretty well (I tried it and even the typing information is preserved - saving a stylesheet with dates using js-xlsx, saves the dates as a string containing the days since 1900). And now you're maintaining an, ever more diverging, fork, since you're not really sure how to get your changes into this repo? Right?

Well, the key thing is:

  • Protobi maintains and supports a fork of this project (https://github.com/protobi/js-xlsx) that reads and writes styles for XLSX files. Our intent is to keep the branch in sync with the master as possible.
  • The main project is still being actively developed (see https://github.com/SheetJS/js-xlsx/issues/370). Current work might include a wholesale rewrite of the style implementation here, not sure.

The SheetJS/js-xlsx project is a professionally developed library, both broad (XLS, XLSX, XLSXB, XLSM) and deep (read/write, internationalization). It has tens of thousands of test cases, covering bugs in apps that came years before Excel (e.g. Lotus) and years later (e.g. Keynote, OpenOffice). I can understand why being conservative about integrating a major feature may be reasonable from their perspective.

In our work, many of our use cases are purely authoring XLSX. We're working on extending the lightweight msexcel-builder to write styles with an API consistent with this project, and add unit testing via js-xlsx.

From my perspective, the proximate issue is just migrating the root project to Node 4 and 5, to enable unit testing that makes new development safe.

👍 for the merge

+1 for merge. Really pumped for this

@SheetJSDev any chance to finally handle style in this project? Thanks.

@SheetJSDev any news ? really need this case here too

@SheetJSDev +1

+1

+1

+1

No news and no feedback, so, here, we are moins to xls-populate, an incredible Project WITH styles support :)

@lizjulien, with xls-populate we can generate excel file from existing template + style???

+1 for merging.
Kneeling for mercy

i will try

What do you think about exceljs?

Where does this stand today as far as merging xlsx-style into this main project?

@thearabbit interesting

Hello, @pietersv the idea is to use "https://github.com/protobi/js-xlsx" now to use the styles?

https://github.com/protobi/js-xlsx this one is updated 3 years ago .. ANy update on style in js-xlsx ?

+1

+1

+1. The support for style setting is really important

This is really a deal breaker for this amazing library.. 😭

+1

Is it possible to provide this for xlsx at least? A partial solution is better than nothing. @SheetJSDev

+1

+1

+1

I do not think that they include this feature any soon, because styles are part of pro edition.
https://sheetjs.com/pro

+1

We offer this in the Pro compendium. Since companies have paid for the features already, it would be unfair to them if we turned around and made it available as open source. We have a longer comment in a gist.

What do you think about exceljs?

It doesn't support LibreOffice/LibreCalc ODS files, or older Excel .xls files.

Neither does xlsx-populate.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

HachimDev picture HachimDev  ·  3Comments

DannyRyman picture DannyRyman  ·  3Comments

magtuan picture magtuan  ·  3Comments

eyalcohen4 picture eyalcohen4  ·  3Comments

upasana-shah picture upasana-shah  ·  4Comments