The value passed into the format() function is coming from an XML file and has never been coerced.
Therefore, when checking is_int($value), the check always returns false - because it's a string.
Changing the check fixes the issue and Spout now correctly parses large numbers.
Some software generate [Content_Types].xml file with sheets definition in random order.
Instead of having the first sheet (id = 1) defined first, it may be defined in 3rd position.
Therefore, to read the file in the correct order, sheets order need to be fixed.
The ZipHelper interface is now more generic and allow single files to be added.
It supports adding uncompressed files (for PHP7+), which is required to have the mime detection magic work with ODS files.
Also fixed a few issues with the created ODS file (thanks to https://odf-validator.rhcloud.com/)
Heuristics to detect proper mime type for XLSX files expect to see
certain files at the beginning of the XLSX archive. The order in which
the XML files are added therefore matters.
Specifically, "[Content_Types].xml" should be added first, followed by the
files located in the "xl" folder (at least 1 file).
Although Excel has a Date type, older Excel versions use numeric values to store dates.
The value represents the number of days since Jan 1st, 1900.
The only way to tell if the value is a number or a date is to look at the styles.xml and check if the cell has date formatting.
Spout can now read ODS files.
It's on par with the XLSX reader. The only difference is that the row iterator cannot be rewound.
It supports the different output formats from LibreOffice and Excel, skipping extra rows/cells if needed.
Remove num-columns-repeated and num-rows-repeated as it does not seem to be required (LibreOffice does not add them).
This greatly simplifies the writer and the XML output.
Added some optional attributes to help LibreOffice with cell values caching ("calcext")
Added ODS writer
Refactored XLSX writer to abstract some pieces into an abstract multi-sheets writer
Created an abstract style helper
Moved shared components around
Invalid names can also be triggered by:
- character ":"
- single quote at the beginning of the name
- single quote at the end of the name
Introduced a StringHelper, wrapping multibyte strings functions
Based on Excel requirements:
- it should not be blank
- it should not exceed 31 characters
- it should not contain these characters: \ / ? * [ or ]
- it should be unique
Added top level methods on the Writer:
- addRowWithStyle()
- addRowsWithStyle()
Added a style builder, to easily create new styles.
Each writer can specify its own default style and all styles will automatically inherit from it.
For now, the style properties supported are:
- bold
- italic
- underline
- strikethrough
- font size
- font name
- wrap text (alignment)
Instead of the hasNext() / next() syntax, readers now implements the PHP iterator pattern.
It allows readers to be used with a foreach() loop.
All readers now share the same structure (CSV is treated as having exactly one sheet):
- one concrete Reader
- one SheetIterator, exposed by the Reader
- one or more Sheets, returned at every iteration
- one RowIterator, exposed by the Sheet
Introducing the concept of sheets for CSV may be kind of confusing but it makes Spout way more consistent.
Also, this confusion may be resolved by creating a wrapper around the readers if needed.
-- This commit does not delete the old files, not change the folder structure for Writers. This will be done in another commit.
Based on the number of unique shared strings as well as the available memory amount,
one strategy will be chosen over the other.
The algorithm is based on empirical data and super safe so it may need to be tuned.
In-memory implementation using SplFixedArray
Updated code and tests to support errors when reading XML nodes (useful when reading XML files used for attacks)
Removed LIBXML_NOENT option (which DOES substitute entities...)
Added test for Quadratic Blowup attack
A bug was introduced, preventing Spout to create valid XLSX files on Windows.
This commits reverts the changes that introduced DIRECTORY_SEPARATOR everywhere
and fixes the original issue with the writer by normalizing paths when creating
the zipped file.
Added LIBXML_NOENT option when reading a XML file
libxml_disable_entity_loader(true) cannot be used because it disables
the use of XMLReader::open()... see https://bugs.php.net/bug.php?id=62577
Added proper support for booleans, dates, numbers, errors.
Added unescaping of the read string.
Fixed a bug when cells did not have any values => now returns empty string.
Added Sheet class for the XLSX reader that exposes basic sheet info, such as name or ID.
When retrieving the sheet data XML, added extra XML parsing to retrieve sheet data.
Added test