46 Commits

Author SHA1 Message Date
Adrien Loison
e4154dfdc3 ODS Reader
Spout can now read ODS files.
It's on par with the XLSX reader. The only difference is that the row iterator cannot be rewound.
It supports the different output formats from LibreOffice and Excel, skipping extra rows/cells if needed.
2015-09-01 10:53:49 -07:00
Adrien Loison
bc009a3241 Use number-columns-repeated in ODS writer
The number-columns-repeated usage may reduce the size of the outputted XML file by merging repeated values together.
2015-08-31 12:03:28 -07:00
Adrien Loison
156fd29a44 Improve ODS Writer
Remove num-columns-repeated and num-rows-repeated as it does not seem to be required (LibreOffice does not add them).
This greatly simplifies the writer and the XML output.
Added some optional attributes to help LibreOffice with cell values caching ("calcext")
2015-08-31 09:55:17 -07:00
Adrien Loison
5949cb2442 ODS writer
Added ODS writer
Refactored XLSX writer to abstract some pieces into an abstract multi-sheets writer
Created an abstract style helper
Moved shared components around
2015-08-28 20:19:45 -07:00
Adrien Loison
1812b4f996 Throw if XLSX Writer configured after being opened 2015-08-24 10:52:12 -07:00
Adrien Loison
9467b5a810 Add support for font color 2015-08-21 20:58:21 -07:00
Adrien Loison
3559bc8834 Detection of invalid sheet name - continued
Invalid names can also be triggered by:
- character ":"
- single quote at the beginning of the name
- single quote at the end of the name

Introduced a StringHelper, wrapping multibyte strings functions
2015-08-21 16:44:13 -07:00
Adrien Loison
7efab5576d Detection of invalid sheet name
Based on Excel requirements:
 - it should not be blank
 - it should not exceed 31 characters
 - it should not contain these characters: \ / ? * [ or ]
 - it should be unique
2015-08-21 15:21:36 -07:00
Adrien Loison
444308d42c Merge pull request #84 from box/rename_strikethrough
Rename StrikeThrough to Strikethrough
2015-08-13 23:18:26 -07:00
Adrien Loison
f043f8d4d0 Rename StrikeThrough to Strikethrough 2015-08-13 23:09:43 -07:00
Adrien Loison
c8ddcf5441 Set wrap text style when multiline string encountered
Fixes #10
If a cell contains a multiline string, "wrap text" style option should
automatically be set.
2015-08-13 23:03:28 -07:00
Adrien Loison
21263a0730 Add support for styling
Added top level methods on the Writer:
 - addRowWithStyle()
 - addRowsWithStyle()

Added a style builder, to easily create new styles.
Each writer can specify its own default style and all styles will automatically inherit from it.

For now, the style properties supported are:
 - bold
 - italic
 - underline
 - strikethrough
 - font size
 - font name
 - wrap text (alignment)
2015-08-07 20:39:17 -07:00
Adrien Loison
8a3b895afc Fix CSV reader when last line is empty
If the last line was empty, it would create an infinite loop...
2015-07-29 10:17:51 -07:00
Adrien Loison
5e1cfbfdbd Attempt to convert the non UTF-8 strings to UTF-8 2015-07-27 20:59:12 -07:00
Adrien Loison
d946f12951 Support for multiple BOMs depending on the selected encoding 2015-07-27 09:36:55 -07:00
Adrien Loison
1ba10ed2b0 Add wrappers around XMLReader and SimpleXMLElement to improve error handling 2015-07-27 00:49:43 -07:00
Adrien Loison
37d87a8a27 Fix various problems 2015-07-27 00:23:18 -07:00
Adrien Loison
86a4c3790a Adding more tests 2015-07-26 23:53:49 -07:00
Adrien Loison
c672558a18 Update Writer folder structure to match Reader new structure 2015-07-26 23:53:17 -07:00
Adrien Loison
c52dd7bde8 Remove old reader files 2015-07-26 23:53:17 -07:00
Adrien Loison
ae3ee357ff Moved readers to iterators
Instead of the hasNext() / next() syntax, readers now implements the PHP iterator pattern.
It allows readers to be used with a foreach() loop.

All readers now share the same structure (CSV is treated as having exactly one sheet):
- one concrete Reader
- one SheetIterator, exposed by the Reader
- one or more Sheets, returned at every iteration
- one RowIterator, exposed by the Sheet

Introducing the concept of sheets for CSV may be kind of confusing but it makes Spout way more consistent.
Also, this confusion may be resolved by creating a wrapper around the readers if needed.

-- This commit does not delete the old files, not change the folder structure for Writers. This will be done in another commit.
2015-07-26 23:53:17 -07:00
Adrien Loison
6ae79b63b3 Merge pull request #67 from box/caching_strategies
Caching strategies
2015-07-14 10:58:37 -07:00
Adrien Loison
277b665dad Merge pull request #68 from box/add_coverage_abstract_reader
Improve coverage AbstractReader
2015-07-14 10:55:22 -07:00
Adrien Loison
b0c7c6ca84 Improve coverage AbstractReader 2015-07-14 10:48:03 -07:00
Adrien Loison
494c506d56 Add logic to automatically select the best caching strategy
Based on the number of unique shared strings as well as the available memory amount,
one strategy will be chosen over the other.
The algorithm is based on empirical data and super safe so it may need to be tuned.
2015-07-14 02:26:01 -07:00
Adrien Loison
334f7087da Add in-memory caching strategy for shared strings
In-memory implementation using SplFixedArray
Updated code and tests to support errors when reading XML nodes (useful when reading XML files used for attacks)
Removed LIBXML_NOENT option (which DOES substitute entities...)
Added test for Quadratic Blowup attack
2015-07-13 00:29:59 -07:00
Adrien Loison
2dcb86aae9 Move shared strings caching strategy into its own component
This will help implementing different caching strategies:
- file based
- in-memory
2015-07-11 14:12:18 -07:00
Lewis
1e2452934c Additional tests for Cell Types 2015-07-02 19:35:23 +01:00
Adrien Loison
b3df57d2e5 Fix XLSX Writer on Windows plaftorms
A bug was introduced, preventing Spout to create valid XLSX files on Windows.
This commits reverts the changes that introduced DIRECTORY_SEPARATOR everywhere
and fixes the original issue with the writer by normalizing paths when creating
the zipped file.
2015-07-01 15:24:58 -07:00
Adrien Loison
7d922e6776 Prevent entity loading when reading XML
Added LIBXML_NOENT option when reading a XML file
libxml_disable_entity_loader(true) cannot be used because it disables
the use of XMLReader::open()... see https://bugs.php.net/bug.php?id=62577
2015-07-01 14:07:15 -07:00
Adrien Loison
8bac924d48 Add support for more cell types
Added proper support for booleans, dates, numbers, errors.
Added unescaping of the read string.
Fixed a bug when cells did not have any values => now returns empty string.
2015-06-03 11:19:21 -07:00
Adrien Loison
b21bb86682 Add support for files containing formulas
Formulas will be skipped on reading.
The result of the formulas will be kept though.
2015-05-29 09:01:28 -07:00
Adrien Loison
04d41d7c9f Support XLSX that don't have a sharedStrings.xml file 2015-05-28 17:59:30 -07:00
Adrien Loison
fb0175d633 Fix issue with directory separators for zip:// on Windows
Replaced "/" by DIRECTORY_SEPARATOR every time it was used with zip://
2015-05-12 20:51:57 -07:00
Adrien Loison
a848be52de Merge pull request #33 from box/improve_code_coverage
Improve code coverage
2015-04-29 11:55:37 -07:00
Adrien Loison
3b4dfba38e Improve code coverage 2015-04-29 11:39:21 -07:00
Adrien Loison
cfd3e0ffa3 Rename *Number to *Index 2015-04-29 10:48:31 -07:00
Adrien Loison
d02013c82e Allow custom sheet name in the XLSX writer
Added setter
Added test
Updated README
2015-04-29 01:01:59 -07:00
Adrien Loison
e9ec4e745c Expose a Sheet object on Reader::XLSX::nextSheet()
Added Sheet class for the XLSX reader that exposes basic sheet info, such as name or ID.
When retrieving the sheet data XML, added extra XML parsing to retrieve sheet data.
Added test
2015-04-29 00:27:45 -07:00
Adrien Loison
3f3461b002 Add and improve test coverage 2015-04-16 14:51:48 -07:00
Adrien Loison
3e5ef284a5 Fix empty shared string bug
Replaced !$sharedString by $sharedString === null to avoid the case
when $sharedString = ''
2015-04-16 13:00:02 -07:00
Adrien Loison
d6155a4243 Better guess the cell type based on its value 2015-04-14 19:52:56 -07:00
Adrien Loison
93cdd398dd Add test for skipping empty rows 2015-04-03 22:45:09 -07:00
Adrien Loison
6e11a043c1 Add support for multiline strings
Escaped line feed characters in shared strings before processing them.
This makes every string remain on one single line and therefore allow
fast retrieval
Replaced usages of "\n" by PHP_EOL
Added test for multiline strings
2015-03-27 16:54:56 -07:00
Adrien Loison
6bc9a18e9b Add support for empty sheets 2015-01-26 11:22:09 -08:00
Adrien Loison
5e199009e6 First external release 2015-01-15 18:14:07 -08:00