32 Commits

Author SHA1 Message Date
Adrien Loison
03866a6604 Support XLSX with prefixed XML files (#237)
While the standard is not to have prefixes, some XLSX files have XML files containing a prefix.
Microsoft has a tool that generates such files: https://msdn.microsoft.com/en-us/library/office/gg278316.aspx
2016-05-29 22:16:59 -07:00
Adrien Loison
2c80b1f23a XLSX Reader should add a space between text nodes (#229)
When a cell contains multiple text nodes, the cell value is currently obtained by concatenating the value of each text node.
Instead, values should still be concatenated but a space should be added in between.
2016-05-23 14:15:48 -07:00
Adrien Loison
104cd9b811 Option to return formatted dates instead of PHP objects (#226)
When reading spreadsheets, Spout should be able to return formatted dates, as shown when opened with Excel for instance.
It currently only returns DateTime/DateInterval objects, making it impossible to read + write, as the Writer does not accept objects.
2016-05-20 16:08:35 -07:00
madflow
2d923c7e46 Fix issue #218 (#222) 2016-05-20 09:32:47 -07:00
Adrien Loison
b4724906c4 Add support for cells formatted as time (#224)
Cells formatted as "time" have values between 0 and 1. These values used to be considered as invalid.
Note: this uses what was started in #202
2016-05-19 13:10:47 -07:00
Adrien Loison
b8fd789ac0 Retrieve XLSX sheets in order of appearance (#220)
Instead of relying on the ID, sheets should be retrieved in the order they appear in the file.
Workbook.xml describes the correct order.
This allows the reader to read data in the correct order when sheets have been manually moved after creation.
2016-05-19 10:37:48 -07:00
madflow
616925148e Renamed xlsx file, #195 2016-04-07 08:58:54 +02:00
madflow
6f0f7c9690 Fix #195 2016-04-06 22:00:47 +02:00
skeleton
d6e8fe4b54 Fix line breaks on CSV reader 2016-03-23 23:26:49 +01:00
madflow
2b1160bb33 Tests for #184 2016-03-19 11:34:31 +01:00
Sebastian Fichera
86e26632f6 Added test case for custom EOL characters... 2016-02-12 16:30:18 -06:00
Adrien Loison
a804be4844 Support XLSX that are defined in random order
Some software generate [Content_Types].xml file with sheets definition in random order.
Instead of having the first sheet (id = 1) defined first, it may be defined in 3rd position.
Therefore, to read the file in the correct order, sheets order need to be fixed.
2016-01-08 08:42:29 -08:00
Adrien Loison
8ef6bdac62 Better date support
Although Excel has a Date type, older Excel versions use numeric values to store dates.
The value represents the number of days since Jan 1st, 1900.
The only way to tell if the value is a number or a date is to look at the styles.xml and check if the cell has date formatting.
2015-10-23 16:04:38 -07:00
Adrien Loison
a1a1077677 Fix infinite loop for CSV with all lines empty
Only occured with multiline CSV files
2015-10-05 21:10:41 +02:00
Adrien Loison
818ec2488c Support all ODS cell types
Including:
- date / time
- currency
- percentage
- void

And improved support for boolean
2015-09-02 14:03:38 -07:00
Adrien Loison
e4154dfdc3 ODS Reader
Spout can now read ODS files.
It's on par with the XLSX reader. The only difference is that the row iterator cannot be rewound.
It supports the different output formats from LibreOffice and Excel, skipping extra rows/cells if needed.
2015-09-01 10:53:49 -07:00
Adrien Loison
8a3b895afc Fix CSV reader when last line is empty
If the last line was empty, it would create an infinite loop...
2015-07-29 10:17:51 -07:00
Adrien Loison
5e1cfbfdbd Attempt to convert the non UTF-8 strings to UTF-8 2015-07-27 20:59:12 -07:00
Adrien Loison
1ba10ed2b0 Add wrappers around XMLReader and SimpleXMLElement to improve error handling 2015-07-27 00:49:43 -07:00
Adrien Loison
86a4c3790a Adding more tests 2015-07-26 23:53:49 -07:00
Adrien Loison
6ae79b63b3 Merge pull request #67 from box/caching_strategies
Caching strategies
2015-07-14 10:58:37 -07:00
Adrien Loison
334f7087da Add in-memory caching strategy for shared strings
In-memory implementation using SplFixedArray
Updated code and tests to support errors when reading XML nodes (useful when reading XML files used for attacks)
Removed LIBXML_NOENT option (which DOES substitute entities...)
Added test for Quadratic Blowup attack
2015-07-13 00:29:59 -07:00
Lewis
1e2452934c Additional tests for Cell Types 2015-07-02 19:35:23 +01:00
Adrien Loison
8bac924d48 Add support for more cell types
Added proper support for booleans, dates, numbers, errors.
Added unescaping of the read string.
Fixed a bug when cells did not have any values => now returns empty string.
2015-06-03 11:19:21 -07:00
Adrien Loison
b21bb86682 Add support for files containing formulas
Formulas will be skipped on reading.
The result of the formulas will be kept though.
2015-05-29 09:01:28 -07:00
Adrien Loison
04d41d7c9f Support XLSX that don't have a sharedStrings.xml file 2015-05-28 17:59:30 -07:00
Adrien Loison
e9ec4e745c Expose a Sheet object on Reader::XLSX::nextSheet()
Added Sheet class for the XLSX reader that exposes basic sheet info, such as name or ID.
When retrieving the sheet data XML, added extra XML parsing to retrieve sheet data.
Added test
2015-04-29 00:27:45 -07:00
Adrien Loison
3f3461b002 Add and improve test coverage 2015-04-16 14:51:48 -07:00
Adrien Loison
3e5ef284a5 Fix empty shared string bug
Replaced !$sharedString by $sharedString === null to avoid the case
when $sharedString = ''
2015-04-16 13:00:02 -07:00
Adrien Loison
6e11a043c1 Add support for multiline strings
Escaped line feed characters in shared strings before processing them.
This makes every string remain on one single line and therefore allow
fast retrieval
Replaced usages of "\n" by PHP_EOL
Added test for multiline strings
2015-03-27 16:54:56 -07:00
Adrien Loison
6bc9a18e9b Add support for empty sheets 2015-01-26 11:22:09 -08:00
Adrien Loison
5e199009e6 First external release 2015-01-15 18:14:07 -08:00