spout

Author	SHA1	Message	Date
Adrien Loison	73d5d0ea17	Remove text suffix in XLSX date formats (#341 ) Some date formats have a text suffix, e.g. "mm/dd/yy;@". We should remove the ";...@" part.	2016-10-18 11:55:36 -07:00
Adrien Loison	2fafb63115	ODS Reader should support num-rows-repeated for non empty rows (#335 )	2016-10-17 10:51:12 -07:00
Adrien Loison	0978d340f0	Option to keep empty rows (#331 ) * Add option to preserve empty rows when reading an XLSX file * Add option to preserve empty rows when reading a CSV file * Add option to preserve empty rows when reading an ODS file	2016-10-17 10:20:02 -07:00
Adrien Loison	cc07072cbb	Better support for Date custom format (#316 ) - To determine if a style should apply a date format, the presence of "applyNumberFormat" attribute on the "cellXfs" section of styles.xml is now optional. We only look at the "numFmtId" attribute (but early return if "applyNumberFormat" is set to "0"). - The format code can contain lowercase AND now uppercase characters as its pattern. - "General" format code used as a custom format is now supported. It seems to be used by a bunch of programs...	2016-09-24 10:46:42 -07:00
Adrien Loison	b2dc0c3fa9	Fix tests on Windows (#288 )	2016-08-10 12:09:37 -07:00
Adrien Loison	b75a3e34fc	XLSX cells containing date values should respect shouldFormatDate option (#282 ) Return the ISO 8601 date string directly if option is set	2016-07-20 20:12:00 -07:00
Adrien Loison	a8eb7ad39c	Shared strings table without uniqueCount and count should work (#269 ) Use file based strategy in this case	2016-07-11 19:03:37 +02:00
Adrien Loison	ffea8871a6	Add support for missing cell reference (#268 ) When describing a cell, the cell reference (r="A1") is optional. When not present, we should just increment the index of the last processed row.	2016-07-11 18:15:55 +02:00
Adrien Loison	1891c0b053	Fix XLSX reading when shared strings is missing the uniqueCount attribute (#255 ) Use "count" attribute as a fallback	2016-06-16 10:06:11 -07:00
madflow	cd38ba093e	Fix #245 (#246 )	2016-06-08 09:50:00 -07:00
Ingmar Runge	efebfb2bc2	CellValueFormatterTest: fix expectations for 32bit PHP (#234 )	2016-05-30 10:25:30 -07:00
Adrien Loison	251c0bebc1	Adding open_file_in_zip() helper function to XMLReader (#238 )	2016-05-29 23:22:57 -07:00
Adrien Loison	03866a6604	Support XLSX with prefixed XML files (#237 ) While the standard is not to have prefixes, some XLSX files have XML files containing a prefix. Microsoft has a tool that generates such files: https://msdn.microsoft.com/en-us/library/office/gg278316.aspx	2016-05-29 22:16:59 -07:00
Adrien Loison	2c80b1f23a	XLSX Reader should add a space between text nodes (#229 ) When a cell contains multiple text nodes, the cell value is currently obtained by concatenating the value of each text node. Instead, values should still be concatenated but a space should be added in between.	2016-05-23 14:15:48 -07:00
Adrien Loison	104cd9b811	Option to return formatted dates instead of PHP objects (#226 ) When reading spreadsheets, Spout should be able to return formatted dates, as shown when opened with Excel for instance. It currently only returns DateTime/DateInterval objects, making it impossible to read + write, as the Writer does not accept objects.	2016-05-20 16:08:35 -07:00
madflow	2d923c7e46	Fix issue #218 (#222 )	2016-05-20 09:32:47 -07:00
Adrien Loison	b4724906c4	Add support for cells formatted as time (#224 ) Cells formatted as "time" have values between 0 and 1. These values used to be considered as invalid. Note: this uses what was started in #202	2016-05-19 13:10:47 -07:00
Adrien Loison	b8fd789ac0	Retrieve XLSX sheets in order of appearance (#220 ) Instead of relying on the ID, sheets should be retrieved in the order they appear in the file. Workbook.xml describes the correct order. This allows the reader to read data in the correct order when sheets have been manually moved after creation.	2016-05-19 10:37:48 -07:00
Adrien Loison	5a7c2c1262	Handle General number format as non date (#221 ) If the number format is set to General (id = 0), do no try to format the value as a date	2016-05-19 09:40:12 -07:00
madflow	616925148e	Renamed xlsx file, #195	2016-04-07 08:58:54 +02:00
madflow	6f0f7c9690	Fix #195	2016-04-06 22:00:47 +02:00
skeleton	d6e8fe4b54	Fix line breaks on CSV reader	2016-03-23 23:26:49 +01:00
madflow	2b1160bb33	Tests for #184	2016-03-19 11:34:31 +01:00
Adrien Loison	d2ac54c578	Custom stream wrapper support Added support for custom stream wrappers, such as "fly" or "s3". Support is determined per reader.	2016-03-18 17:09:13 -07:00
Sebastian Fichera	86e26632f6	Added test case for custom EOL characters...	2016-02-12 16:30:18 -06:00
Adrien Loison	4a5da2ad74	Fix CellValueFormatter for numeric values The value passed into the format() function is coming from an XML file and has never been coerced. Therefore, when checking is_int($value), the check always returns false - because it's a string. Changing the check fixes the issue and Spout now correctly parses large numbers.	2016-01-14 11:11:31 -08:00
Adrien Loison	a804be4844	Support XLSX that are defined in random order Some software generate [Content_Types].xml file with sheets definition in random order. Instead of having the first sheet (id = 1) defined first, it may be defined in 3rd position. Therefore, to read the file in the correct order, sheets order need to be fixed.	2016-01-08 08:42:29 -08:00
Ingmar Runge	4407cffeff	XLSX Date Support / Test + Fix for years beyond 2037 This also fixes years < 1902 on 32-bit PHP systems.	2015-12-17 08:52:15 +01:00
Adrien Loison	8ef6bdac62	Better date support Although Excel has a Date type, older Excel versions use numeric values to store dates. The value represents the number of days since Jan 1st, 1900. The only way to tell if the value is a number or a date is to look at the styles.xml and check if the cell has date formatting.	2015-10-23 16:04:38 -07:00
Adrien Loison	01cc8b3da0	Fix "Cannot open file" issue with XMLReader::open on Windows This occurred when using relative paths. Using realpath() solves this issue.	2015-10-15 09:19:47 -07:00
Adrien Loison	a1a1077677	Fix infinite loop for CSV with all lines empty Only occured with multiline CSV files	2015-10-05 21:10:41 +02:00
Adrien Loison	818ec2488c	Support all ODS cell types Including: - date / time - currency - percentage - void And improved support for boolean	2015-09-02 14:03:38 -07:00
Adrien Loison	e4154dfdc3	ODS Reader Spout can now read ODS files. It's on par with the XLSX reader. The only difference is that the row iterator cannot be rewound. It supports the different output formats from LibreOffice and Excel, skipping extra rows/cells if needed.	2015-09-01 10:53:49 -07:00
Adrien Loison	8a3b895afc	Fix CSV reader when last line is empty If the last line was empty, it would create an infinite loop...	2015-07-29 10:17:51 -07:00
Adrien Loison	5e1cfbfdbd	Attempt to convert the non UTF-8 strings to UTF-8	2015-07-27 20:59:12 -07:00
Adrien Loison	1ba10ed2b0	Add wrappers around XMLReader and SimpleXMLElement to improve error handling	2015-07-27 00:49:43 -07:00
Adrien Loison	37d87a8a27	Fix various problems	2015-07-27 00:23:18 -07:00
Adrien Loison	86a4c3790a	Adding more tests	2015-07-26 23:53:49 -07:00
Adrien Loison	c52dd7bde8	Remove old reader files	2015-07-26 23:53:17 -07:00
Adrien Loison	ae3ee357ff	Moved readers to iterators Instead of the hasNext() / next() syntax, readers now implements the PHP iterator pattern. It allows readers to be used with a foreach() loop. All readers now share the same structure (CSV is treated as having exactly one sheet): - one concrete Reader - one SheetIterator, exposed by the Reader - one or more Sheets, returned at every iteration - one RowIterator, exposed by the Sheet Introducing the concept of sheets for CSV may be kind of confusing but it makes Spout way more consistent. Also, this confusion may be resolved by creating a wrapper around the readers if needed. -- This commit does not delete the old files, not change the folder structure for Writers. This will be done in another commit.	2015-07-26 23:53:17 -07:00
Adrien Loison	6ae79b63b3	Merge pull request #67 from box/caching_strategies Caching strategies	2015-07-14 10:58:37 -07:00
Adrien Loison	277b665dad	Merge pull request #68 from box/add_coverage_abstract_reader Improve coverage AbstractReader	2015-07-14 10:55:22 -07:00
Adrien Loison	b0c7c6ca84	Improve coverage AbstractReader	2015-07-14 10:48:03 -07:00
Adrien Loison	494c506d56	Add logic to automatically select the best caching strategy Based on the number of unique shared strings as well as the available memory amount, one strategy will be chosen over the other. The algorithm is based on empirical data and super safe so it may need to be tuned.	2015-07-14 02:26:01 -07:00
Adrien Loison	334f7087da	Add in-memory caching strategy for shared strings In-memory implementation using SplFixedArray Updated code and tests to support errors when reading XML nodes (useful when reading XML files used for attacks) Removed LIBXML_NOENT option (which DOES substitute entities...) Added test for Quadratic Blowup attack	2015-07-13 00:29:59 -07:00
Adrien Loison	2dcb86aae9	Move shared strings caching strategy into its own component This will help implementing different caching strategies: - file based - in-memory	2015-07-11 14:12:18 -07:00
Lewis	1e2452934c	Additional tests for Cell Types	2015-07-02 19:35:23 +01:00
Adrien Loison	7d922e6776	Prevent entity loading when reading XML Added LIBXML_NOENT option when reading a XML file libxml_disable_entity_loader(true) cannot be used because it disables the use of XMLReader::open()... see https://bugs.php.net/bug.php?id=62577	2015-07-01 14:07:15 -07:00
Adrien Loison	8bac924d48	Add support for more cell types Added proper support for booleans, dates, numbers, errors. Added unescaping of the read string. Fixed a bug when cells did not have any values => now returns empty string.	2015-06-03 11:19:21 -07:00
Adrien Loison	b21bb86682	Add support for files containing formulas Formulas will be skipped on reading. The result of the formulas will be kept though.	2015-05-29 09:01:28 -07:00

1 2

60 Commits