diff --git a/CHANGELOG.md b/CHANGELOG.md index 4a6f4a46..4e032b81 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -10,7 +10,7 @@ All Notable changes to `Csv` will be documented in this file - `TabularDataReader::selectAllExcept` - `Statement::selectAllExcept` - `ResultSet::from` and `ResultSet::tryFrom` -- `RdbmsResult` class to allow converting RDBMS result into `ResultSet` +- `RdbmsResult` class to ease importing RDBMS result into the package classes - `TabularData` interface - `Buffer` class - `XMLConverter::supportsHeader` diff --git a/docs/9.0/writer/buffer.md b/docs/9.0/writer/buffer.md index a4f7d1f2..86168d22 100644 --- a/docs/9.0/writer/buffer.md +++ b/docs/9.0/writer/buffer.md @@ -15,14 +15,22 @@ PHP stream capabilities like the `Reader` or the `Writer` do. ## Loading Data into the buffer -The `Buffer` object can be instantiated from any object that implements the `League\Csv\TabularData` like the `Reader` -or the `ResultSet` classes: +### Using the package classes + +The `Buffer` object can be instantiated from any object that implements the package `TabularData` interface +like the `Reader` or the `ResultSet` classes: ```php $buffer = Buffer::from(Reader::createFromPath('path/to/file.csv')); +//or +$document = Reader::createFromPath('path/to/file.csv'); +$document->setHeaderOffset(0); +$altBuffer = Buffer::from($document->slice(0, 30_000)); ``` -Apart from `TabularData` implementing object, the method also accepts results from RDBMS query as shown below: +### Using RDBMS result + +The `from` method also accepts results from RDBMS query as shown below: ```php $db = new SQLite3( '/path/to/my/db.sqlite'); @@ -33,17 +41,15 @@ $user24 = Buffer::from($stmt)->nth(23); // returns ['id' => 42, 'firstname' => 'john', 'lastname' => 'doe', ...] ``` -The `from` supports the following Database Extensions: +The method supports the following Database Extensions: - SQLite3 (`SQLite3Result` object) - MySQL Improved Extension (`mysqli_result` object) - PostgreSQL (`PgSql\Result` object returned by the `pg_get_result`) - PDO (`PDOStatement` object) -

The Buffer class is mutable. On instantiation, it copies and stores the full source data in-memory.

- -You can tell the `Buffer` instance to exclude the header when importing the data using the `from` named constructor -using the method second optional argument with one of the class public constant: +You can tell the `Buffer` instance to include or exclude the header when importing the data using the +second optional argument of the `from` named constructor with one of the class public constant: - `Buffer::INCLUDE_HEADER` - `Buffer::EXCLUDE_HEADER` @@ -56,6 +62,7 @@ $stmt instanceof SQLite3Result || throw new RuntimeException('SQLite3 results no $user24 = Buffer::from($stmt, Buffer::EXCLUDE_HEADER)->nth(23); //will return a list of properties without any column name attach to them!! // returns [42, 'john', 'doe', ...] +// the header information will be lost and not header data will be present ``` ### Generic Importer Logic @@ -76,110 +83,175 @@ $payload = <<insert(...$data); +``` + +### Buffer state + +

The Buffer class is mutable. On instantiation, +it copies and stores the full source data in-memory.

+ +Once loaded, at any given moment, the `Buffer` exposes the following methods: + +- `Buffer::hasHeader` which tells whether a non-empty header is attached to the buffer +- `Buffer::isEmpty` which tells whether the instance contains some records or not. +- `Buffer::firstOffset` which returns the first **offset** in the buffer or `null` if the instance is empty +- `Buffer::lastOffset` which returns the last **offset** in the buffer or `null` if the instance is empty +- `Buffer::recordCount` which returns the total number of records currently present in the instance. + +```php +use League\Csv\Buffer; +use League\Csv\Reader; +use League\Csv\Statement; + +$reader = Reader::createFromPath('/path/to/file.csv'); +$reader->setHeaderOffset(0); + +$buffer = Buffer::from($reader->slice(50, 30_000))); +$buffer->isEmpty(); // returns false +$buffer->hasHeader(); // returns true +$buffer->firstOffset(); // returns 50 +$buffer->lastOffset(); // returns the offset of the last inserted record +$buffer->recordCount(); // the total number of rows in the instance + +$emptyBuffer = new Buffer(); +$emptyBuffer->isEmpty(); // returns true +$emptyBuffer->hasHeader(); // returns false +$emptyBuffer->firstOffset(); // returns null +$emptyBuffer->lastOffset(); // returns null +$emptyBuffer->recordCount(); // returns 0 +``` + +

The Buffer header can not be changed once the object +has been instantiated. To change the header you are required to create a new +Buffer instance.

+ +At any given time your can also return the last inserted record or an empty `array` if not record as yet +to be added to the buffer via the `Buffer::last` method. The same logic applies with the first inserted +record when using the `Buffer::first` method. Both methods have a `*AsObject` counterpart which maps +the found record to a specified object or returns `null` if no record is found. + +```php +$buffer->first(); //returns ['firstname' => 'john', 'lastname' => 'doe', 'email' => 'johh.doe@example.com'] +$buffer->firstAsObject(User::class); // returns a User instance on success +$emptyBuffer->last(); // returns [] +$emptyBuffer->lastAsObject(User::class); // returns null ``` ## Modifying the buffer data -Because of its in-memory and mutable state, the `Buffer` is best suited to help modifying the data on the fly before -persisting it on a more suitable storage layer. To do so, the class provide a straightforward CRUP public API. +Because of its in-memory and mutable state, the `Buffer` is best suited to help modifying +the data on the fly before persisting it on a more suitable storage layer. To do so, the +class provides a straightforward CRUD public API. ### Insert Records -The class provides two method to insert records `insertOne` will insert a single record while `insertAll` will insert new records into the instance. -Because tabular data can have a header or not, both methods accept either a list of values or an array of column names and -values as shown below: +The class provides the `insert` method to add records to the instance. Because tabular data +can have a header or not the method accepts either a variadic list of records values +or of associative arrays as shown below: ```php -$buffer = Buffer::from(Reader::createFromPath('path/to/file.csv')); -$affectedRowsCount = $buffer->insertAll([['first', 'second', 'third']]); -$affectedRowsCount = $buffer->insertOne(['first', 'second', 'third']); +$buffer = new Buffer(); +$buffer->getHeader(); // returns [] +$buffer->hasHeader(); // return false +$buffer->insert( + ['moko', 'mibalé', 'misató'], + [ + 'first column' => 'un', + 'second column' => 'deux', + 'third column' => 'trois', + ], + ['one', 'two', 'three'], +); // returns 3 + +return iterator_to_array($buffer->getRecords()); +// [ +// ['moko', 'mibalé', 'misató'], +// ['un', 'deux', 'trois'], +// ['one', 'two', 'three'], +// ]; ``` -The method returns the number of successfully inserted records or trigger an exception if the parameters are invalid. +The method returns the number of successfully inserted records or trigger an exception if the +insertion can not occur. -

If no header is defined for the instance, column consistency is not checked on insertion. -And only list are accepted as record to be inserted.

+

If no header is defined for the instance, column consistency is not +checked on insertion and associative array are inserted without their corresponding keys.

-Let's create a new `Buffer` instance from a `Reader` object. +Let's create a new `Buffer` instance **with a header specified**. ```php $document = Reader::createFromPath('path/to/file.csv'); -$document->setHeaderOffset(0); +$document->setHeaderOffset(0); //the Reader header will be imported alongside its records $buffer = Buffer::from($document); $buffer->getHeader(); // returns ['column1', 'column2', 'column3'] +$buffer->hasHeader(); // return true ``` -We can insert a new record using a list as long as the list has the same length as the `Buffer` instance or the -`Buffer` instance has no header attached to it. +We can insert a new record using a list as long as the list has the same length as the `Buffer` +instance or the `Buffer` instance has no header attached to it. ```php -$affectedRowsCount = $buffer->insertOne(['first', 'second', 'third']); -//will work because the list contains the same number of fields as in the header +$affectedRowsCount = $buffer->insert(['first', 'second', 'third']); +$buffer->last(); // returns ['column1' => 'first', 'column2' => 'second', 'column3' => 'third']; ``` We can also insert a record if it shares the exact same key as the header values. ```php -$affectedRowsCount = $buffer->insertAll([[ +$affectedRowsCount = $buffer->insert([ 'column1' => 'first', 'column2' => 'second', 'column3' => 'third', -]]); +]); ``` On the other hand, trying to insert an incomplete record will trigger an exception. ```php -$buffer->insertOne(['column1' => 'first', 'column3' => 'third']); //will trigger an exception +$buffer->insert(['column1' => 'first', 'column3' => 'third']); //will trigger an exception ``` The same will happen if the list does not contain the same number of fields as the header does when it is present. ```php -$buffer->insertOne(['first', 'third']); //will trigger an exception +$buffer->insert(['first', 'third']); //will trigger an exception ``` ### Update or Delete Records -The class also provides an `update` and `delete` methods. Those method are responsible for updating or deleting records -based on some constraints and use the following signature. +The class also provides the `update`, `delete` and `truncate` methods. Those method are responsible for +updating or deleting records based on some constraints and use the following signature. ```php use League\Csv\Buffer; use League\Csv\Query\Predicate; -Buffer::update(Predicate|Closure|Callable|array|int $where, array $record): int; -Buffer::delete(Predicate|Closure|Callable|array|int $where): int; +Buffer::update(Predicate|Closure|callable $where, array $record): int; +Buffer::delete(Predicate|Closure|callable $where): int; +Buffer::truncate(): void; ``` -Just like the `insert` method, these methods return the number of successfully updated or deleted records or -trigger an exception if the parameters are invalid. - -The `$where` argument can be: - -An integer in which case it represents the specific offset of the `Buffer`. If the offset does not exist, -an exception is triggered. +The `truncate` method remove all the records present in the `Buffer` instance leaving its +header state unchanged. ```php -$buffer = Buffer::from(Reader::createFromPath('path/to/file.csv')); -$affectedRowsCount = $buffer->update(234, ['column1' => 'first', 'column2' => 'second']); -$buffer->delete(42); //delete the record with the offset = 42 -``` - -A list of integer representing each a specific offset of the `Buffer`. All the offset **MUST** exist otherwise an -exception will be triggered. - -```php -$buffer = Buffer::from(Reader::createFromPath('path/to/file.csv')); -$affectedRowsCount = $buffer->update([234, 5, 28], [1 => 'second']); -$buffer->delete([234, 5, 28]); //delete the record with the offset = 42 +$document = Reader::createFromPath('path/to/file.csv'); +$document->setHeaderOffset(0); //the Reader header will be imported alongside its records +$buffer = Buffer::from($document); +$buffer->isEmpty(); // returns false +$buffer->hasHeader(); // return true +$buffer->truncrate(); +$buffer->isEmpty(); // returns true +$buffer->hasHeader(); // return true ``` -if the above example, the update is performed using the field offset instead of the field name. This can be handy if -the `Buffer` instance has no header, but it works with or without the presence of one. +On the other hand, the `update` and `delete` methods return the number of successfully updated or +deleted records or trigger an exception if the parameters are invalid. -A callable or a `League\Csv\Predicate` implementing class. +The `$where` argument can be a callable or a `League\Csv\Predicate` implementing class. This is the same +argument used with the `Statement::where` method. ```php use League\Csv\Buffer; @@ -190,19 +262,51 @@ $reader = Reader::createFromPath('path/to/file.csv'); $reader->setHeaderOffset(0); $buffer = Buffer::from($reader->slice(0, 300)); //copy the first 300 lines of the Reader class -$affectedRowsCount = $buffer->update(Column::filterOn('location', '=', 'Berkeley'), ['location' => 'Galway']); +$affectedRowsCount = $buffer->update( + Column::filterOn('location', '=', 'Berkeley'), + ['location' => 'Galway'] +); +``` + +The previous example will update all the rows from the `Buffer` instance where the `location` field +is equal to the `Berkeley` string. To know more about the predicates you can refer to +the `ResultSet` documentation page. + +The update or deletion can be performed using the field offset or the field name. +This can be handy if the `Buffer` instance has no header, but it works with or without the presence of one. + +

The values returned by the Buffer state methods may vary depending +on the record(s) added and/or deleted.

+ +### Record formatting + +Before insertion, the record can be further formatted using a formatter. A formatter is a `callable` which accepts +a single record as an `array` on input and returns an array representing the formatted record according to its +inner rules. + +```php +function(array $record): array ``` -The previous example will update all the rows `location` field from the `Buffer` instance which contains the value `Berkeley`. -To know more about the predicates you can refer to the `ResultSet` documentation page. +You can attach as many formatters as you want using the `Buffer::addFormatter` method. +Formatters are applied following the *First In First Out* rule. + +```php +use League\Csv\Buffer; + +$buffer = new Buffer(); +$buffer->addFormatter(fn (array $row): array => array_map('strtoupper', $row)); +$buffer->insert(['john', 'doe', 'john.doe@example.com']); +$buffer->last(); //returns ['JOHN', 'DOE', 'JOHN.DOE@EXAMPLE.COM'] +``` ### Record validation By default, the `Buffer` instance will only validate the column field names, if a header is provided, otherwise, column consistency or column value are ignored. To improve validation you can use a record validator. -The validator is a `callable` or a `Closure` which takes a single record as an `array` as its sole argument and returns -a `boolean` to indicate if it satisfies the validator rule. +The validator is a `callable` or a `Closure` which takes a single record as an `array` as its sole argument +and returns a `boolean` to indicate if it satisfies the validator rule. ```php function(array $record): bool @@ -213,10 +317,11 @@ The validator **must** return `true` to validate the submitted record. Any other expression, including truthy ones like `yes`, `1` will make the inserting or updating methods throw an `League\Csv\CannotInsertRecord` exception. -You can attach as many validators as you want using the `Buffer::addValidator` method. Validators are applied following -the *First In First Out* rule. +You can attach as many validators as you want using the `Buffer::addValidator` method. Validators are applied +following the *First In First Out* rule. -

The record is checked against your supplied validators after it has been checked for field names integrity.

+

The record is checked against your supplied validators after it has been checked +for field names integrity and formatted using the optionals registered formatters.

`Buffer::addValidator` takes two (2) **required** parameters: @@ -237,19 +342,19 @@ $buffer = new Buffer(); $buffer->addValidator(fn (array $row): bool => 10 == count($row), 'row_must_contain_10_cells'); try { - $buffer->insertOne(['john', 'doe', 'john.doe@example.com']); -} catch (CannotInsertRecord $e) { - echo $e->getName(); //displays 'row_must_contain_10_cells' - $e->getData();//returns the invalid data ['john', 'doe', 'john.doe@example.com'] + $buffer->insert(['john', 'doe', 'john.doe@example.com']); +} catch (CannotInsertRecord $exception) { + echo $exception->getName(); //displays 'row_must_contain_10_cells' + $exception->getData();//returns the invalid data ['john', 'doe', 'john.doe@example.com'] } ``` ## Persisting Buffer data -The `Buffer` content can be store using the `to` method. The method takes 2 arguments, the `Writer` class or any class -that implements the `TabularWriter` interface and the same second optional argument used with the `from` method to tell -whether the header should also be written as the first line in the stored persistence layer using the -`TabularWriter` or not. +The `Buffer` content can be store using the `to` method. The method takes 2 arguments, the `Writer` class or any +class that implements the `TabularWriter` interface and the same second optional argument used with the `from` +method to tell whether the header should also be written as the first line in the stored persistence layer +using the `TabularWriter` or not. ```php use League\Csv\Buffer; @@ -315,8 +420,8 @@ Or simply, use the class select features to expose the buffer content to your sp Since version `9.6` the package provides a common API to works with tabular data like structure. A tabular data is data organized in rows and columns. The fact that the package aim at interacting mainly with CSV does not restrict its usage to CSV document only, In fact if you can provide a tabular data structure to the package -it should be able to manipulate such data with ease. Hence, the introduction of the `TabularData` interface to improve -interoperability with any tabular structure. +it should be able to manipulate such data with ease. Hence, the introduction of the `TabularData` interface +to improve interoperability with any tabular structure. As seen by the package a tabular data is: @@ -332,11 +437,10 @@ interface TabularData public function getHeader(): array; public function getRecords(array $header = []): Iterator public function getRecordsAsObject(string $className, array $header = []): Iterator - public function nth(int $offset): array - public function nthAsObject(int $offset, string $className, array $header = []): ?object - public function fetchColumn(int|string $offset): Iterator; - public function fetchPairs(int|string $offset_index, int|string $value_index): Iterator; - public function recordCount(): int; + public function map(callable $callback): Iterator + public function nth(int $nth): array + public function nthAsObject(int $nth, string $className, array $header = []): ?object + public function fetchColumn(int|string $columnIndex): Iterator } ``` @@ -367,23 +471,3 @@ $records = new Statement() ``` `$records` will be a `ResultSet` instance that you can manipulate further more if needed. - -Last but not least, since the `Buffer` is an in-memory tabular data it exposes the following 2 (two) methods `Buffer::isEmpty` -and `Buffer::includeHeader` to quickly know if the instance contains a defined header and if it has already some records in it. - -```php -use League\Csv\Buffer; -use League\Csv\Reader; -use League\Csv\Statement; - -$reader = Reader::createFromPath('/path/to/file.csv'); -$reader->setHeaderOffset(0); -$buffer = Buffer::from($reader->slice(0, 30000))); - -$buffer->isEmpty(); // return false -$buffer->includeHeader(); // return true - -$emptyBuffer = new Buffer(); -$emptyBuffer->isEmpty(); // return true -$emptyBuffer->includeHeader(); // return false -``` diff --git a/docs/_data/menu.yml b/docs/_data/menu.yml index 926b7a8d..5074a75a 100644 --- a/docs/_data/menu.yml +++ b/docs/_data/menu.yml @@ -9,32 +9,31 @@ version: Document Loading: '/9.0/connections/instantiation/' Document Output: '/9.0/connections/output/' Character Controls: '/9.0/connections/controls/' + Stream Filters: '/9.0/connections/filters/' BOM Sequence: '/9.0/connections/bom/' Charset Converter: '/9.0/converter/charset/' - Stream Filters: '/9.0/connections/filters/' - Inserting Records: - CSV Writer: '/9.0/writer/' - Tabular Data Buffer: '/9.0/writer/buffer/' - Bundled Helpers: '/9.0/writer/helpers/' Selecting Records: Tabular Data Reader: '/9.0/reader/tabular-data-reader/' CSV Reader: '/9.0/reader/' Result Set: '/9.0/reader/resultset/' Constraint Builders: '/9.0/reader/statement/' Record Mapping: '/9.0/reader/record-mapping/' + Inserting Records: + CSV Writer: '/9.0/writer/' + Tabular Data Buffer: '/9.0/writer/buffer/' + Bundled Helpers: '/9.0/writer/helpers/' Interoperability: Overview : '/9.0/interoperability/' Handling Delimiter : '/9.0/interoperability/swap-delimiter/' Formula Injection : '/9.0/interoperability/escape-formula-injection/' - Tabular Data Importer: '/9.0/interoperability/tabular-data-importer/' Document Encoding : '/9.0/interoperability/encoding/' RFC4180 Field : '/9.0/interoperability/rfc4180-field/' Force Enclosure : '/9.0/interoperability/enclose-field/' Converting Records: Overview: '/9.0/converter/' + JSON Converter: '/9.0/converter/json/' XML Converter: '/9.0/converter/xml/' HTML Converter: '/9.0/converter/html/' - JSON Converter: '/9.0/converter/json/' Extensions: Overview: '/9.0/extensions/' Doctrine: '/9.0/extensions/doctrine' diff --git a/src/Buffer.php b/src/Buffer.php index cf2f3042..afae5ff2 100644 --- a/src/Buffer.php +++ b/src/Buffer.php @@ -16,40 +16,37 @@ use CallbackFilterIterator; use Closure; use Iterator; -use League\Csv\Query\Constraint\Comparison; use League\Csv\Query\Constraint\Criteria; -use League\Csv\Query\Constraint\Offset; use League\Csv\Query\Predicate; -use League\Csv\Query\QueryException; use League\Csv\Serializer\Denormalizer; use League\Csv\Serializer\MappingFailed; use League\Csv\Serializer\TypeCastingFailed; use mysqli_result; -use OutOfBoundsException; use PDOStatement; use PgSql\Result; use ReflectionException; use RuntimeException; use SQLite3Result; -use Throwable; -use TypeError; -use ValueError; use function array_combine; use function array_diff; +use function array_fill_keys; use function array_filter; use function array_is_list; use function array_key_exists; +use function array_key_first; +use function array_key_last; use function array_keys; use function array_map; +use function array_push; use function array_unique; use function array_values; use function count; use function in_array; -use function is_array; use function is_int; -use function is_string; -use function iterator_to_array; +use function sort; + +use const ARRAY_FILTER_USE_KEY; final class Buffer implements TabularData { @@ -60,18 +57,21 @@ final class Buffer implements TabularData private readonly array $header; /** @var list|array{} */ private readonly array $sortedHeader; - /** @var array> */ + /** @var array */ + private readonly array $nullRecord; + /** @var array> */ private array $rows = []; /** @var array callable collection to validate the record before insertion. */ private array $validators = []; + /** @var array collection of Closure to format the record before reading. */ + private array $formatters = []; /** - * @param iterable> $rows * @param list|array{} $header * * @throws SyntaxError */ - public function __construct(iterable $rows = [], array $header = []) + public function __construct(array $header = []) { $this->header = match (true) { !array_is_list($header) => throw new SyntaxError('The header must be a list of unique column names.'), @@ -81,8 +81,7 @@ public function __construct(iterable $rows = [], array $header = []) }; sort($header); $this->sortedHeader = $header; - - $this->insertAll($rows); + $this->nullRecord = array_fill_keys($this->header, null); } /** @@ -92,30 +91,22 @@ public function __construct(iterable $rows = [], array $header = []) */ public static function from(PDOStatement|Result|mysqli_result|SQLite3Result|TabularData $dataStorage, int $options = self::INCLUDE_HEADER): self { - if ($dataStorage instanceof TabularData) { - if (self::INCLUDE_HEADER === $options) { - $instance = new self(header: $dataStorage->getHeader()); - $instance->rows = iterator_to_array($dataStorage->getRecords(), false); - - return $instance; - } - - $instance = new self(); - $instance->rows = iterator_to_array(new MapIterator($dataStorage->getRecords(), fn (array $row) => array_values($row)), false); /* @phpstan-ignore-line */ - - return $instance; - } - - if (self::INCLUDE_HEADER === $options) { - $instance = new self(header: RdbmsResult::columnNames($dataStorage)); - $instance->rows = RdbmsResult::rows($dataStorage); + /** @var Iterator $rows */ + $rows = $dataStorage instanceof TabularData ? $dataStorage->getRecords() : RdbmsResult::rows($dataStorage); + $instance = new self(match (true) { + self::EXCLUDE_HEADER === $options => [], + $dataStorage instanceof TabularData => $dataStorage->getHeader(), + default => RdbmsResult::columnNames($dataStorage), + }); - return $instance; + /** + * @var int $offset + * @var list $row + */ + foreach (new MapIterator($rows, fn (array $record): array => array_values($record)) as $offset => $row) { + $instance->rows[$offset] = $row; } - $instance = new self(); - $instance->rows = array_map(array_values(...), RdbmsResult::rows($dataStorage)); - return $instance; } @@ -131,7 +122,7 @@ public function to(TabularDataWriter $dataStorage, int $options = self::INCLUDE_ $bytes += $dataStorage->insertOne($header); } - return $bytes + $dataStorage->insertAll($this->getRecords()); /* @phpstan-ignore-line */ + return $bytes + $dataStorage->insertAll($this->getRecords()); } public function isEmpty(): bool @@ -139,7 +130,7 @@ public function isEmpty(): bool return [] === $this->rows; } - public function includeHeader(): bool + public function hasHeader(): bool { return [] !== $this->header; } @@ -157,28 +148,16 @@ public function getHeader(): array return $this->header; } + /** + * @throws SyntaxError + * + * @return Iterator + */ public function getRecords(array $header = []): Iterator { - $header = match (true) { - !array_is_list($header) => throw new SyntaxError('The header must be a list of unique column names.'), - $header !== array_filter($header, is_string(...)) => throw SyntaxError::dueToInvalidHeaderColumnNames(), - $header !== array_unique($header) => throw SyntaxError::dueToDuplicateHeaderColumnNames($header), - [] === $header => $this->header, - default => $header, - }; + $header = $this->prepareHeader($header); - return MapIterator::fromIterable($this->rows, match ([]) { - $this->header => fn (array $row): array => array_values($row), - default => function (array $row) use ($header): array { - $record = []; - $values = array_values($row); - foreach ($header as $offset => $headerName) { - $record[$headerName] = $values[$offset] ?? null; - } - - return $record; - }, - }); + return MapIterator::fromIterable($this->rows, fn (array $row): array => $this->rowToRecord($row, $header)); } /** @@ -213,65 +192,112 @@ public function map(callable $callback): Iterator return MapIterator::fromIterable($this->getRecords(), $callback); } + /** + * @param non-negative-int $nth + * + * @throws InvalidArgument + */ public function nth(int $nth): array { - try { - array_key_exists($nth, $this->rows) || throw new OutOfBoundsException('The specified offset does not exist.'); - $values = array_values($this->rows[$nth]); - if ([] === $this->header) { - return $values; - } + if ([] === ($row = $this->fetchRow($nth, __METHOD__))) { + return []; + } - $record = []; - foreach ($this->header as $index => $headerName) { - $record[$headerName] = $values[$index] ?? null; - } + return $this->rowToRecord($row, $this->header); + } - return $record; - } catch (Throwable) { - return []; + /** + * @template T of object + * + * @param non-negative-int $nth + * @param class-string $className + * @param array $header + * + * @throws InvalidArgument + * @throws ReflectionException + */ + public function nthAsObject(int $nth, string $className, array $header = []): ?object + { + if ([] === ($row = $this->fetchRow($nth, __METHOD__))) { + return null; } + + return Denormalizer::assign($className, $this->rowToRecord($row, [] !== $header ? $header : $this->header)); + } + + public function firstOffset(): ?int + { + return array_key_first($this->rows); + } + + public function first(): array + { + return null === ($offset = $this->firstOffset()) ? [] : $this->rowToRecord($this->rows[$offset], $this->header); } /** * @param class-string $className * @param array $header * - * @throws SyntaxError * @throws ReflectionException */ - public function nthAsObject(int $nth, string $className, array $header = []): ?object + public function firstAsObject(string $className, array $header = []): ?object { - $record = $this->nth($nth); - if ([] === $record) { + if ([] === ($row = $this->rows[$this->firstOffset()] ?? [])) { return null; } - if ([] === $header) { - return Denormalizer::assign($className, $record); + return Denormalizer::assign($className, $this->rowToRecord($row, [] !== $header ? $header : $this->header)); + } + + public function lastOffset(): ?int + { + return array_key_last($this->rows); + } + + public function last(): array + { + return null === ($offset = $this->lastOffset()) ? [] : $this->rowToRecord($this->rows[$offset], $this->header); + } + + /** + * @param class-string $className + * @param array $header + * + * @throws ReflectionException + */ + public function lastAsObject(string $className, array $header = []): ?object + { + if ([] === ($row = $this->rows[$this->lastOffset()] ?? [])) { + return null; } - $header = match (true) { - !array_is_list($header) => throw new SyntaxError('The header must be a list of unique column names.'), - $header !== array_filter($header, is_string(...)) => throw SyntaxError::dueToInvalidHeaderColumnNames(), - $header !== array_unique($header) => throw SyntaxError::dueToDuplicateHeaderColumnNames($header), - default => $header, - }; + return Denormalizer::assign($className, $this->rowToRecord($row, [] !== $header ? $header : $this->header)); + } - $values = array_values($record); - $record = []; - foreach ($header as $index => $headerName) { - $record[$headerName] = $values[$index] ?? null; + /** + * @throws InvalidArgument + */ + private function fetchRow(int $nth, string $method): array + { + -1 < $nth || throw InvalidArgument::dueToInvalidRecordOffset($nth, $method); + if (null === ($first = $this->firstOffset())) { + return []; + } + + $offset = $first + $nth; + if (!array_key_exists($offset, $this->rows)) { + return []; } - return Denormalizer::assign($className, $record); + return $this->rows[$nth + $first]; } public function fetchColumn(int|string $index = 0): Iterator { if (is_int($index)) { - $index > -1 || throw new OutOfBoundsException('The specified column `'.$index.'` does not exist.'); - [] === $this->header || array_key_exists($index, $this->header) || throw new OutOfBoundsException('The specified column `'.$index.'` does not exist.'); + $index > -1 || throw InvalidArgument::dueToInvalidColumnIndex($index, 'offset', __METHOD__); + [] === $this->header || array_key_exists($index, $this->header) || throw InvalidArgument::dueToInvalidColumnIndex($index, 'name', __METHOD__); $iterator = new MapIterator($this->getRecords(), fn (array $row) => array_values($row)); $iterator = new CallbackFilterIterator($iterator, fn (array $row) => array_key_exists($index, $row)); @@ -279,162 +305,116 @@ public function fetchColumn(int|string $index = 0): Iterator return new MapIterator($iterator, fn (array $row) => $row[$index]); } - [] !== $this->header || throw new OutOfBoundsException('The specified column `'.$index.'` does not exist.'); - in_array($index, $this->header, true) || throw new OutOfBoundsException('The specified column `'.$index.'` does not exist.'); + [] !== $this->header || throw InvalidArgument::dueToInvalidColumnIndex($index, 'name', __METHOD__); + in_array($index, $this->header, true) || throw InvalidArgument::dueToInvalidColumnIndex($index, 'name', __METHOD__); $iterator = new CallbackFilterIterator($this->getRecords(), fn (array $row) => array_key_exists($index, $row)); return new MapIterator($iterator, fn (array $row) => $row[$index]); } - public function fetchPairs(string|int $offset_index = 0, string|int $value_index = 1): Iterator - { - $offset = $this->fetchIndex($offset_index); - $value = $this->fetchIndex($value_index); - - foreach (new CallbackFilterIterator($this->getRecords(), fn (array $record) => isset($record[$offset])) as $record) { - yield $record[$offset] => $record[$value] ?? null; - } - } - /** * Adds a record validator. * - * @param (callable(array): bool)|(Closure(array): bool) $validator + * @param callable(array): bool $validator */ - public function addValidator(callable $validator, string $validator_name): self + public function addValidator(callable $validator, string $name): self { - $this->validators[$validator_name] = !$validator instanceof Closure ? $validator(...) : $validator; + $this->validators[$name] = !$validator instanceof Closure ? $validator(...) : $validator; return $this; } /** - * @throws CannotInsertRecord + * Adds a record formatter. + * + * @param callable(array): array $formatter */ - public function insertOne(array $record): int + public function addFormatter(callable $formatter): self { - $this->rows[] = $this->validateRecord($this->formatInsertRecord($record)); + $this->formatters[] = !$formatter instanceof Closure ? $formatter(...) : $formatter; - return 1; + return $this; } /** - * @param iterable $records - * * @throws CannotInsertRecord */ - public function insertAll(iterable $records): int + public function insert(array ...$records): int { - $affectedRows = 0; - foreach ($records as $record) { - $affectedRows += $this->insertOne($record); - } + [] !== $records || throw CannotInsertRecord::triggerOnValidation('@buffer_record_validation_on_insert', $records); + + array_push($this->rows, ...array_map($this->formatInsertRecord(...), $records)); - return $affectedRows; + return count($records); } /** - * @throws QueryException * @throws CannotInsertRecord * @throws SyntaxError */ - public function update(Predicate|Closure|callable|array|int $where, array $record): int + public function update(Predicate|Closure|callable $where, array $record): int { - $where = $this->filterPredicate($where); - $this->filterUpdateRecord($record) || throw new ValueError('The specified record contain invalid column names.'); - - if (array_is_list($record) && [] !== $this->header) { - $formattedRecord = []; - foreach ($this->header as $offset => $headerName) { - $formattedRecord[$headerName] = $record[$offset] ?? null; + $record = $this->filterUpdateRecord($record); + $updateRecord = function (array $row) use ($record): array { + foreach ($record as $index => $value) { + $row[$index] = $value; } - $record = $formattedRecord; - } + return $this->validateRecord($row); + }; + /** @var Iterator $iterator */ + $iterator = new MapIterator(new CallbackFilterIterator($this->getRecords(), $this->filterPredicate($where)), $updateRecord); $affectedRecords = 0; - foreach ($this->getRecords() as $offset => $currentRecord) { - if ($where($currentRecord, $offset)) { - foreach ($record as $index => $value) { - $currentRecord[$index] = $value; - } - - $this->rows[$offset] = $this->validateRecord($currentRecord); - $affectedRecords++; - } + foreach ($iterator as $offset => $row) { + $this->rows[$offset] = $row; + $affectedRecords++; } return $affectedRecords; } /** - * @throws QueryException|SyntaxError + * @throws SyntaxError */ - public function delete(Predicate|Closure|callable|array|int $where): int + public function delete(Predicate|Closure|callable $where): int { $affectedRecords = 0; - $where = $this->filterPredicate($where); - foreach ($this->getRecords() as $offset => $record) { - if ($where($record, $offset)) { - unset($this->rows[$offset]); - $affectedRecords++; - } + foreach (new CallbackFilterIterator($this->getRecords(), $this->filterPredicate($where)) as $offset => $row) { + unset($this->rows[$offset]); + $affectedRecords++; } return $affectedRecords; } - private function fetchIndex(string|int $index): string|int + public function truncate(): void { - if (is_string($index)) { - [] !== $this->header || throw new OutOfBoundsException('The specified column `'.$index.'` does not exist.'); - in_array($index, $this->header, true) || throw new OutOfBoundsException('The specified column `'.$index.'` does not exist.'); - - return $index; - } - - $index > -1 || throw new OutOfBoundsException('The specified column `'.$index.'` does not exist.'); - [] === $this->header || array_key_exists($index, $this->header) || throw new OutOfBoundsException('The specified column `'.$index.'` does not exist.'); - if ([] === $this->header) { - return $index; - } - - return $this->header[$index]; + $this->rows = []; } + /** + * @throws CannotInsertRecord + */ private function formatInsertRecord(array $record): array { - if (!$this->filterInsertRecord($record)) { - throw new ValueError('The specified record contain invalid column names.'); - } - - if ([] === $this->header) { - return array_values($record); - } - - if (array_is_list($record)) { - return array_combine($this->header, $record); - } - - // re-order the associative array to have all the data - // correctly aligned - $newRow = []; - foreach ($this->header as $name) { - $newRow[$name] = $record[$name]; - } + $this->filterInsertRecord($record) || throw CannotInsertRecord::triggerOnValidation('@buffer_record_validation_on_insert', $record); - return $newRow; + return $this->validateRecord(match (true) { + [] === $this->header => !array_is_list($record) ? array_values($record) : $record, + array_is_list($record) => array_combine($this->header, $record), + default => [...$this->nullRecord, ...$record], + }); } private function filterInsertRecord(array $record): bool { - $recordIsList = array_is_list($record); if ([] === $this->header) { - return $recordIsList; + return true; } - if ($recordIsList) { + if (array_is_list($record)) { return count($record) === count($this->header); } @@ -445,65 +425,73 @@ private function filterInsertRecord(array $record): bool } /** - * Validates a record. - * - * @throws CannotInsertRecord If the validation failed + * @throws CannotInsertRecord */ - private function validateRecord(array $record): array - { - foreach ($this->validators as $name => $validator) { - true === $validator($record) || throw CannotInsertRecord::triggerOnValidation($name, $record); - } - - return $record; - } - - private function filterUpdateRecord(array $record): bool + private function filterUpdateRecord(array $record): array { + [] !== $record || throw CannotInsertRecord::triggerOnValidation('@buffer_record_validation_on_update', $record); if (array_is_list($record)) { - return true; + return $this->rowToRecord($record, $this->header); } $keys = array_keys($record); return match (true) { - $keys === array_filter($keys, is_int(...)) => true, + $keys === array_filter($keys, is_int(...)) => $record, $keys !== array_filter($keys, is_string(...)), - [] !== array_diff($keys, $this->header) => false, - default => true, + [] !== array_diff($keys, $this->header) => throw CannotInsertRecord::triggerOnValidation('@buffer_record_validation_on_update', $record), + default => $record, }; } /** - * @throws QueryException + * Validates a record. + * + * @throws CannotInsertRecord If the validation failed */ - private function filterPredicate(Predicate|Closure|callable|array|int $predicate): Predicate + private function validateRecord(array $record): array { - if (is_int($predicate)) { - array_key_exists($predicate, $this->rows) || throw new OutOfBoundsException('The specified offset does not exist.'); - - return Offset::filterOn('=', $predicate); + foreach ($this->formatters as $formatter) { + $record = $formatter($record); } - if (!is_array($predicate)) { - return Criteria::all($predicate); + foreach ($this->validators as $name => $validator) { + true === $validator($record) || throw CannotInsertRecord::triggerOnValidation($name, $record); } - if ($predicate === array_filter($predicate, is_int(...))) { - $foundPredicate = array_filter( - array_map(fn (int $index): ?int => array_key_exists($index, $this->rows) ? $index : null, $predicate), - fn (?int $index): bool => null !== $index - ); + return !array_is_list($record) ? array_values($record) : $record; + } - ($foundPredicate === $predicate) || throw new OutOfBoundsException('At least one of the specified offset does not exist.'); + private function filterPredicate(Predicate|Closure|callable $predicate): Predicate + { + return !$predicate instanceof Predicate ? Criteria::all($predicate) : $predicate; + } + + /** + * @throws SyntaxError + */ + private function prepareHeader(array $header): array + { + return match (true) { + [] === $header => $this->header, + $header !== array_filter($header, is_int(...), ARRAY_FILTER_USE_KEY) => throw new SyntaxError('The header must be a list of unique column names.'), + $header !== array_filter($header, is_string(...)) => throw SyntaxError::dueToInvalidHeaderColumnNames(), + $header !== array_unique($header) => throw SyntaxError::dueToDuplicateHeaderColumnNames($header), + default => $header, + }; + } - return Offset::filterOn(Comparison::In, $predicate); + private function rowToRecord(array $row, array $header): array + { + if ([] === $header) { + return $row; } - try { - return Criteria::all($predicate); /* @phpstan-ignore-line */ - } catch (Throwable $exception) { - throw new TypeError('The specified predicate is invalid.', previous: $exception); + $record = []; + foreach ($header as $offset => $headerName) { + $record[$headerName] = $row[$offset] ?? null; } + + return $record; } } diff --git a/src/BufferBench.php b/src/BufferBench.php index bb2a8a3d..8c726aa0 100644 --- a/src/BufferBench.php +++ b/src/BufferBench.php @@ -23,7 +23,7 @@ final class BufferBench { #[Bench\OutputTimeUnit('seconds')] #[Bench\Assert('mode(variant.mem.peak) < 4700000'), Bench\Assert('mode(variant.time.avg) < 10000000')] - public function benchReading1MRowsCSVUsingSplFileObject(): void + public function benchLoadingRecordsUsingFromSplFileObject(): void { $path = dirname(__DIR__).'/test_files/prenoms.csv'; @@ -32,7 +32,7 @@ public function benchReading1MRowsCSVUsingSplFileObject(): void #[Bench\OutputTimeUnit('seconds')] #[Bench\Assert('mode(variant.mem.peak) < 4700000'), Bench\Assert('mode(variant.time.avg) < 10000000')] - public function benchReading1MRowsCSVUsingStream(): void + public function benchLoadingRecordsUsingFromStreamResource(): void { $path = dirname(__DIR__).'/test_files/prenoms.csv'; @@ -40,18 +40,28 @@ public function benchReading1MRowsCSVUsingStream(): void } #[Bench\OutputTimeUnit('seconds')] - #[Bench\Assert('mode(variant.mem.peak) < 56000000'), Bench\Assert('mode(variant.time.avg) < 10000000')] + #[Bench\Assert('mode(variant.mem.peak) < 40000000'), Bench\Assert('mode(variant.time.avg) < 10000000')] public function benchWritingAndDeletingEntries(): void { - $numRows = 100_000; - $writer = new Buffer(header: ['foo', 'bar', 'baz']); + $numRows = 10_000; + $buffer = new Buffer(header: ['foo', 'bar', 'baz']); for ($i = 1; $i <= $numRows; ++$i) { - $writer->insertOne(["csv--{$i}1", "csv--{$i}2", "csv--{$i}3"]); + $buffer->insert( + ["csv--{$i}1", "csv--{$i}2", "csv--{$i}3"], + ["csv--{$i}1", "csv--{$i}2", "csv--{$i}3"], + ["csv--{$i}1", "csv--{$i}2", "csv--{$i}3"], + ["csv--{$i}1", "csv--{$i}2", "csv--{$i}3"], + ["csv--{$i}1", "csv--{$i}2", "csv--{$i}3"], + ["csv--{$i}1", "csv--{$i}2", "csv--{$i}3"], + ["csv--{$i}1", "csv--{$i}2", "csv--{$i}3"], + ["csv--{$i}1", "csv--{$i}2", "csv--{$i}3"], + ["csv--{$i}1", "csv--{$i}2", "csv--{$i}3"], + ["csv--{$i}1", "csv--{$i}2", "csv--{$i}3"], + ); } - assert($numRows === $writer->recordCount()); - - $writer->delete(fn (array $row, int $offset) => ($offset % 2) === 0); + $buffer->delete(fn (array $row, int $offset) => ($offset % 2) === 0); + assert(50_000 === $buffer->recordCount()); } } diff --git a/src/BufferTest.php b/src/BufferTest.php index 328be8a7..47e1bbaf 100644 --- a/src/BufferTest.php +++ b/src/BufferTest.php @@ -13,7 +13,10 @@ use League\Csv\Buffer; use League\Csv\CannotInsertRecord; +use League\Csv\InvalidArgument; +use League\Csv\Query\Constraint\Column; use League\Csv\Query\Constraint\Offset; +use League\Csv\Reader; use League\Csv\Writer; use PHPUnit\Framework\Attributes\CoversClass; use PHPUnit\Framework\Attributes\Test; @@ -26,23 +29,24 @@ final class BufferTest extends TestCase public function it_will_create_a_datable_with_a_header(): void { $header = ['date', 'temperature', 'place']; - $dataTable = new Buffer([ + $buffer = new Buffer($header); + $buffer->insert( ['2011-01-01', '1', 'Galway'], ['2011-01-02', '-1', 'Galway'], ['2011-01-03', '0', 'Galway'], ['2011-01-01', '6', 'Berkeley'], ['2011-01-02', '8', 'Berkeley'], ['2011-01-03', '5', 'Berkeley'], - ], $header); + ); - self::assertSame($header, $dataTable->getHeader()); + self::assertSame($header, $buffer->getHeader()); self::assertSame( ['date' => '2011-01-01', 'temperature' => '1', 'place' => 'Galway'], - $dataTable->nth(0) + $buffer->nth(0) ); - self::assertSame(6, $dataTable->recordCount()); - self::assertTrue($dataTable->includeHeader()); - self::assertFalse($dataTable->isEmpty()); + self::assertSame(6, $buffer->recordCount()); + self::assertTrue($buffer->hasHeader()); + self::assertFalse($buffer->isEmpty()); $weather = new class (new DateTimeImmutable(), 6, 'Brussels') { public function __construct( @@ -53,29 +57,42 @@ public function __construct( } }; - $obj = $dataTable->nthAsObject(0, $weather::class); - self::assertInstanceOf($weather::class, $obj); - self::assertSame('2011-01-01', $obj->date->format('Y-m-d')); + $objStart = $buffer->nthAsObject(0, $weather::class); + $objEnd = $buffer->nthAsObject($buffer->recordCount() - 1, $weather::class); + $objFirst = $buffer->firstAsObject($weather::class); + $objLast = $buffer->lastAsObject($weather::class); + + self::assertInstanceOf($weather::class, $objStart); + self::assertInstanceOf($weather::class, $objFirst); + self::assertInstanceOf($weather::class, $objLast); + self::assertSame('2011-01-01', $objStart->date->format('Y-m-d')); + self::assertEquals($objFirst, $objStart); + self::assertEquals($objLast, $objEnd); } #[Test] public function it_will_create_a_datable_without_a_header(): void { - $dataTable = new Buffer([ + $buffer = new Buffer(); + self::assertSame([], $buffer->last()); + self::assertSame([], $buffer->first()); + self::assertNull($buffer->nthAsObject(23, stdClass::class)); + + $buffer->insert( ['2011-01-01', '1', 'Galway'], ['2011-01-02', '-1', 'Galway'], ['2011-01-03', '0', 'Galway'], ['2011-01-01', '6', 'Berkeley'], ['2011-01-02', '8', 'Berkeley'], ['2011-01-03', '5', 'Berkeley'], - ]); + ); - self::assertSame([], $dataTable->getHeader()); - self::assertFalse($dataTable->includeHeader()); - self::assertFalse($dataTable->isEmpty()); - self::assertSame(['2011-01-01', '1', 'Galway'], $dataTable->nth(0)); - self::assertSame([], $dataTable->nth(42)); - self::assertSame(6, $dataTable->recordCount()); + self::assertSame([], $buffer->getHeader()); + self::assertFalse($buffer->hasHeader()); + self::assertFalse($buffer->isEmpty()); + self::assertSame(['2011-01-01', '1', 'Galway'], $buffer->nth(0)); + self::assertSame([], $buffer->nth(42)); + self::assertSame(6, $buffer->recordCount()); $weather = new class (new DateTimeImmutable(), 6, 'Brussels') { public function __construct( @@ -86,17 +103,17 @@ public function __construct( } }; - $obj = $dataTable->nthAsObject(0, $weather::class, ['date', 'temperature', 'place']); + $obj = $buffer->firstAsObject($weather::class, ['date', 'temperature', 'place']); self::assertInstanceOf($weather::class, $obj); self::assertSame('2011-01-01', $obj->date->format('Y-m-d')); - self::assertNull($dataTable->nthAsObject(42, $weather::class, ['date', 'temperature', 'place'])); + self::assertNull($buffer->nthAsObject(42, $weather::class, ['date', 'temperature', 'place'])); - $collection = $dataTable->getRecordsAsObject($weather::class, ['date', 'temperature', 'place']); + $collection = $buffer->getRecordsAsObject($weather::class, ['date', 'temperature', 'place']); $collection = iterator_to_array($collection); self::assertInstanceOf($weather::class, $collection[0]); self::assertSame('2011-01-01', $collection[0]->date->format('Y-m-d')); - $mappedCollection = $dataTable->map( + $mappedCollection = $buffer->map( fn (array $item) => new ($weather::class)(new DateTimeImmutable($item[0]), (int) $item[1], (string) $item[2]) ); @@ -109,43 +126,47 @@ public function __construct( public function it_will_only_consider_header_content_and_not_the_record_keys_and_values_1(): void { $header = ['date', 'temperature']; - $dataTable = new Buffer([ + $buffer = new Buffer($header); + $buffer->insert( ['date' => '2011-01-01', 'temperature' => '1'], ['date' => '2011-01-02', 'temperature' => '-1'], ['date' => '2011-01-03', 'temperature' => '0'], ['date' => '2011-01-01', 'temperature' => '6'], ['date' => '2011-01-02', 'temperature' => '8'], ['date' => '2011-01-03', 'temperature' => '5'], - ], $header); + ); - self::assertSame(['date' => '2011-01-01', 'temperature' => '1'], $dataTable->nth(0)); - self::assertSame($header, $dataTable->getHeader()); - self::assertSame(6, $dataTable->recordCount()); + self::assertSame(['date' => '2011-01-01', 'temperature' => '1'], $buffer->nth(0)); + self::assertSame($header, $buffer->getHeader()); + self::assertSame(6, $buffer->recordCount()); + self::assertSame(['date' => '2011-01-03', 'temperature' => '5'], $buffer->last()); + self::assertSame(['date' => '2011-01-01', 'temperature' => '1'], $buffer->first()); } #[Test] public function it_will_only_consider_header_content_and_not_the_record_keys_and_values_2(): void { $header = ['date', 'temperature', 'place']; - $dataTable = new Buffer([ + $buffer = new Buffer($header); + $buffer->insert( ['date' => '2011-01-01', 'temperature' => '1', 'place' => 'Berkeley'], ['date' => '2011-01-02', 'temperature' => '-1', 'place' => 'Berkeley'], ['date' => '2011-01-03', 'temperature' => '0', 'place' => 'Berkeley'], ['date' => '2011-01-01', 'temperature' => '6', 'place' => 'Berkeley'], ['date' => '2011-01-02', 'temperature' => '8', 'place' => 'Berkeley'], ['date' => '2011-01-03', 'temperature' => '5', 'place' => 'Berkeley'], - ], $header); + ); - self::assertSame(['date' => '2011-01-01', 'temperature' => '1', 'place' => 'Berkeley'], $dataTable->nth(0)); - self::assertSame($header, $dataTable->getHeader()); - self::assertSame(6, $dataTable->recordCount()); + self::assertSame(['date' => '2011-01-01', 'temperature' => '1', 'place' => 'Berkeley'], $buffer->nth(0)); + self::assertSame($header, $buffer->getHeader()); + self::assertSame(6, $buffer->recordCount()); } #[Test] public function it_will_return_no_rows_if_non_rows_are_supplied(): void { $emptyDataTable = new Buffer(); - self::assertFalse($emptyDataTable->includeHeader()); + self::assertFalse($emptyDataTable->hasHeader()); self::assertTrue($emptyDataTable->isEmpty()); self::assertSame([], $emptyDataTable->getHeader()); self::assertSame(0, $emptyDataTable->recordCount()); @@ -161,140 +182,139 @@ public function it_will_return_no_rows_if_non_rows_are_supplied(): void public function it_can_be_filled_by_inserting_new_records(): void { $header = ['date', 'temperature', 'place']; - $dataTable = new Buffer(header: $header); - self::assertSame(2, $dataTable->insertAll([ + $buffer = new Buffer($header); + self::assertSame(2, $buffer->insert( ['2011-01-01', '1', 'Galway'], ['date' => '2011-01-02', 'temperature' => '-1', 'place' => 'Berkeley'], - ])); + )); - self::assertSame($header, $dataTable->getHeader()); - self::assertSame(2, $dataTable->recordCount()); - self::assertSame(['date' => '2011-01-01', 'temperature' => '1', 'place' => 'Galway'], $dataTable->nth(0)); + self::assertSame($header, $buffer->getHeader()); + self::assertSame(2, $buffer->recordCount()); + self::assertSame(['date' => '2011-01-01', 'temperature' => '1', 'place' => 'Galway'], $buffer->nth(0)); } #[Test] public function it_can_not_be_filled_by_inserting_new_invalid_list_due_to_missing_fields(): void { $header = ['date', 'temperature', 'place']; - $dataTable = new Buffer(header: $header); + $buffer = new Buffer($header); - $this->expectException(\ValueError::class); - $dataTable->insertOne(['2011-01-01', '1']); + $this->expectException(CannotInsertRecord::class); + $buffer->insert(['2011-01-01', '1']); } #[Test] public function it_can_not_be_filled_by_inserting_new_invalid_record_with_missing_fields(): void { $header = ['date', 'temperature', 'place']; - $dataTable = new Buffer(header: $header); + $buffer = new Buffer($header); - $this->expectException(\ValueError::class); - $dataTable->insertOne(['date' => '2011-01-01', 'temperature' => '1']); + $this->expectException(CannotInsertRecord::class); + $buffer->insert(['date' => '2011-01-01', 'temperature' => '1']); } #[Test] public function it_can_not_be_filled_by_inserting_new_invalid_record_with_unknown_fields(): void { $header = ['date', 'temperature', 'place']; - $dataTable = new Buffer(header: $header); + $buffer = new Buffer($header); - $this->expectException(\ValueError::class); - $dataTable->insertOne(['date' => '2011-01-01', 'temperature' => '1', 'location' => 'Berkeley']); + $this->expectException(CannotInsertRecord::class); + $buffer->insert(['date' => '2011-01-01', 'temperature' => '1', 'location' => 'Berkeley']); } #[Test] public function it_can_not_be_filled_by_inserting_new_invalid_record_with_extra_fields(): void { $header = ['date', 'temperature', 'place']; - $dataTable = new Buffer(header: $header); + $buffer = new Buffer($header); - $this->expectException(\ValueError::class); - $dataTable->insertOne(['date' => '2011-01-01', 'temperature' => '1', 'place' => 'Berkeley', 'origin' => 'station']); + $this->expectException(CannotInsertRecord::class); + $buffer->insert(['date' => '2011-01-01', 'temperature' => '1', 'place' => 'Berkeley', 'origin' => 'station']); } #[Test] public function it_can_not_be_filled_by_inserting_new_invalid_record_with_mixed_fields(): void { $header = ['date', 'temperature', 'place']; - $dataTable = new Buffer(header: $header); + $buffer = new Buffer($header); - $this->expectException(\ValueError::class); - $dataTable->insertOne(['date' => '2011-01-01', '1', 'place' => 'Berkeley', 'origin' => 'station']); + $this->expectException(CannotInsertRecord::class); + $buffer->insert(['date' => '2011-01-01', '1', 'place' => 'Berkeley', 'origin' => 'station']); } #[Test] public function it_can_not_be_filled_by_inserting_new_invalid_record_on_buffer_without_header(): void { - $dataTable = new Buffer(); - $this->expectException(\ValueError::class); - $dataTable->insertAll([['date' => '2011-01-01', 'temperature' => '1', 'place' => 'Berkeley', 'origin' => 'station']]); + $buffer = new Buffer(); + $buffer->insert(['date' => '2011-01-01', 'temperature' => '1', 'place' => 'Berkeley', 'origin' => 'station']); + + self::assertSame(1, $buffer->recordCount()); + self::assertSame(['2011-01-01', '1', 'Berkeley', 'station'], $buffer->nth(0)); } #[Test] public function it_can_be_filled_by_inserting_new_record_on_buffer_without_header(): void { - $dataTable = new Buffer(); - $dataTable->insertOne(['2011-01-01', '1', 'Berkeley', 'station']); - self::assertSame(1, $dataTable->recordCount()); + $buffer = new Buffer(); + $buffer->insert(['2011-01-01', '1', 'Berkeley', 'station']); + self::assertSame(1, $buffer->recordCount()); } #[Test] public function it_can_be_updated_by_replacing_or_removing_records_using_its_offset(): void { $header = ['date', 'temperature', 'place']; - $dataTable = new Buffer([ + $buffer = new Buffer($header); + $buffer->insert( ['date' => '2011-01-01', 'temperature' => '1', 'place' => 'Berkeley'], ['date' => '2011-01-02', 'temperature' => '-1', 'place' => 'Berkeley'], ['date' => '2011-01-03', 'temperature' => '0', 'place' => 'Berkeley'], + ); + $buffer->insert( ['date' => '2011-01-01', 'temperature' => '6', 'place' => 'Galway'], ['date' => '2011-01-02', 'temperature' => '8', 'place' => 'Galway'], ['date' => '2011-01-03', 'temperature' => '5', 'place' => 'Galway'], - ], $header); + ); - self::assertSame(1, $dataTable->update(0, ['date' => '2025-02-08', 'temperature' => '3'])); - self::assertSame(2, $dataTable->delete([4, 5, 4])); + self::assertSame(1, $buffer->update(fn (array $row, int $offset): bool => 0 === $offset, ['date' => '2025-02-08', 'temperature' => '3'])); + self::assertSame(2, $buffer->delete(fn (array $row, int $offset): bool => in_array($offset, [4, 5, 4], true))); - self::assertSame($header, $dataTable->getHeader()); - self::assertSame(4, $dataTable->recordCount()); - self::assertSame(['date' => '2025-02-08', 'temperature' => '3', 'place' => 'Berkeley'], $dataTable->nth(0)); + self::assertSame($header, $buffer->getHeader()); + self::assertSame(4, $buffer->recordCount()); + self::assertSame(['date' => '2025-02-08', 'temperature' => '3', 'place' => 'Berkeley'], $buffer->nth(0)); } #[Test] - public function it_can_not_push_invalid_records(): void + public function it_can_not_insert_invalid_records(): void { $header = ['date', 'temperature', 'place']; - $dataTable = new Buffer(header: $header); + $buffer = new Buffer($header); - $this->expectException(ValueError::class); + $this->expectException(CannotInsertRecord::class); - $dataTable->insertAll([['foo' => 'bar']]); + $buffer->insert(['foo' => 'bar']); } #[Test] - public function it_can_not_unshift_invalid_records(): void + public function it_can_not_insert_without_records(): void { $header = ['date', 'temperature', 'place']; - $dataTable = new Buffer(header: $header); + $buffer = new Buffer($header); - $this->expectException(ValueError::class); - - $dataTable->insertAll([['foo' => 'bar']]); + $this->expectException(CannotInsertRecord::class); + $buffer->insert(); } #[Test] - public function it_can_allow_removing_no_records_or_all_records(): void + public function it_can_not_unshift_invalid_records(): void { - $dataTable = new Buffer([ - ['date' => '2011-01-01', 'temperature' => '1', 'place' => 'Galway'], - ['date' => '2011-01-02', 'temperature' => '-1', 'place' => 'Galway'], - ['date' => '2011-01-03', 'temperature' => '0', 'place' => 'Galway'], - ['date' => '2011-01-01', 'temperature' => '6', 'place' => 'Berkeley'], - ['date' => '2011-01-02', 'temperature' => '8', 'place' => 'Berkeley'], - ['date' => '2011-01-03', 'temperature' => '5', 'place' => 'Berkeley'], - ], ['date', 'temperature', 'place']); + $header = ['date', 'temperature', 'place']; + $buffer = new Buffer($header); + + $this->expectException(CannotInsertRecord::class); - $this->expectException(OutOfBoundsException::class); - $dataTable->delete(42); + $buffer->insert(['foo' => 'bar']); } #[Test] @@ -463,145 +483,181 @@ public function it_can_be_used_with_pdo_without_storing_the_header(): void self::assertSame([1, 'Ronnie', 'ronnie@example.com'], $tabularData->nth(0)); } - #[Test] - public function it_will_fail_with_bogus_predicate(): void - { - $this->expectException(TypeError::class); - - (new Buffer())->delete(['foo' => 'bar']); - } - #[Test] public function it_implements_array_access_getter_methods_with_header(): void { - $dataTable = new Buffer([ + $buffer = new Buffer(['date', 'temperature', 'place']); + $buffer->insert( ['date' => '2011-01-01', 'temperature' => '1', 'place' => null], ['date' => '2011-01-02', 'temperature' => '-1', 'place' => 'Berkeley'], ['date' => '2011-01-03', 'temperature' => '0', 'place' => 'Berkeley'], ['date' => '2011-01-01', 'temperature' => '6', 'place' => 'Galway'], ['date' => '2011-01-02', 'temperature' => '8', 'place' => 'Galway'], ['date' => '2011-01-03', 'temperature' => '5', 'place' => 'Galway'], - ], ['date', 'temperature', 'place']); + ); - self::assertSame(['date' => '2011-01-01', 'temperature' => '1', 'place' => null], $dataTable->nth(0)); - self::assertNotEmpty($dataTable->nth(5)); - self::assertEmpty($dataTable->nth(42)); + self::assertSame(['date' => '2011-01-01', 'temperature' => '1', 'place' => null], $buffer->nth(0)); + self::assertNotEmpty($buffer->nth(5)); + self::assertEmpty($buffer->nth(42)); + } + + #[Test] + public function it_throws_an_exception_on_negative_offset(): void + { + $buffer = new Buffer(['date', 'temperature', 'place']); + $this->expectException(InvalidArgument::class); + + $buffer->nth(-1); /* @phpstan-ignore-line */ + } + + #[Test] + public function it_returns_an_empty_array_if_no_record_is_present(): void + { + $buffer = new Buffer(['date', 'temperature', 'place']); + + self::assertSame([], $buffer->nth(0)); + self::assertSame([], $buffer->nth(42)); } #[Test] public function it_implements_array_access_getter_methods_without_header(): void { - $dataTable = new Buffer([ + $buffer = new Buffer(); + $buffer->insert( ['2011-01-01', '1'], ['2011-01-02', '-1'], ['2011-01-03', '0'], ['2011-01-01', '6'], ['2011-01-02', '8'], ['2011-01-03', '5'], - ]); + ); - self::assertSame(['2011-01-01', '1'], $dataTable->nth(0)); - self::assertNotEmpty($dataTable->nth(5)); - self::assertEmpty($dataTable->nth(42)); + self::assertSame(['2011-01-01', '1'], $buffer->nth(0)); + self::assertNotEmpty($buffer->nth(5)); + self::assertEmpty($buffer->nth(42)); } #[Test] public function it_can_update_the_buffer_using_a_record_as_a_list(): void { - $dataTable = new Buffer([ + $buffer = new Buffer(['date', 'temperature', 'place']); + $buffer->insert( ['date' => '2011-01-01', 'temperature' => '1', 'place' => 'Galway'], ['date' => '2011-01-02', 'temperature' => '-1', 'place' => 'Berkeley'], ['date' => '2011-01-03', 'temperature' => '0', 'place' => 'Galway'], ['date' => '2011-01-01', 'temperature' => '6', 'place' => 'Berkeley'], ['date' => '2011-01-02', 'temperature' => '8', 'place' => 'Galway'], ['date' => '2011-01-03', 'temperature' => '5', 'place' => 'Berkeley'], - ], ['date', 'temperature', 'place']); + ); + + self::assertSame(0, $buffer->update(Offset::filterOn('=', 42), ['2011-01-01', '1', 'bujumbura'])); + self::assertSame(1, $buffer->update(Offset::filterOn('=', 0), ['2011-01-01', '1', 'bujumbura'])); + self::assertSame(['date' => '2011-01-01', 'temperature' => '1', 'place' => 'bujumbura'], $buffer->nth(0)); + } + + #[Test] + public function it_can_update_the_buffer_using_a_record_as_a_list_on_a_buffer_without_header(): void + { + $buffer = new Buffer(); + $buffer->insert( + ['2011-01-01', '1', 'Galway'], + ['2011-01-02', '-1', 'Berkeley'], + ['2011-01-03', '0', 'Galway'], + ['2011-01-01', '6', 'Berkeley'], + ['2011-01-02', '8', 'Galway'], + ['2011-01-03', '5', 'Berkeley'], + ); - self::assertSame(0, $dataTable->update(Offset::filterOn('=', 42), ['2011-01-01', '1', 'bujumbura'])); - self::assertSame(1, $dataTable->update(0, ['2011-01-01', '1', 'bujumbura'])); - self::assertSame(['date' => '2011-01-01', 'temperature' => '1', 'place' => 'bujumbura'], $dataTable->nth(0)); + self::assertSame(0, $buffer->update(Offset::filterOn('=', 42), ['2011-01-01', '1', 'bujumbura'])); + self::assertSame(1, $buffer->update(Offset::filterOn('=', 0), [2 => 'bujumbura'])); + self::assertSame(['2011-01-01', '1', 'bujumbura'], $buffer->nth(0)); } #[Test] public function it_can_delete_the_buffer_records_using_a_closure(): void { - $dataTable = new Buffer([ + $buffer = new Buffer(['date', 'temperature', 'place']); + $buffer->insert( ['date' => '2011-01-01', 'temperature' => '1', 'place' => 'Berkeley'], ['date' => '2011-01-02', 'temperature' => '-1', 'place' => 'Galway'], ['date' => '2011-01-03', 'temperature' => '0', 'place' => 'Berkeley'], ['date' => '2011-01-01', 'temperature' => '6', 'place' => 'Galway'], ['date' => '2011-01-02', 'temperature' => '8', 'place' => 'Berkeley'], ['date' => '2011-01-03', 'temperature' => '5', 'place' => 'Galway'], - ], ['date', 'temperature', 'place']); + ); - self::assertSame(3, $dataTable->delete(fn (array $record, int $offset): bool => 0 === $offset % 2)); - self::assertSame(3, $dataTable->recordCount()); + self::assertSame(3, $buffer->delete(fn (array $record, int $offset): bool => 0 === $offset % 2)); + self::assertSame(3, $buffer->recordCount()); } #[Test] public function it_can_return_a_column_by_name(): void { - $dataTable = new Buffer([ + $buffer = new Buffer(['date', 'temperature', 'place']); + $buffer->insert( ['date' => '2011-01-01', 'temperature' => '1', 'place' => 'Berkeley'], ['date' => '2011-01-02', 'temperature' => '-1', 'place' => 'Galway'], ['date' => '2011-01-03', 'temperature' => '0', 'place' => 'Galway'], ['date' => '2011-01-01', 'temperature' => '6', 'place' => 'Galway'], ['date' => '2011-01-02', 'temperature' => '8', 'place' => 'Berkeley'], ['date' => '2011-01-03', 'temperature' => '5', 'place' => 'Berkeley'], - ], ['date', 'temperature', 'place']); + ); self::assertSame( ['2011-01-01', '2011-01-02', '2011-01-03', '2011-01-01', '2011-01-02', '2011-01-03'], - iterator_to_array($dataTable->fetchColumn('date')) + iterator_to_array($buffer->fetchColumn('date')) ); } #[Test] public function it_can_return_a_column_by_offset(): void { - $dataTable = new Buffer([ + $buffer = new Buffer(); + $buffer->insert( ['2011-01-01', '1'], ['2011-01-02', '-1'], ['2011-01-03', '0'], ['2011-01-01', '6'], ['2011-01-02', '8'], ['2011-01-03', '5'], - ]); + ); self::assertSame( ['2011-01-01', '2011-01-02', '2011-01-03', '2011-01-01', '2011-01-02', '2011-01-03'], - iterator_to_array($dataTable->fetchColumn()) + iterator_to_array($buffer->fetchColumn()) ); } #[Test] public function it_can_return_a_column_by_offset_even_when_theres_a_header(): void { - $dataTable = new Buffer([ + $buffer = new Buffer(['date', 'temperature', 'place']); + $buffer->insert( ['date' => '2011-01-01', 'temperature' => '1', 'place' => 'Berkeley'], ['date' => '2011-01-02', 'temperature' => '-1', 'place' => 'Berkeley'], ['date' => '2011-01-03', 'temperature' => '0', 'place' => 'Galway'], ['date' => '2011-01-01', 'temperature' => '6', 'place' => 'Berkeley'], ['date' => '2011-01-02', 'temperature' => '8', 'place' => 'Galway'], ['date' => '2011-01-03', 'temperature' => '5', 'place' => 'Berkeley'], - ], ['date', 'temperature', 'place']); + ); self::assertSame( ['2011-01-01', '2011-01-02', '2011-01-03', '2011-01-01', '2011-01-02', '2011-01-03'], - iterator_to_array($dataTable->fetchColumn()) + iterator_to_array($buffer->fetchColumn()) ); } #[Test] public function it_will_store_the_buffer_with_its_header(): void { - $dataTable = new Buffer([ + $buffer = new Buffer(['date', 'temperature']); + $buffer->insert( ['2011-01-01', '1'], ['2011-01-02', '-1'], - ], ['date', 'temperature']); + ); $writer = Writer::createFromString(); - $res = $dataTable->to($writer); + $res = $buffer->to($writer); self::assertSame(44, $res); self::assertSame("date,temperature\n2011-01-01,1\n2011-01-02,-1\n", $writer->toString()); @@ -610,13 +666,14 @@ public function it_will_store_the_buffer_with_its_header(): void #[Test] public function it_will_store_the_buffer_without_its_header(): void { - $dataTable = new Buffer([ + $buffer = new Buffer(['date', 'temperature']); + $buffer->insert( ['2011-01-01', '1'], ['2011-01-02', '-1'], - ], ['date', 'temperature']); + ); $writer = Writer::createFromString(); - $res = $dataTable->to($writer, Buffer::EXCLUDE_HEADER); + $res = $buffer->to($writer, Buffer::EXCLUDE_HEADER); self::assertSame(27, $res); self::assertSame("2011-01-01,1\n2011-01-02,-1\n", $writer->toString()); @@ -625,90 +682,146 @@ public function it_will_store_the_buffer_without_its_header(): void #[Test] public function it_will_store_the_buffer_without_its_header_if_none_exists(): void { - $dataTable = new Buffer([ + $buffer = new Buffer(); + $buffer->insert( ['2011-01-01', '1'], ['2011-01-02', '-1'], - ]); + ); $writer = Writer::createFromString(); - $res = $dataTable->to($writer); + $res = $buffer->to($writer); self::assertSame(27, $res); self::assertSame("2011-01-01,1\n2011-01-02,-1\n", $writer->toString()); } #[Test] - public function it_can_return_the_column_pairs_when_the_buffer_has_a_header(): void + public function it_can_validate_a_record_on_insertion(): void { - $dataTable = new Buffer([ - ['date' => '2011-01-01', 'temperature' => '1', 'place' => 'Berkeley'], - ['date' => '2011-01-02', 'temperature' => '-1', 'place' => 'Berkeley'], - ['date' => '2011-01-03', 'temperature' => '0', 'place' => 'Galway'], - ['date' => '2011-01-01', 'temperature' => '6', 'place' => 'Berkeley'], - ['date' => '2011-01-02', 'temperature' => '8', 'place' => 'Galway'], - ['date' => '2011-01-03', 'temperature' => '5', 'place' => 'Berkeley'], - ], ['date', 'temperature', 'place']); - - $expected = [ - '1' => 'Berkeley', - '-1' => 'Berkeley', - '0' => 'Galway', - '6' => 'Berkeley', - '8' => 'Galway', - '5' => 'Berkeley', - ]; + $buffer = new Buffer(); + $buffer->addValidator(fn (array $row): bool => $row[1] >= 0, 'func1'); - self::assertSame($expected, iterator_to_array($dataTable->fetchPairs(1, 2))); - self::assertSame($expected, iterator_to_array($dataTable->fetchPairs('temperature', 'place'))); - self::assertSame($expected, iterator_to_array($dataTable->fetchPairs(1, 'place'))); - self::assertSame($expected, iterator_to_array($dataTable->fetchPairs('temperature', 2))); + $this->expectExceptionObject(CannotInsertRecord::triggerOnValidation('func1', ['column1', -1])); + + $buffer->insert(['column1', 1]); + $buffer->insert(['column1', -1]); } #[Test] - public function it_can_return_the_column_pairs_when_the_buffer_has_no_header(): void + public function it_can_validate_a_record_on_update(): void { - $dataTable = new Buffer([ - ['2011-01-01', '1', 'Berkeley'], - ['2011-01-02', '-1', 'Berkeley'], - ['2011-01-03', '0', 'Galway'], - ['2011-01-01', '6', 'Berkeley'], - ['2011-01-02', '8', 'Galway'], - ['2011-01-03', '5', 'Berkeley'], - ]); - - $expected = [ - '1' => 'Berkeley', - '-1' => 'Berkeley', - '0' => 'Galway', - '6' => 'Berkeley', - '8' => 'Galway', - '5' => 'Berkeley', - ]; + $buffer = new Buffer(); + $buffer->addValidator(fn (array $row): bool => $row[1] >= 0, 'func1'); + + $this->expectExceptionObject(CannotInsertRecord::triggerOnValidation('func1', ['column1', -1])); - self::assertSame($expected, iterator_to_array($dataTable->fetchPairs(1, 2))); + $buffer->insert(['column1', 1]); + $buffer->update(fn (array $row, int $offset): bool => 0 === $offset, [1 => -1]); } #[Test] - public function it_can_validate_a_record_on_insertion(): void + public function it_can_re_order_the_fields_using_the_header(): void { - $dataTable = new Buffer(); - $dataTable->addValidator(fn (array $row): bool => $row[1] >= 0, 'func1'); + $buffer = new Buffer(); + $buffer->insert( + ['moko', 'mibalé', 'misató'], + ['un', 'deux', 'trois'], + ['one', 'two', 'three'], + ['unos', 'dos', 'tres'], + ); - $this->expectExceptionObject(CannotInsertRecord::triggerOnValidation('func1', ['column1', -1])); + $res = iterator_to_array($buffer->getRecords([2 => 'column 1', 1 => 'column 2', 0 => 'column 3'])); + self::assertSame(['column 1' => 'misató', 'column 2' => 'mibalé', 'column 3' => 'moko'], $res[0]); + } + + #[Test] + public function it_fails_to_update_a_record_with_no_record(): void + { + $buffer = new Buffer(); + $buffer->insert( + ['moko', 'mibalé', 'misató'], + ['un', 'deux', 'trois'], + ['one', 'two', 'three'], + ['unos', 'dos', 'tres'], + ); - $dataTable->insertOne(['column1', 1]); - $dataTable->insertOne(['column1', -1]); + $this->expectException(CannotInsertRecord::class); + $buffer->update(fn (array $row, int $offset): bool => 1 === $offset, []); } #[Test] - public function it_can_validate_a_record_on_update(): void + public function it_can_format_the_data_inserted(): void { - $dataTable = new Buffer(); - $dataTable->addValidator(fn (array $row): bool => $row[1] >= 0, 'func1'); + $buffer = new Buffer(); + $buffer->addFormatter(fn (array $row): array => array_map(strtoupper(...), $row)); + $buffer->insert(['jane', 'doe']); + self::assertSame(['JANE', 'DOE'], $buffer->nth(0)); + } - $this->expectExceptionObject(CannotInsertRecord::triggerOnValidation('func1', ['column1', -1])); + #[Test] + public function it_formats_the_data_before_validation(): void + { + $buffer = new Buffer(); + $buffer->addFormatter(fn (array $row): array => array_map(strtoupper(...), $row)); + $buffer->addValidator(fn (array $row): bool => strtolower($row[1]) === $row[1], 'func1'); + + $this->expectException(CannotInsertRecord::class); + $buffer->insert(['jane', 'doe']); + } + + #[Test] + public function it_makes_a_difference_between_record_offset_first_and_last(): void + { + $csv = <<setHeaderOffset(0); + + $buffer = Buffer::from($document->slice(2, 3)); + $bufferAsArray = iterator_to_array($buffer->getRecords()); + $firstRecord = ['date' => '2011-01-03', 'temperature' => '0', 'place' => 'Galway']; + $lastRecord = ['date' => '2011-01-02', 'temperature' => '8', 'place' => 'Berkeley']; + + self::assertSame($firstRecord, $buffer->first()); + //offset 3 based on the original CSV document + self::assertSame($firstRecord, $bufferAsArray[$buffer->firstOffset()]); + self::assertSame($lastRecord, $buffer->last()); + //offset 5 based on the original CSV document + self::assertSame($lastRecord, $bufferAsArray[$buffer->lastOffset()]); + + //delete all records except the first record! + $buffer->delete(Column::filterOn('temperature', '<>', '0')); + self::assertSame($firstRecord, $bufferAsArray[$buffer->firstOffset()]); + self::assertSame($firstRecord, $bufferAsArray[$buffer->lastOffset()]); + } + + #[Test] + public function it_implements_the_truncate_method(): void + { + $buffer = new Buffer(['date', 'temperature', 'place']); + $buffer->insert( + ['date' => '2011-01-01', 'temperature' => '1', 'place' => null], + ['date' => '2011-01-02', 'temperature' => '-1', 'place' => 'Berkeley'], + ['date' => '2011-01-03', 'temperature' => '0', 'place' => 'Berkeley'], + ['date' => '2011-01-01', 'temperature' => '6', 'place' => 'Galway'], + ['date' => '2011-01-02', 'temperature' => '8', 'place' => 'Galway'], + ['date' => '2011-01-03', 'temperature' => '5', 'place' => 'Galway'], + ); + + self::assertFalse($buffer->isEmpty()); + self::assertTrue($buffer->hasHeader()); + + $buffer->truncate(); - $dataTable->insertOne(['column1', 1]); - $dataTable->update(0, [1 => -1]); + self::assertTrue($buffer->isEmpty()); + self::assertTrue($buffer->hasHeader()); } } diff --git a/src/RdbmsResult.php b/src/RdbmsResult.php index 3fb2c3de..c76aa727 100644 --- a/src/RdbmsResult.php +++ b/src/RdbmsResult.php @@ -13,7 +13,7 @@ namespace League\Csv; -use Iterator; +use Generator; use mysqli_result; use PDO; use PDOStatement; @@ -40,52 +40,44 @@ final class RdbmsResult public static function columnNames(PDOStatement|Result|mysqli_result|SQLite3Result $result): array { return match (true) { + $result instanceof PDOStatement => array_map(fn (int $index): string => $result->getColumnMeta($index)['name'] ?? throw new RuntimeException('Unable to get metadata for column '.$index), range(0, $result->columnCount() - 1)), $result instanceof mysqli_result => array_column($result->fetch_fields(), 'name'), $result instanceof Result => array_map(fn (int $index) => pg_field_name($result, $index), range(0, pg_num_fields($result) - 1)), $result instanceof SQLite3Result => array_map($result->columnName(...), range(0, $result->numColumns() - 1)), - $result instanceof PDOStatement => array_map(fn (int $index): string => $result->getColumnMeta($index)['name'] ?? throw new RuntimeException('Unable to get metadata for column '.$index), range(0, $result->columnCount() - 1)), }; } /** - * @return array> + * @return Generator> */ - public static function rows(PDOStatement|Result|mysqli_result|SQLite3Result $result): array + public static function rows(PDOStatement|Result|mysqli_result|SQLite3Result $result): Generator { if ($result instanceof PDOStatement) { - return $result->fetchAll(PDO::FETCH_ASSOC); - } - - /** @var array> $records */ - $records = []; - if ($result instanceof mysqli_result) { - while ($record = $result->fetch_assoc()) { - $records[] = $record; + while ($row = $result->fetch(PDO::FETCH_ASSOC)) { + yield $row; /* @phpstan-ignore-line */ } - return $records; + return; } if ($result instanceof Result) { - while ($record = pg_fetch_assoc($result)) { - $records[] = $record; + while ($row = pg_fetch_assoc($result)) { + yield $row; } - return $records; + return; } - while ($record = $result->fetchArray(SQLITE3_ASSOC)) { - $records[] = $record; - } + if ($result instanceof mysqli_result) { + while ($row = $result->fetch_assoc()) { + yield $row; + } - return $records; - } + return; + } - /** - * @return Iterator> - */ - public static function iteratorRows(PDOStatement|Result|mysqli_result|SQLite3Result $result): Iterator - { - return MapIterator::toIterator(self::rows($result)); + while ($row = $result->fetchArray(SQLITE3_ASSOC)) { + yield $row; + } } } diff --git a/src/RdbmsResultTest.php b/src/RdbmsResultTest.php index 4909a3cd..fa08ac7f 100644 --- a/src/RdbmsResultTest.php +++ b/src/RdbmsResultTest.php @@ -21,6 +21,8 @@ use SQLite3Result; use SQLite3Stmt; +use function iterator_count; + final class RdbmsResultTest extends TestCase { #[Test] @@ -63,7 +65,7 @@ public function it_can_be_used_with_sqlite3(): void $tabularData = ResultSet::tryFrom($result); self::assertSame(['id', 'name', 'email'], $tabularData->getHeader()); - self::assertSame(6, $tabularData->recordCount()); + self::assertSame(6, iterator_count($tabularData->getRecords())); self::assertSame( ['id' => 1, 'name' => 'Ronnie', 'email' => 'ronnie@example.com'], ResultSet::from($tabularData)->first() @@ -104,7 +106,7 @@ public function it_can_be_used_with_pdo(): void $tabularData = ResultSet::tryFrom($stmt); self::assertSame(['id', 'name', 'email'], $tabularData->getHeader()); - self::assertSame(6, $tabularData->recordCount()); + self::assertSame(6, iterator_count($tabularData->getRecords())); self::assertSame( ['id' => 1, 'name' => 'Ronnie', 'email' => 'ronnie@example.com'], ResultSet::from($tabularData)->first() diff --git a/src/Reader.php b/src/Reader.php index 3c9cc5fd..df4f756a 100644 --- a/src/Reader.php +++ b/src/Reader.php @@ -292,7 +292,7 @@ public function fetchPairs(string|int $offset_index = 0, string|int $value_index /** * @throws Exception */ - public function recordCount(): int + public function count(): int { if (-1 === $this->nb_records) { $this->nb_records = iterator_count($this->getRecords()); @@ -301,14 +301,6 @@ public function recordCount(): int return $this->nb_records; } - /** - * @throws Exception - */ - public function count(): int - { - return $this->recordCount(); - } - /** * @throws Exception */ diff --git a/src/ResultSet.php b/src/ResultSet.php index 205559a1..a3986467 100644 --- a/src/ResultSet.php +++ b/src/ResultSet.php @@ -109,7 +109,13 @@ public static function tryFrom(PDOStatement|Result|mysqli_result|SQLite3Result|T public static function from(PDOStatement|Result|mysqli_result|SQLite3Result|TabularData $tabularData): self { if (!$tabularData instanceof TabularData) { - return new self(RdbmsResult::iteratorRows($tabularData), RdbmsResult::columnNames($tabularData)); + /** @var ArrayIterator> $data */ + $data = new ArrayIterator(); + foreach (RdbmsResult::rows($tabularData) as $offset => $row) { + $data[$offset] = $row; + } + + return new self($data, RdbmsResult::columnNames($tabularData)); } return new self($tabularData->getRecords(), $tabularData->getHeader()); @@ -477,14 +483,9 @@ protected function combineHeader(array $header): Iterator }; } - public function recordCount(): int - { - return iterator_count($this->records); - } - public function count(): int { - return $this->recordCount(); + return iterator_count($this->records); } public function jsonSerialize(): array diff --git a/src/ResultSetTest.php b/src/ResultSetTest.php index 408b947c..0947f92d 100644 --- a/src/ResultSetTest.php +++ b/src/ResultSetTest.php @@ -409,11 +409,11 @@ public function testHeaderMapperOnResultSetAlwaysIgnoreTheColumnName(): void CSV; $reader = Reader::createFromString($csv) ->setHeaderOffset(0); - $this->expectException(SyntaxError::class); + $this->expectException(SyntaxError::class); [...(new Statement()) ->process($reader) - ->getRecords(['lastname' => 'nom de famille', 'firstname' => 'prenom', 'e-mail' => 'e-mail'])]; + ->getRecords(['lastname' => 'nom de famille', 'firstname' => 'prenom', 'e-mail' => 'e-mail'])]; /* @phpstan-ignore-line */ } public function testChunkByIssue524(): void diff --git a/src/TabularData.php b/src/TabularData.php index 65eb88c4..aa1cda21 100644 --- a/src/TabularData.php +++ b/src/TabularData.php @@ -45,7 +45,7 @@ public function getHeader(): array; * filled with null values while extra record fields are strip from * the returned object. * - * @param array $header an optional header mapper to use instead of the CSV document header + * @param array $header an optional header mapper to use instead of the tabular data header * * @return Iterator> */ @@ -63,25 +63,4 @@ public function getRecords(array $header = []): Iterator; * @return Iterator */ public function fetchColumn(string|int $index = 0): Iterator; - - /** - * Returns the next key-value pairs from the tabular data (first - * column is the key, second column is the value). - * - * By default, if no column index is provided: - * - the first column is used to provide the keys - * - the second column is used to provide the value - * - * @param string|int $offset_index The column index to serve as offset - * @param string|int $value_index The column index to serve as value - * - * @throws UnableToProcessCsv if the column index is invalid or not found - */ - public function fetchPairs(string|int $offset_index = 0, string|int $value_index = 1): Iterator; - - /** - * Returns the number of records contained in the tabular data structure - * excluding the header record. - */ - public function recordCount(): int; } diff --git a/src/TabularDataReader.php b/src/TabularDataReader.php index 906f2a62..94146db2 100644 --- a/src/TabularDataReader.php +++ b/src/TabularDataReader.php @@ -62,6 +62,27 @@ interface TabularDataReader extends TabularData, IteratorAggregate, Countable */ public function getIterator(): Iterator; + /** + * Returns the number of records contained in the tabular data structure + * excluding the header record. + */ + public function count(): int; + + /** + * Returns the next key-value pairs from the tabular data (first + * column is the key, second column is the value). + * + * By default, if no column index is provided: + * - the first column is used to provide the keys + * - the second column is used to provide the value + * + * @param string|int $offset_index The column index to serve as offset + * @param string|int $value_index The column index to serve as value + * + * @throws UnableToProcessCsv if the column index is invalid or not found + */ + public function fetchPairs(string|int $offset_index = 0, string|int $value_index = 1): Iterator; + /** * DEPRECATION WARNING! This method will be removed in the next major point release. * @@ -77,10 +98,4 @@ public function getIterator(): Iterator; */ #[Deprecated(message:'use League\Csv\TabularDataReader::nth() instead', since:'league/csv:9.9.0')] public function fetchOne(int $nth_record = 0): array; - - /** - * Returns the number of records contained in the tabular data structure - * excluding the header record. - */ - public function count(): int; }