Skip to content

Commit e941469

Browse files
committed
Add suggestions from code review
1 parent e27fd60 commit e941469

File tree

2 files changed

+47
-40
lines changed

2 files changed

+47
-40
lines changed

tests/AggregationBlogPost/FuelExample1Test.php

-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22

33
namespace AggregationBlogPost;
44

5-
use MongoDB\BSON\Type;
65
use MongoDB\Builder\Accumulator;
76
use MongoDB\Builder\Expression;
87
use MongoDB\Builder\Pipeline;

tests/AggregationBlogPost/article.md

+47-39
Original file line numberDiff line numberDiff line change
@@ -121,12 +121,12 @@ pipeline aggregates all of these documents, producing a document for each month:
121121
Without going into more details on this, even if we were to comment on parts of
122122
the aggregation pipeline to explain what it does, there will still be a high
123123
cognitive load when going through the aggregation pipeline. One reason for this
124-
is that any PHP editor will not know that this is an aggregation pipeline, and
125-
thus can't provide much help beyond syntax highlighting (e.g. "this is a string
126-
in an array"). Couple that with a few levels of nesting, and you've got yourself
127-
this magical kind of code that you can write, but not read. We can of course
128-
refactor this code, but before we get into that, we want to move away from these
129-
array structures.
124+
is that the only way to express the aggregation framework domain-specific
125+
language (DSL) is through untyped arrays, and any PHP editor can't provide much
126+
help beyond syntax highlighting. Pair that with a few levels of nesting, and
127+
you've got yourself the kind of code that you can write, but not read. We could
128+
start off by refactoring the code, but instead let's try to move away from array
129+
structures and use a better solution.
130130

131131
## Introducing the Aggregation Pipeline Builder
132132

@@ -137,6 +137,12 @@ and operators. Here is that same pipeline as we had before, this time written
137137
with the aggregation pipeline builder:
138138

139139
```php
140+
use MongoDB\Builder\Accumulator;
141+
use MongoDB\Builder\Expression;
142+
use MongoDB\Builder\Pipeline;
143+
use MongoDB\Builder\Stage;
144+
use function MongoDB\object;
145+
140146
$pipeline = new Pipeline(
141147
Stage::group(
142148
_id: object(
@@ -209,7 +215,9 @@ that can receive an aggregation pipeline, such as `Collection::aggregate` or
209215
`Collection::watch`. In addition, methods like `Collection::updateMany` and
210216
`Collection::findOneAndUpdate` can receive a `Pipeline` instance to run an
211217
update. Keep in mind that you won't be able to use all available aggregation
212-
pipeline stages in update operations.
218+
pipeline stages in update operations. Since a pipeline instance could be used
219+
everywhere, the builder will allow you to add stages to a pipeline that aren't
220+
supported in updates.
213221

214222
### Builder Design
215223

@@ -223,7 +231,8 @@ expression that resolves to a date, timestamp, or ObjectId. While we could use
223231
`$reportDate` to reference the `reportDate` field of the document being evaluated,
224232
`dateFieldPath` is more expressive and shows intent of receiving a date field.
225233
This also allows IDEs like PhpStorm to make better suggestions when offering
226-
code completion.
234+
code completion, as well as a certain amount of type checking to ensure you're
235+
not creating pipelines that would result in server errors.
227236

228237
For all expressions, there are factory classes with methods to create the
229238
expression objects. The use of static methods makes the code a little more
@@ -314,17 +323,17 @@ to since they are only used once, but you may want to do so in favor of consiste
314323
In complex pipelines, you'll often find comments explaining what a certain
315324
pipeline stage or segment does. You should definitely include comments like
316325
that, but you can also consider extracting parts of a pipeline to your own
317-
builder method. If you choose a descriptive method name, this can already
318-
explain what the stage or segment does, without the reader having to see the
319-
internal workings.
326+
builder method. Choosing a descriptive method name that explains what the stage
327+
or segment does can also remove the need for a comment and spare the reader from
328+
having to dive into the internal workings.
320329

321330
```php
322331
public static function groupAndComputeStatistics(
323-
stdClass $_id,
332+
stdClass $groupBy,
324333
Expression\ResolvesToDouble $price,
325334
): GroupStage {
326335
return Stage::group(
327-
_id: $_id,
336+
_id: $groupBy,
328337
lowest: Accumulator::min($price),
329338
highest: Accumulator::max($price),
330339
average: Accumulator::avg($price),
@@ -342,7 +351,7 @@ $reportDate = Expression::dateFieldPath('reportDate');
342351
$price = Expression::doubleFieldPath('price');
343352

344353
self::groupAndComputeStatistics(
345-
_id: object(
354+
groupBy: object(
346355
year: Expression::year($reportDate),
347356
month: Expression::month($reportDate),
348357
fuelType: Expression::fieldPath('fuelType'),
@@ -366,7 +375,7 @@ return a `Pipeline` instance:
366375

367376
```php
368377
public static function groupAndAssembleFuelTypePriceObject(
369-
stdClass $_id,
378+
stdClass $groupBy,
370379
Expression\ResolvesToString $fuelType,
371380
Expression\ResolvesToInt $count,
372381
Expression\ResolvesToDouble $lowest,
@@ -375,7 +384,7 @@ public static function groupAndAssembleFuelTypePriceObject(
375384
): Pipeline {
376385
return new Pipeline(
377386
Stage::group(
378-
_id: $_id,
387+
_id: $groupBy,
379388
count: Accumulator::sum($count),
380389
prices: Accumulator::push(
381390
object(
@@ -409,7 +418,7 @@ the pipeline:
409418

410419
```php
411420
self::groupAndAssembleFuelTypePriceObject(
412-
_id: object(
421+
groupBy: object(
413422
year: Expression::fieldPath('_id.year'),
414423
month: Expression::fieldPath('_id.month'),
415424
brand: Expression::fieldPath('_id.brand'),
@@ -432,7 +441,7 @@ $price = Expression::doubleFieldPath('price');
432441

433442
$pipeline = new Pipeline(
434443
self::groupAndComputeStatistics(
435-
_id: object(
444+
groupBy: object(
436445
year: Expression::year($reportDate),
437446
month: Expression::month($reportDate),
438447
fuelType: Expression::fieldPath('fuelType'),
@@ -441,7 +450,7 @@ $pipeline = new Pipeline(
441450
price: $price,
442451
),
443452
self::groupAndAssembleFuelTypePriceObject(
444-
_id: object(
453+
groupBy: object(
445454
year: Expression::fieldPath('_id.year'),
446455
month: Expression::fieldPath('_id.month'),
447456
brand: Expression::fieldPath('_id.brand'),
@@ -453,7 +462,7 @@ $pipeline = new Pipeline(
453462
average: Expression::fieldPath('average'),
454463
),
455464
self::groupBrandsAndSort(
456-
_id: object(
465+
groupBy: object(
457466
year: Expression::fieldPath('_id.year'),
458467
month: Expression::fieldPath('_id.month'),
459468
),
@@ -588,16 +597,16 @@ public static function computeElapsedSecondsOnDay(
588597
* argument.
589598
*/
590599
public static function computeDurationBetweenDates(
591-
Expression\ResolvesToDate $previousReportDate,
592-
Expression\ResolvesToDate $reportDate,
600+
Expression\ResolvesToDate $startDate,
601+
Expression\ResolvesToDate $endDate,
593602
): Expression\ResolvesToInt {
594603
return Expression::min(
595604
Expression::dateDiff(
596-
startDate: $previousReportDate,
597-
endDate: $reportDate,
605+
startDate: $startDate,
606+
endDate: $endDate,
598607
unit: 'second',
599608
),
600-
self::computeElapsedSecondsOnDay($reportDate),
609+
self::computeElapsedSecondsOnDay($endDate),
601610
);
602611
}
603612
```
@@ -638,7 +647,7 @@ This would include all the logic included in the pipeline. By extracting complex
638647
parts into separate methods, you can test those in isolation under controlled
639648
conditions, knowing that they behave as you intended. Looking at the last
640649
example where we've extracted the expression that computes the duration between
641-
two dates, you can now easily verify that all the individual parts of that
650+
two dates, you can now better verify that all the individual parts of that
642651
complex expression work as expected.
643652

644653
Just beware of premature abstractions: extract complex logic with the goal of
@@ -656,12 +665,12 @@ but they allow us to know what types an expression will resolve to.
656665
For an example, let's take the `$hour` operator. When you use
657666
`Expression::hour`, you will receive an instance of an `HourOperator` class.
658667
This class implements an `OperatorInterface`, telling the builder that this is
659-
an operator that can be used in aggregation pipeline. It also implements a
660-
`ResolvesToInt` interface, as we always know that evaluating the expression
661-
results in an integer value. The required `date` parameter of the operator is a
662-
date, which in the builder is one of many things. It could be a BSON
663-
`UTCDateTime` instance, but it could also be the result of any operator that
664-
returns a date, e.g. `$dateFromString`.
668+
an operator that can be used in an aggregation pipeline stage. It also
669+
implements a `ResolvesToInt` interface, as we always know that evaluating the
670+
expression results in an integer value. The required `date` parameter of the
671+
operator is a date, which in the builder is one of many things. It could be a
672+
`MongoDB\BSON\UTCDateTime` instance, but it could also be the result of any
673+
operator that returns a date, e.g. `$dateFromString`.
665674

666675
Now that we know about these value holder objects, we still need to make sure
667676
the server knows what we're talking about. When you call `Collection::aggregate`
@@ -674,8 +683,8 @@ representations.
674683
This allows us to keep the logic customizable. For example, the Doctrine MongoDB
675684
ODM allows users to specify different names for fields in the database. In turn,
676685
it needs to convert the name of a property in the mapped class to the name of
677-
the field in the database. Such a feature could easily be built by creating a
678-
custom encoder for all `fieldPath` expressions, and changing the field path
686+
the field in the database. Such a feature could be built by creating a custom
687+
encoder for all `fieldPath` expressions, and changing the field path
679688
accordingly.
680689

681690
When creating a `MongoDB\Client` instance, you can now pass an additional
@@ -690,7 +699,7 @@ With factories, value holders, and encoders, we wanted to ensure that creating
690699
the builder does not turn into a repetitive chore. As you can imagine, many
691700
operators will mostly consist of the same logic, resulting in tons of code
692701
duplication. To make matters worse, every new server version may introduce new
693-
operators and stages, so we wanted to make sure that we can easily expand
702+
operators and stages, so we wanted to make sure that we can quickly expand
694703
the builder.
695704

696705
We could try to rely on generative AI to help us with this, but this only goes
@@ -706,7 +715,6 @@ before and adds them to the tests. In these tests, we manually write the builder
706715
code that would generate the pipeline or expression from the example. This
707716
ensures that the generated logic behaves as we expect, is protected from
708717
regression should we make any changes to the generator, and it allows us to use
709-
the builder and feel what it's like. To top it all off, we could add this code
710-
to the server documentation, similar to how we add other language-specific code
711-
snippets. We're not quite there yet, but we'd love to include this in the
712-
documentation.
718+
the builder and feel what it's like. Ultimately, we love to add this builder
719+
code to the server documentation, similar to how we add other language-specific
720+
code snippets, but we're not quite there yet.

0 commit comments

Comments
 (0)