Skip to content
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions build/render_hook_docs/DECISION_TREE_FORMAT.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ questions:
- **`scope`** (required): Category or domain this tree applies to (e.g., `documents`, `collections`, `sequences`). Helps AI agents understand the tree's purpose and applicability.
- **`rootQuestion`** (required): The ID of the starting question
- **`questions`** (required): Object containing all questions, keyed by ID
- **`indentWidth`** (optional): Horizontal spacing in pixels between parent and child nodes. Default: 40. Use smaller values (e.g., 20-30) for deeply nested trees to reduce overall width. This is a rendering preference and does not affect the semantic metadata.

### Question Object

Expand All @@ -72,6 +73,29 @@ Each answer (`yes` or `no`) contains:

- **`label`** (required): The recommendation text
- **`id`** (required): Unique identifier (e.g., `jsonOutcome`, `hashOutcome`)
- **`sentiment`** (optional): Indicates the nature of the outcome for visual styling
- `"positive"`: Renders with green styling (e.g., "Use this option")
- `"negative"`: Renders with red styling (e.g., "Don't use this option")
- Omitted: Defaults to red (neutral/warning styling)

**Note**: The `sentiment` field is particularly useful for **suitability trees** (where outcomes are binary: suitable vs. unsuitable) as opposed to **selection trees** (where all outcomes are valid options). See [Tree Types](#tree-types) below.

## Tree Types

Decision trees can serve different purposes, which affects how you structure outcomes:

### Selection Trees
All paths lead to valid recommendations. Users choose between options.
- **Example**: "Which data type should I use?" → JSON, Hash, or String (all valid)
- **Outcome styling**: Typically all neutral (no sentiment field needed)
- **Use case**: Helping users choose among alternatives

### Suitability Trees
Paths lead to binary outcomes: suitable or unsuitable for the use case.
- **Example**: "Should I use RDI?" → Yes (good fit) or No (various reasons why not)
- **Outcome styling**: Use `sentiment: "positive"` for suitable outcomes and `sentiment: "negative"` for unsuitable ones
- **Use case**: Determining if a technology/approach is appropriate
- **Benefit**: Visual distinction (green vs. red) helps users quickly understand if something is recommended

## Multi-line Text

Expand Down Expand Up @@ -112,6 +136,10 @@ scope: documents
5. **Consistent naming**: Use camelCase for IDs, end question IDs with "Question"
6. **Match fence and YAML IDs**: The `id` in the code block fence should match the `id` field in the YAML for consistency
7. **Use meaningful scopes**: Choose scope values that clearly indicate the tree's domain (e.g., `documents`, `collections`, `sequences`)
8. **Add sentiment for suitability trees**: If your tree determines whether something is suitable (not just choosing between options), use `sentiment: "positive"` and `sentiment: "negative"` to provide visual feedback
9. **Be consistent with sentiment**: In a suitability tree, ensure all positive outcomes have `sentiment: "positive"` and all negative outcomes have `sentiment: "negative"` for clarity
10. **Control answer order**: The order of `yes` and `no` in the YAML controls the visual layout. For early rejection patterns, put `no` first so negative outcomes appear on the left side of the diagram
11. **Adjust indent width for deeply nested trees**: If your tree has many levels and becomes too wide, use `indentWidth="25"` (or lower) in the code block fence to reduce horizontal spacing between parent and child nodes

## Example: Redis Data Type Selection

Expand Down
114 changes: 114 additions & 0 deletions build/render_hook_docs/DECISION_TREE_IMPLEMENTATION_NOTES.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,3 +140,117 @@ const maxCharsPerLine = Math.floor(maxBoxWidth / charWidth);

**Benefit**: Helps future implementers and enables AI agents to understand the format.

## 11. Sentiment-Based Styling for Suitability Trees

**Discovery**: Not all decision trees are "selection trees" (choose between options). Some are "suitability trees" (determine if something is appropriate).

**Problem**: Selection trees and suitability trees have fundamentally different semantics:
- **Selection trees**: All outcomes are valid recommendations (e.g., "Use JSON" vs. "Use Hash" vs. "Use String")
- **Suitability trees**: Outcomes are binary (suitable vs. unsuitable) (e.g., "RDI is a good fit" vs. "RDI won't work")

**Solution**: Add optional `sentiment` field to outcomes:
```yaml
outcome:
label: "✅ RDI is a good fit for your use case"
id: goodFit
sentiment: "positive" # Green styling
```

**Implementation Details**:
- Extract `sentiment` field during YAML parsing in JavaScript
- Apply conditional styling in SVG rendering:
- `sentiment: "positive"` → Green background (`#0fa869`) and border
- `sentiment: "negative"` → Red background (`#d9534f`) and border
- No sentiment → Red (default, maintains backward compatibility)

**Key Insight**: Explicit metadata is better than heuristics. Don't try to infer sentiment from emoji (✅/❌) or label text. Use explicit fields for reliability and AI agent compatibility.

**Backward Compatibility**: Existing trees without sentiment fields continue to work with default red styling. This allows gradual adoption.

## 12. Answer Order Respects YAML Structure

**Discovery**: The JavaScript had two issues preventing YAML answer order from being respected:
1. The `flattenDecisionTree()` function was hardcoded to process "yes" first, then "no"
2. The tree line drawing code was deriving Yes/No labels from position (first child = Yes, others = No) instead of using the actual answer value

**Problem**: This prevented authors from controlling the visual layout of the tree. If you wanted "no" outcomes to appear first (for early rejection patterns), the diagram would still show "yes" first.

**Solution**:
1. Modified `flattenDecisionTree()` to iterate through answer keys in the order they appear in the YAML
2. Modified `drawTreeLines()` to use the actual `answer` value stored in each item instead of deriving it from position

```javascript
// In flattenDecisionTree():
const answerKeys = Object.keys(question.answers);
answerKeys.forEach(answerKey => {
const answer = question.answers[answerKey];
// Process in order, storing answer.value in the item
});

// In drawTreeLines():
answerLabel = item.answer || 'Yes'; // Use stored value, not position
```

**Benefit**: Authors can now control tree layout by ordering answers in the YAML:
- Put `no` first for early rejection patterns (negative outcomes appear left)
- Put `yes` first for positive-path-first patterns (positive outcomes appear left)

**Key Insight**: YAML object key order is preserved in JavaScript (since ES2015), and we now respect both the order AND the actual answer values, making the layout fully author-controlled.

## 13. Configurable Indent Width for Deeply Nested Trees

**Problem**: Deeply nested decision trees (with many levels of questions) can become too wide to fit on the page, requiring horizontal scrolling.

**Solution**: Added optional `indentWidth` parameter to the YAML root object that controls the horizontal spacing between parent and child nodes:

```yaml
id: when-to-use-rdi
scope: rdi
indentWidth: 25 # Reduce from default 40 to make tree narrower
rootQuestion: cacheTarget
questions:
# ...
```

**Implementation**:
In `renderDecisionTree()`, the indent width is read from `treeData.indentWidth` with a sensible default:
```javascript
const indentWidth = treeData.indentWidth ? parseInt(treeData.indentWidth) : 40;
```

**Design Rationale**: While `indentWidth` is a rendering preference, it's included in the YAML because:
1. Hugo's Goldmark attribute parsing doesn't reliably expose custom attributes from the code block info string to the render hook
2. Including it in the YAML keeps all tree configuration in one place
3. AI agents can still access the semantic metadata (id, scope, questions) separately from rendering preferences

**Benefit**: Authors can now control tree width by adjusting `indentWidth`:
- Default (40): Comfortable spacing for shallow trees
- Reduced (20-30): Compact layout for deeply nested trees
- The SVG width is calculated as: `leftMargin + (maxDepth + 1) * indentWidth + maxBoxWidth + 40`

**Recommendation**: For trees with 8+ levels of nesting, try `indentWidth: 25` or lower to keep the diagram readable without horizontal scrolling.

## 14. Improved Label Visibility with Reduced Indent Width

**Problem**: When using reduced `indentWidth` values, the Yes/No labels on the connecting lines were being covered by the node boxes they referred to.

**Solution**:
1. Increased the vertical offset of labels from `y + 10` to `y + 16` pixels
2. Added a white background rectangle behind each label to ensure visibility even when overlapping with boxes

**Implementation**:
```javascript
const labelY = y + 16; // Increased offset

// Add white background rectangle behind label
const labelBg = document.createElementNS('http://www.w3.org/2000/svg', 'rect');
labelBg.setAttribute('x', labelX - 12);
labelBg.setAttribute('y', labelY - 9);
labelBg.setAttribute('width', '24');
labelBg.setAttribute('height', '12');
labelBg.setAttribute('fill', 'white');
svg.appendChild(labelBg);
```

**Benefit**: Labels remain readable regardless of indent width or tree density.

223 changes: 223 additions & 0 deletions content/embeds/rdi-when-to-use.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,223 @@
### When to use RDI

RDI is a good fit when:

- You want to use Redis as the target database for caching data.
- You want to transfer data to Redis from a *single* source database.
- You must use a slow database as the system of record for the app.
- The app must always *write* its data to the slow database.
- Your app can tolerate *eventual* consistency of data in the Redis cache.
- You want a self-managed solution or AWS based solution.
- The source data changes frequently in small increments.
- There are no more than 10K changes per second in the source database.
- The total data size is not larger than 100GB.
- RDI throughput during
[full sync]({{< relref "/integrate/redis-data-integration/data-pipelines#pipeline-lifecycle" >}}) would not exceed 30K records per second and during
[CDC]({{< relref "/integrate/redis-data-integration/data-pipelines#pipeline-lifecycle" >}})
would not exceed 10K records per second.
- You don’t need to perform join operations on the data from several tables
into a [nested Redis JSON object]({{< relref "/integrate/redis-data-integration/data-pipelines/data-denormalization#joining-one-to-many-relationships" >}}).
- RDI supports the [data transformations]({{< relref "/integrate/redis-data-integration/data-pipelines/transform-examples" >}}) you need for your app.
- Your data caching needs are too complex or demanding to implement and maintain yourself.
- Your database administrator has reviewed RDI's requirements for the source database and
confirmed that they are acceptable.

### When not to use RDI

RDI is not a good fit when:

- You are migrating an existing data set into Redis only once.
- Your app needs *immediate* cache consistency (or a hard limit on latency) rather
than *eventual* consistency.
- You need *transactional* consistency between the source and target databases.
- The data is ingested from two replicas of Active-Active at the same time.
- The app must *write* data to the Redis cache, which then updates the source database.
- Your data set will only ever be small.
- Your data is updated by some batch or ETL process with long and large transactions - RDI will fail
processing these changes.
- You need complex stream processing of data (aggregations, sliding window processing, complex
custom logic).
- You need to write data to multiple targets from the same pipeline (Redis supports other
ways to replicate data across Redis databases such as replicaOf and Active Active).
- Your database administrator has rejected RDI's requirements for the source database.

### Decision tree for using RDI

Use the decision tree below to determine whether RDI is a good fit for your architecture:

```decision-tree {id="when-to-use-rdi"}
id: when-to-use-rdi
scope: rdi
indentWidth: 25
rootQuestion: cacheTarget
questions:
cacheTarget:
text: |
Do you want to use Redis as the target database for caching data?
whyAsk: |
RDI is specifically designed to keep Redis in sync with a primary database. If you don't need Redis as a cache, RDI is not the right tool.
answers:
no:
value: "No"
outcome:
label: "❌ RDI only works with Redis as the target database"
id: noRedisCache
sentiment: "negative"
yes:
value: "Yes"
nextQuestion: singleSource
singleSource:
text: |
Are you transferring data from a single source database?
whyAsk: |
RDI is designed to work with a single source database. Multiple sources or Active-Active replicas create conflicting change events.
answers:
no:
value: "No"
outcome:
label: "❌ RDI won't work with multiple source databases"
id: multipleSourcesOrActiveActive
sentiment: "negative"
yes:
value: "Yes"
nextQuestion: systemOfRecord
systemOfRecord:
text: |
Does your app always *write* to the source database and not to Redis?
whyAsk: |
RDI requires the source database to be the authoritative source of truth. If your app writes to Redis first, RDI won't work.
answers:
no:
value: "No"
outcome:
label: "❌ RDI doesn't support syncing data from Redis back to the source database"
id: notSystemOfRecord
sentiment: "negative"
yes:
value: "Yes"
nextQuestion: consistency
consistency:
text: |
Can your app tolerate eventual consistency in the Redis cache?
whyAsk: |
RDI provides eventual consistency, not immediate consistency. If your app needs real-time cache consistency or hard latency limits, RDI is not suitable.
answers:
no:
value: "No"
outcome:
label: "❌ RDI does not provide immediate cache consistency"
id: needsImmediate
sentiment: "negative"
yes:
value: "Yes"
nextQuestion: deployment
deployment:
text: |
Do you want a self-managed solution or an AWS-based solution?
whyAsk: |
RDI is available as a self-managed solution or as an AWS-based managed service. If you need a different deployment model, RDI may not be suitable.
answers:
no:
value: "No"
outcome:
label: "❌ RDI may not be suitable - check deployment options"
id: deploymentMismatch
sentiment: "negative"
yes:
value: "Yes"
nextQuestion: dataChangePattern
dataChangePattern:
text: |
Does your source data change frequently in small increments?
whyAsk: |
RDI captures changes from the database transaction log. Large batch transactions or ETL processes can cause RDI to fail.
answers:
no:
value: "No"
outcome:
label: "❌ RDI will fail with batch/ETL processes and large transactions"
id: batchProcessing
sentiment: "negative"
yes:
value: "Yes"
nextQuestion: changeRate
changeRate:
text: |
Are there fewer than 10K changes per second in the source database?
whyAsk: |
RDI has throughput limits. Exceeding these limits will cause processing failures and data loss.
answers:
no:
value: "No"
outcome:
label: "❌ RDI throughput limits will be exceeded"
id: exceedsChangeRate
sentiment: "negative"
yes:
value: "Yes"
nextQuestion: dataSize
dataSize:
text: |
Is your total data size smaller than 100GB?
whyAsk: |
RDI has practical limits on the total data size it can manage. Very large datasets may exceed these limits.
answers:
no:
value: "No"
outcome:
label: "❌ RDI may not be suitable - your data set is probably too large"
id: dataTooLarge
sentiment: "negative"
yes:
value: "Yes"
nextQuestion: joins
joins:
text: |
Do you need to perform join operations on data from several tables into a nested Redis JSON object?
whyAsk: |
RDI has limitations with complex join operations. If you need to combine data from multiple tables into nested structures, you may need custom transformations.
answers:
yes:
value: "Yes"
outcome:
label: "❌ RDI may not be suitable - complex joins are not well supported"
id: complexJoins
sentiment: "negative"
no:
value: "No"
nextQuestion: transformations
transformations:
text: |
Does RDI support the data transformations you need for your app?
whyAsk: |
RDI provides built-in transformations, but if you need custom logic beyond what RDI supports, you may need a different approach.
answers:
no:
value: "No"
outcome:
label: "❌ RDI may not be able to perform the required data transformations"
id: unsupportedTransformations
sentiment: "negative"
yes:
value: "Yes"
nextQuestion: adminReview
adminReview:
text: |
Has your database administrator reviewed RDI's requirements for the source database
and confirmed they are acceptable?
whyAsk: |
RDI has specific requirements for the source database (binary logging, permissions, etc.). Your DBA must confirm these are acceptable before proceeding.
answers:
no:
value: "No"
outcome:
label: "❌ RDI requirements for the source database can't be met"
id: adminRejected
sentiment: "negative"
yes:
value: "Yes"
outcome:
label: "✅ RDI is a good fit for your use case"
id: goodFit
sentiment: "positive"
```
Loading