Skip to content

Prevent separator characters from corrupting long-term memory tag values #214

@tylerhutcherson

Description

@tylerhutcherson

Problem

AMS stores long-term memory topics, entities, and extracted_from as comma-delimited TAG strings in
Redis.

That means if a single tag value itself contains a comma, it can be split incorrectly when read back.

Example:

["cooking, italian style", "recipes"]

can be decoded as:

["cooking", "italian style", "recipes"]

instead of the original 2 values.

Decision

We should reject commas in submitted tag values rather than add custom escaping/encoding logic.

Why this is the right choice

  • simple to implement
  • easy to explain
  • avoids silent corruption
  • matches the TAG separator defined in the Redis index
  • avoids introducing a custom encoding format into AMS

What we should do

  • validate topics, entities, and extracted_from on write/update paths
  • reject values containing commas
  • return a clear validation error
  • document the restriction

What we should not do right now

We should not add a custom escaping or sentinel encoding scheme unless we decide we truly need arbitrary tag
strings.

That would require coordinated changes across:

  • storage encoding
  • decoding
  • filter/query generation
  • migrations/backfills
  • tests

Risk if we do nothing

Comma-containing tag values can be silently corrupted and later read back incorrectly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions