From 5d5c85546932481fc513d4a0ee97d40c620cc1a3 Mon Sep 17 00:00:00 2001 From: Eliah Kagan Date: Mon, 24 Jul 2023 01:37:21 -0400 Subject: [PATCH] Add Tiktokenizer link in "How to count tokens" This adds a link to Tiktokenizer webapp as another tool, in addition to the OpenAI Tokenizer. --- examples/How_to_count_tokens_with_tiktoken.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/How_to_count_tokens_with_tiktoken.ipynb b/examples/How_to_count_tokens_with_tiktoken.ipynb index ecf3c55aad..9a221acd90 100644 --- a/examples/How_to_count_tokens_with_tiktoken.ipynb +++ b/examples/How_to_count_tokens_with_tiktoken.ipynb @@ -52,7 +52,7 @@ "\n", "## How strings are typically tokenized\n", "\n", - "In English, tokens commonly range in length from one character to one word (e.g., `\"t\"` or `\" great\"`), though in some languages tokens can be shorter than one character or longer than one word. Spaces are usually grouped with the starts of words (e.g., `\" is\"` instead of `\"is \"` or `\" \"`+`\"is\"`). You can quickly check how a string is tokenized at the [OpenAI Tokenizer](https://beta.openai.com/tokenizer)." + "In English, tokens commonly range in length from one character to one word (e.g., `\"t\"` or `\" great\"`), though in some languages tokens can be shorter than one character or longer than one word. Spaces are usually grouped with the starts of words (e.g., `\" is\"` instead of `\"is \"` or `\" \"`+`\"is\"`). You can quickly check how a string is tokenized at the [OpenAI Tokenizer](https://beta.openai.com/tokenizer), or the third-party [Tiktokenizer](https://tiktokenizer.vercel.app/) webapp." ] }, {