Both text and chat completion requests have a parameter called n that defines how many choices should be returned in the response message.
Once this parameter is added, please ensure that the request_max_num_generation_tokens behaves as expected and add appropriate tests to verify it.