Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tracking Issue] Translation #5

Open
3 tasks
rylev opened this issue Jul 30, 2021 · 6 comments
Open
3 tasks

[Tracking Issue] Translation #5

rylev opened this issue Jul 30, 2021 · 6 comments

Comments

@rylev
Copy link
Member

rylev commented Jul 30, 2021

In an ideal world, the survey would be available in all languages. Unfortunately this is not practical in reality.

For the 2021 survey we need to decide which languages we will translate the survey (authored in English). Translations can be essential for ensuring that we're capturing the feedback from community members around the world and not just those who happen to be comfortable with taking surveys in English.

The Cost

Translations do represent an additional cost, so while in an ideal world we would have the survey in any language the user desires, we need to limit the selection to the languages with the largest share of users who are either or unwilling/unable to take the survey in English or less comfortable taking the survey in English than they would be in another language (and thus less likely to share relevant information).

The costs of translation includes:

  • Translating the questions
  • Translating free-form answers
  • Lining up and matching answers in one language with answers from another
  • Managing the translation process (which grows in complexity fairly linearly with each additional language)
  • Creating the different surveys in the survey platform

Which languages?

The 2020 survey was administered in many languages:

  • English: 75.0%
  • Simplified Chinese: 5.4%
  • Russian: 5.3%
  • German: 4.0%
  • French: 2.7%
  • Japanese: 2.2%
  • Korean: 1.2%
  • Traditional Chinese: 1.1%
  • Spanish: 1.0%
  • Portuguese: 0.7%
  • Italian: 0.6%
  • Swedish: 0.5%
  • Vietnamese: 0.1%
  • Polish: 0.1%

These languages were simply the languages for which there were volunteers who were willing to do the translation. While this is one way of deciding which languages to choose, it does ignore the other costs of adding another language besides that of the actual translation.

This answer to the question of which languages are worth the administrative overhead to translate for is not straightforward. For example, while 5% of responses were in language A and 1% of responses in language B, which percentage of those respondents would be uncomfortable or unwilling to take the survey if they could only do so in English? If we could only translate one, would it make more sense to translate language B if we could determine that more of those respondents were uncomfortable in English than those who answered in language A?

Currently, the only other large scale translation effort we have in Rust is in rust-lang.org. This differs from the survey in that the survey is a one time artifact. Translating the website into a new language continues to pay dividends while translating the survey into a new language is only going to pay off one time.

What is the cutoff for how many respondents will choose a particular language for that language to be worth the overhead? It seems obvious that accommodating 1 additional person at the fairly large cost of translating the survey into a new language is unlikely to be worth the overhead, but is it worth it only starting at 10 people, 100, 500?

Gathering more info

These questions would be much more easy to answer if we knew the answer to the following question: of all people who are otherwise willing to take the survey, how many people feel uncomfortable with the survey and how many are completely unlikely to take the survey if the survey is not translated into a given language?

Is there someway to find this out

TODOs

  • Decide on likely list of must have translations
  • Decide on framework for deciding if the administrative overhead of translation is worth it for a particular language
  • Decide how we can improve this process for next year so that we have a better idea of which languages to translate.
@poliorcetics
Copy link

poliorcetics commented Jul 30, 2021

I don't have any answers for the questions asked here but I want to make a remark: if we decide to translate the survey in language B, please inform possible translators quickly, last year we missed errors in the French one (and maybe others) which may have affected results.

@qrnch-jan
Copy link

Regarding the Swedish survey: Several of the answers were written in English, so if the goal is to find out how many people use translations because it's the only way they can participate then that 0.5% is an over-estimate.

@llogiq
Copy link

llogiq commented Oct 28, 2021

I have talked to a number of German Rustaceans, and though most if not all of them speak English, all appreciated the effort, and some would have felt it harder to answer the survey in English.

@dynaxis
Copy link

dynaxis commented Oct 31, 2021

I'm running a poll on FB for 5 days to ask if Korean Rustaceans would like to have a Korean translation this year. 31 people answered so far:

  1. I can answer the English version of the survey, but it would be good if a Korean translation is available: 28 votes
  2. No Korean translation is necessary. I will answer the English version of the survey: 5 votes
  3. Without a Korean translation, I'm unlikely to answer the survey: 3 votes

So overall, they are rather positive on having a Korean translation.

<Updated the voting result on Nov. 10, 2021>

@apiraino
Copy link
Contributor

After reviewing this issue in 2023, I decided to leave it open because the fundamental question about the translation workflow is still open, the 3 TODOs in the first comment do not yet have a satisfactory response, I'll leave here some thoughts:

Decide on likely list of must have translations

We can relate "must have translations" with "how many people we can find to do them". It's a community effort so I'd love the community to be more involved in the process. Our job is coordinating this effort.

Decide on framework for deciding if the administrative overhead of translation is worth it for a particular language

The workflow we currently use is a bit kludgy. We can improve on that but we need more time to find and experiment alternatives. Until then, it will do.

Decide how we can improve this process for next year so that we have a better idea of which languages to translate.

I'd personally love to have a more text-based workflow. I think the main crux at this point is how to import/export the dataset of questions (in all the languages) into the survey.

@Kobzol
Copy link
Contributor

Kobzol commented Aug 24, 2024

Some data on the last year usage of individual languages (total response count was 11950):

  • English: 9492 (79.5%)
  • Simplified Chinese: 734 (6%)
  • German: 446 (3.7%)
  • French: 437 (3.6%)
  • Russian: 392 (3.2%)
  • Japanese: 274 (2.2%)
  • Spanish: 163 (1.3%)

Seems quite similar to the numbers from 2020.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants