-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
object comments may be prohibited in favour of LLM-generated docs #3993
Comments
@maxonfjvipon @volodya-lombrozo thoughts? |
@yegor256 so the source code will not contain any documentation at all? Where should I go then if I want to know what the specific object does? |
@yegor256 @maxonfjvipon how about we allow comments (EODocs) on syntax level, but make lints warn about them. If programmer still wants to write a comment, he can |
@maxonfjvipon we can automatically generate comments inside objectionary/home. Also, we can make our IDE plugin to generate them on-fly. |
@yegor256 I strongly disagree. In most cases, objects should be self-explanatory—this is true. However, quite often, we need to write not about an object's functionality but to answer the question of why an object was implemented this way or why it even exists. None of the solutions will answer this question or generate appropriate documentation. Moreover, some comments might clarify the implementation, such as in the following code:
We might have this human-readable comment:
Btw, what about PDD puzzles? |
@yegor256 Can we close this one? |
@volodya-lombrozo the future is coming: LLMs will be as smart (or smarter) than people. They will be able to understand the code and write even better comments than in your example about (with the "Euclidean distance"). We can ask programmers to write as clean code as necessary for LLM to understand it and write proper text for it. In other words, we'll restrict programmers for the sake of higher readability. TDD puzzles we can move to metas:
WDYT? |
@yegor256 I'm afraid this future is still far from reality and might never happen, actually. Comments aren't only about the inability to express a developer's thoughts through code; sometimes, they explain why this code even exists.
Code Tells You How, Comments Tell You Why
I suggest waiting until AI is able to generate this comment, for example. If it can do that, then this feature will be totally reasonable. What do you think? |
@volodya-lombrozo how about we invite others to this discussion: https://t.me/eolang_org/1549 |
@yegor256 I personally don't believe LLMs are able to write good comment's in ALL situations, so I agree with @volodya-lombrozo about letting user to write comment but warn them that the language don't like human-written comment. However, I think you can prove that we are wrong! I think you can create a repo, with some really hard-to-understand code (for example for complicated mathematical formulas, optimizations targeting some weird hardware, etc.) and let LLMs write comments for this code. I think once you put complex enough code, and LLMs will generate good enough comments, this will be a nice proof that LLMs are smart enough to ban humans from writing comments. For code snippets to test LLM commenting skills I have this in my mind:
I think you can put some general design docs near the file with this code snippet (but remember, zk-vm we targeting is so brand new that an LLM have no idea what it is). I highly doubt an LLM will write good comment explaining why we want disable compiler optimizations in this function. Also, I believe, there will be some engineers who will need to write such function and disable optimizations only in it. p.s. I think some day your idea will be possible, with next level llm, which scans all slack messages, all issues and PRs discussions, all zoom meetings, it probably will be able to write good comment explaining why we want to disable compiler optimizations for this function, but they definetely can't do it today. So if you disable comments now I think you will make EO a bad choice for anything that have some complicated and not ideal parts. p.p.s. However, banning non-llm written comments can be a good marketing move for eo, since llms are a popular topic today, so... it can attract some "vibe-coders" to the eo =) |
One more example of hard-to-llm-generate-comment
I think it will be quite hard for an LLM to explain this hardcoded bitcoin public key. You can argue that once I write it not as hardcoded but properly, like this:
Then llm can add comment making it easy to understand the code. But imagine this is a library and you don't want add sha256 to your dependency list. Here I also think current state llms will not be able to generate a comment based on just NUMS variable name (btw NUMS is real and quite popular concept in bitcoin, you can try LLMs on it) |
I believe comments play a crucial role in communicating the current knowledge to future readers. Arguably, the programmer knows better than a current LLM what information is worth putting into comments. It'd be nice if comments of any length and format were allowed. Moreover, you may want to implement a feature that is currently missing in the VSCode extension for Haskell - footnotes and their preview on hover. Example: |
@MCJOHN974 this is what Claude.ai Sonnet 3.7 gave me:
Wouldn't this comment push the programmer to making the function code better? My point is that people usually write imperfect code, then making it "a bit better" via comments. If we prohibit comments, we force people to write perfect code. They will have to write it the way LLM perfectly understands their intent. Until then, they will have to improve the code, again and again. |
this feels completely redundant, and potentially even harmful |
@yegor256 Glad to see the Invitation for discussion from the eo channel, thnx. Points so far from opposite opinions as i see it:
And the last is the quiestion like - should a programming language by itself restrict such behavoiir of the developer UX ? We can say that the language provides instruments, but the programmer and the project decides which to choose from, f.e. allow comments or not, but the language itself may restrict certain smal-things about it, rather making them big and strongly influential, so the audience would shrink down to a very narrowed group of people worldwide, that love EOLang and it's positions. |
Yea, I see that I forgot to change function name to something more sensible. However, I think this comment from Claude even prove my point I think -- LLM for sure CAN see and highlight in comments that we disabling compiler optimizations here. But engineers probably also can see it from code, and from comments they expect to have answer -- why we disabling this optimizations. And LLM also can answer this question, but for generating such answer it need some bigger scope -- it should follow all slack discussions, all zoom calls, etc etc. I don't know any tool which can do so. And even when such tool will be invented it probably will be under a paywall, so it will be polite to wait until this tool became a standard developer tool such as git or github, before forcing EO users to use this tool. |
If you allow everything you will probably invent C++ but we already have one. I think it is OK when language forces you to avoid some bad practices, but I just think there are a lot of cases where LLM simply can't generate good comment just because it doesn't know everything it should. |
@yegor256 Few more questions about AI generated comments: Probably in such scenario of AI generated comments we want to make this comments stable, in sense comment generated locally on my machine and in CI will be same. Is it possible with current LLMs? What if, for example, I have open source code, but closed telegram/slack discussion about the project. Then I have two options -- don't give AI access to such chats (and then comments generated by external contributors and internal project maintainers will be different) or, other option will be to not allow AI to see anything external contributor can't see. If you allow AI generated comment just be different depending on who generated it then it will be quite hard to check in CI that comment was not hand-written. And even without external contributors -- I can have some local branches which I didn't push to github yet, some code I didn't even add to git, all of this can lead to change of state of my local LLM and different state of my local LLM and the CI one can lead to different comment generated |
What is good about EO here -- it is a brand new language, and it don't have some huge codebases which have bad code quality but still necessary to maintain because it is too expensive to rewrite them from scratch. Thus, while EO is still under development it is possible to force all projects in EO to start with some quality standards and force people to maintain this quality standards. |
It's true with followng this part: "allow everything" (c), but what i considered is not that, it's the absence of what should lie on the shoulders of developers - final users, on the tool itself. For example, in python there's an ability (and a propostion if i'm not mistaken in PEPs) to write doc-comments on the first line after the function defenition, and without that the code still maybe in much projects self-explainable. I see this prohibition of comments entirely could work only in such manne:
And here on the p.3. we already have, by a well-known processes of strong code-review, that doc-commetns would contain exactly what we need and are agreed upon, so no need then for developers to write comments before the 1st code-review. |
|
Because if LLM don't know any of your zoom/slack discussions there is a lot of cases where it can't write a good comment. Imagine you developed brand new algorithm, which didn't exist when LLM was learning. Your code is a lot of weird from first glance arithmetic operations. How LLM supposed to write comment about what all this operations do and why it leads to result you aiming for? How LLM can explain tradeoffs and decisions you made about them after looking on grafana and discussing at zoom meeting? If there are already some human written comments explaining it then LLM can do it, but AFAIU, yegor256 whant's to ban humans from writing comments completely
Tbh I didn't understand this. But my point was, that if CI want to check if comment was LLM generated it need to know LLM state. If we go back to previous question, we see that LLM state should contain some information about slack messages and zoom calls. And there is really a ton of projects where zoom and slack discussions are private, but repo is public. And yes, you can run this CI check privately, but it will be a pain for external contributors
And this is a good question =) Maybe that can work and it is possible to create a work pipeline where human writes code, push it, and then some LLM generate comments and push them to same branch. But... It sounds like "you can not write 100% ready code locally", and you for me it sounds really weird, but I don't have constructive arguments for it atm, so maybe yes, it can work this way |
This pipeline sounds like something that can work and like something what beats my arguments. However, I still believe with current state of LLMs there will be cases where LLM will fail to write a comment and this "review of LLM-written comments" will never lead to merge. And also it is hard to do a review of code without comments sometimes, you can argue that it is just because code is bad but I'm would not agree. At least something like pdd comments and todo's, I would like to see them during review |
How about we prohibit comments entirely, except for atoms? Objects must be self-explainable. Later, we can generate documentation for them with the help of LLMs, either live (in IDE) or statically (via
eoc docs
CLI command).Atoms, on the other hand, may have comments, because they don't have code sources.
The text was updated successfully, but these errors were encountered: