[GH-ISSUE #344] Why are there long underscores before the names in the output code? #67

Open
opened 2026-03-03 13:52:44 +03:00 by kerem · 4 comments
Owner

Originally created by @lissettecarlr on GitHub (Feb 25, 2025).
Original GitHub issue: https://github.com/jehna/humanify/issues/344

Input a segment:

{feed:l,reset:o}

output a segment:

feed: ________________________________________________________________handleTextInput,
reset: initializeInputState,

Image

Originally created by @lissettecarlr on GitHub (Feb 25, 2025). Original GitHub issue: https://github.com/jehna/humanify/issues/344 Input a segment: ``` {feed:l,reset:o} ``` output a segment: ``` feed: ________________________________________________________________handleTextInput, reset: initializeInputState, ``` ![Image](https://github.com/user-attachments/assets/feae8b4b-9fc1-44a3-b7b4-f54a8eba95b8)
Author
Owner

@0xdevalias commented on GitHub (Feb 26, 2025):

I would suspect that there were a lot of name clashes, and this safe rename code is what's doing it:

github.com/jehna/humanify@7b85f9de6c/src/plugins/local-llm-rename/visit-all-identifiers.ts (L42-L54)

Some past discussion related to this feature here as well:

Instead of just prefixing with _ each time there is a clash, it might be better to generate some kind of suffix like foo$1, foo$2, etc; which has also been mentioned in the past:

And some other related issues/PRs:

Also.. off the top of my head.. since we're tracking renames against the entire program, rather than just per scope, it means that if there is a similarly named variable anywhere in the program, it will be disallowed, and thus get an extra _ prefix.

github.com/jehna/humanify@7b85f9de6c/src/plugins/local-llm-rename/visit-all-identifiers.ts (L20)

github.com/jehna/humanify@7b85f9de6c/src/plugins/local-llm-rename/visit-all-identifiers.ts (L51)

While this can obviously make it easier to not have to think about scope rules/shadowing while quickly skimming through a file, in cases like this, it clearly makes things worse. This issue is tangentially related to that area of the code:

<!-- gh-comment-id:2683797426 --> @0xdevalias commented on GitHub (Feb 26, 2025): I would suspect that there were a lot of name clashes, and this safe rename code is what's doing it: https://github.com/jehna/humanify/blob/7b85f9de6c61afed42f5eebb3a1fefc104af8f2c/src/plugins/local-llm-rename/visit-all-identifiers.ts#L42-L54 Some past discussion related to this feature here as well: - https://github.com/jehna/humanify/issues/117#issuecomment-2381046017 Instead of just prefixing with `_` each time there is a clash, it might be better to generate some kind of suffix like `foo$1`, `foo$2`, etc; which has also been mentioned in the past: - https://github.com/jehna/humanify/issues/67#issuecomment-2422878669 And some other related issues/PRs: - https://github.com/jehna/humanify/pull/164 Also.. off the top of my head.. since we're tracking `renames` against the entire program, rather than just per scope, it means that if there is a similarly named variable anywhere in the program, it will be disallowed, and thus get an extra `_` prefix. https://github.com/jehna/humanify/blob/7b85f9de6c61afed42f5eebb3a1fefc104af8f2c/src/plugins/local-llm-rename/visit-all-identifiers.ts#L20 https://github.com/jehna/humanify/blob/7b85f9de6c61afed42f5eebb3a1fefc104af8f2c/src/plugins/local-llm-rename/visit-all-identifiers.ts#L51 While this can obviously make it easier to not have to think about scope rules/shadowing while quickly skimming through a file, in cases like this, it clearly makes things worse. This issue is tangentially related to that area of the code: - https://github.com/jehna/humanify/issues/330
Author
Owner

@lissettecarlr commented on GitHub (Feb 26, 2025):

Thank you for your reply.
I checked the source code and indeed found numerous variables with the same character names. The main issue is that the single-page code is too long, causing the underscores to stack up infinitely.

<!-- gh-comment-id:2683816872 --> @lissettecarlr commented on GitHub (Feb 26, 2025): Thank you for your reply. I checked the source code and indeed found numerous variables with the same character names. The main issue is that the single-page code is too long, causing the underscores to stack up infinitely.
Author
Owner

@0xdevalias commented on GitHub (Feb 26, 2025):

It might be worth leaving this issue open as a reminder that we could improve this based on some of the notes above.

<!-- gh-comment-id:2684070130 --> @0xdevalias commented on GitHub (Feb 26, 2025): It might be worth leaving this issue open as a reminder that we could improve this based on some of the notes above.
Author
Owner

@0xdevalias commented on GitHub (Mar 12, 2025):

Related older issue:

  • https://github.com/jehna/humanify/issues/181
    • Repeated variable names
      In large code bases with many variables, humanify frequently assigns the same name to different variables, leading to confusion and reduced readability - especially when multiple variables are renamed in generic terms such as _______variable.

      We should introduce a check to prevent duplication of variable names in the same scope. If a name has already been used, the system could ask for confirmation, ensuring that names remain unique and clear.

Which is then also tangentially related to this one:

  • https://github.com/jehna/humanify/issues/147
    • instead of just forcing an invalid suggestion to be valid (with toIdentifier), we could detect that it's invalid (with isValidIdentifier) and then provide that feedback to the LLM, asking it to give a new suggestion; probably with some max retry limit; after which we could fall back to using the invalid suggestion run through toIdentifier, or log a warning and leave it un-renamed or similar.


Also, this upstream webcrack issue/PR may improve this situation somewhat too:

<!-- gh-comment-id:2716362862 --> @0xdevalias commented on GitHub (Mar 12, 2025): Related older issue: - https://github.com/jehna/humanify/issues/181 - > Repeated variable names > In large code bases with many variables, humanify frequently assigns the same name to different variables, leading to confusion and reduced readability - especially when multiple variables are renamed in generic terms such as `_______variable`. > > We should introduce a check to prevent duplication of variable names in the same scope. If a name has already been used, the system could ask for confirmation, ensuring that names remain unique and clear. Which is then also tangentially related to this one: - https://github.com/jehna/humanify/issues/147 - > instead of just forcing an invalid suggestion to be valid (with `toIdentifier`), we could detect that it's invalid (with `isValidIdentifier`) and then provide that feedback to the LLM, asking it to give a new suggestion; probably with some max retry limit; after which we could fall back to using the invalid suggestion run through `toIdentifier`, or log a warning and leave it un-renamed or similar. --- Also, this upstream `webcrack` issue/PR may improve this situation somewhat too: - https://github.com/j4k0xb/webcrack/issues/154 - https://github.com/j4k0xb/webcrack/pull/155
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/humanify#67
No description provided.