[GH-ISSUE #190] Improper quoting of wget useragent flag #131

Closed
opened 2026-03-01 14:40:52 +03:00 by kerem · 2 comments
Owner

Originally created by @n0ncetonic on GitHub (Mar 22, 2019).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/190

Describe the bug

wget appears to have a slight typo in the command generation in specific in dealing with the --user-agent flag.

Steps to reproduce

I observed the following when a page returned a 404 response (not a fault of ArchiveBox, the page was legitimately not found). Any method of running ArchiveBox that would display the generated output for commands would show this as well

Screenshots or log output

The wget command outputs the following as part of its run flags.

"--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36".

Is this done intentionally or is the intended functionality to output --user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36" ?

Software versions

  • OS: macOS 10.14
  • ArchiveBox version: 0075daa
  • Python version: Python 3.7.2
  • Chrome version: N/A
Originally created by @n0ncetonic on GitHub (Mar 22, 2019). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/190 ## Describe the bug wget appears to have a slight typo in the command generation in specific in dealing with the `--user-agent` flag. ## Steps to reproduce I observed the following when a page returned a 404 response (not a fault of ArchiveBox, the page was legitimately not found). Any method of running ArchiveBox that would display the generated output for commands would show this as well ## Screenshots or log output The wget command outputs the following as part of its run flags. `"--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36"`. Is this done intentionally or is the intended functionality to output `--user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36"` ? ## Software versions - OS: macOS 10.14 - ArchiveBox version: 0075daa - Python version: Python 3.7.2 - Chrome version: N/A
kerem closed this issue 2026-03-01 14:40:52 +03:00
Author
Owner

@n0ncetonic commented on GitHub (Mar 22, 2019):

This appears to be replicated in the chrome flags as well

/Applications/Chromium.app/Contents/MacOS/Chromium --headless --disable-web-security --ignore-certificate-errors "--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36" --window-size=1440,900 --timeout=40000 "--user-data-dir=/Users/user/Library/Application Support/Chromium" --dump-dom https://url.com

<!-- gh-comment-id:475700846 --> @n0ncetonic commented on GitHub (Mar 22, 2019): This appears to be replicated in the chrome flags as well ` /Applications/Chromium.app/Contents/MacOS/Chromium --headless --disable-web-security --ignore-certificate-errors "--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36" --window-size=1440,900 --timeout=40000 "--user-data-dir=/Users/user/Library/Application Support/Chromium" --dump-dom https://url.com`
Author
Owner

@pirate commented on GitHub (Mar 22, 2019):

It's intentional and harmless. That's only how it's printed for the user to copy, not actually how it's run.

def log_archive_method_finished(result):
    """quote the argument with whitespace in a command so the user can 
       copy-paste the outputted string directly to run the cmd
    """
    ...

    # Prettify CMD string and make it safe to copy-paste by quoting arguments
    quoted_cmd = ' '.join(
        '"{}"'.format(arg) if ' ' in arg else arg
        for arg in result['cmd']
    )

When being run it's quoted properly by the system because we use subprocess.run:

def run(*popenargs, input=None, ...):
    with Popen(*popenargs, ...) as process:
<!-- gh-comment-id:475753395 --> @pirate commented on GitHub (Mar 22, 2019): It's intentional and harmless. That's only how it's printed for the user to copy, not actually how it's run. ```python def log_archive_method_finished(result): """quote the argument with whitespace in a command so the user can copy-paste the outputted string directly to run the cmd """ ... # Prettify CMD string and make it safe to copy-paste by quoting arguments quoted_cmd = ' '.join( '"{}"'.format(arg) if ' ' in arg else arg for arg in result['cmd'] ) ``` When being run it's quoted properly by the system because we use `subprocess.run`: ```python def run(*popenargs, input=None, ...): with Popen(*popenargs, ...) as process: ```
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#131
No description provided.