[GH-ISSUE #1390] Bug: Enter a valid URL. #2356

Closed
opened 2026-03-01 17:58:26 +03:00 by kerem · 2 comments
Owner

Originally created by @raikiriww on GitHub (Mar 27, 2024).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1390

Describe the bug

I use ArchiveBox(docker image: archivebox/archivebox:dev) to arhivce "http://scz.617.cn:8/body/200004171952.txt". It works but when I try to change the snapshot's Title, it displayed an error message: "Please correct the error below.". I found that the error item is Url: "Enter a valid URL.". But the url is vaild, it is "http://scz.617.cn:8/body/200004171952.txt". I can open it in my browser.

Steps to reproduce

archive "http://scz.617.cn:8/body/200004171952.txt" and change the snapshot's items like "Title" and save.

Screenshots or log output

image

ArchiveBox version

0.7.3
ArchiveBox v0.7.3+editable COMMIT_HASH=0872c84 BUILD_TIME=2024-03-15 03:23:01 1710472981
IN_DOCKER=True IN_QEMU=False ARCH=x86_64 OS=Linux PLATFORM=Linux-5.15.0-97-generic-x86_64-with-glibc2.36 PYTHON=Cpython
FS_ATOMIC=True FS_REMOTE=True FS_USER=0:0 FS_PERMS=644
DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND=sonic LDAP=False

[i] Dependency versions:
 √  PYTHON_BINARY         v3.11.8         valid     /usr/local/bin/python3.11                                                   
 √  SQLITE_BINARY         v2.6.0          valid     /usr/local/lib/python3.11/sqlite3/dbapi2.py                                 
 √  DJANGO_BINARY         v3.1.14         valid     /usr/local/lib/python3.11/site-packages/django/__init__.py                  
 √  ARCHIVEBOX_BINARY     v0.7.3          valid     /usr/local/bin/archivebox                                                   

 √  CURL_BINARY           v8.5.0          valid     /usr/bin/curl                                                               
 √  WGET_BINARY           v1.21.3         valid     /usr/bin/wget                                                               
 √  NODE_BINARY           v20.11.1        valid     /usr/bin/node                                                               
 √  SINGLEFILE_BINARY     v1.1.46         valid     /app/node_modules/single-file-cli/single-file                               
 √  READABILITY_BINARY    v0.0.11         valid     /app/node_modules/readability-extractor/readability-extractor               
 √  MERCURY_BINARY        v1.0.0          valid     /app/node_modules/@postlight/parser/cli.js                                  
 √  GIT_BINARY            v2.39.2         valid     /usr/bin/git                                                                
 √  YOUTUBEDL_BINARY      v2023.12.30     valid     /usr/local/bin/yt-dlp                                                       
 √  CHROME_BINARY         v123.0.6312.4   valid     /usr/bin/chromium-browser                                                   
 √  RIPGREP_BINARY        v13.0.0         valid     /usr/bin/rg                                                                 

[i] Source-code locations:
 √  PACKAGE_DIR           24 files        valid     /app/archivebox                                                             
 √  TEMPLATES_DIR         3 files         valid     /app/archivebox/templates                                                   
 -  CUSTOM_TEMPLATES_DIR  -               disabled  None                                                                        

[i] Secrets locations:
 -  CHROME_USER_DATA_DIR  -               disabled  None                                                                        
 -  COOKIES_FILE          -               disabled  None                                                                        

[i] Data locations:
 √  OUTPUT_DIR            6 files @       valid     /data                                                                       
 √  SOURCES_DIR           26 files        valid     ./sources                                                                   
 √  LOGS_DIR              1 files         valid     ./logs                                                                      
 √  ARCHIVE_DIR           6 files         valid     ./archive                                                                   
 √  CONFIG_FILE           81.0 Bytes      valid     ./ArchiveBox.conf                                                           
 √  SQL_INDEX             248.0 KB        valid     ./index.sqlite3 
Originally created by @raikiriww on GitHub (Mar 27, 2024). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1390 <!-- Please fill out the following information, feel free to delete sections if they're not applicable or if long issue templates annoy you. (the only required section is the version information) --> #### Describe the bug <!-- A description of what the bug is, what you expected to happen, and any relevant context about issue. --> I use ArchiveBox(docker image: archivebox/archivebox:dev) to arhivce "http://scz.617.cn:8/body/200004171952.txt". It works but when I try to change the snapshot's Title, it displayed an error message: "Please correct the error below.". I found that the error item is Url: "Enter a valid URL.". But the url is vaild, it is "http://scz.617.cn:8/body/200004171952.txt". I can open it in my browser. #### Steps to reproduce <!-- For example: 1. Ran ArchiveBox with the following config '...' 2. Saw this output during archiving '....' 3. UI didn't show the thing I was expecting '....' --> archive "http://scz.617.cn:8/body/200004171952.txt" and change the snapshot's items like "Title" and save. #### Screenshots or log output <!-- If applicable, post any relevant screenshots or copy/pasted terminal output from ArchiveBox. If you're reporting a parsing / importing error, **you must paste a copy of your redacted import file here**. --> ![image](https://github.com/ArchiveBox/ArchiveBox/assets/37997482/3939c17c-5f13-40ab-b19b-3a5903d7b139) #### ArchiveBox version <!-- Run the `archivebox version` command locally then copy paste the result here: --> ```logs 0.7.3 ArchiveBox v0.7.3+editable COMMIT_HASH=0872c84 BUILD_TIME=2024-03-15 03:23:01 1710472981 IN_DOCKER=True IN_QEMU=False ARCH=x86_64 OS=Linux PLATFORM=Linux-5.15.0-97-generic-x86_64-with-glibc2.36 PYTHON=Cpython FS_ATOMIC=True FS_REMOTE=True FS_USER=0:0 FS_PERMS=644 DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND=sonic LDAP=False [i] Dependency versions: √ PYTHON_BINARY v3.11.8 valid /usr/local/bin/python3.11 √ SQLITE_BINARY v2.6.0 valid /usr/local/lib/python3.11/sqlite3/dbapi2.py √ DJANGO_BINARY v3.1.14 valid /usr/local/lib/python3.11/site-packages/django/__init__.py √ ARCHIVEBOX_BINARY v0.7.3 valid /usr/local/bin/archivebox √ CURL_BINARY v8.5.0 valid /usr/bin/curl √ WGET_BINARY v1.21.3 valid /usr/bin/wget √ NODE_BINARY v20.11.1 valid /usr/bin/node √ SINGLEFILE_BINARY v1.1.46 valid /app/node_modules/single-file-cli/single-file √ READABILITY_BINARY v0.0.11 valid /app/node_modules/readability-extractor/readability-extractor √ MERCURY_BINARY v1.0.0 valid /app/node_modules/@postlight/parser/cli.js √ GIT_BINARY v2.39.2 valid /usr/bin/git √ YOUTUBEDL_BINARY v2023.12.30 valid /usr/local/bin/yt-dlp √ CHROME_BINARY v123.0.6312.4 valid /usr/bin/chromium-browser √ RIPGREP_BINARY v13.0.0 valid /usr/bin/rg [i] Source-code locations: √ PACKAGE_DIR 24 files valid /app/archivebox √ TEMPLATES_DIR 3 files valid /app/archivebox/templates - CUSTOM_TEMPLATES_DIR - disabled None [i] Secrets locations: - CHROME_USER_DATA_DIR - disabled None - COOKIES_FILE - disabled None [i] Data locations: √ OUTPUT_DIR 6 files @ valid /data √ SOURCES_DIR 26 files valid ./sources √ LOGS_DIR 1 files valid ./logs √ ARCHIVE_DIR 6 files valid ./archive √ CONFIG_FILE 81.0 Bytes valid ./ArchiveBox.conf √ SQL_INDEX 248.0 KB valid ./index.sqlite3 ``` <!-- Tickets without full version info will closed until it is provided, we need the full output here to help you solve your issue -->
Author
Owner

@raikiriww commented on GitHub (Mar 27, 2024):

I have tested it on my own computer. When Django version is 3.1.14, matching the version in the Docker image, the url "http://scz.617.cn:8/body/200004171952.txt" is considered to be invalid.

image

Then I discovered that a pull request (PR) had been merged. #1388 So I test it on django-4.2.11 and it passed.

image

It seems that this issue has already been resolved.

<!-- gh-comment-id:2022233792 --> @raikiriww commented on GitHub (Mar 27, 2024): I have tested it on my own computer. When Django version is 3.1.14, matching the version in the Docker image, the url "http://scz.617.cn:8/body/200004171952.txt" is considered to be invalid. ![image](https://github.com/ArchiveBox/ArchiveBox/assets/37997482/bbf73d4d-1ef7-4238-83e6-90d4e80bd62f) Then I discovered that a pull request (PR) had been merged. #1388 So I test it on django-4.2.11 and it passed. ![image](https://github.com/ArchiveBox/ArchiveBox/assets/37997482/6c02b6b5-3cec-448d-a8c9-cc51af539fda) It seems that this issue has already been resolved.
Author
Owner

@pirate commented on GitHub (Mar 28, 2024):

Yup, you beat me to it @raikiriww. The newer Django version's URLField should now accept those URLs.

The latest v0.8.0-rc is currently in alpha testing stage and isn't ready for general use yet, but keep an eye on our releases page to see when it's out! https://github.com/ArchiveBox/ArchiveBox/releases/tag/v0.8.0-rc

<!-- gh-comment-id:2024328406 --> @pirate commented on GitHub (Mar 28, 2024): Yup, you beat me to it @raikiriww. The newer Django version's URLField should now accept those URLs. The latest `v0.8.0-rc` is currently in alpha testing stage and isn't ready for general use yet, but keep an eye on our releases page to see when it's out! https://github.com/ArchiveBox/ArchiveBox/releases/tag/v0.8.0-rc
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#2356
No description provided.