[GH-ISSUE #456] Support for network drives or filesystems that don't implement FSYNC #303

Closed
opened 2026-03-01 14:42:15 +03:00 by kerem · 16 comments
Owner

Originally created by @blackberryoctopus on GitHub (Aug 20, 2020).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/456

Describe the bug

The issue encountered occurs when attempting to initialize an archivebox directory on a mounted network drive. The issue is reproducible in multiple directory locations located on the NAS. Initializing an archivebox directory on the local drive of the machine succeed. Additionally, if a successfully initialized directory is copied to the NAS, future archivebox operations on that copied directory fail. The write of the Config file appears to fail based on the traceback.

I investigated the trace output and added print statements to the atomicwrites function in which the failure occurs at line 46 of atomicwrites init.py
Additionally, I uncommented the print statement at line 41 of archivebox system.py for further debug info.

Steps to reproduce

mkdir archivebox; cd archivebox; archivebox init

Screenshots or log output

OUTPUT:

$ archivebox init
[i] [2020-08-20 19:55:28] ArchiveBox v0.4.21: archivebox init
     /Volumes/Public/data/archivebox

[+] Initializing a new ArchiveBox collection in this folder...
    /Volumes/Public/data/archivebox
------------------------------------------------------------------

[+] Building archive folder structure...
    √ /Volumes/Public/data/archivebox/sources
    √ /Volumes/Public/data/archivebox/archive
    √ /Volumes/Public/data/archivebox/logs

 Atomic Write: w /Volumes/Public/data/archivebox/ArchiveBox.conf 437 overwrite=True
fd is: 3
fcntl.F_FULLFSYNC is 51
fd is: 3
fcntl.F_FULLFSYNC is 51
Traceback (most recent call last):
  File "/Users/user123/.pyenv/versions/3.8.5/bin/archivebox", line 8, in <module>
    sys.exit(main())
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/cli/__init__.py", line 122, in main
    run_subcommand(
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/cli/__init__.py", line 62, in run_subcommand
    module.main(args=subcommand_args, stdin=stdin, pwd=pwd)    # type: ignore
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/cli/archivebox_init.py", line 33, in main
    init(
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/util.py", line 111, in typechecked_function
    return func(*args, **kwargs)
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/main.py", line 294, in init
    write_config_file({}, out_dir=out_dir)
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/config/__init__.py", line 376, in write_config_file
    atomic_write(config_path, CONFIG_HEADER)
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/util.py", line 111, in typechecked_function
    return func(*args, **kwargs)
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/system.py", line 46, in atomic_write
    f.write(contents)
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/contextlib.py", line 120, in __exit__
    next(self.gen)
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/atomicwrites/__init__.py", line 171, in _open
    self.commit(f)
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/atomicwrites/__init__.py", line 204, in commit
    replace_atomic(f.name, self._path)
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/atomicwrites/__init__.py", line 101, in replace_atomic
    return _replace_atomic(src, dst)
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/atomicwrites/__init__.py", line 58, in _replace_atomic
    _sync_directory(os.path.normpath(os.path.dirname(dst)))
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/atomicwrites/__init__.py", line 52, in _sync_directory
    _proper_fsync(fd)
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/atomicwrites/__init__.py", line 46, in _proper_fsync
    fcntl.fcntl(fd, fcntl.F_FULLFSYNC)
FileNotFoundError: [Errno 2] No such file or directory

Here is the output of mount for the two disks:
SUCCESS DRIVE:
/dev/disk1s5 on /System/Volumes/Data (apfs, local, journaled, nobrowse)
FAIL DRIVE:
//;AUTH=No%20User%20Authent@Drobo-5N2-RYE._afpovertcp._tcp.local/Public on /Volumes/Public (afpfs, nodev, nosuid, mounted by user123)

Here are the file permissions of the FAIL directory
drwxrwxrwx 1 user123 staff 264 Aug 20 15:55 archivebox

Software versions

  • OS: MacOS 10.15.6 (19G73)
  • ArchiveBox version: ArchiveBox v0.4.21
  • Python version: Python 3.8.5
Originally created by @blackberryoctopus on GitHub (Aug 20, 2020). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/456 #### Describe the bug The issue encountered occurs when attempting to initialize an archivebox directory on a mounted network drive. The issue is reproducible in multiple directory locations located on the NAS. Initializing an archivebox directory on the local drive of the machine succeed. Additionally, if a successfully initialized directory is copied to the NAS, future archivebox operations on that copied directory fail. The write of the Config file appears to fail based on the traceback. I investigated the trace output and added print statements to the atomicwrites function in which the failure occurs at line 46 of atomicwrites __init__.py Additionally, I uncommented the print statement at line 41 of archivebox system.py for further debug info. #### Steps to reproduce `mkdir archivebox; cd archivebox; archivebox init` #### Screenshots or log output OUTPUT: ```bash $ archivebox init [i] [2020-08-20 19:55:28] ArchiveBox v0.4.21: archivebox init /Volumes/Public/data/archivebox [+] Initializing a new ArchiveBox collection in this folder... /Volumes/Public/data/archivebox ------------------------------------------------------------------ [+] Building archive folder structure... √ /Volumes/Public/data/archivebox/sources √ /Volumes/Public/data/archivebox/archive √ /Volumes/Public/data/archivebox/logs Atomic Write: w /Volumes/Public/data/archivebox/ArchiveBox.conf 437 overwrite=True fd is: 3 fcntl.F_FULLFSYNC is 51 fd is: 3 fcntl.F_FULLFSYNC is 51 Traceback (most recent call last): File "/Users/user123/.pyenv/versions/3.8.5/bin/archivebox", line 8, in <module> sys.exit(main()) File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/cli/__init__.py", line 122, in main run_subcommand( File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/cli/__init__.py", line 62, in run_subcommand module.main(args=subcommand_args, stdin=stdin, pwd=pwd) # type: ignore File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/cli/archivebox_init.py", line 33, in main init( File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/util.py", line 111, in typechecked_function return func(*args, **kwargs) File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/main.py", line 294, in init write_config_file({}, out_dir=out_dir) File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/config/__init__.py", line 376, in write_config_file atomic_write(config_path, CONFIG_HEADER) File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/util.py", line 111, in typechecked_function return func(*args, **kwargs) File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/system.py", line 46, in atomic_write f.write(contents) File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/contextlib.py", line 120, in __exit__ next(self.gen) File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/atomicwrites/__init__.py", line 171, in _open self.commit(f) File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/atomicwrites/__init__.py", line 204, in commit replace_atomic(f.name, self._path) File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/atomicwrites/__init__.py", line 101, in replace_atomic return _replace_atomic(src, dst) File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/atomicwrites/__init__.py", line 58, in _replace_atomic _sync_directory(os.path.normpath(os.path.dirname(dst))) File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/atomicwrites/__init__.py", line 52, in _sync_directory _proper_fsync(fd) File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/atomicwrites/__init__.py", line 46, in _proper_fsync fcntl.fcntl(fd, fcntl.F_FULLFSYNC) FileNotFoundError: [Errno 2] No such file or directory ``` Here is the output of mount for the two disks: SUCCESS DRIVE: `/dev/disk1s5 on /System/Volumes/Data (apfs, local, journaled, nobrowse)` FAIL DRIVE: `//;AUTH=No%20User%20Authent@Drobo-5N2-RYE._afpovertcp._tcp.local/Public on /Volumes/Public (afpfs, nodev, nosuid, mounted by user123)` Here are the file permissions of the FAIL directory `drwxrwxrwx 1 user123 staff 264 Aug 20 15:55 archivebox` #### Software versions - OS: MacOS 10.15.6 (19G73) - ArchiveBox version: ArchiveBox v0.4.21 - Python version: Python 3.8.5
kerem closed this issue 2026-03-01 14:42:15 +03:00
Author
Owner

@pirate commented on GitHub (Aug 20, 2020):

It looks like your AFP server might not support AFP command 78 (FPSyncDir via fcntl(F_FULLFSYNC)). In the past, ArchiveBox's lack of atomic writes have caused painful corruption issues when multiple processes tried to write to the same index files, so we fixed it by enforcing atomic writes everywhere with fsyncs and file renaming. A consequence of that is that if your underlying filesystem ignores/skips fsyncs, ArchiveBox will be completely unable to run.

Is there any chance you're able to mount the network drive over SMB/NFS instead to test whether it's an issue with the network storage protocol or the filesystem or something else? Or maybe put your archive index.{json|sqlite3|html} and config file on a local drive and only put the data/archive/ subfolder on the network drive (which contains the actual page content)?

Relevant:

<!-- gh-comment-id:677948652 --> @pirate commented on GitHub (Aug 20, 2020): It looks like your AFP server might not support AFP command `78` (`FPSyncDir` via `fcntl(F_FULLFSYNC)`). In the past, ArchiveBox's lack of atomic writes have caused painful corruption issues when multiple processes tried to write to the same index files, so we fixed it by enforcing atomic writes everywhere with fsyncs and file renaming. A consequence of that is that if your underlying filesystem ignores/skips fsyncs, ArchiveBox will be completely unable to run. Is there any chance you're able to mount the network drive over SMB/NFS instead to test whether it's an issue with the network storage protocol or the filesystem or something else? Or maybe put your archive index.{json|sqlite3|html} and config file on a local drive and only put the `data/archive/` subfolder on the network drive (which contains the actual page content)? Relevant: - https://bugzilla.samba.org/show_bug.cgi?id=12380 - https://lists.apple.com/archives/filesystem-dev/2008/Apr/msg00031.html - https://github.com/untitaker/python-atomicwrites/issues/17
Author
Owner

@blackberryoctopus commented on GitHub (Aug 21, 2020):

@pirate
A different error occurs when mounting the NAS over smb, but the error appears to happen at the same point in the code execution.

Here is the new mount output for the drive:
//GUEST:@169.254.8.156/Public on /Volumes/Public (smbfs, nodev, nosuid, noowners, mounted by user123)

Here is the traceback

$ archivebox init
\[i\] [2020-08-21 15:17:34] ArchiveBox v0.4.21: archivebox init
     /Volumes/Public/data/archivebox

\[\+\] Initializing a new ArchiveBox collection in this folder...
    /Volumes/Public/data/archivebox
------------------------------------------------------------------

\[\+\] Building archive folder structure...
    √ /Volumes/Public/data/archivebox/sources
    √ /Volumes/Public/data/archivebox/archive
    √ /Volumes/Public/data/archivebox/logs

 Atomic Write: w /Volumes/Public/data/archivebox/ArchiveBox.conf 437 overwrite=True
fd is: 3
fcntl.F_FULLFSYNC is 51
Traceback (most recent call last):
  File "/Users/user123/.pyenv/versions/3.8.5/bin/archivebox", line 8, in <module>
    sys.exit(main())
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/cli/__init__.py", line 122, in main
    run_subcommand(
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/cli/__init__.py", line 62, in run_subcommand
    module.main(args=subcommand_args, stdin=stdin, pwd=pwd)    # type: ignore
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/cli/archivebox_init.py", line 33, in main
    init(
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/util.py", line 111, in typechecked_function
    return func(*args, **kwargs)
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/main.py", line 294, in init
    write_config_file({}, out_dir=out_dir)
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/config/__init__.py", line 376, in write_config_file
    atomic_write(config_path, CONFIG_HEADER)
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/util.py", line 111, in typechecked_function
    return func(*args, **kwargs)
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/system.py", line 46, in atomic_write
    f.write(contents)
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/contextlib.py", line 120, in __exit__
    next(self.gen)
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/atomicwrites/__init__.py", line 170, in _open
    self.sync(f)
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/atomicwrites/__init__.py", line 199, in sync
    _proper_fsync(f.fileno())
  File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/atomicwrites/__init__.py", line 46, in _proper_fsync
    fcntl.fcntl(fd, fcntl.F_FULLFSYNC)
OSError: [Errno 45] Operation not supported
<!-- gh-comment-id:678350107 --> @blackberryoctopus commented on GitHub (Aug 21, 2020): @pirate A different error occurs when mounting the NAS over smb, but the error appears to happen at the same point in the code execution. Here is the new `mount` output for the drive: `//GUEST:@169.254.8.156/Public on /Volumes/Public (smbfs, nodev, nosuid, noowners, mounted by user123)` Here is the traceback ``` $ archivebox init \[i\] [2020-08-21 15:17:34] ArchiveBox v0.4.21: archivebox init /Volumes/Public/data/archivebox \[\+\] Initializing a new ArchiveBox collection in this folder... /Volumes/Public/data/archivebox ------------------------------------------------------------------ \[\+\] Building archive folder structure... √ /Volumes/Public/data/archivebox/sources √ /Volumes/Public/data/archivebox/archive √ /Volumes/Public/data/archivebox/logs Atomic Write: w /Volumes/Public/data/archivebox/ArchiveBox.conf 437 overwrite=True fd is: 3 fcntl.F_FULLFSYNC is 51 Traceback (most recent call last): File "/Users/user123/.pyenv/versions/3.8.5/bin/archivebox", line 8, in <module> sys.exit(main()) File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/cli/__init__.py", line 122, in main run_subcommand( File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/cli/__init__.py", line 62, in run_subcommand module.main(args=subcommand_args, stdin=stdin, pwd=pwd) # type: ignore File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/cli/archivebox_init.py", line 33, in main init( File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/util.py", line 111, in typechecked_function return func(*args, **kwargs) File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/main.py", line 294, in init write_config_file({}, out_dir=out_dir) File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/config/__init__.py", line 376, in write_config_file atomic_write(config_path, CONFIG_HEADER) File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/util.py", line 111, in typechecked_function return func(*args, **kwargs) File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/archivebox/system.py", line 46, in atomic_write f.write(contents) File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/contextlib.py", line 120, in __exit__ next(self.gen) File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/atomicwrites/__init__.py", line 170, in _open self.sync(f) File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/atomicwrites/__init__.py", line 199, in sync _proper_fsync(f.fileno()) File "/Users/user123/.pyenv/versions/3.8.5/lib/python3.8/site-packages/atomicwrites/__init__.py", line 46, in _proper_fsync fcntl.fcntl(fd, fcntl.F_FULLFSYNC) OSError: [Errno 45] Operation not supported ```
Author
Owner

@pirate commented on GitHub (Aug 21, 2020):

Ah yeah, unfortunately if F_FULLFSYNC is not supported then you're sort of screwed. I don't want to allow non-fsync'ed writes to disk in the codebase, it causes too many headaches.

SMBv4 definitely supports FSYNC when configured to do so, can you check your NAS and see if there are any options you can tweak to allow fsyncs?

<!-- gh-comment-id:678395654 --> @pirate commented on GitHub (Aug 21, 2020): Ah yeah, unfortunately if `F_FULLFSYNC` is not supported then you're sort of screwed. I don't want to allow non-fsync'ed writes to disk in the codebase, it causes too many headaches. SMBv4 definitely supports FSYNC when configured to do so, can you check your NAS and see if there are any options you can tweak to allow fsyncs?
Author
Owner

@blackberryoctopus commented on GitHub (Aug 21, 2020):

@pirate
Thanks for your help with the investigation.

I have one suggestion/question:
Is it possible to improve the initialization routine to output more descriptive debug info when a non-compatible drive destination is attempted for use?

<!-- gh-comment-id:678411738 --> @blackberryoctopus commented on GitHub (Aug 21, 2020): @pirate Thanks for your help with the investigation. I have one suggestion/question: Is it possible to improve the initialization routine to output more descriptive debug info when a non-compatible drive destination is attempted for use?
Author
Owner

@blackberryoctopus commented on GitHub (Nov 6, 2020):

@pirate I'm wondering if there's any update on this issue? If not, what changes need to be made to better inform users when they attempt archiving on unsupported drive volumes?

<!-- gh-comment-id:723057646 --> @blackberryoctopus commented on GitHub (Nov 6, 2020): @pirate I'm wondering if there's any update on this issue? If not, what changes need to be made to better inform users when they attempt archiving on unsupported drive volumes?
Author
Owner

@pirate commented on GitHub (Nov 10, 2020):

Added an explicit error: github.com/pirate/ArchiveBox@fbd9a7caa6

This can maybe be removed in the future if we fully move to SQLite for everything (including config). But that's far in the future, so I'm closing this for now with the error msg.

<!-- gh-comment-id:724478945 --> @pirate commented on GitHub (Nov 10, 2020): Added an explicit error: https://github.com/pirate/ArchiveBox/commit/fbd9a7caa6c227a59c16028cd00b059d60cba0a7 This can maybe be removed in the future if we fully move to SQLite for everything (including config). But that's far in the future, so I'm closing this for now with the error msg.
Author
Owner

@blackberryoctopus commented on GitHub (Nov 10, 2020):

Thank you!

<!-- gh-comment-id:724691077 --> @blackberryoctopus commented on GitHub (Nov 10, 2020): Thank you!
Author
Owner

@lylebrown commented on GitHub (Jan 28, 2021):

I'm attempting to set up SMB storage myself, but don't mind using local storage for the application/config files, just the archive (that will take up more file space) should be on a network drive.

You mentioned putting only data/archive on the SMB share. I tried that and I get 400 errors every time I attempt to archive a URL. Is that still not supported either? Just trying to understand if that's still an option or not. Here's the relevant section of my docker-compose.yml.

volumes:
      - ./data/archivebox:/data
      - /media/smb/archivebox/archivebox:/data/archive
<!-- gh-comment-id:768758409 --> @lylebrown commented on GitHub (Jan 28, 2021): I'm attempting to set up SMB storage myself, but don't mind using local storage for the application/config files, just the archive (that will take up more file space) should be on a network drive. You mentioned putting only `data/archive` on the SMB share. I tried that and I get 400 errors every time I attempt to archive a URL. Is that still not supported either? Just trying to understand if that's still an option or not. Here's the relevant section of my `docker-compose.yml`. ``` volumes: - ./data/archivebox:/data - /media/smb/archivebox/archivebox:/data/archive ```
Author
Owner

@pirate commented on GitHub (Jan 28, 2021):

It should be supported, not sure why it's failing. Do you mind running the server with archivebox server --debug ... or setting archivebox config --set DEBUG=True and posting the verbatim output / screenshots of those 400 errors. Will help narrow down what the root cause is.

<!-- gh-comment-id:769086616 --> @pirate commented on GitHub (Jan 28, 2021): It should be supported, not sure why it's failing. Do you mind running the server with `archivebox server --debug ...` or setting `archivebox config --set DEBUG=True` and posting the verbatim output / screenshots of those 400 errors. Will help narrow down what the root cause is.
Author
Owner

@lylebrown commented on GitHub (Jan 28, 2021):

So there were some permissions errors at first that I had to fix. But now I believe my issue is with atomic writes based on the error I'm getting. It's strange, because it did write the index.json file and it appears to be complete. Let me know if you need the full trace.


Request Method: | POST
-- | --
http://archivebox.{mysite}/add/
3.1.3
PermissionError
[Errno 1] Operation not permitted: '/data/archive/1611858110.058611/index.json'
/app/archivebox/system.py, line 53, in atomic_write
/usr/local/bin/python
3.9.1
['/usr/local/bin',  '/usr/local/lib/python39.zip',  '/usr/local/lib/python3.9',  '/usr/local/lib/python3.9/lib-dynload',  '/usr/local/lib/python3.9/site-packages',  '/app',  '/data/node_modules/.bin',  '/app/archivebox',  '/data/node_modules/.bin']
Thu, 28 Jan 2021 18:21:50 +0000
<!-- gh-comment-id:769322350 --> @lylebrown commented on GitHub (Jan 28, 2021): So there were some permissions errors at first that I had to fix. But now I believe my issue is with atomic writes based on the error I'm getting. It's strange, because it did write the index.json file and it appears to be complete. Let me know if you need the full trace. ``` Request Method: | POST -- | -- http://archivebox.{mysite}/add/ 3.1.3 PermissionError [Errno 1] Operation not permitted: '/data/archive/1611858110.058611/index.json' /app/archivebox/system.py, line 53, in atomic_write /usr/local/bin/python 3.9.1 ['/usr/local/bin', '/usr/local/lib/python39.zip', '/usr/local/lib/python3.9', '/usr/local/lib/python3.9/lib-dynload', '/usr/local/lib/python3.9/site-packages', '/app', '/data/node_modules/.bin', '/app/archivebox', '/data/node_modules/.bin'] Thu, 28 Jan 2021 18:21:50 +0000 ```
Author
Owner

@pirate commented on GitHub (Jan 28, 2021):

I need the full trace to know what part of the code called atomic_write, as the filesystem behavior may be different depending on where it's being called.

<!-- gh-comment-id:769325751 --> @pirate commented on GitHub (Jan 28, 2021): I need the full trace to know what part of the code called `atomic_write`, as the filesystem behavior may be different depending on where it's being called.
Author
Owner

@lylebrown commented on GitHub (Jan 28, 2021):

Environment:


Request Method: POST
Request URL: http://archivebox.{mysite}/add/

Django Version: 3.1.3
Python Version: 3.9.1
Installed Applications:
['django.contrib.auth',
 'django.contrib.contenttypes',
 'django.contrib.sessions',
 'django.contrib.messages',
 'django.contrib.staticfiles',
 'django.contrib.admin',
 'core',
 'django_extensions']
Installed Middleware:
['django.middleware.security.SecurityMiddleware',
 'django.contrib.sessions.middleware.SessionMiddleware',
 'django.middleware.common.CommonMiddleware',
 'django.middleware.csrf.CsrfViewMiddleware',
 'django.contrib.auth.middleware.AuthenticationMiddleware',
 'django.contrib.messages.middleware.MessageMiddleware']



Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/django/core/handlers/exception.py", line 47, in inner
    response = get_response(request)
  File "/usr/local/lib/python3.9/site-packages/django/core/handlers/base.py", line 179, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/usr/local/lib/python3.9/site-packages/django/views/generic/base.py", line 70, in view
    return self.dispatch(request, *args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/django/contrib/auth/mixins.py", line 109, in dispatch
    return super().dispatch(request, *args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/django/views/generic/base.py", line 98, in dispatch
    return handler(request, *args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/django/views/generic/edit.py", line 142, in post
    return self.form_valid(form)
  File "/app/archivebox/core/views.py", line 164, in form_valid
    add(**input_kwargs)
  File "/app/archivebox/util.py", line 112, in typechecked_function
    return func(*args, **kwargs)
  File "/app/archivebox/main.py", line 593, in add
    archive_links(new_links, overwrite=False, **archive_kwargs)
  File "/app/archivebox/util.py", line 112, in typechecked_function
    return func(*args, **kwargs)
  File "/app/archivebox/extractors/__init__.py", line 173, in archive_links
    archive_link(to_archive, overwrite=overwrite, methods=methods, out_dir=Path(link.link_dir))
  File "/app/archivebox/util.py", line 112, in typechecked_function
    return func(*args, **kwargs)
  File "/app/archivebox/extractors/__init__.py", line 95, in archive_link
    write_link_details(link, out_dir=out_dir, skip_sql_index=False)
  File "/app/archivebox/util.py", line 112, in typechecked_function
    return func(*args, **kwargs)
  File "/app/archivebox/index/__init__.py", line 332, in write_link_details
    write_json_link_details(link, out_dir=out_dir)
  File "/app/archivebox/util.py", line 112, in typechecked_function
    return func(*args, **kwargs)
  File "/app/archivebox/index/json.py", line 89, in write_json_link_details
    atomic_write(str(path), link._asdict(extended=True))
  File "/app/archivebox/util.py", line 112, in typechecked_function
    return func(*args, **kwargs)
  File "/app/archivebox/system.py", line 53, in atomic_write
    os.chmod(path, int(OUTPUT_PERMISSIONS, base=8))

Exception Type: PermissionError at /add/
Exception Value: [Errno 1] Operation not permitted: '/data/archive/1611858110.058611/index.json'
<!-- gh-comment-id:769327260 --> @lylebrown commented on GitHub (Jan 28, 2021): ``` Environment: Request Method: POST Request URL: http://archivebox.{mysite}/add/ Django Version: 3.1.3 Python Version: 3.9.1 Installed Applications: ['django.contrib.auth', 'django.contrib.contenttypes', 'django.contrib.sessions', 'django.contrib.messages', 'django.contrib.staticfiles', 'django.contrib.admin', 'core', 'django_extensions'] Installed Middleware: ['django.middleware.security.SecurityMiddleware', 'django.contrib.sessions.middleware.SessionMiddleware', 'django.middleware.common.CommonMiddleware', 'django.middleware.csrf.CsrfViewMiddleware', 'django.contrib.auth.middleware.AuthenticationMiddleware', 'django.contrib.messages.middleware.MessageMiddleware'] Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/django/core/handlers/exception.py", line 47, in inner response = get_response(request) File "/usr/local/lib/python3.9/site-packages/django/core/handlers/base.py", line 179, in _get_response response = wrapped_callback(request, *callback_args, **callback_kwargs) File "/usr/local/lib/python3.9/site-packages/django/views/generic/base.py", line 70, in view return self.dispatch(request, *args, **kwargs) File "/usr/local/lib/python3.9/site-packages/django/contrib/auth/mixins.py", line 109, in dispatch return super().dispatch(request, *args, **kwargs) File "/usr/local/lib/python3.9/site-packages/django/views/generic/base.py", line 98, in dispatch return handler(request, *args, **kwargs) File "/usr/local/lib/python3.9/site-packages/django/views/generic/edit.py", line 142, in post return self.form_valid(form) File "/app/archivebox/core/views.py", line 164, in form_valid add(**input_kwargs) File "/app/archivebox/util.py", line 112, in typechecked_function return func(*args, **kwargs) File "/app/archivebox/main.py", line 593, in add archive_links(new_links, overwrite=False, **archive_kwargs) File "/app/archivebox/util.py", line 112, in typechecked_function return func(*args, **kwargs) File "/app/archivebox/extractors/__init__.py", line 173, in archive_links archive_link(to_archive, overwrite=overwrite, methods=methods, out_dir=Path(link.link_dir)) File "/app/archivebox/util.py", line 112, in typechecked_function return func(*args, **kwargs) File "/app/archivebox/extractors/__init__.py", line 95, in archive_link write_link_details(link, out_dir=out_dir, skip_sql_index=False) File "/app/archivebox/util.py", line 112, in typechecked_function return func(*args, **kwargs) File "/app/archivebox/index/__init__.py", line 332, in write_link_details write_json_link_details(link, out_dir=out_dir) File "/app/archivebox/util.py", line 112, in typechecked_function return func(*args, **kwargs) File "/app/archivebox/index/json.py", line 89, in write_json_link_details atomic_write(str(path), link._asdict(extended=True)) File "/app/archivebox/util.py", line 112, in typechecked_function return func(*args, **kwargs) File "/app/archivebox/system.py", line 53, in atomic_write os.chmod(path, int(OUTPUT_PERMISSIONS, base=8)) Exception Type: PermissionError at /add/ Exception Value: [Errno 1] Operation not permitted: '/data/archive/1611858110.058611/index.json' ```
Author
Owner

@pirate commented on GitHub (Jan 28, 2021):

oh it's just the chmod failing, not the actual write. What's the permissions and ownership on this dir /data/archive/1611858110.058611 and what user are you running archivebox as?

<!-- gh-comment-id:769329154 --> @pirate commented on GitHub (Jan 28, 2021): oh it's just the chmod failing, not the actual write. What's the permissions and ownership on this dir `/data/archive/1611858110.058611` and what user are you running `archivebox` as?
Author
Owner

@lylebrown commented on GitHub (Jan 29, 2021):

That was the hint I needed, thanks! I may not be doing things in the ideal way with my smb share, but I'm using file_mode=0777,dir_mode=0777 as options in my fstab, and I needed to add the noperm option for it to be able to respond to chmod calls with success (even if they aren't actually making changes).
I'm open to suggestions to better manage permissions, but I don't think I have many options over smb.

<!-- gh-comment-id:769509950 --> @lylebrown commented on GitHub (Jan 29, 2021): That was the hint I needed, thanks! I may not be doing things in the ideal way with my smb share, but I'm using `file_mode=0777,dir_mode=0777` as options in my `fstab`, and I needed to add the `noperm` option for it to be able to respond to chmod calls with success (even if they aren't actually making changes). I'm open to suggestions to better manage permissions, but I don't think I have many options over smb.
Author
Owner

@pirate commented on GitHub (Jan 29, 2021):

Yeah haha I also run my smb shares with forced file_mode=0777,dir_mode=0777, too many hours of my life wasted fighting with permissions on shared network drives. I implement my file access permissions at other layers.

<!-- gh-comment-id:769516585 --> @pirate commented on GitHub (Jan 29, 2021): Yeah haha I also run my smb shares with forced `file_mode=0777,dir_mode=0777`, too many hours of my life wasted fighting with permissions on shared network drives. I implement my file access permissions at other layers.
Author
Owner

@pirate commented on GitHub (Apr 12, 2022):

Note I've added a new DB/filesystem troubleshooting area to the wiki that may help people arriving here from Google: https://github.com/ArchiveBox/ArchiveBox/wiki/Upgrading-or-Merging-Archives#database-troubleshooting

Contributions/suggestions welcome there.

<!-- gh-comment-id:1097264175 --> @pirate commented on GitHub (Apr 12, 2022): Note I've added a new DB/filesystem troubleshooting area to the wiki that may help people arriving here from Google: https://github.com/ArchiveBox/ArchiveBox/wiki/Upgrading-or-Merging-Archives#database-troubleshooting Contributions/suggestions welcome there.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#303
No description provided.