mirror of
https://github.com/s3fs-fuse/s3fs-fuse.git
synced 2026-04-25 05:16:00 +03:00
[GH-ISSUE #617] Apache2 access.log and error.log mounted on s3 bucket problem. #349
Labels
No labels
bug
bug
dataloss
duplicate
enhancement
feature request
help wanted
invalid
need info
performance
pull-request
question
question
testing
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/s3fs-fuse#349
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @andr3i00 on GitHub (Jun 6, 2017).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/617
Hello guys,
I have a little issue, i've been following your tutorial and i've succeded to mount a s3 bucket but when i try to save access.log and error.log files on bucket, the files ar created but empty untill i open files with "tail -f access.log" after this command line, data are saved in access.log
Can you help me with this problem ?
Best regards,
Adrian
@sqlbot commented on GitHub (Jun 8, 2017):
Continually-growing log files are not a valid use case for s3fs.
S3 objects are atomic. They can only be written or overwritten en bloc -- S3 itself has no capability that allows appending to an existing object.
Appending essentially requires that the entire object be downloaded, the new data appended, and a new object (with the same name) be uploaded. However, S3's eventual consistency on overwrites of existing objects may mean that if I upload version 1 of an object, then download it and append some new data and upload version 2 of the object, then download it with the intention of appending some new data and uploading version 3 of the object... I have a potential problem: the download after I upload version 2 may return version 2, but it may return version 1, because there is no guarantee that read-after-overwrite will immediately return the latest version uploaded. It eventually will, but not necessarily right away.
This is an inherent part of the design of S3.
@andr3i00 commented on GitHub (Jun 8, 2017):
Well is not the answer that i expected but it is another way to mount an s3 bucket to a linux (ubuntu) machine that can do this things "natural" ?
@sqlbot commented on GitHub (Jun 9, 2017):
I don't speak authoritatively for the s3fs project, so there may be a more official answer forthcoming, but I don't expect it could be any different.
I have been working directly with the S3 API for a number of years, and this is really not a suitable use case for S3 itself.
S3 is (of course) excellent for archiving logs, but not for streaming them. Files that grow over time are not suited for S3.
Another example of something you can't do with S3 (and thus can't do with s3fs) is to use it as a backing store for a working, live database like MySQL or Sqlite.
It is actually possible to -- sort of -- stream data into S3, but it's unsuited to log files: the stream data must be buffered somewhere (memory or local disk) as it is collected, and then committed to S3 in chunks not smaller than 5 MB each, with a maximum of 10,000 chunks, and the entire upload is invisible in the bucket until the write operation is finalized. These are part of the design of S3 As you can probably see, this could be useful for writing large tarballs, but terrible for logs.
Fundamentally, Amazon S3 is not a filesystem. S3 is an object store. It has no notion of append, no concept of random writes. It deals with atomic operations on whole files. The s3fs project, and others like it, try to bridge the difference by emulating filesystem semantics for S3 object operations, but the filesystem/object store functionality difference is actually quite large. We are bridging a very large "impedance mismatch" and there are things that are inevitably either impossible at worst, or a Very Bad Idea at best.
@gaul commented on GitHub (Oct 10, 2020):
As @sqlbot said,
tail -fon log files is not a good use case for s3fs. That being said, s3fs could periodically sync file uploads, either by a threshold of bytes written or seconds sync last sync. #1257 discusses this.