[GH-ISSUE #617] Apache2 access.log and error.log mounted on s3 bucket problem. #349

Open
opened 2026-03-04 01:44:38 +03:00 by kerem · 4 comments
Owner

Originally created by @andr3i00 on GitHub (Jun 6, 2017).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/617

Hello guys,
I have a little issue, i've been following your tutorial and i've succeded to mount a s3 bucket but when i try to save access.log and error.log files on bucket, the files ar created but empty untill i open files with "tail -f access.log" after this command line, data are saved in access.log
Can you help me with this problem ?

Best regards,

Adrian

Originally created by @andr3i00 on GitHub (Jun 6, 2017). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/617 Hello guys, I have a little issue, i've been following your tutorial and i've succeded to mount a s3 bucket but when i try to save access.log and error.log files on bucket, the files ar created but empty untill i open files with "tail -f access.log" after this command line, data are saved in access.log Can you help me with this problem ? Best regards, Adrian
Author
Owner

@sqlbot commented on GitHub (Jun 8, 2017):

Continually-growing log files are not a valid use case for s3fs.

S3 objects are atomic. They can only be written or overwritten en bloc -- S3 itself has no capability that allows appending to an existing object.

Appending essentially requires that the entire object be downloaded, the new data appended, and a new object (with the same name) be uploaded. However, S3's eventual consistency on overwrites of existing objects may mean that if I upload version 1 of an object, then download it and append some new data and upload version 2 of the object, then download it with the intention of appending some new data and uploading version 3 of the object... I have a potential problem: the download after I upload version 2 may return version 2, but it may return version 1, because there is no guarantee that read-after-overwrite will immediately return the latest version uploaded. It eventually will, but not necessarily right away.

This is an inherent part of the design of S3.

<!-- gh-comment-id:307110802 --> @sqlbot commented on GitHub (Jun 8, 2017): Continually-growing log files are not a valid use case for s3fs. S3 objects are atomic. They can only be written or overwritten *en bloc* -- S3 itself has no capability that allows appending to an existing object. Appending essentially requires that the entire object be downloaded, the new data appended, and a new object (with the same name) be uploaded. However, S3's eventual consistency on overwrites of existing objects may mean that if I upload version 1 of an object, then download it and append some new data and upload version 2 of the object, then download it with the intention of appending some new data and uploading version 3 of the object... I have a potential problem: the download *after* I upload version 2 *may* return version 2, but it *may* return version 1, because there is no guarantee that read-after-overwrite will *immediately* return the latest version uploaded. It eventually will, but not necessarily right away. This is an inherent part of the design of S3.
Author
Owner

@andr3i00 commented on GitHub (Jun 8, 2017):

Well is not the answer that i expected but it is another way to mount an s3 bucket to a linux (ubuntu) machine that can do this things "natural" ?

<!-- gh-comment-id:307233537 --> @andr3i00 commented on GitHub (Jun 8, 2017): Well is not the answer that i expected but it is another way to mount an s3 bucket to a linux (ubuntu) machine that can do this things "natural" ?
Author
Owner

@sqlbot commented on GitHub (Jun 9, 2017):

I don't speak authoritatively for the s3fs project, so there may be a more official answer forthcoming, but I don't expect it could be any different.

I have been working directly with the S3 API for a number of years, and this is really not a suitable use case for S3 itself.

S3 is (of course) excellent for archiving logs, but not for streaming them. Files that grow over time are not suited for S3.

Another example of something you can't do with S3 (and thus can't do with s3fs) is to use it as a backing store for a working, live database like MySQL or Sqlite.

It is actually possible to -- sort of -- stream data into S3, but it's unsuited to log files: the stream data must be buffered somewhere (memory or local disk) as it is collected, and then committed to S3 in chunks not smaller than 5 MB each, with a maximum of 10,000 chunks, and the entire upload is invisible in the bucket until the write operation is finalized. These are part of the design of S3 As you can probably see, this could be useful for writing large tarballs, but terrible for logs.

Fundamentally, Amazon S3 is not a filesystem. S3 is an object store. It has no notion of append, no concept of random writes. It deals with atomic operations on whole files. The s3fs project, and others like it, try to bridge the difference by emulating filesystem semantics for S3 object operations, but the filesystem/object store functionality difference is actually quite large. We are bridging a very large "impedance mismatch" and there are things that are inevitably either impossible at worst, or a Very Bad Idea at best.

<!-- gh-comment-id:307276815 --> @sqlbot commented on GitHub (Jun 9, 2017): I don't speak authoritatively for the s3fs project, so there may be a more official answer forthcoming, but I don't expect it could be any different. I have been working directly with the S3 API for a number of years, and this is really not a suitable use case for S3 itself. S3 is (of course) excellent for archiving logs, but not for streaming them. Files that grow over time are not suited for S3. Another example of something you can't do with S3 (and thus can't do with s3fs) is to use it as a backing store for a working, live database like MySQL or Sqlite. It is actually possible to -- sort of -- stream data into S3, but it's unsuited to log files: the stream data must be buffered somewhere (memory or local disk) as it is collected, and then committed to S3 in chunks not smaller than 5 MB each, with a maximum of 10,000 chunks, and the entire upload is invisible in the bucket until the write operation is finalized. These are part of the design of S3 As you can probably see, this could be useful for writing large tarballs, but terrible for logs. Fundamentally, Amazon S3 is not a filesystem. S3 is an object store. It has no notion of append, no concept of random writes. It deals with atomic operations on whole files. The s3fs project, and others like it, try to bridge the difference by emulating filesystem semantics for S3 object operations, but the filesystem/object store functionality difference is actually quite large. We are bridging a very large "impedance mismatch" and there are things that are inevitably either impossible at worst, or a Very Bad Idea at best.
Author
Owner

@gaul commented on GitHub (Oct 10, 2020):

As @sqlbot said, tail -f on log files is not a good use case for s3fs. That being said, s3fs could periodically sync file uploads, either by a threshold of bytes written or seconds sync last sync. #1257 discusses this.

<!-- gh-comment-id:706560712 --> @gaul commented on GitHub (Oct 10, 2020): As @sqlbot said, `tail -f` on log files is not a good use case for s3fs. That being said, s3fs could periodically sync file uploads, either by a threshold of bytes written or seconds sync last sync. #1257 discusses this.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#349
No description provided.