[GH-ISSUE #35] Feature request: S3 crawler #35

Closed
opened 2026-02-27 15:54:36 +03:00 by kerem · 6 comments
Owner

Originally created by @jamesinc on GitHub (May 16, 2017).
Original GitHub issue: https://github.com/RD17/ambar/issues/35

I'd really like to be able to crawl an S3 bucket. I feel sketchy serving web traffic out of my Dropbox!

Originally created by @jamesinc on GitHub (May 16, 2017). Original GitHub issue: https://github.com/RD17/ambar/issues/35 I'd really like to be able to crawl an S3 bucket. I feel sketchy serving web traffic out of my Dropbox!
kerem 2026-02-27 15:54:36 +03:00
Author
Owner

@sochix commented on GitHub (May 16, 2017):

Hi! FTP or SMB crawler will not work for you?

<!-- gh-comment-id:301704413 --> @sochix commented on GitHub (May 16, 2017): Hi! FTP or SMB crawler will not work for you?
Author
Owner

@jamesinc commented on GitHub (May 17, 2017):

It would be possible to work around it using s3fs. I was just hoping for a native solution that didn't require mounting a bucket into a directory. It would work, but it is not ideal, as it would want to cache the contents of the bucket to the local disk. In this case, I'm trying to avoid needing lots of storage on the instance running Ambar.

<!-- gh-comment-id:301954348 --> @jamesinc commented on GitHub (May 17, 2017): It would be possible to work around it using [s3fs](https://github.com/s3fs-fuse/s3fs-fuse). I was just hoping for a native solution that didn't require mounting a bucket into a directory. It would work, but it is not ideal, as it would want to cache the contents of the bucket to the local disk. In this case, I'm trying to avoid needing lots of storage on the instance running Ambar.
Author
Owner

@jmgilman commented on GitHub (May 18, 2017):

+1

<!-- gh-comment-id:302531343 --> @jmgilman commented on GitHub (May 18, 2017): +1
Author
Owner

@sochix commented on GitHub (Apr 19, 2018):

In the latest release you can mount S3 folder to local crawler

<!-- gh-comment-id:382659310 --> @sochix commented on GitHub (Apr 19, 2018): In the latest release you can mount S3 folder to local crawler
Author
Owner

@AyKarsi commented on GitHub (Sep 10, 2018):

how can I mount a s3 folder to the local crawler?

<!-- gh-comment-id:419899773 --> @AyKarsi commented on GitHub (Sep 10, 2018): how can I mount a s3 folder to the local crawler?
Author
Owner

@sochix commented on GitHub (Sep 10, 2018):

@AyKarsi just google on how to map s3 folder to local folder, and then map this local folder to the crawler.

<!-- gh-comment-id:419945564 --> @sochix commented on GitHub (Sep 10, 2018): @AyKarsi just google on how to map s3 folder to local folder, and then map this local folder to the crawler.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ambar#35
No description provided.