[GH-ISSUE #50] SMB crawler not working, share verified working #49

Closed
opened 2026-02-27 15:54:41 +03:00 by kerem · 15 comments
Owner

Originally created by @effnorwood on GitHub (Jul 14, 2017).
Original GitHub issue: https://github.com/RD17/ambar/issues/50

Installed clean today on clean Ubuntu 16.04 install. Verified I can connect to the share from Windows and Linux using mount -t cifs. Crawler config:

{
"id": "data",
"uid": "data_d033e22ae348aeb5660fc2140aec35850c4da997",
"description": "nas crawler",
"type": "smb",
"locations": [
{
"host_name": "nas",
"ip_address": "10.0.0.100",
"location": "data"
}
],
"file_regex": "(\.doc[a-z])|(\\.xls[a-z]*)|(\.txt$)|(\.csv$)|(\.htm[a-z])|(\\.ppt[a-z]*)|(\.pdf$)|(\.msg$)|(\.eml$)|(\.rtf$)|(\.md$)|(\.png$)|(\.bmp$)|(\.tif[f])|(\\.jp[e]*g)|(\.hwp$)",
"credentials": {
"auth_type": "ntlm",
"login": "jes",
"password": "
****",
"token": ""
},
"schedule": {
"is_active": true,
"cron_schedule": "
/15 * * * *"
},
"max_file_size_bytes": 30000000,
"verbose": true
}

Error:
2017-07-14 11:15:00.688: [info] filecrawler initialized
2017-07-14 11:15:00.695: [error]
2017-07-14 11:15:00.700: [error] error connecting to Smb share on nas

Notice that there is nothing by the error at all.

Also, how do I get to the logs for this system? I looked at docker logs but they said nothing about this issue. Thank you.

Originally created by @effnorwood on GitHub (Jul 14, 2017). Original GitHub issue: https://github.com/RD17/ambar/issues/50 Installed clean today on clean Ubuntu 16.04 install. Verified I can connect to the share from Windows and Linux using mount -t cifs. Crawler config: { "id": "data", "uid": "data_d033e22ae348aeb5660fc2140aec35850c4da997", "description": "nas crawler", "type": "smb", "locations": [ { "host_name": "nas", "ip_address": "10.0.0.100", "location": "data" } ], "file_regex": "(\\.doc[a-z]*$)|(\\.xls[a-z]*$)|(\\.txt$)|(\\.csv$)|(\\.htm[a-z]*$)|(\\.ppt[a-z]*$)|(\\.pdf$)|(\\.msg$)|(\\.eml$)|(\\.rtf$)|(\\.md$)|(\\.png$)|(\\.bmp$)|(\\.tif[f]*$)|(\\.jp[e]*g$)|(\\.hwp$)", "credentials": { "auth_type": "ntlm", "login": "jes", "password": "******", "token": "" }, "schedule": { "is_active": true, "cron_schedule": "*/15 * * * *" }, "max_file_size_bytes": 30000000, "verbose": true } Error: 2017-07-14 11:15:00.688: [info] filecrawler initialized 2017-07-14 11:15:00.695: [error] 2017-07-14 11:15:00.700: [error] error connecting to Smb share on nas Notice that there is nothing by the error at all. Also, how do I get to the logs for this system? I looked at docker logs but they said nothing about this issue. Thank you.
kerem closed this issue 2026-02-27 15:54:41 +03:00
Author
Owner

@sochix commented on GitHub (Jul 19, 2017):

Maybe you didn't escape correctly some chars in the password?

<!-- gh-comment-id:316376030 --> @sochix commented on GitHub (Jul 19, 2017): Maybe you didn't escape correctly some chars in the password?
Author
Owner

@effnorwood commented on GitHub (Jul 20, 2017):

Password is just all lower case and kind of like 'heythisismypasswordbutitslongsoicanrememberit'. No special characters.

<!-- gh-comment-id:316568096 --> @effnorwood commented on GitHub (Jul 20, 2017): Password is just all lower case and kind of like 'heythisismypasswordbutitslongsoicanrememberit'. No special characters.
Author
Owner

@sochix commented on GitHub (Jul 24, 2017):

Ok, can you please run docker logs ambar_crawler_c0 and paste results here

<!-- gh-comment-id:317364233 --> @sochix commented on GitHub (Jul 24, 2017): Ok, can you please run `docker logs ambar_crawler_c0` and paste results here
Author
Owner

@sochix commented on GitHub (Aug 4, 2017):

@effnorwood any news?

<!-- gh-comment-id:320182776 --> @sochix commented on GitHub (Aug 4, 2017): @effnorwood any news?
Author
Owner

@effnorwood commented on GitHub (Aug 4, 2017):

Just saw this, will do and get back. Thanks!

<!-- gh-comment-id:320299180 --> @effnorwood commented on GitHub (Aug 4, 2017): Just saw this, will do and get back. Thanks!
Author
Owner

@sochix commented on GitHub (Aug 18, 2017):

@effnorwood any news?

<!-- gh-comment-id:323293642 --> @sochix commented on GitHub (Aug 18, 2017): @effnorwood any news?
Author
Owner

@sochix commented on GitHub (Aug 31, 2017):

No news for 2 weeks. Closing

<!-- gh-comment-id:326332046 --> @sochix commented on GitHub (Aug 31, 2017): No news for 2 weeks. Closing
Author
Owner

@yuergen commented on GitHub (Nov 9, 2017):

I have the same issue. I can ping the SMB host from a freshly installed virtual machine with ubuntu 16.04. docker logs does not provide any output at all:

root@banane /o/ambar# docker logs ambar_crawler_c0
root@banane /o/ambar# 

To mount the share with mount -t cifs, I need to supply domain="DOMAINNAME"as a mount option in addition to the credentials.

<!-- gh-comment-id:343048185 --> @yuergen commented on GitHub (Nov 9, 2017): I have the same issue. I can ping the SMB host from a freshly installed virtual machine with ubuntu 16.04. `docker logs` does not provide any output at all: ``` root@banane /o/ambar# docker logs ambar_crawler_c0 root@banane /o/ambar# ``` To mount the share with `mount -t cifs`, I need to supply `domain="DOMAINNAME"`as a mount option in addition to the credentials.
Author
Owner

@sochix commented on GitHub (Nov 9, 2017):

@yuergen please share with us your crawler config

<!-- gh-comment-id:343070207 --> @sochix commented on GitHub (Nov 9, 2017): @yuergen please share with us your crawler config
Author
Owner

@yuergen commented on GitHub (Nov 9, 2017):

@sochix yes, sure. Here it is:

{
  "id": "daten",
  "description": "daten",
  "type": "smb",
  "locations": [
    {
      "host_name": "10.0.0.3",
      "ip_address": "10.0.0.3",
      "location": "daten"
    }
  ],
  "file_regex": "(\\.od[a-z]*$)|(\\.doc[a-z]*$)|(\\.xls[a-z]*$)|(\\.txt$)|(\\.csv$)|(\\.htm[a-z]*$)|(\\.ppt[a-z]*$)|(\\.pdf$)|(\\.msg$)|(\\.zip$)|(\\.eml$)|(\\.rtf$)|(\\.md$)|(\\.png$)|(\\.bmp$)|(\\.tif[f]*$)|(\\.jp[e]*g$)|(\\.hwp$)",
  "credentials": {
    "auth_type": "ntlm",
    "login": "js",
    "password": "******",
    "token": ""
  },
  "schedule": {
    "is_active": true,
    "cron_schedule": "24 02 * * *"
  },
  "max_file_size_bytes": 30000000,
  "verbose": true
}
<!-- gh-comment-id:343191223 --> @yuergen commented on GitHub (Nov 9, 2017): @sochix yes, sure. Here it is: ``` { "id": "daten", "description": "daten", "type": "smb", "locations": [ { "host_name": "10.0.0.3", "ip_address": "10.0.0.3", "location": "daten" } ], "file_regex": "(\\.od[a-z]*$)|(\\.doc[a-z]*$)|(\\.xls[a-z]*$)|(\\.txt$)|(\\.csv$)|(\\.htm[a-z]*$)|(\\.ppt[a-z]*$)|(\\.pdf$)|(\\.msg$)|(\\.zip$)|(\\.eml$)|(\\.rtf$)|(\\.md$)|(\\.png$)|(\\.bmp$)|(\\.tif[f]*$)|(\\.jp[e]*g$)|(\\.hwp$)", "credentials": { "auth_type": "ntlm", "login": "js", "password": "******", "token": "" }, "schedule": { "is_active": true, "cron_schedule": "24 02 * * *" }, "max_file_size_bytes": 30000000, "verbose": true } ```
Author
Owner

@sochix commented on GitHub (Nov 16, 2017):

@yuergen host_name should be the name of the pc not the ip address

<!-- gh-comment-id:344841945 --> @sochix commented on GitHub (Nov 16, 2017): @yuergen host_name should be the name of the pc not the ip address
Author
Owner

@yuergen commented on GitHub (Nov 18, 2017):

I have found a workaround which can be used for testing: Changing the samba-Server's configuration by enabling map untrusted to domain enables ambar to mount the share. Crawling then works.

What is the correct way to supply the samba domain name via the crawler configuration of ambar?

<!-- gh-comment-id:345453350 --> @yuergen commented on GitHub (Nov 18, 2017): I have found a workaround which can be used for testing: Changing the samba-Server's configuration by enabling `map untrusted to domain` enables ambar to mount the share. Crawling then works. What is the correct way to supply the samba domain name via the crawler configuration of ambar?
Author
Owner

@sochix commented on GitHub (Nov 19, 2017):

use domain name in your login, e.g. domain\user_name

<!-- gh-comment-id:345513290 --> @sochix commented on GitHub (Nov 19, 2017): use domain name in your login, e.g. `domain\user_name`
Author
Owner

@yuergen commented on GitHub (Nov 22, 2017):

@sochix I will test it as soon as possible and give you feedback if it works. Thank you.

<!-- gh-comment-id:346248809 --> @yuergen commented on GitHub (Nov 22, 2017): @sochix I will test it as soon as possible and give you feedback if it works. Thank you.
Author
Owner

@yuergen commented on GitHub (Nov 26, 2017):

@sochix, your suggestion works. Thank you!

For other users it might be important to know that it is necessary to escape the backslash. This results in a config line similar to "login": "DOMAIN\\user",.

<!-- gh-comment-id:347005267 --> @yuergen commented on GitHub (Nov 26, 2017): @sochix, your suggestion works. Thank you! For other users it might be important to know that it is necessary to escape the backslash. This results in a config line similar to `"login": "DOMAIN\\user",`.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ambar#49
No description provided.