mirror of
https://github.com/RD17/ambar.git
synced 2026-04-25 23:45:50 +03:00
[GH-ISSUE #213] Not crawling updated data source #208
Labels
No labels
$$ Paid Support
bug
bug
enhancement
help wanted
invalid
pull-request
question
question
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/ambar#208
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @s1rk1t on GitHub (Jan 15, 2019).
Original GitHub issue: https://github.com/RD17/ambar/issues/213
I uploaded a bunch of new data to be crawled, and when I changed the name of the data path in the compose file the crawler no longer sees the data. Is there something else I need to do to get the crawler pointed to the right spot?
Here's the code from my compose file:
newcrawler:
depends_on:
serviceapi:
condition: service_healthy
image: ambar/ambar-local-crawler
restart: always
networks:
- internal_network
expose:
- "8082"
environment:
- name=newcrawler
volumes:
- /home/ec2-user/ambar/AMBAR/newData:/usr/data
I'm running ambar on an ec2 instance, if that matters. I'm trying to crawl around 500 files, none of them very large.
Update: I got a fresh copy of AMBAR downloaded and it sees the data like it should, so I am retracing my steps to see where I might have messed it up somehow.
Any help would be greatly appreciated.
Update: got it working, but I had to download a fresh installation to do it. After downloading, I included my updated data source in the initial build and then moved the resulting db, es, and rabbit files, along with the new data to my older build, which then was able to integrate the new data into it's searching process. Seems like not the optimal way to do this, given that I should be able to just update the data source in the .yml file and have it crawl the new data.