mirror of
https://github.com/lldap/lldap.git
synced 2026-04-26 00:36:01 +03:00
[GH-ISSUE #847] [INTEGRATION] Creating a DB with k8s setup fails #303
Labels
No labels
backend
blocked
bug
cleanup
dependencies
docker
documentation
duplicate
enhancement
enhancement
frontend
github_actions
good first issue
help wanted
help wanted
integration
invalid
ldap
pull-request
question
rust
rust
tests
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/lldap-lldap#303
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @zelogik on GitHub (Feb 23, 2024).
Original GitHub issue: https://github.com/lldap/lldap/issues/847
Describe the bug
On k8s with the help of Evantage-WS/lldap-kubernetes repo.
At first start, got the error:
To Reproduce
Set or don't set LLDAP_LDAP_USER_DN or LLDAP_LDAP_USER_EMAIL
FS have been tested with longhorn and hostPath.
Expected behavior
Creation of first admin account.
Logs
Additional context
Add any other context about the problem here.
@nitnelave commented on GitHub (Feb 23, 2024):
It looks like your db already contains a user with no email address. If you don't have anything important in there, can you delete the DB? And make sure to grab the logs when you restart LLDAP, the first run logs might be able to tell us how we got there.
@zelogik commented on GitHub (Feb 23, 2024):
I have removed the Volume, deleted manifest and reapply my manifest (even changed the name of my PVC)
same problem.
@nitnelave commented on GitHub (Feb 23, 2024):
Either your database still exists or it's not the very first run of LLDAP (do you have something that auto-restarts it?)
You can see that because it's getting the current db schema version and getting version 9 (instead of no version for an empty db)
@nitnelave commented on GitHub (Feb 23, 2024):
Where is the file "/data/users.db" from, and can you delete it?
@zelogik commented on GitHub (Feb 23, 2024):
And I have deleted the PVC ...
I understand what you mean... but I don't know where come from the /data/users.db as I create a fresh PV ...
Last Edit: k get persistentvolume -A, return 0 lldap-data-pvc
@zelogik commented on GitHub (Feb 23, 2024):
For test I use lldap/lldap:latest, and I have checked the Dockerfile, and entry-points.sh, and normally just check permissions, if I'm right
@nitnelave commented on GitHub (Feb 23, 2024):
Sorry, I don't know enough about Kubernetes to help you... You can try changing the database path (change users.db to something else)
@zelogik commented on GitHub (Feb 23, 2024):
I have set: LLDAP_DATABASE_URL to "database_url: sqlite:///data/users-dev.db?mode=rwc,"
And I have exactly the same problem...
But, thank for your help, and your software that "look" really good and light (compared to FreeIPA/OpenLDAP..)
@martadinata666 commented on GitHub (Feb 23, 2024):
I think this is some incompatibility between storage type and sqlite. Something like NFS can't be used when using SQLITE type database. As taking the reference of https://github.com/Evantage-WS/lldap-kubernetes/blob/main/lldap-persistentvolumeclaim.yaml it use local-path instead longhorn that a networked storage I guess?
@zelogik commented on GitHub (Feb 23, 2024):
Yes, was thinking about that, but longhorn is not NFS, and haven't seen "bug", with sqlite and longhorn Storage.
Need to recheck with hostPath to check and verify if it's longhorn/sqlite, or lldap...
@zelogik commented on GitHub (Feb 23, 2024):
I have replaced longhorn storage by hostPath storage ( more or less same than Docker)
Make sure that the directory have no config.toml or users.db (users-dev.db) in my case.
apply the manifest
simple ls /tmp/data on the k8s runner where lldap running.
The users-dev.db is created ... but same error. I pulling my hair...
@zelogik commented on GitHub (Feb 23, 2024):
Got new!
tested:
So seem like there is a "feature" / bug added from 0.4.3 and 0.5.0
Edit: upgrade 0.4.3 to 2023-02-08-alpine and got:
@nitnelave commented on GitHub (Feb 26, 2024):
Sorry to push back again on this issue, but as long as we don't understand what's going on with your setup, we can't debug the issue in LLDAP. In particular, as I mentioned earlier, these logs cannot be the logs for a first start of LLDAP with an empty database. We should at the very least see DB migration messages, and user/group creations for the built-in users (admin, admin groups and so on).
@zelogik commented on GitHub (Feb 26, 2024):
Yes i understand the problem and look to get the same as:
docker compose up
But you can update that the k8s deployment don´t work up to lldap v4.5
@nitnelave commented on GitHub (Feb 26, 2024):
Alright, until proof of the contrary, I'll assume the fault is in the k8s setup rather than in LLDAP itself, so downgrading from
bugtointegration+documentation.@onedr0p commented on GitHub (Mar 15, 2024):
@zelogik maybe you need to setup a startup probe due to kubernetes killing the pod before the DB migrations have completed? If that's the case, the pod would restart and might lead to the issue you are seeing since the migration hasn't completely finished?
@zelogik commented on GitHub (Mar 18, 2024):
@onedr0p , thanks for the suggestion, I have already done that, even tested with a initContainer with the same problem.
One test I haven´t done, as lldap docker image have build error 2 weeks ago was to remove the line:
HEALTHCHECK CMD ["/app/lldap", "healthcheck", "--config-file", "/data/lldap_config.toml"] on the DockerFile file
I don't know if it's not the problem with k8s. (ie: race condition with the CMD [run....])
Regards
@nitnelave commented on GitHub (Mar 18, 2024):
The healthcheck shouldn't affect anything: it's essentially sending a ping on the HTTP/LDAP(S) interfaces to see if they're up. It wouldn't set up the DB, for instance. The interfaces only start listening after everything else is setup, including the DB.
@zelogik commented on GitHub (Mar 18, 2024):
..ok :/
I'm looking everywhere I can find a race condition
And I don't know how is proceeded the HEALTHCHECK cmd in k8s
@zelogik commented on GitHub (Mar 18, 2024):
my bad....
seem like it's working now.
I have changed two things, the last version 2024-03-07-alpine vs "old" 2024-02-08-alpine
But I have increase the resource limit from 100m to 4000m and memory 50M to 500M...
I check if it's was the resources limit the problem and close the issue.
Sorry for that...
@zelogik commented on GitHub (Mar 18, 2024):
So it's the memory limit the problem, with 50M the app "crash" at init, without saying anything "useful". With 100M of ram it's seem to work well.
and when running:
@nitnelave
I think, i can close the issue?
@nitnelave commented on GitHub (Mar 18, 2024):
Yes, that sounds like the culprit. We need more RAM (by design) when setting/checking a password ("hashing" the password is intentionally resource intensive).
We can close this.
Maybe you want to add to the LLDAP K8s docs a note about the minimum resources required?
@zelogik commented on GitHub (Mar 18, 2024):
@nitnelave : It's not really docs but a more recent working and sane manifest for k8s than the good base from Evantage-WS/lldap-kubernetes
I don't know if we want to recreate new documentation for k8s specific or update Evantage-WS/lldap-kubernetes
A working simple k8s manifest as example: (requiring ingress-nginx + longhorn)
@zelogik commented on GitHub (Mar 18, 2024):
The problem is not really the "design", but the log, even the the verbose mode haven't said anything except the strange "[debug]: | return: Some(SchemaVersion(9))" at the first run.
Done (too early?)
It's not really k8s specific finally, all productions servers using docker/k8s/distro with limit allocations (cpu/ram/...) could have that problem. no?
And @nitnelave thanks for the work on lldap!
@martadinata666 commented on GitHub (Mar 18, 2024):
it just hard to say. As the container directly terminated, even when there are logs for oom, it won't show. Any program that had limit allocations will act the same, like nodejs known as memory hog, as the program reach the allocation limit the container will terminated without any suspect of oom, it just dead.
edit: on docker usually determined by (137) that could be OOM or some other issue. Still unclear, essentially it just "container died non zero exit".
@nitnelave commented on GitHub (Mar 18, 2024):
You should probably get some logs about the OOM from k8s, no? Maybe it should be more visible.