[GH-ISSUE #401] Mailpit in crashloopbackoff error in Kubernetes #260

Closed
opened 2026-03-15 13:29:45 +03:00 by kerem · 18 comments
Owner

Originally created by @wali97 on GitHub (Dec 9, 2024).
Original GitHub issue: https://github.com/axllent/mailpit/issues/401

Hi,
I've deployed mailpit as a deployment in my GKE cluster with following configuration,no authentication or anything is enabled,

 env:
        - name: MP_DATABASE
          value: /data/mailpit.db
        - name: MP_MAX_MESSAGES
          value: '5000'
        - name: MP_SMTP_AUTH_ACCEPT_ANY
          value: '1'
        - name: MP_SMTP_AUTH_ALLOW_INSECURE
          value: '1'

It runs for sometime,but then mailpit pod goes into crashloopbackoff error
level=fatal msg="database is locked (5) (SQLITE_BUSY)"

I tried using three different mailpit docker images with version v1.20.7,v1.21.3,v1.21.5 but all result in same error.
Can please help me out on this as to why i keep getting this error.

Originally created by @wali97 on GitHub (Dec 9, 2024). Original GitHub issue: https://github.com/axllent/mailpit/issues/401 Hi, I've deployed mailpit as a deployment in my GKE cluster with following configuration,no authentication or anything is enabled, ``` env: - name: MP_DATABASE value: /data/mailpit.db - name: MP_MAX_MESSAGES value: '5000' - name: MP_SMTP_AUTH_ACCEPT_ANY value: '1' - name: MP_SMTP_AUTH_ALLOW_INSECURE value: '1' ``` It runs for sometime,but then mailpit pod goes into crashloopbackoff error `level=fatal msg="database is locked (5) (SQLITE_BUSY)"` I tried using three different mailpit docker images with version v1.20.7,v1.21.3,v1.21.5 but all result in same error. Can please help me out on this as to why i keep getting this error.
kerem closed this issue 2026-03-15 13:29:51 +03:00
Author
Owner

@axllent commented on GitHub (Dec 9, 2024):

Hi @wali97. Internally Mailpit enforces a single connection to the database to prevent this very issue, so this should not happen unless:

  1. You have another process sharing / accessing the database (eg: another instance of Mailpit or script reading/writing to the database)
  2. The database is stored on a network filesystem such as SMB or NFS (SQLite is designed for local)

Can you confirm you don't have 1 or 2 above, and any other details about your setup that may be appropriate? Are you receiving a very high load of emails (not that it should matter, I've tested with a constant stream of 130k emails without issues).

<!-- gh-comment-id:2527055335 --> @axllent commented on GitHub (Dec 9, 2024): Hi @wali97. Internally Mailpit enforces a [single connection](https://github.com/axllent/mailpit/blob/develop/internal/storage/database.go#L98-L100) to the database to prevent this very issue, so this should not happen unless: 1. You have another process sharing / accessing the database (eg: another instance of Mailpit or script reading/writing to the database) 2. The database is stored on a network filesystem such as SMB or NFS (SQLite is designed for local) Can you confirm you don't have 1 or 2 above, and any other details about your setup that may be appropriate? Are you receiving a very high load of emails (not that it should matter, I've tested with a constant stream of 130k emails without issues).
Author
Owner

@wali97 commented on GitHub (Dec 9, 2024):

Hi @axllent .
No other process is sharing or accessing the database,i've other instance of mailpit running but they're in a different namesapce..
However,i'm using a NFS server as a persistent storage to store the data at path: /data

<!-- gh-comment-id:2527147051 --> @wali97 commented on GitHub (Dec 9, 2024): Hi @axllent . No other process is sharing or accessing the database,i've other instance of mailpit running but they're in a different namesapce.. However,i'm using a NFS server as a persistent storage to store the data at path: /data
Author
Owner

@axllent commented on GitHub (Dec 9, 2024):

I'm not sure what to say. SQLite over NFS is known for file locking issues, sure it may work for a while, but it goes wrong eventually as you've discovered, not to mention poor performance.

You could investigate NFS mount options (not sure if there are actual working configurations), but it's best to avoid it if possible. Not sure what other options you have though... I would suggest using an rqlite database, however it also uses SQLite, and it sounds like you may just be shifting the problem.

<!-- gh-comment-id:2527190621 --> @axllent commented on GitHub (Dec 9, 2024): I'm not sure what to say. SQLite over NFS is known for file locking issues, sure it may work for a while, but it goes wrong eventually as you've discovered, not to mention poor performance. You could investigate NFS mount options (not sure if there are actual working configurations), but it's best to avoid it if possible. Not sure what other options you have though... I would suggest using an [rqlite database](https://mailpit.axllent.org/docs/configuration/email-storage/#remote-storage-rqlite), however it also uses SQLite, and it sounds like you may just be shifting the problem.
Author
Owner

@axllent commented on GitHub (Dec 9, 2024):

I see GKE supports persistent volumes which may solve your issue (by not using NFS).

<!-- gh-comment-id:2527364360 --> @axllent commented on GitHub (Dec 9, 2024): I see GKE supports [persistent volumes](https://cloud.google.com/kubernetes-engine/docs/concepts/persistent-volumes) which may solve your issue (by not using NFS).
Author
Owner

@axllent commented on GitHub (Dec 10, 2024):

I have updated the documentation to emphasize (more) the fact that the database should be local. Sorry, I wish I could help but there isn't anything else I can do here, so I am closing this ticket.

<!-- gh-comment-id:2530974151 --> @axllent commented on GitHub (Dec 10, 2024): I have [updated the documentation](https://mailpit.axllent.org/docs/configuration/email-storage/) to emphasize (more) the fact that the database should be local. Sorry, I wish I could help but there isn't anything else I can do here, so I am closing this ticket.
Author
Owner

@boomam commented on GitHub (Apr 1, 2025):

Seeing this issue as well.
Is there a workaround for this, to have mailpit ignore the storage type in use?

Tested with "MP_DISABLE_WAL" set to true, but the issue persists.

<!-- gh-comment-id:2770508223 --> @boomam commented on GitHub (Apr 1, 2025): Seeing this issue as well. Is there a workaround for this, to have mailpit ignore the storage type in use? Tested with "MP_DISABLE_WAL" set to true, but the issue persists.
Author
Owner

@axllent commented on GitHub (Apr 1, 2025):

@boomam Can you please confirm your Mailpit version is >= v1.23.0? SQLite not not designed for NFS (it will always be prone to network delays etc), however it should work if WAL support is disabled.

<!-- gh-comment-id:2770742799 --> @axllent commented on GitHub (Apr 1, 2025): @boomam Can you please confirm your Mailpit version is >= v1.23.0? SQLite not not designed for NFS (it will always be prone to network delays etc), however it _should_ work if WAL support is disabled.
Author
Owner

@boomam commented on GitHub (Apr 1, 2025):

Hi,
Running latest release as of today.
I'm aware of the documented limitations of SQLite and network share protocols, but there are use cases.

The WAL option didn't make a difference, error persisted.

Thanks.

<!-- gh-comment-id:2770767997 --> @boomam commented on GitHub (Apr 1, 2025): Hi, Running latest release as of today. I'm aware of the documented limitations of SQLite and network share protocols, but there are use cases. The WAL option didn't make a difference, error persisted. Thanks.
Author
Owner

@axllent commented on GitHub (Apr 2, 2025):

Unfortunately I'm not sure there is anything I can do about this. The NFS issue is documented and is a known limitation of any SQLite over NFS (this is definitely not limited to Mailpit). A potential work-around has been suggested (MP_DISABLE_WAL=true) or using persistent volumes (I never got a response so I don't know 100% if that works), and lastly an alternative approach by using rqlite (although that may fall into the same trap if using NFS).

Unless someone knowledgeable of the issue and a working solution can assist me, I don't know what else I can do, sorry.

<!-- gh-comment-id:2771211383 --> @axllent commented on GitHub (Apr 2, 2025): Unfortunately I'm not sure there is anything I can do about this. The NFS issue is documented and is a known limitation of any SQLite over NFS (this is definitely not limited to Mailpit). A potential work-around has been suggested (`MP_DISABLE_WAL=true`) or using [persistent volumes](https://cloud.google.com/kubernetes-engine/docs/concepts/persistent-volumes) (I never got a response so I don't know 100% if that works), and lastly an alternative approach by using rqlite (although that may fall into the same trap if using NFS). Unless someone knowledgeable of the issue and a working solution can assist me, I don't know what else I can do, sorry.
Author
Owner

@boomam commented on GitHub (Apr 2, 2025):

I think there may be a misunderstanding as to what I'm asking / reporting.

To simplify - the function to disable WAL does not work.

This is not a NFS/SMB issue, nor a platform issue, nor a SQLite issue.

But an issue with mailpit reporting locks they don't exist, due to what is being described as a WAL issue.

<!-- gh-comment-id:2771219282 --> @boomam commented on GitHub (Apr 2, 2025): I think there may be a misunderstanding as to what I'm asking / reporting. To simplify - the function to disable WAL does not work. This is not a NFS/SMB issue, nor a platform issue, nor a SQLite issue. But an issue with mailpit reporting locks they don't exist, due to what is being described as a WAL issue.
Author
Owner

@axllent commented on GitHub (Apr 2, 2025):

To simplify - the function to disable WAL does not work.

Are you sure it is not disabling WAL though? I can confirm that MP_DISABLE_WAL=true mailpit does completely disable WAL here which specifically sets the SQLite journal to DELETE instead of WAL. It should be easy to detect simply by inspecting the path where the database is being written to: there should not be either a *.db-shm or *.db-wal file alongside the Mailpit db file (if there is then you may be working with a corrupted database from a previous attempt). Can you please confirm that your database path does not contain either of these files while Mailpit is running?

Can you please also tell me what NFS protocol your mount is using (the vers= part in nfsstat -m)?

<!-- gh-comment-id:2771254065 --> @axllent commented on GitHub (Apr 2, 2025): > To simplify - the function to disable WAL does not work. Are you sure it is not disabling WAL though? I can confirm that `MP_DISABLE_WAL=true mailpit` does completely disable WAL here which specifically [sets the SQLite journal](https://github.com/axllent/mailpit/blob/develop/internal/storage/database.go#L122-L133) to `DELETE` instead of `WAL`. It should be easy to detect simply by inspecting the path where the database is being written to: there should not be either a `*.db-shm` or `*.db-wal` file alongside the Mailpit db file (if there is then you may be working with a corrupted database from a previous attempt). Can you please confirm that your database path does not contain either of these files while Mailpit is running? Can you please also tell me what NFS protocol your mount is using (the `vers=` part in `nfsstat -m`)?
Author
Owner

@wali97 commented on GitHub (Apr 2, 2025):

Hi @axllent @boomam
I got the mailpit v1.21.3 running without any storage type(NFS or PVC). The only which i did in my setup was i deployed mailpit on nonpreemptive k8s nodes so as to avoid frequent restarts,that's all ....didnt face any issue since then.

<!-- gh-comment-id:2771381487 --> @wali97 commented on GitHub (Apr 2, 2025): Hi @axllent @boomam I got the mailpit v1.21.3 running without any storage type(NFS or PVC). The only which i did in my setup was i deployed mailpit on nonpreemptive k8s nodes so as to avoid frequent restarts,that's all ....didnt face any issue since then.
Author
Owner

@boomam commented on GitHub (Apr 2, 2025):

To simplify - the function to disable WAL does not work.

Are you sure it is not disabling WAL though? I can confirm that MP_DISABLE_WAL=true mailpit does completely disable WAL here which specifically sets the SQLite journal to DELETE instead of WAL. It should be easy to detect simply by inspecting the path where the database is being written to: there should not be either a *.db-shm or *.db-wal file alongside the Mailpit db file (if there is then you may be working with a corrupted database from a previous attempt). Can you please confirm that your database path does not contain either of these files while Mailpit is running?

Can you please also tell me what NFS protocol your mount is using (the vers= part in nfsstat -m)?

I'm basing the WAL not disabling on your comments, and the documentations notes, of that disabling the checks for file locking.
 
Here is what i have just done to test -
 

Part 1

  1. shutdown mailpit (scaled to 0).
  2. went to the PV that mounts at /data and removed any file or folder that wasn't the certificate files i have.
  3. restarted mailpit without specifying a database with MP_DATABASE not specified.

This was done to ensure that mailpit was working, and that no files were created - confirmed.
 

Part 2

  1. updated deployment to add in MP_DATABASE=/data/messages.db & MP_DISABLE_WAL=true
  2. deployed and checked PV for content, i can see a newly created messages.db
  3. checked log and app functionality - fails as before with error -
time="2025/04/02 12:32:46" level=fatal msg="database is locked (5) (SQLITE_BUSY)"

 
RE: NFS version
I am not using NFS, i am using SMB.
 
This suggests that the WAL is maybe getting disabled if theres no shm or wal files in the PV, but does not explain why Mailpit still complains about a database lock?
How does it determine a database is locked?
 

Hi @axllent @boomam I got the mailpit v1.21.3 running without any storage type(NFS or PVC). The only which i did in my setup was i deployed mailpit on nonpreemptive k8s nodes so as to avoid frequent restarts,that's all ....didnt face any issue since then.

Hi,
Getting it running without storage is the easy bit, works fine when no database is at play (which i understand to mean it just puts the data in RAM, so its ephemeral in nature) so its not really a solution when i want to keep short-term storage of mails passing through mailpit. Short term testing, its fine though ofc.

Thanks though!

<!-- gh-comment-id:2772435467 --> @boomam commented on GitHub (Apr 2, 2025): > > To simplify - the function to disable WAL does not work. > > Are you sure it is not disabling WAL though? I can confirm that `MP_DISABLE_WAL=true mailpit` does completely disable WAL here which specifically [sets the SQLite journal](https://github.com/axllent/mailpit/blob/develop/internal/storage/database.go#L122-L133) to `DELETE` instead of `WAL`. It should be easy to detect simply by inspecting the path where the database is being written to: there should not be either a `*.db-shm` or `*.db-wal` file alongside the Mailpit db file (if there is then you may be working with a corrupted database from a previous attempt). Can you please confirm that your database path does not contain either of these files while Mailpit is running? > > Can you please also tell me what NFS protocol your mount is using (the `vers=` part in `nfsstat -m`)? I'm basing the WAL not disabling on your comments, and the documentations notes, of that disabling the checks for file locking. &nbsp; Here is what i have just done to test - &nbsp; **Part 1** 1. shutdown mailpit (scaled to 0). 2. went to the PV that mounts at `/data` and removed any file or folder that wasn't the certificate files i have. 3. restarted mailpit without specifying a database with `MP_DATABASE` not specified. This was done to ensure that mailpit was working, and that no files were created - confirmed. &nbsp; **Part 2** 1. updated deployment to add in `MP_DATABASE=/data/messages.db` & `MP_DISABLE_WAL=true` 2. deployed and checked PV for content, i can see a newly created `messages.db` 3. checked log and app functionality - fails as before with error - ``` time="2025/04/02 12:32:46" level=fatal msg="database is locked (5) (SQLITE_BUSY)" ``` &nbsp; **RE: NFS version** I am not using NFS, i am using SMB. &nbsp; This suggests that the WAL is maybe getting disabled if theres no shm or wal files in the PV, but does not explain why Mailpit still complains about a database lock? How does it determine a database is locked? &nbsp; > Hi [@axllent](https://github.com/axllent) [@boomam](https://github.com/boomam) I got the mailpit v1.21.3 running without any storage type(NFS or PVC). The only which i did in my setup was i deployed mailpit on nonpreemptive k8s nodes so as to avoid frequent restarts,that's all ....didnt face any issue since then. Hi, Getting it running without storage is the easy bit, works fine when no database is at play (which i understand to mean it just puts the data in RAM, so its ephemeral in nature) so its not really a solution when i want to keep short-term storage of mails passing through mailpit. Short term testing, its fine though ofc. Thanks though!
Author
Owner

@axllent commented on GitHub (Apr 3, 2025):

Thank you for that detailed info @boomam. Unfortunately there is general confusion as to what NFS actually means - more specifically whether it is in actual fact NFS itself, or whether it's implying any Network Filesystem (eg: Samba). I'm not 100% sure what the right answer is here when it comes to SQLite (yet). I have tested here in my own network with both NFS and Samba, and I don't get issues on either (with or without the MP_DISABLE_WAL=true), so this is tough for me to debug. Let's start with what I know:

  1. Mailpit sets the maximum connections to a database to 1 - so at any point Mailpit should only have one "open connection" to the database. SQLite in general is designed this way (single-application use). This in itself should prevent unnecessary file locking errors when run locally. This, together with the MP_DISABLE_WAL=true should prevent file locking from happening at all, that is unless something else is locking the file (see next point). SQLite does however lock internally for some actions, such as when dealing with rollback journals.
  2. Samba (CIFS) supports native file locking as far as I understand (as part of the client/server communication), which is where I suspect your issue may originate from. It is also heavily dependent on both the server config as well as the client. I see that I use nobrl in my client config (which according to the CIFS documentation is "do not send byte range lock requests to the server") , which may be a clue and an option you could try. Sorry, I don't recall why I use this option as it's been years since I last had to tinker with my configs.

The last point I wanted to clarify if that Mailpit, when run without specifying a database, actually still saves everything to a local randomly-named database in the (typically) /tmp directory, which gets deleted again on termination. I'm just stating this to clarify that Mailpit does not store messages in memory, not that it is related to your issue ;-)

If you wouldn't mind adding the nobrl option to your mount flags to test, then we can hopefully either rule that out or identify the Samba lock as the cause.

<!-- gh-comment-id:2774296250 --> @axllent commented on GitHub (Apr 3, 2025): Thank you for that detailed info @boomam. Unfortunately there is general confusion as to what NFS actually means - more specifically whether it is in actual fact NFS itself, or whether it's implying any Network Filesystem (eg: Samba). I'm not 100% sure what the right answer is here when it comes to SQLite (yet). I have tested here in my own network with both NFS and Samba, and I don't get issues on either (with or without the `MP_DISABLE_WAL=true`), so this is tough for me to debug. Let's start with what I know: 1. Mailpit sets the maximum connections to a database to `1` - so at any point Mailpit should only have one "open connection" to the database. SQLite in general is designed this way (single-application use). This in itself should prevent unnecessary file locking errors when run locally. This, together with the `MP_DISABLE_WAL=true` should prevent file locking from happening at all, that is unless something else is locking the file (see next point). SQLite does however lock internally for some actions, such as when dealing with rollback journals. 2. Samba (CIFS) supports native file locking as far as I understand (as part of the client/server communication), which is where I suspect your issue may originate from. It is also heavily dependent on both the server config as well as the client. I see that I use `nobrl` in my client config (which according to the [CIFS documentation](https://www.samba.org/~ab/output/htmldocs/manpages-3/mount.cifs.8.html) is "do not send byte range lock requests to the server") , which may be a clue and an option you could try. Sorry, I don't recall why I use this option as it's been years since I last had to tinker with my configs. The last point I wanted to clarify if that Mailpit, when run without specifying a database, actually still saves everything to a local randomly-named database in the (typically) `/tmp` directory, which gets deleted again on termination. I'm just stating this to clarify that Mailpit does not store messages in memory, not that it is related to your issue ;-) If you wouldn't mind adding the `nobrl` option to your mount flags to test, then we can hopefully either rule that out or identify the Samba lock as the cause.
Author
Owner

@boomam commented on GitHub (Apr 3, 2025):

RE: NFS confusion
Its its own protocol, it should not be getting confused with SMB linguistically or technically.
Are you saying that people are confusing the two with Mailpit?

Re: Max connections
Is there way, to test, to set the connection max higher in Mailpit?

Re: mount flag "nobrl"
This isn't something i can test - Changing the entire storage layer on a running system just to test one app is not prudent.
If i look into other apps on the same infrastructure, using a Persistent Volume that is presented over SMB, and uses SQLite for a database of some sort, shows zero issues with them and file locking.

<!-- gh-comment-id:2775779865 --> @boomam commented on GitHub (Apr 3, 2025): RE: NFS confusion Its its own protocol, it should not be getting confused with SMB linguistically or technically. Are you saying that people are confusing the two with Mailpit? Re: Max connections Is there way, to test, to set the connection max higher in Mailpit? Re: mount flag "nobrl" This isn't something i can test - Changing the entire storage layer on a running system just to test one app is not prudent. If i look into other apps on the same infrastructure, using a Persistent Volume that is presented over SMB, and uses SQLite for a database of some sort, shows zero issues with them and file locking.
Author
Owner

@axllent commented on GitHub (Apr 4, 2025):

NFS: I think the confusion arises when official documentation refers to issues while running SQLite on a "network filesystem", ie: is that NFS or SMB - or any network filesystem. It does not specifically name either, unless of course "network filesystem" implies NFS (which is technically "Network File System").

Max connections: You could modify the source code, recompile and run that, however I can tell you now that it is hardcoded to a single thread for a very good reason (and not relating to network filesystems) as this would cause race issues when handling heavy SMTP traffic. I hit this issue early in development.

nobrl: As I mentioned before, I can't test your environment, however I was able to replicate the same issue here by removing the nobrl flag in my SMB mount options:

mailpit -d /mnt/smb/mailpit.db
FATA[2025/04/04 16:45:13] database is locked (5) (SQLITE_BUSY)

After reverting the fstab change and remounting the volume Mailpit worked exactly as expected (with and without the MP_DISABLE_WAL=true). So I'll double-down on my earlier comment that you will need the nobrl mount flag in order to disable the Samba byte locking. This is not a Mailpit limitation, this is a conflict between SMB and SQLite (as well as other databases such as PostgreSQL when storing data over SMB). If I recall correctly I had this file locking issue when just writing LibreOffice documents over SMB. Searching Google reveals this is a well-documented issue with SMB and various apps, especially SQLite. If you haven't experienced this with your other SQLite apps, then it's likely just a matter of time before they hit the same issue while doing a moderate level reading & writing.

<!-- gh-comment-id:2777511030 --> @axllent commented on GitHub (Apr 4, 2025): **NFS**: I think the confusion arises when official documentation refers to issues while running SQLite on a "network filesystem", ie: is that NFS or SMB - or **any** network filesystem. It does not specifically name either, unless of course "network filesystem" implies NFS (which is technically "Network File System"). **Max connections**: You could modify the source code, recompile and run that, however I can tell you now that it is hardcoded to a single thread for a very good reason (and not relating to network filesystems) as this would cause race issues when handling heavy SMTP traffic. I hit this issue early in development. **nobrl**: As I mentioned before, I can't test your environment, however I was able to replicate the same issue here by removing the `nobrl` flag in my SMB mount options: ``` mailpit -d /mnt/smb/mailpit.db FATA[2025/04/04 16:45:13] database is locked (5) (SQLITE_BUSY) ``` After reverting the fstab change and remounting the volume Mailpit worked exactly as expected (with and without the `MP_DISABLE_WAL=true`). So I'll double-down on my earlier comment that you will need the `nobrl` mount flag in order to disable the Samba byte locking. This is not a Mailpit limitation, this is a conflict between SMB and SQLite (as well as other databases such as PostgreSQL when storing data over SMB). If I recall correctly I had this file locking issue when just writing LibreOffice documents over SMB. Searching Google reveals this is a well-documented issue with SMB and various apps, especially SQLite. If you haven't experienced this with your other SQLite apps, then it's likely just a matter of time before they hit the same issue while doing a moderate level reading & writing.
Author
Owner

@boomam commented on GitHub (Apr 4, 2025):

I wonder how other SQLite apps are getting around this without the nobrl...
Anyway, could it be worth adding that particular note/workaround to the docs?

I'm gonna experiment with some CSI driver options, will note back once tested.

<!-- gh-comment-id:2779258395 --> @boomam commented on GitHub (Apr 4, 2025): I wonder how other SQLite apps are getting around this without the `nobrl`... Anyway, could it be worth adding that particular note/workaround to the docs? I'm gonna experiment with some CSI driver options, will note back once tested.
Author
Owner

@axllent commented on GitHub (Apr 4, 2025):

I suspect other apps read & write far less to the db, or potentially don't use SQL transactions - I don't know for sure.

I'll definitely add it to the documentation once you can confirm the option resolves the issue, thanks.

<!-- gh-comment-id:2779785999 --> @axllent commented on GitHub (Apr 4, 2025): I suspect other apps read & write far less to the db, or potentially don't use SQL transactions - I don't know for sure. I'll definitely add it to the documentation once you can confirm the option resolves the issue, thanks.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/mailpit#260
No description provided.