mirror of
https://github.com/axllent/mailpit.git
synced 2026-04-26 08:45:54 +03:00
[GH-ISSUE #426] API concurrency? #276
Labels
No labels
awaiting feedback
bug
docker
documentation
enhancement
github_actions
invalid
pull-request
question
stale
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/mailpit#276
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @baiomys on GitHub (Jan 21, 2025).
Original GitHub issue: https://github.com/axllent/mailpit/issues/426
Hi, I finally started some load testing, how many API operations can take place simultaneously?
It seems that I can't delete one message while downloading attachments from another.
Currently I am using FastAPI + Uvicorn + Cython.
Webhook processing from MP is blazing fast.
Tested at 30+ msgs per second while Telegram allow to send no more than 20-60 per minute. Depending on target chat/group/channel.
Perfect!
But then it comes to API calls via aiohttp, everything start to look a bit sluggish.
MP database is nearly empty. All logic at my side is completely async.
@axllent commented on GitHub (Jan 21, 2025):
Hi @baiomys.
I think what you are describing is (to me) expected behaviour, depending on what you mean by "can't". If you mean they don't happen at the same time, then yes.
Message deletions include three separate delete operations (the message summary, raw data & message tags) which are wrapped in a single SQL transaction. SQL transactions are "blocking" queries due to the rollback nature - if the delete fails then the database is rolled back (the delete is "undone"). This prevents a situations where we could end up with unreferenced data in the database, as well as prevent a simultaneous request trying to access that message's data (eg: attachment) while it is being deleted.
Message deletion should only be locking the database for a matter of milliseconds though, so it would be great if you could provide some benchmarking to define "sluggish". Yes I would definitely expect a slight slowdown of other operations during the delete, but not one that should greatly impact an application.
@baiomys commented on GitHub (Jan 21, 2025):
Thanks for detailed answer.
It seems that you are using WAL mode in sqlite database, so concurrency should not be a problem even with delete operations. On the other hand WAL only works for local databases, and your application able to use remote database.
So do you use WAL and if so why delete operations are blocking?
Do you plan to implement some kind of "soft delete" using for example hidden tag ?
And vacuum operation for major cleanup?
P.S. Just tested MP on Ryzen 3900 + NVME and it is way more responsive even on single core. Looks like MP is disk intensive application. Gonna test it on RAMDISK.
=)
@axllent commented on GitHub (Jan 21, 2025):
There is already automated vacuuming implemented which runs after 5 minutes of database inactivity, and only when there are enough messages which have been deleted (ie: "saving" >= 1% of the total database size). The reason for this is a VACUUM operation is 100% blocking, and can take minutes to complete when a database is several GB in size (time taken is directly linked to the database size and of course CPU & disk speed).
Soft deletion (marking a message as deleted rather than physically deleting it) would likely solve some of the "problem" you describe, however would also require fairly significant changes throughout Mailpit (every SQL query) to accommodate that approach, and it would also introduce some additional overhead too. My question still stands though:
There is very little point changing how Mailpit works without first understanding the problem it is supposed to solve, and whether it is actually a problem at all. You have already indicated that the Mailpit API currently handles 60x more requests than you can process via Telegram anyway (if I understand you correctly), so I'm very curious as to how much slowdown you get when doing simultaneous deletes, and whether this would make any meaningful difference to your integration.
@baiomys commented on GitHub (Jan 21, 2025):
My API calls delete on every "wrong" undeliverable message without sending Telegram notification.
So if we have 1000 spam messages none of them will be sent to Telegram, BUT delete will be called 1000 times.
And if it is blocking it is a serious problem.
I think tagging can help me to mark and later remove such messages
Your app seem to be "sluggish" on slow drives. Some investigation needed.
@axllent commented on GitHub (Jan 21, 2025):
Each delete API call is blocking, but that does not mean the entire application "stops" until all 1000 API requests are finished, all subsequent read/write requests are queued. This means that if you called a delete and immediately another API call to fetch data, that fetch data would happen immediately after the delete, so milliseconds later.
There are two ways of improving your performance:
Grouping IDs for deletion (option 2) is definitely the better option.
I don't understand how you can accuse any application (which relies heavily on disk I/O for database read/writes) of being sluggish when you are using slow drives. That seems obvious to me.
@baiomys commented on GitHub (Jan 21, 2025):
Well, drive is not really too "slow", it's SATA at 500 mb/sec
But it is definitely not enough to be happy.
Don't take it personally.
Your app is great!
@axllent commented on GitHub (Jan 21, 2025):
Yes, Mailpit uses WAL to allow async read/write access. As I explained earlier, deletions involve separate delete queries to multiple tables, and these tables must remain "in sync" to ensure message data consistency. If any of the deletions fail, then the data is rolled back - and so this happens in an "SQL transaction" (which is blocking to ensure consistency). It's not just deletions either, it also includes new messages being written to the database (for the exact same reason). Despite this, I get 100-200 SMTP transactions (new messages) per second, while also being able to use the UI - so I do not think this is an issue.
I believe that's the maximum data transfer rate (eg: a large file being streamed). A SATA disk is still a spinning disk with a needle, which is far slower than a SSD especially when it comes to seek actions (to find the data position on the drive to read from or/write to). I won't however pretend to be a hardware expert, I'm not, but for any production-ready application these days that relies on disk I/O, you should be using (at the minimum) a SSD of sorts if you need performance.
It's not so much taking it personally as getting frustrated when someone says something like that. It's like saying "the download speed of your server is terrible when I'm on dialup internet" ;-)
@baiomys commented on GitHub (Jan 21, 2025):
SATA is not a spinning disk, it is just interface like SCSI or SAS.
My SATA disk is SSD.
=)
@axllent commented on GitHub (Jan 21, 2025):
Sorry, then I misinterpreted what you were meaning by SATA and "slow drives". In this case it should not matter too much then. NVME will always win against SSD though, but shouldn't be necessary. I use an SSD with SATA and it performs pretty well with my 3.9GB database, but the bottleneck I have is the CPU. It doesn't matter though, this is a personal Mailpit instance, not a high-usage production instance.
@baiomys commented on GitHub (Jan 22, 2025):
Thanks for nice conversation, I appreciate your support.