mirror of
https://github.com/Pro/dkim-exchange.git
synced 2026-04-25 08:55:52 +03:00
[GH-ISSUE #68] CompactWhitespaces poor performance causes problems with big attachments #54
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @Jay1980 on GitHub (Oct 24, 2014).
Original GitHub issue: https://github.com/Pro/dkim-exchange/issues/68
Originally assigned to: @Pro on GitHub.
Hi,
Bit of an odd one here. DKim signer has been working well for us but today noticed that our email had ground to a halt. Discovered that for certain emails they seem to get stuck in the submissions queue and the EdgeTransport starts using excessive CPU and memory. Disabling the DKim Signer agent and restarting the MSTransportAgent allows the messages to leave the server and everything returns to normal.
I'm also periodically seeing the following in the event viewer: (source: MSExchange Extensibility)
"The execution time of agent 'Exchange DkimSigner' exceeded 90000 milliseconds while handling event 'OnCategorizedMessage' for message with InternetMessageId: 'Not Available'. This is an unusual amount of time for an agent to process a single event. However, Transport will continue processing this message."
So far it seems to occur with emails with attachments / over a certain size. Smaller emails seem to go through ok.
We're a design agency so some of our emails are proofs of 6-10mb in size.
Any ideas?
James.
@Pro commented on GitHub (Oct 24, 2014):
90 seconds to handle a message is quite long, this shouldn't happen.
I think this is related to calculating the body hash of the message: if the message is too big, the body hash function takes too long and Exchange stops submitting the message.
Currently I can't investigate the problem but will look into it in a week or so.
You can try to send multiple emails with different attachment sizes (1MB, 2MB, .... up to 20MB) and tell me which one don't go through, this would speedup my work. Thanks
@Jay1980 commented on GitHub (Oct 24, 2014):
Thanks for the quick response.
So far I can see there are emails stuck in the queue with the following sizes:
20715KB
6950KB
I've managed to send emails of the following sizes (larger take longer as I'd expect from your suggestion about calculating the body hash):
937KB
2106KB
3111KB
5390KB
@Jay1980 commented on GitHub (Oct 24, 2014):
Interesting development....
after looking at the code and seeing the section in GetBodyHash about relaxed canonicalisation I decided to set my configuration to simple as a test.... the emails go through fine!
Looks like the bottleneck could be in the "CompactWhitespaces" function?
@Pro commented on GitHub (Oct 24, 2014):
Yes, that is most probably the problem, this code part has some quite costly function calls and loops.
This should be replaced with a RegEx since those have a much higher performance.
Thanks for your info.
@AlexLaroche commented on GitHub (Nov 13, 2014):
The problem could be with this function too. It's really delicate to change one of this two functions. :/
May be we could find some inspiration here : https://github.com/dmcgiv/DKIM.Net
@AlexLaroche commented on GitHub (Nov 13, 2014):
@Pro : I have some problems with my developpement environnement. Can you identify with one of this two functions is in problem?
@Pro commented on GitHub (Nov 13, 2014):
The DKIM.Net project uses the following function for compact whitespaces:
github.com/dmcgiv/DKIM.Net@120ada02a7/src/DKIM.Net/WhiteSpace.cs (L33)I've also found an interesting discussion: http://stackoverflow.com/questions/6442421/c-sharp-fastest-way-to-remove-extra-white-spaces
I still think that using a compiled RegEx would be the fastest way for CompactWhitespaces. Maybe we can implement both suggestes ways (from the Stackoverflow page) and then do some performance measurements.
Regarding the Header parsing, maybe we can find a good performing MIME parser which we can integrate to get all the headers of the mail:
http://stackoverflow.com/questions/1669797/are-there-net-framework-methods-to-parse-an-email-mime
http://www.lessanvaezi.com/email-headers-from-outlook-mailitem/
http://jeffreystedfast.blogspot.de/2013/09/time-for-rant-on-mime-parsers.html
@AlexLaroche We should implement the changes in another branch and test them thourugly before adding them back to the master branch. I'll do some tests to find out which function causes the long processing time.
@AlexLaroche commented on GitHub (Nov 13, 2014):
@Pro : An another way to do the implementation is to make a wrapper to the c library libopendkim (http://www.opendkim.org/libopendkim/) that we call for signing. This library is part of OpenDKIM project (http://www.opendkim.org/) and OpenDKIM project seem to be a reference for DKIM implementation. The current library isn't made to be build on Windows officially but we could try to made a easy adaptation.
@Pro commented on GitHub (Nov 13, 2014):
Yes, this would also be a good option. But this still needs a parser for header fields. Do you want to investigate if it's possible to use libopendkim in our code?
@AlexLaroche commented on GitHub (Nov 13, 2014):
I think if we could use libopendkim, the code no more need to make header parsing... We just past the parameters (SigningAlgorithm, HeaderCanonicalization, BodyCanonicalization, HeadersToSign, Private Key) to the wrapper. The wrapper will past all the stuff to libopendkim and exposed some methods of the library. We just have to let the library do all the signing job after.
@AlexLaroche commented on GitHub (Nov 13, 2014):
I will investigate the use of libopendkim in our code.
@Pro commented on GitHub (Nov 13, 2014):
Yep, but looking at http://www.opendkim.org/libopendkim/overview.html, you need to add each header separately with http://www.opendkim.org/libopendkim/dkim_header.html
Maybe this can be used: http://www.opendkim.org/libopendkim/dkim_mail_parse.html
@AlexLaroche commented on GitHub (Nov 13, 2014):
@AlexLaroche commented on GitHub (Nov 13, 2014):
Just remove the header relaxed canonicalization case should probably fix the function GetCanonicalizedHeaders(). I don't think the problem is with parsing header. We don't have the problem in simple canonicalization mode.
@Pro commented on GitHub (Nov 13, 2014):
Yes, this should be fine to use for libopendkim since it does the canonicalization by itself.
@AlexLaroche commented on GitHub (Nov 14, 2014):
Do you have identify with one of this two functions is in problem?
@AlexLaroche commented on GitHub (Nov 24, 2014):
Do you have identify with one of this two functions is in problem? It will be to much work to try to use libopendkim. Libopendkim need to be port to Windows and a c++/cli wrapper should be develop.
@Pro commented on GitHub (Nov 27, 2014):
I investigated the performance of the two functions. I tested with a 1MB file attachment.
It took in total 61584ms (1 Minute). Here's the detailed performance data:
So the bottleneck is definetly the CompactWhitespaces method. I'll check if this function can be rewritten for better performance.
@Pro commented on GitHub (Nov 27, 2014):
Well, the problem was that we added each line directly to the string variable, e.g.
bodyText += "This is a string"The performance of this operation is very poor since the string is recreated in the memory.
Replacing it with a StringBuilder (see commit) increased the performance significantly:
1MB requires now 500ms, 6MB 2900ms
@AlexLaroche commented on GitHub (Nov 27, 2014):
Very small change but a big performance improvement. Good job!