[GH-ISSUE #1446] Make NormalizerFormatter::addJsonEncodeOption public #609

Closed
opened 2026-03-04 02:16:22 +03:00 by kerem · 4 comments
Owner

Originally created by @zerkms on GitHub (Apr 3, 2020).
Original GitHub issue: https://github.com/Seldaek/monolog/issues/1446

At this point NormalizerFormatter::addJsonEncodeOption function is protected, and is not used anywhere (no wondering).

In my code base I wanted to JSON_INVALID_UTF8_SUBSTITUTE so that Monolog could log broken UTF8 as well (at the moment it simply would return null).

To do that I need to $formatter->addJsonEncodeOption(JSON_INVALID_UTF8_SUBSTITUTE) from my code, which is impossible.

Originally created by @zerkms on GitHub (Apr 3, 2020). Original GitHub issue: https://github.com/Seldaek/monolog/issues/1446 At this point `NormalizerFormatter::addJsonEncodeOption` function is protected, and is not used anywhere (no wondering). In my code base I wanted to `JSON_INVALID_UTF8_SUBSTITUTE` so that Monolog could log broken UTF8 as well (at the moment it simply would return `null`). To do that I need to `$formatter->addJsonEncodeOption(JSON_INVALID_UTF8_SUBSTITUTE)` from my code, which is impossible.
kerem 2026-03-04 02:16:22 +03:00
  • closed this issue
  • added the
    Feature
    label
Author
Owner

@zerkms commented on GitHub (Apr 4, 2020):

I just realised that alternatively JSON_INVALID_UTF8_SUBSTITUTE can be set as default for the json encoder for php versions that support it.

It will cause php < 7.2 and >= 7.2 produce different results, but null for any data makes really very little sense to keep and call it as a BC.

In either case, please let's discuss and I can contribute either.

<!-- gh-comment-id:608963560 --> @zerkms commented on GitHub (Apr 4, 2020): I just realised that alternatively `JSON_INVALID_UTF8_SUBSTITUTE` can be set as default for the json encoder for php versions that support it. It will cause php < 7.2 and >= 7.2 produce different results, but `null` for any data makes really very little sense to keep and call it as a BC. In either case, please let's discuss and I can contribute either.
Author
Owner

@Seldaek commented on GitHub (May 11, 2020):

Oh this is great, I wasn't aware of this new constant, thanks for pointing it out. Added by default, and made the methods public too. Monolog 2 supports only PHP 7.2+ so this is a no brainer there IMO.

<!-- gh-comment-id:626922197 --> @Seldaek commented on GitHub (May 11, 2020): Oh this is great, I wasn't aware of this new constant, thanks for pointing it out. Added by default, and made the methods public too. Monolog 2 supports only PHP 7.2+ so this is a no brainer there IMO.
Author
Owner

@Seldaek commented on GitHub (May 11, 2020):

cc @bd808 essentially this will by default disable the conversion to ISO-8859-15 you added in #683 - is this likely to be a problem for you you think? github.com/Seldaek/monolog@f279285232 shows the change in output.

<!-- gh-comment-id:626938621 --> @Seldaek commented on GitHub (May 11, 2020): cc @bd808 essentially this will by default disable the conversion to ISO-8859-15 you added in #683 - is this likely to be a problem for you you think? https://github.com/Seldaek/monolog/commit/f2792852325a68f6a197403077bf6975be186acf shows the change in output.
Author
Owner

@bd808 commented on GitHub (May 11, 2020):

cc @bd808 essentially this will by default disable the conversion to ISO-8859-15 you added in #683 - is this likely to be a problem for you you think? f279285 shows the change in output.

@Seldaek thank you very much for thinking of this and pinging me!

I think that the JSON_INVALID_UTF8_SUBSTITUTE change at least preserves most of the intent of my prior patch. For the @wikimedia use case it may result in some loss of readable data in our log events as the diff you link shows. We have strings (mostly "titles" of articles, but also User-Agent headers) which are not UTF-8 and are instead encoded in a variety of different ways depending on the origin of the string. These strings show up in log events in ways that are not entirely predictable, often embedded in SQL statement trace logging or stack dumps. It should not bring back the fatal error bug that we saw in #545 which was fixed by #683. We still are using a layer of WhatFailureGroupHandler wrappers too as belt & suspenders protection.

We actually use a local subclass of LogstashFormatter which is itself a subclass of NormalizerFormatter. I will open a task on our side flagging this behavior change and let folks decide if they want to start calling removeJsonEncodeOption(JSON_INVALID_UTF8_SUBSTITUTE) as part of our setup or not when we finally get to Monolog 2.x.

<!-- gh-comment-id:626967375 --> @bd808 commented on GitHub (May 11, 2020): > cc @bd808 essentially this will by default disable the conversion to ISO-8859-15 you added in #683 - is this likely to be a problem for you you think? [f279285](https://github.com/Seldaek/monolog/commit/f2792852325a68f6a197403077bf6975be186acf) shows the change in output. @Seldaek thank you very much for thinking of this and pinging me! I think that the JSON_INVALID_UTF8_SUBSTITUTE change at least preserves most of the intent of my prior patch. For the @wikimedia use case it may result in some loss of readable data in our log events as the diff you link shows. We have strings (mostly "titles" of articles, but also User-Agent headers) which are not UTF-8 and are instead encoded in a variety of different ways depending on the origin of the string. These strings show up in log events in ways that are not entirely predictable, often embedded in SQL statement trace logging or stack dumps. It should not bring back the fatal error bug that we saw in #545 which was fixed by #683. We still are using a layer of WhatFailureGroupHandler wrappers too as belt & suspenders protection. We actually use a local subclass of LogstashFormatter which is itself a subclass of NormalizerFormatter. I will open a task on our side flagging this behavior change and let folks decide if they want to start calling `removeJsonEncodeOption(JSON_INVALID_UTF8_SUBSTITUTE)` as part of our setup or not when we finally get to Monolog 2.x.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/monolog#609
No description provided.