mirror of
https://github.com/spamscanner/spamscanner.git
synced 2026-04-27 12:45:50 +03:00
[GH-ISSUE #4] How to train Naive Bayes Classifier ? #5
Labels
No labels
bug
bug
enhancement
help wanted
pull-request
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/spamscanner#5
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @JQuags on GitHub (Aug 22, 2020).
Original GitHub issue: https://github.com/spamscanner/spamscanner/issues/4
Is there more information on how to train the classifier?
I see in the source classifier.json is currently private, which explains the broken links on the site.
The source indicates removing classifier.json, should be all that is needed to train and set SPAM_CATEGORY and SCAN_DIRECTOR. Is that all then feed a directory of spam or ham in EML or ARF format?
@wis commented on GitHub (Sep 13, 2020):
I thought you provided a well trained classifier.json, the link in the README 404s, why was it removed? @niftylettuce
@JQuags commented on GitHub (Sep 14, 2020):
I suspect it never has been provided, and there may be privacy reason.
@niftylettuce commented on GitHub (Sep 14, 2020):
I should have this published in the near future. Currently I had to put my focus on something else. But this is not a privacy concern anymore as I have sha256 hashed all the tokens.
@wis commented on GitHub (Sep 16, 2020):
good! can we contribute to the training data by forwarding spam emails from our inbox to an email address you setup?
@niftylettuce commented on GitHub (Sep 16, 2020):
abuse@forwardemail.net works
On Tue, Sep 15, 2020 at 11:55 PM Wis notifications@github.com wrote:
@titanism commented on GitHub (Dec 22, 2025):
see https://github.com/spamscanner/spamscanner?tab=readme-ov-file#custom-classifier
you'd just write the JSON file you train to classifier.json and then load it basically
you can also make it do sha256 hashing (customizable)
v6 released, we will update classifier.json (there's one published now with sha256) after @fwdemail integration (we're on older v5). the current classifier.json is not that accurate, but we will improve after integration (since we process millions of emails daily, it'll be very accurate soon enough).
https://github.com/spamscanner/spamscanner
https://github.com/spamscanner/spamscanner/releases
X post/announcement @ https://x.com/fwdemail/status/2002872581402063281
we also support TypeScript now in the project (thx to AI, we despise TS internally tho)