Auto-Learn spam/ham installer

This bash script should setup Dovecot and SpamAssassin to auto learn ham/spam on moving to spam or from spam except for trash. HestiaCP can use this to help add the feature if wanted. I know a lot of users want the auto learn so figured i’d post it. If anyone can take a few minuets and check it out and let me know if I missed anything. but it seems to be working.

#!/bin/bash

cat <<EOF | sudo tee /etc/dovecot/conf.d/05-spamham.conf
#!/bin/bash

mail_attribute_dict = file:%Lh/mail/%d/dovecot-attributes

plugin {
    sieve_plugins              = sieve_imapsieve sieve_extprograms
    imapsieve_url              = sieve://127.0.0.1:4190

    #                          From elsewhere to Junk folder
    imapsieve_mailbox1_name    = Junk
    imapsieve_mailbox1_causes  = COPY APPEND
    imapsieve_mailbox1_before  = file:/var/mail/sieve/report_spam.sieve

    #                          From Junk folder to elsewhere
    imapsieve_mailbox2_name    = *
    imapsieve_mailbox2_from    = Junk
    imapsieve_mailbox2_causes  = COPY
    imapsieve_mailbox2_before  = file:/var/mail/sieve/report_ham.sieve

    sieve_pipe_bin_dir         = /etc/dovecot/sieve/pipe

    sieve_global_extensions    = +vnd.dovecot.pipe +vnd.dovecot.environment

}
EOF

mkdir -p /etc/dovecot/sieve/pipe
mkdir -p /var/mail/imapsieve_copy
chown dovecot:mail /var/mail/imapsieve_copy
chmod 0700 /var/mail/imapsieve_copy

cat <<EOF | sudo tee /var/mail/sieve/report_spam.sieve
require ["vnd.dovecot.pipe", "copy", "imapsieve", "environment", "variables"];

if environment :matches "imap.user" "*" {
    set "username" "${1}";
}

pipe :copy "imapsieve_copy" [ "${username}", "spam" ];
EOF

cat <<EOF | sudo tee /var/mail/sieve/report_sham.sieve
require ["vnd.dovecot.pipe", "copy", "imapsieve", "environment", "variables"];

if environment :matches "imap.mailbox" "*" {
    set "mailbox" "${1}";
}

if string "${mailbox}" "Trash" {
    stop;
}

if environment :matches "imap.user" "*" {
    set "username" "${1}";
}

pipe :copy "imapsieve_copy" [ "${username}", "ham" ];
EOF

cat <<EOF | sudo tee /etc/dovecot/sieve/pipe/imapsieve_copy
#!/usr/bin/env bash
# Usage: bash imapsieve_copy <email> <spam|ham> <output_base_dir>

export USER="$1"
export MSG_TYPE="$2"

export OUTPUT_BASE_DIR="/var/mail/imapsieve_copy"
export OUTPUT_DIR="${OUTPUT_BASE_DIR}/${MSG_TYPE}"
export FILE="${OUTPUT_DIR}/${USER}-$(date +%Y%m%d%H%M%S)-${RANDOM}${RANDOM}.eml"

export OWNER="dovecot"
export GROUP="mail"

for dir in "${OUTPUT_BASE_DIR}" "${OUTPUT_DIR}"; do
    if [[ ! -d ${dir} ]]; then
        mkdir -p ${dir}
        chown ${OWNER}:${GROUP} ${dir}
        chmod 0700 ${dir}
    fi
done

cat > ${FILE} < /dev/stdin

# Logging
#export LOG='logger -p local5.info -t imapsieve_copy'
#[[ $? == 0 ]] && ${LOG} "Copied one ${MSG_TYPE} email reported by ${USER}: ${FILE}"
EOF

chown dovecot:mail /var/mail/sieve/report_spam.sieve \
    /var/mail/sieve/report_ham.sieve \
    /etc/dovecot/sieve/pipe/imapsieve_copy

chmod 0700 /var/mail/sieve/report_spam.sieve \
    /var/mail/sieve/report_ham.sieve \
    /etc/dovecot/sieve/pipe/imapsieve_copy

service dovecot restart

# su -s /bin/bash amavis -c "sa-learn --dump magic"

cat <<EOF | sudo tee /etc/dovecot/sieve/scan_reported_mails.sh
#!/usr/bin/env bash
# Author: Zhang Huangbin <[email protected]>
# Purpose: Copy spam/ham to another directory and call sa-learn to learn.

# Paths to find program.
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH"

export OWNER="dovecot"
export GROUP="mail"

# The spamd daemon user.
export SPAMD_USER='debian-spamd'
export SPAMD_USER_HOMEDIR="$(eval echo ~${SPAMD_USER})"

# Kernel name, in upper cases.
export KERNEL_NAME="$(uname -s | tr '[a-z]' '[A-Z]')"

# A temporary lock file. should be removed after successfully examed messages.
export LOCK_FILE='/tmp/scan_reported_mails.lock'

# Logging to syslog with 'logger' command.
export LOG='logger -p local5.info -t scan_reported_mails'

# `sa-learn` command, with optional arguments.
export SA_LEARN="sa-learn -u ${SPAMD_USER} --dbpath ${SPAMD_USER_HOMEDIR}/.spamassassin"

# Spool directory.
# Must be owned by dovecot:mail.
export SPOOL_DIR='/var/mail/imapsieve_copy'

# Directories which store spam and ham emails.
# These 2 should be created while setup SPAMD antispam plugin.
export SPOOL_SPAM_DIR="${SPOOL_DIR}/spam"
export SPOOL_HAM_DIR="${SPOOL_DIR}/ham"

# Directory used to store emails we're going to process.
# We will copy new spam/ham messages to these directories, scan them, then
# remove them.
export SPOOL_LEARN_SPAM_DIR="${SPOOL_DIR}/processing/spam"
export SPOOL_LEARN_HAM_DIR="${SPOOL_DIR}/processing/ham"

if [ -e ${LOCK_FILE} ]; then
    find $(dirname ${LOCK_FILE}) -maxdepth 1 -ctime 1 "$(basename ${LOCK_FILE})" >/dev/null 2>&1
    if [ X"$?" == X'0' ]; then
        rm -f ${LOCK_FILE} >/dev/null 2>&1
    else
        ${LOG} "Lock file exists (${LOCK_FILE}), abort."
        exit
    fi
fi

for dir in "${SPOOL_DIR}" "${SPOOL_LEARN_SPAM_DIR}" "${SPOOL_LEARN_HAM_DIR}"; do
    if [[ ! -d ${dir} ]]; then
        mkdir -p ${dir}
    fi

    chown ${OWNER}:${GROUP} ${dir}
    chmod 0700 ${dir}
done

# If there're a lot files, direct `mv` command may fail with error like
# `argument list too long`, so we need `find` in this case.
if [[ X"${KERNEL_NAME}" == X'OPENBSD' ]] || [[ X"${KERNEL_NAME}" == X'FREEBSD' ]]; then
    [[ -d ${SPOOL_SPAM_DIR} ]] && find ${SPOOL_SPAM_DIR} -name '*.eml' -exec mv {} ${SPOOL_LEARN_SPAM_DIR}/ \;
    [[ -d ${SPOOL_HAM_DIR} ]]  && find ${SPOOL_HAM_DIR}  -name '*.eml' -exec mv {} ${SPOOL_LEARN_HAM_DIR}/  \;
else
    [[ -d ${SPOOL_SPAM_DIR} ]] && find ${SPOOL_SPAM_DIR} -name '*.eml' -exec mv -t ${SPOOL_LEARN_SPAM_DIR}/ {} +
    [[ -d ${SPOOL_HAM_DIR} ]]  && find ${SPOOL_HAM_DIR}  -name '*.eml' -exec mv -t ${SPOOL_LEARN_HAM_DIR}/  {} +
fi

# Try to delete empty directory, if failed, that means we have some messages to
# scan.
rmdir ${SPOOL_LEARN_SPAM_DIR} &>/dev/null
if [[ X"$?" != X'0' ]]; then
    output="$(${SA_LEARN} --spam ${SPOOL_LEARN_SPAM_DIR})"
    rm -rf ${SPOOL_LEARN_SPAM_DIR} &>/dev/null
    ${LOG} '[SPAM]' ${output}
fi

rmdir ${SPOOL_LEARN_HAM_DIR} &>/dev/null
if [[ X"$?" != X'0' ]]; then
    output="$(${SA_LEARN} --ham ${SPOOL_LEARN_HAM_DIR})"
    rm -rf ${SPOOL_LEARN_HAM_DIR} &>/dev/null
    ${LOG} '[CLEAN]' ${output}
fi

rm -f ${LOCK_FILE} &>/dev/null
EOF

cat <<EOF | sudo tee /etc/cron.daily/spamham
#!/usr/bin/env
bash /etc/dovecot/sieve/scan_reported_mails.sh
EOF

chmod 755 /etc/cron.daily/spamham && chown root:root /etc/cron.daily/spamham
1 Like

If you like, you can send us a PR on github with this feature. We would love to review and inplement it, if you need additional help, you can reach us on discord.

1 Like

I found a few correction I need to make. I am not sure how to do a PR on github. I’ve only ever used it for the versioning and backup of things.

The way I tried to create the files tried to use the variables in the script not make a file with them in the files script. Do you know a way that won’t do that instead of cat <<EOF | sudo tee?

You can pull any file from /usr/local/hestia/install/deb

Just drop the file in:

Then it will be installed non default if sieve has been selected

1 Like

I just got to update for a few permission errors I just fixed and figure out a replacement for the command <<EOF | sudo tee

Ok so I had to change if from a one file script so I put it in a repository for now for anyone who wants to use it. its working 100% for me currently. Most likely better way to implement then this but it works if anyone wants auto learning. only thing not included is spamassassin config. Make sure learning stuff is set.

I have no clue about the github thing, I had to set it up as the needed files and dir and a bash script that puts things where they need to be and sets permissions. I don’t know how to PR that onto the repo on hestia. if anyone wants to they are welcome to take this and do that.

@djav1985 Does your version work with the current HestiaCP version? For me, emails always end up in the spam folder, even though they are not spam xD

BTW: I have rarely false positives. Only from one provider the emails marked as spam.

Yeah it’s still working for me