One of our partner is having a difficulty in automating the email attachment processing. So basically attachments on each email, hosted on a secured POP3 server must be downloaded and saved to certain folder on a server. In addition to that, all attachment must be converted to tiff. See, this could be done easily with a cronjob, fetchmail and munpack. What you need:
- linux install, I’m using ubuntu 14.04 for this, since CentOS has outdated and buggy mpack package
- fetchmail to fetch the email from the pop3 server
- procmail. Since munpack works with Maildir format, we will be using procmail to transfer the email to Maildir
- mpack, for unpacking the attachments
- imagemagick, for jpegs and pngs conversions to tiff. It can also do pdf, but the result is horrible at best
- ghostscript for pdf to tiff conversion
Prepare your linux box properly, a vanilla install without a DE should suffice. Install the required packages, :
sudo apt-get install fetchmail procmail mpack imagemagick
ghostscript should be installed by default on your distro. If you’re using distributon other than Ubuntu and its’ derivatives, please make sure that you’re using the latest version of mpack. The next steps should be performed with non-root account. Let’s start first with fetchmail. Create fetchmail config file by doing
nano ~/.fetchmailrc
For the purpose I’m using gmail to simulate POP3 server access
poll pop.gmail.com protocol pop3 timeout 300 port 995 username "sovereign.khan@gmail.com" password "IsThisSparta?" keep mimedecode ssl sslcertck sslproto TLS1 mda "/usr/bin/procmail -m '/home/ikhsan/.procmailrc'"
Replace gmail pop server address with yours, and since the file will contain the password to the mailbox in clear text, it must be secured. Do
chmod 700 ~/.fetchmailrc
…so that only you can open and see the file. Please take a note that the last line of the config file contains a hook to call procmail and its’ corresponding config file. Next, to setup procmail. Create the config file for procmail. Create it where your fetchmail hook can find it.
nano ~/.procmailrc
The file should look like this
LOGFILE=/home/ikhsan/.procmail.log MAILDIR=/home/ikhsan/ VERBOSE=on :0 Maildir/
This will set /home/ikhsan/Maildir as the mail directory, and new mails will be delivered there. Now let’s create the folders that we will use to process the attachments:
mkdir ~/Maildir/process mkdir ~/Maildir/process/landing mkdir ~/Maildir/process/extract mkdir ~/Maildir/process/store mkdir ~/Maildir/process/archive
A bit of explanation for the folders:
- landing is where we first move new mails from procmail’s Maildir prior extracting the attachments
- extract is where we will perform attachment extraction
- store is the final destination of the attachments
- archive is where the mail files are stored after the process is done. If you want to reprocess certain files, just move it back to landing
And now, for the script. Create it wherever you like it, I personally kept all of my scripts in one place
nano ~/scripts/getmail.sh
The scripts is very simple and should be self explanatory:
#!/bin/bash DIR=/home/ikhsan/Maildir LOG=/home/ikhsan/Maildir/getmail.log date +%r-%-d/%-m/%-y >> $LOG fetchmail mv $DIR/new/* $DIR/process/landing/ cd $DIR/process/landing/ shopt -s nullglob for i in * do echo "processing $i" >> $LOG mkdir $DIR/process/extract/$i cp $i $DIR/process/extract/$i/ echo "saving backup $i to archive" >> $LOG mv $i $DIR/process/archive echo "unpacking $i" >> $LOG munpack -C $DIR/process/extract/$i -q $DIR/process/extract/$i/$i echo "converting pdf.." >> $LOG for x in $DIR/process/extract/$i/*.pdf do ranx=$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 5 | head -n 1) gs -sDEVICE=tiff24nc -dNOPAUSE -r300x300 -sOutputFile=$DIR/process/extract/$i/$i-$ranx.tiff -- $x rm $x done echo "next, the jpegs.." >> $LOG for y in $DIR/process/extract/$i/*.jpg do rany=$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 5 | head -n 1) convert $y $DIR/process/extract/$i/$i-$rany.tiff rm $y done echo "last, the pngs.." >> $LOG for z in $DIR/process/extract/$i/*.png do ranz=$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 5 | head -n 1) convert $z $DIR/process/extract/$i/$i-$ranz.tiff rm $z done done shopt -u nullglob echo "finishing.." >> $LOG mv $DIR/process/extract/* /$DIR/process/store/ echo "done!" >> $LOG
Each set of attachments will be kept on separate folder, tagged with the time and date of processing. Ghostscript is used to convert pdf to tiff, while ImageMagick’s convert is used for jpeg and png conversions. Call the script with with a cronjob. The script will not preserve the name of the attachments.
crontab -e
To set the script to check for new mail every minutes, do
*/1 * * * * /home/ikhsan/scripts/getmail.sh
Please be conservative with the schedule and consult with the mailserver admin, since some servers might relate periodical access with short interval to an attempt for Denial of Service attack. ..And we’re done 🙂
Thank you for providing this! Very helpful. This was the best source of info that I found that explains bringing fetchmail, procmail, and munpack together to retrieve/save emails in scripting.
You’re welcome 😀
kereen,.. pak ikhsan masih seneng ngoprek ya,…
pernah dapet case kayak gini juga pak, jadi download attachment, kemudian upload ke ftp. saat ini pake skrip powershell, make EWS soalnya service pop nya exchange ga dinyalain, jadi skrip nya masi jalan di windows,
mungkin kalo pak ikhsan pernah coba ews yang di linux, bisa share pak,.. sangat membantu bgt 🙂
terimakasih
Oi Phil 😀
Sayangnya belum pernah, tapi pada dasarnya metode di atas bisa dipakai, dengan menambahkan bridge ke OWA/EWS di depannya. Untuk itu bisa pakai DavMail, yang keluarannya smtp dan IMAP.
Jadi dibikin kaya gini:
OWA-[EWS]->Davmail-[IMAP]->Fetchmail+Procmail-[MailDir]->MPack. Selanjutnya tinggal ditambah script untuk ftp/sftp
Buat baca2 soal Davmail:
https://www.digitalocean.com/community/tutorials/how-to-setup-a-davmail-exchange-gateway-on-a-debian-7-vps
Kalo iseng ntar gue coba 😀