I have always regarded forums as a way for 1 persons experience to save the struggle for another. When I started this blog, I set out to offer my own experiences and fixes to help others that may experience the same. So I hope this reaches someone out there and is useful for them.
Last week my IT team spent several hours attempting to resolve an issue where 1 of our MX servers would return 0 new messages when, clearly there were over 60,000 new email between /home/vmail/%mailbox%/MailDir/cur and new. Now, we have a complex mail setup that pulls email from dovecot mailbox and inserts the email into a database for the the application. There are several layers involved, so troubleshooting a system like that can take several hours, going through each layer to determine if the program that picks up the mail was faulty or the dovecot server itself.
Below are the steps i took to troubleshoot the issue.
- Log into the MX server and run [ # tail -f /var/log/dovecot.log]
- open email client or start email client program to fetch new mail
- watch dovecot logs to determine if another program is picking up the mail prior to our mail program is able to fetch the new mail.
- Rule out another process picking up the mail.
- watch /home/vmail/%mailbox%/MailDir/new for new incoming mail.
- once mail is in new queue, run mail client to fetch new mail.
- At this point i noticed mail being moved from “new” to “cur” while the mail client still reported no new mail to pick up
- Next I looked at the mail client to see when the last new message was picked up. I made not of the date stamp on the last mail message that was picked up.
- I then went back to the dovecot MX server and ran [ # ls-l /home/vmail/%mailbox%/MailDir/ ]
- this returns a few directories and a few key files that are the focus of this blog post. Dovecot.index, dovecot.index.cache, and dovecot.index.log. These files are what dovecot uses to make loading of mail in the mailbox faster. They are an index of the transactions on that particular mailbox. What i noticed was that the dovecot.index file, had the same timestamp as the timestamp of the last successful email that was received by the client. Even though new mail was obviously coming in, the dovecot index was not updating.
- I then, renamed all three files to dovecot.index.old, dovecot.index.cache.old, and dovecot.index.log.old.
- I proceeded to restart the dovecot service by running [ # services dovecot restart]
- Verified that dovecot.index, dovecot.index.cache, and dovecot.index.log were rebuilt after the service restarted
- then I restarted my mail client and fetched new mail.
- BAM! 60,000 new messages now are seen and fetched.
I hope this finds usefulness to anyone out there having the same issue. Feel free to leave your comments or questions anytime!. Good Luck!.