CDR Tickets

Issue Number 3781
Summary Modify Gen. Prof. Publication Notification program to report failed emails
Created 2014-07-08 20:54:40
Issue Type New Feature
Submitted By Osei-Poku, William (NIH/NCI) [C]
Assigned To Kline, Bob (NIH/NCI) [C]
Status Closed
Resolved 2014-10-30 10:55:59
Resolution Won't Fix
Path /home/bkline/backups/jira/ocecdr/issue.134701
Description

We discussed this a while back. When the Genetics Professional Publication Notification program fails, the emails that failed to be generated are not logged or reported so there's no way to follow up with the professional. We briefly discussed making changes to the error log to list the failed email addresses.

Comment entered 2014-10-02 17:08:39 by Kline, Bob (NIH/NCI) [C]

Just to be clear about the scope of this enhancement: we're talking about failures that happen while the CGI script is running, right? So if I can't connect to the NIH mail server (for example) you would expect me to log the failure (I assume you wouldn't expect me to send you an email notification if I can't connect to a mail server). On the other hand, you're not expecting the software to know that the email message eventually bounced because the NIH mail server gave up after trying for several days to deliver the message, right?

Comment entered 2014-10-02 19:01:01 by Osei-Poku, William (NIH/NCI) [C]

>>Just to be clear about the scope of this enhancement: we're talking about failures that happen while the CGI script is running, right?
Yes.

>>So if I can't connect to the NIH mail server (for example) you would expect me to log the failure (I assume you wouldn't expect me to send you an email notification if I can't connect to a mail server).

Yes, log the failure but also inform me so that I can run the mailer job again at a later time or report the issue to be investigated.

>> On the other hand, you're not expecting the software to know that the email message eventually bounced because the NIH mail server gave up after trying for several days to deliver the message, right?

That is correct.

Comment entered 2014-10-03 08:28:51 by Kline, Bob (NIH/NCI) [C]

I've gone through the code and it looks like it's already letting you know which GP docs failed when trying to send out the notification. Can you give me an example of one which failed and the web page which came back didn't say so?

Comment entered 2014-10-03 09:13:27 by Kline, Bob (NIH/NCI) [C]

I've added a query to https://cdr.cancer.gov/cgi-bin/cdr/CdrQueries.py called Published GPs Needing Notification, which might be helpful. Again, it won't show you anything about messages which we successfully handed off to the NIH mail server, but which that server was unable to deliver.

As a side note, JIRA doesn't give you as much help as Bugzilla did (no surprise), but if you prefix a quoted paragraph with bq., you can mark it in a way which isn't thwarted when the paragraph wraps multiple lines (when quoting a passage which spans multiple paragraphs, wrap it with {quote} on each side - click the yellow circle below with the Help question mark for more info); for example:

Just to be clear about the scope of this enhancement: we're talking about failures that happen while the CGI script is running, right?

Yes.

So if I can't connect to the NIH mail server (for example) you would expect me to log the failure (I assume you wouldn't expect me to send you an email notification if I can't connect to a mail server).

Yes, log the failure but also inform me so that I can run the mailer job again at a later time or report the issue to be investigated.

On the other hand, you're not expecting the software to know that the email message eventually bounced because the NIH mail server gave up after trying for several days to deliver the message, right?

That is correct.

Comment entered 2014-10-03 10:23:31 by Kline, Bob (NIH/NCI) [C]

... click the yellow circle below with the Help question mark for more info ....

Oops! That icon only appears when you're actually creating/editing a comment. The blue framed rectangular icon to its left toggles between a preview and editing of the comment.

Comment entered 2014-10-03 10:25:45 by Osei-Poku, William (NIH/NCI) [C]

Thanks for the Jira tips. I will give it a try:

Can you give me an example of one which failed and the web page which came back didn't say so?

How is it reported to me? Through email or on the HTML page?

Comment entered 2014-10-03 10:39:11 by Kline, Bob (NIH/NCI) [C]

How is it reported to me? Through email or on the HTML page?

On the HTML page which comes back. That page contains a table, with one row for each GP for which an attempt was made to send a notification. Rows for notifications which succeed get four cells, for CDR ID, Name, Email Address, and Mailer Tracking Doc. Rows for notifications which fail get a single cell with an error message.

I suspect what you're dealing with are failures which occur after the script has finished sending out the emails.

Comment entered 2014-10-03 10:57:28 by Osei-Poku, William (NIH/NCI) [C]

When a mailer job completes successfully or not, I get the "CBIIT-PROD: CDR Mailer Job Status" email with a link to the report. It is the same page that is displayed after running the mailer job.

https://cdr.cancer.gov/cgi-bin/cdr/PubStatus.py?id=11952

I do not remember getting any other report or page. I will check to see if a page besides this is displayed the next time I run a mailer job.

Comment entered 2014-10-03 11:15:11 by Kline, Bob (NIH/NCI) [C]

I did some low-level testing to confirm that the NIH mail server is deferring any attempts to deliver the messages, which means that the script won't know about addresses which are stale or otherwise undeliverable. Some servers try to deliver the mail right away (that's how things worked when I was running my own mail server), but NIH's servers queue the mail up for later delivery.

>>> import smtplib
>>> import cdr
>>> server = smtplib.SMTP(cdr.SMTP_RELAY)
>>> server.set_debuglevel(1)
>>> sender = "bkline@rksystems.com"
>>> recipients = ["bkline@mail.nih.gov", "klem@kadiddle.com"]
>>> message = """\
... From: bkline@rksystems.com
... To: bkline@mail.nih.gov, klem@kadiddle.com
... Subject: Test for mail failure
...
... This is the body of the test message.
... """
>>> undeliverable = server.sendmail(sender, recipients, message)
send: 'ehlo nciws-d141-v.nci.nih.gov\r\n'
reply: '250-mailfwd.nih.gov Hello [156.40.114.87]\r\n'
reply: '250-SIZE 52428800\r\n'
reply: '250-PIPELINING\r\n'
reply: '250-DSN\r\n'
reply: '250-ENHANCEDSTATUSCODES\r\n'
reply: '250-AUTH\r\n'
reply: '250-8BITMIME\r\n'
reply: '250-BINARYMIME\r\n'
reply: '250-CHUNKING\r\n'
reply: '250 XEXCH50\r\n'
reply: retcode (250); Msg: mailfwd.nih.gov Hello [156.40.114.87]
SIZE 52428800
PIPELINING
DSN
ENHANCEDSTATUSCODES
AUTH
8BITMIME
BINARYMIME
CHUNKING
XEXCH50
send: 'mail FROM:<bkline@rksystems.com> size=140\r\n'
reply: '250 2.1.0 Sender OK\r\n'
reply: retcode (250); Msg: 2.1.0 Sender OK
send: 'rcpt TO:<bkline@mail.nih.gov>\r\n'
reply: '250 2.1.5 Recipient OK\r\n'
reply: retcode (250); Msg: 2.1.5 Recipient OK
send: 'rcpt TO:<klem@kadiddle.com>\r\n'
reply: '250 2.1.5 Recipient OK\r\n'
reply: retcode (250); Msg: 2.1.5 Recipient OK
send: 'data\r\n'
reply: '354 Start mail input; end with <CRLF>.<CRLF>\r\n'
reply: retcode (354); Msg: Start mail input; end with <CRLF>.<CRLF>
data: (354, 'Start mail input; end with <CRLF>.<CRLF>')
send: 'From: bkline@rksystems.com\r\nTo: bkline@mail.nih.gov, klem@kadiddle.com\r\nSubject: Test for mail failure\r\n\r\nThis is the body of the test
message.\r\n.\r\n'
reply: '250 2.6.0 <437be8d6-aaed-4ff2-97bf-482964f4883a@CESEDGE01.nih.gov> [InternalId=35624061] Queued mail for delivery\r\n'
reply: retcode (250); Msg: 2.6.0 <437be8d6-aaed-4ff2-97bf-482964f4883a@CESEDGE01.nih.gov> [InternalId=35624061] Queued mail for delivery
data: (250, '2.6.0 <437be8d6-aaed-4ff2-97bf-482964f4883a@CESEDGE01.nih.gov> [InternalId=35624061] Queued mail for delivery')
>>> print undeliverable
{}

As you can see, there's no clue that the message won't ever get delivered to Klem.

Comment entered 2014-10-30 10:55:21 by Osei-Poku, William (NIH/NCI) [C]

We decided to close this issue and re-open if it happens again and it can be fixed.

Elapsed: 0:00:00.001681