CDR Tickets

Issue Number 4216
Summary Modify URL Check Report to exclude redirected links
Created 2017-01-17 12:59:21
Issue Type Improvement
Submitted By Osei-Poku, William (NIH/NCI) [C]
Assigned To Kline, Bob (NIH/NCI) [C]
Status Closed
Resolved 2017-01-31 11:27:00
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.201516
Description

The URL Check report is displaying all the glossary terms (Related External Refs to glossary terms) as 302 redirection errors or warnings. This creates a lot of noise in the report. Since we are not going to do a global to change the protocols, is there a way you can suppress these 302 errors so that they don't get reported?

To reproduce the problem, you may select the following search criteria:

Doc Type: Glossary Term Concept
Audience: Patient
Language: English
Type of Report: URL errors

Comment entered 2017-01-24 15:57:29 by Kline, Bob (NIH/NCI) [C]

Here are some ideas about how this report could be made more useful.

  1. Use some reasonable timeouts to speed up the processing

  2. Support the ability to upload an Excel workbook seeded with URL information we already have, and return a fresh workbook which contains a copy of the uploaded data, as well as URLs which were not in the original workbook and which did not result in a 200 ("OK") code. Each document type would be in a separate sheet, and the script would only check the document types corresponding to the names of the sheets in the uploaded workbook.

An example worksheet would have a table like this:

CDR ID

Path

URL

Code

Description

666

/Person/PersonLocations/PrivatePractice/PrivatePracticeLocation/WebSite/@cdr:xref

http://www.endicottmdpa.com/

None

Unable to connect

1390

/Person/PersonLocations/PrivatePractice/PrivatePracticeLocation/WebSite/@cdr:xref

http://www.drwag.com/

503

Service Unavailable

1447

/Person/PersonLocations/OtherPracticeLocation/SpecificWebSite/@cdr:xref

http://www.cucnj.urologydomain.com/handler.cfm?event=practice,main&nid=274

503

Service Unavailable

3638

/Person/PersonLocations/OtherPracticeLocation/SpecificWebSite/@cdr:xref

http://www.davidcaplinmd.com

403

Forbidden

4140

/Person/PersonLocations/PrivatePractice/PrivatePracticeLocation/WebSite/@cdr:xref

http://bdwilsonmd.com/index.html

404

Not Found

What do you think?

Comment entered 2017-01-26 14:51:13 by Osei-Poku, William (NIH/NCI) [C]

Bob will see if the "requests" package supports controlling whether or not redirects are followed. If so, we will let the user choose whether to include redirect codes.

If we find out this is not possible, we'll make a decision at that point as to whether to go back to using the older package.

Comment entered 2017-01-26 17:13:28 by Kline, Bob (NIH/NCI) [C]

The package does support controlling whether redirects are followed. We will let users control this option. What would you like the default to be?

Comment entered 2017-01-31 11:27:00 by Kline, Bob (NIH/NCI) [C]

Implemented on DEV (along with OCECDR-4219)

Comment entered 2017-02-16 18:31:53 by Osei-Poku, William (NIH/NCI) [C]

Verified on DEV.

Comment entered 2017-02-28 10:05:04 by Osei-Poku, William (NIH/NCI) [C]

Verified on QA.

Elapsed: 0:00:00.001217