CDR Tickets

Issue Number 3694
Summary Update Filter Tools to adjust to CBIIT Environment
Created 2013-12-19 18:17:24
Issue Type Task
Submitted By Englisch, Volker (NIH/NCI) [C]
Assigned To Kline, Bob (NIH/NCI) [C]
Status Closed
Resolved 2014-01-13 15:32:20
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.116341
Description

The InstallFilter.py and UpdateFilter.py tools need to be modified in order to work within the CBIIT environment.

Comment entered 2013-12-19 18:19:37 by Englisch, Volker (NIH/NCI) [C]

Please also refer to the issue OCECDR-3599 for the already modified tool CreateNewFilter.py.

Comment entered 2013-12-20 11:08:59 by Kline, Bob (NIH/NCI) [C]

I have done the modifications to UpdateFilter.py, tested them on PROD (from the bastion host), and checked in the changes to Subversion.

  • R12187 trunk/Bin/UpdateFilter.py

I thought about stopping there, and just using CreateNewFilter.py on the lower tiers to replace InstallFilter.py. Using this approach, you'd run InstallFilter.py on PROD, which gives you the stub filter file, named using the CDR ID assigned by the production server for the new document. This file gets checked into trunk/Filters. Then you'd do the same thing on DEV and/or QA, which would result in a new document on the lower tier server(s), with (almost certainly) different document ID(s). You'd make a note of the ID(s) for the lower tier(s), but discard the file written by the script in these lower-tier runs. You'd have to make sure you were in a directory other than your sandbox trunk/Filters, to avoid the (admittedly low) risk that the file written by the script didn't overwrite the file for another filter which happened to have the same ID in production as your new lower-tier document. Then you edit the file you checked into version control, and use UpdateFilter.py to store the changes on the local tier's server. Eventually, your testing on the lower tiers will have convinced you that the filter is ready to be promoted, and you'd run UpdateFilter.py on PROD.

However, I'm inclined to think that's too kludgey, even for the CBIIT environment, and I'm not comfortable even with the small risk described above. So I'm thinking of going ahead with updating InstallFilter.py so it can be used on the lower tiers in CBIIT's world. You'll have to give the script the filter title, since the script will no longer be able to look up the title on the production server itself.

Here's how I picture the life cycle of a filter in CBIIT Land:

  1. [PROD BASTION]\sandbox\trunk\Filters> CreateNewFilter.py UID PWD "TITLE OF THE FILTER"

  2. [PROD BASTION]\sandbox\trunk\Filters> svn add CDR0000999999.xml

  3. [PROD BASTION]\sandbox\trunk\Filters> svn ci -m "WONDERFUL COMMENT" CDR0000999999.xml

  4. [DEV]\sandbox\trunk\Filters> svn up

  5. [DEV]\sandbox\trunk\Filters> InstallFilter.py UID PWD "TITLE OF THE FILTER" CDR0000999999.xml

  6. Note down CDR ID of created document (e.g., CDR0000888888)

  7. Edit CDR0000999999.xml in DEV sandbox

  8. [DEV]\sandbox\trunk\Filters> UpdateFilter.py -i 888888 -p Y -c "ANOTHER WONDERFUL COMMENT" CDR0000999999.xml

  9. Test the filter on DEV, repeating previous two steps as needed

  10. [DEV]\sandbox\trunk\Filters> svn ci -m "SUBVERSION COMMENT" CDR0000999999.xml

  11. Optionally repeat DEV steps on QA and/or STAGE

  12. [PROD BASTION]\sandbox\trunk\Filters> svn up

  13. [PROD BASTION]\sandbox\trunk\Filters> UpdateFilter.py -p Y -c "SUPERLATIVE COMMENT" CDR0000999999.xml

If you want, there's nothing to prevent editing CDR0000999999.xml before installing it on DEV (between steps 4 and 5).

I'll hold off on doing any work on InstallFilter.py until Volker chimes in with any feedback.

Comment entered 2013-12-20 11:15:14 by Kline, Bob (NIH/NCI) [C]

Alan:

I should have added you before posting that previous comment. You might have some thoughts about the approach described above.

There's no doubt that the approach I'm describing is more cumbersome than what we had in the OCE hosting environment. Not much we can do about that, as far as I can tell, given the restrictions which prevent the CDR servers on the different tiers from talking to each other. It would be possible, I suppose, to cobble together something along the lines of what I've done for the filter diff program, using local copies of the filter information from the production repository to do the title/id lookups, but we'd have to be pretty assiduous about keeping that local information current, and I'm not inclined to think it would be worth it.

Your thoughts?

Comment entered 2013-12-20 15:36:42 by Englisch, Volker (NIH/NCI) [C]

My picture was a little different. We haven't created a new filter yet but these are the steps I immagined:

  1. [DEV] CreateNewFilter.py UID PWD "Weisse Weihnacht in Hawaii" --> creating CDR0000777777.xml

  2. [DEV] Lots of filter editing and testing. When done ...

  3. [DEV] svn add CDR0000777777.xml; svn commit CDR0000777777.xml

  4. [PROD] Aarghhh!!!! Now I see a problem

We want to version the same file name that's being created on PROD which means we would need to create a new filter on PROD (CDR0000666666.xml) and copy CDR0000777777.xml to CDR0000666666.xml on DEV before we version the filter.

I think I need to think some more.

Comment entered 2013-12-20 16:25:55 by Kline, Bob (NIH/NCI) [C]

You're right that we need to create the filter first on PROD, because the name of the file we'll work on (and store in Subversion) needs to be based on the CDR ID on the production server. In fact, that part of the process (a new filter gets created first as a stub on PROD) was true even before the migration to CBIIT hosting. There's no need to copy CDR<PROD-ID>.xml to CDR<DEV-ID>.xml, though. We'll always work with CDR<PROD-ID>.xml (which gets copied to DEV with the "svn up ..." command above in my step 4). The only time we use the CDR ID for the document on DEV is when we run UpdateFilter.py on DEV, to which we provide the -i command-line argument so the script knows which document to update on DEV. It's an annoyance to write down what the separate ID is on DEV, when the script could figure that out for us by matching the title of the filter on DEV with the corresponding filter title on PROD, but CBIIT has taken that convenience away from us.

Read over my steps carefully, and see if they don't make sense.

Comment entered 2013-12-20 17:00:19 by Englisch, Volker (NIH/NCI) [C]

Read over my steps carefully, and see if they don't make sense.

There is nothing wrong with your steps - they do make sense. I was just hoping we could use fewer steps overall. I'm not comfortable with versioning a filter on PROD but this seems to be the only way to create the correct file name in SVN and CDR-ID and it's the most convenient way (via subversion) to get the filters to the other tiers. As usual, it took a few more cycles for my brain to see that what your brain came up with makes the most sense. (I'm easily convinced :-) )

Comment entered 2013-12-20 17:39:41 by Kline, Bob (NIH/NCI) [C]

I wouldn't worry too much about creating extra stub filters on PROD - we can always make them "disappear" with the CdrDelDoc command. :-)

Comment entered 2014-01-08 14:28:16 by Kline, Bob (NIH/NCI) [C]

I'm going to follow up on a suggestion Alan made a few days ago of embedding in the filter documents themselves what we need for matching up IDs for the filter documents across tiers. Here's what I plan to do.

Step 1: normalize the storage approach for the filter files in Subversion

For this I'm going to write a script which identifies the filters which are still stored in version control the old way, with the CdrDoc wrapper around the real filter document stored as a CDATA section. For each such filter, I'll create a new version of the filter, stored without the CdrDoc wrapper. I'll check these new versions into version control.

Step 2: inject the filter title into the filter document

I'll create a second script, which I'll run on the PROD bastion host, to ask the production database for the titles of each filter stored in version control. I'll create a new top-level variable element to store that title in the filter document. For example:

<?xml version="1.0"?>
<xsl:transform           xmlns:xsl = "http://www.w3.org/1999/XSL/Transform"
                           version = "1.0"
                         xmlns:cdr = "cips.nci.nih.gov/cdr">
    
 <xsl:variable                name = "cdr-filter-title"
                            select = "Denormalization Filter: Summary"/>
 .
 .
 .
</xsl:transform>

Step 3: modify CreateNewFilter.py

The script for creating a new filter (always on production) will take the title given on the command line and use it to create the cdr-filter-title variable in the new filter document.

Step 4: modify InstallFilter.py

This script will not use the title argument described in the earlier comments above. Instead it will use the cdr-filter-title variable found in the document whose file is named on the command line. The invocation of the script will look like this:

[DEV]\sandbox\trunk\Filters> InstallFilter.py UID PWD CDR0000999999.xml

If the cdr-filter-title variable is not found in the document, the script will refuse to do anything except print an error message. You'll have to fix the filter file first.

Step 5: modify UpdateFilter.py

The document ID argument (-i) will no longer be used by this program. Instead it will pull the filter title from the cdr-filter-title variable found in the document whose file is named on the command line. If that variable isn't found, or doesn't match any filter documents on the tier where the script is being run, an error message will be printed. Otherwise, the filter document represented by that title will be updated from the file named on the command line. Invoked like this:

[ANY TIER]\sandbox\trunk\Filters> UpdateFilter.py [options] UID PWD CDR0000999999.xml

Options can be supplied (as before) asking for a publishable version or supplying a version comment.

Step 6: create new script ModifyFilterTitle.py

In the rare case when you decide that a filter title needs to change (usually not a good idea, as once a filter is in production there's usually software that expects to find it by name), you'll need to modify the cdr-filter-title variable in the filter document to set the new title, then run this script once on each tier where the filter has been installed. Usage:

[ANY TIER]\sandbox\trunk\Filters> ModifyFilterTitle.py [options] UID PWD CDRID CDR0000999999.xml

You can optionally supply a comment, which will default to "Filter title changed."

Now's the time to stop me if you don't think this is a good approach. Your thoughts?

Comment entered 2014-01-08 14:44:46 by Englisch, Volker (NIH/NCI) [C]

Now's the time to stop me if you don't think this is a good approach. Your thoughts?

Can I get a little time to think about this? I need time to digest and compare.

Comment entered 2014-01-08 14:56:59 by Kline, Bob (NIH/NCI) [C]

I can hold off until tomorrow, if you want to discuss the proposal then.

Comment entered 2014-01-08 16:45:41 by Kline, Bob (NIH/NCI) [C]

While we're at it, I propose marking the filter documents which have been deleted from version control (e.g., the old Summary denormalization filters) as deleted (active_status="D"); any objections?

Comment entered 2014-01-08 17:30:44 by Englisch, Volker (NIH/NCI) [C]

any objections?

Not at all. I like a clean house.

Comment entered 2014-01-08 18:02:05 by Kline, Bob (NIH/NCI) [C]

Similarly, I plan to push into the version control attic (svn rm obsolete-filter.xml) filter documents which have been marked "deleted" in the CDR.

Comment entered 2014-01-08 18:32:26 by Englisch, Volker (NIH/NCI) [C]

Now's the time to stop me if you don't think this is a good approach. Your thoughts?

I like it.

Remind me again where the current filter title is stored. That's the title value coming from the document table, right? In this case I'm assuming that - unless the filter has been renamed - the value of the cdr-filter-title variable and the title value of the filter would always be the same.
Will we keep these identical? If you're using a filter by name this might get tricky if those values aren't identical.

Comment entered 2014-01-08 21:31:04 by Kline, Bob (NIH/NCI) [C]

Yes, the filter title is stored in the title column of the document table.

Will we keep these identical?

Yes. The weakest link in the chain for the approach I'm describing is that there's nothing to prevent you from mucking with the cdr-filter-title variable inside the document when you're editing that document. The only time you should touch that variable is when you're deliberately preparing to use ModifyFilterTitle.py. If you change the variable to a title that doesn't match a CDR filter and try to use ModifyFilter.py I'll just complain. If you change the variable to a title which matches another CDR filter, I'll overwrite that filter with what's in the file you just modified, which probably isn't what you want.

On the other hand, I don't think this dependency on filter title integrity is any more fragile than the system we were using pre-CBIIT.

Comment entered 2014-01-10 08:38:23 by Kline, Bob (NIH/NCI) [C]

When I woke up this morning, I realized that I can't use <xsl:variable/> elements to store the filter titles, because XSL/T doesn't permit two global variables with the same name and precedence (we wouldn't have this problem if we used xsl:import directives where we use xsl:include, because imports introduce different precedence levels). So I'm going to use a comment instead.

<?xml version = "1.0" encoding="utf-8"?>
<!-- Filter title: DocTitle for Summary -->

We've already got such a comment in many of the filters. I will modify my script to inject the comment (where it's not already present) instead of the xsl:variable element, and to ensure that the filter title comments that are present match the title in the CDR. Same caveat will apply as given above for the xsl:variable element - be careful to leave this comment alone unless you are preparing to invoke ModifyFilterTitle.py on all the tiers where the filter exists.

Sound reasonable?

Comment entered 2014-01-10 13:54:41 by alan

I can't think of any downside of doing it this way. It seems fine to me.

Comment entered 2014-01-10 15:05:08 by Englisch, Volker (NIH/NCI) [C]

When I woke up this morning, I realized that I can't use <xsl:variable/> elements ...

I'm sorry to hear you're waking up with a nightmare. :-)

I'm guessing that many (maybe all?) filters that are stored in the new format already include this "Filter title" comment. I don't see a problem with using this approach instead but I also don't understand the problem. Are we already using the variable "cdr-filter-title" anywhere else? Under what circumstances would we run into a conflict?

Comment entered 2014-01-10 15:23:40 by Kline, Bob (NIH/NCI) [C]

Filter A includes filter B. Both filters have a "cdr-filter-title" variable. After the inclusion takes place, we now have two variables with the same name within the same scope. The XSL/T engine will complain.

Comment entered 2014-01-10 15:32:26 by Englisch, Volker (NIH/NCI) [C]

So, the XSL/T engine will complain about the existence of the variable. I thought since we're not actually using it anywhere it should be OK. I guess not.

Comment entered 2014-01-13 10:49:35 by Kline, Bob (NIH/NCI) [C]

Steps 1 and 2 have been done. You'll want to refresh your filter sandboxes before doing further work on any of the filters.

Comment entered 2014-01-13 12:20:28 by Englisch, Volker (NIH/NCI) [C]

Done. My directories are so empty now. I like it.

Comment entered 2014-01-13 15:32:20 by Kline, Bob (NIH/NCI) [C]

I have finished the work on the tool adjustments. Please test. A good test plan would include:

  • create one or more test filters on PROD (CreateNewFilter.py)

  • check it/them into Subversion

  • install it/them on lower tiers (InstallFilter.py)

  • edit the filter doc in your sandbox

  • use UpdateFilter.py to install the changes (on each tier)

  • change the filter title (ModifyFilterTitle.py, all tiers)

  • remove the test filters from subversion

  • remove the test filters from the CDR servers (https://cdr.cancer.gov/cgi-bin/cdr/del-some-docs.py)

Here are the tools:

  • R12278 trunk/Bin/CreateNewFilter.py

  • R12278 trunk/Bin/InstallFilter.py

  • R12278 trunk/Bin/UpdateFilter.py

  • R12278 trunk/Bin/ModifyFilterTitle.py

If your testing comes out OK, we can install the new versions in d:\cdr\Bin on the lower tiers (C:\cdr\Bin on the bastion host for the upper tiers); meanwhile, best to invoke the new versions with specific paths to your sandbox).

Comment entered 2014-01-14 17:54:49 by Englisch, Volker (NIH/NCI) [C]

I've created a new filter on PROD, moved it to DEV, installed it, updated it, and modified its name and everything worked as expected (and documented).

Comment entered 2014-01-14 17:55:59 by Englisch, Volker (NIH/NCI) [C]

Verified on DEV/PROD

Comment entered 2014-01-23 08:35:29 by Kline, Bob (NIH/NCI) [C]

The tools have been installed in \cdr\bin on all four tiers. Closing ticket.

Elapsed: 0:00:00.001491