Issue Number | 3733 |
---|---|
Summary | Update scripts to preserve data on DEV after a database refresh |
Created | 2014-02-25 11:24:14 |
Issue Type | Improvement |
Submitted By | Kline, Bob (NIH/NCI) [C] |
Assigned To | Kline, Bob (NIH/NCI) [C] |
Status | Closed |
Resolved | 2014-03-17 12:36:06 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.118811 |
PullDevDocs.py -> PullDevData.py
PushDevDocs.py -> PushDevData.py
CheckDevData.py (new)
I've added a test document type (DadaForVolker) to DEV, so we can test restoration of a document type that doesn't exist at all in the PROD database, along with the new document type's documents. Here's an updated report on what's been preserved, including the new doctype.
Here's the documentation for the module which implements the supporting classes for the program whose source code we walked through yesterday. I've also posted the pydocs for the program itself. The source for the cdr_dev_data module is in Subversion:
R12392 trunk/lib/Python/cdr_dev_data.py
One advantage to the approach taken by the re-written scripts is the fact that because the new PullDevData script captures all of the documents for each of the document types to be preserved (as well as all of the rows in the control tables), the script to restore the documents can be run more than once. If, for example, the portion of that script which restores a document type which wasn't on PROD (along with its documents) fails (and that's the part likely to have bugs, since it's the trickiest, and uses logic which is relatively untested), the failure will occur after the restoration of the other (control) documents. Unlike the previous approach, which tried to capture a determination of changes which needed to be made in order to restore the pre-refresh control documents, the new approach makes that determination at restoration time by examining the current state of the DEV database and comparing it with the state we're trying to restore (as captured in the file system). This seems like a more robust approach. Another advantage to the new approach is that it eliminates the assumption that the state of the PROD database, on which the capture script made its decision, is identical to the state represented in the database backup of PROD which gets used for the refresh of DEV, which is a fragile assumption.
I think the only remaining window for unpleasant surprises would be the following sequence of events:
We capture the state of DEV we want to preserve
A developer changes that state on DEV
CBIIT replaces the database on DEV with the backup from PROD
The solution is to avoid that sequence (that is, make no changes are made to the state of the DEV database between the capture from which we will restore and the DB refresh from PROD). The fallback if we make a mistake is to find the developer's local copy of the change that didn't get preserved and manually restore it.
Does this all make sense?
Makes sense to me. I think this is an advance on what we had before.
Not sure how this got assigned to Erika, but I'm doing to work on this task, so I reassigned it to myself.
I'm guessing Erika moved the task around in the Agile view which tends to do funny things with the assignments or maybe Erika really wanted to do some programming herself. :-)
The restore script has been run successfully on DEV. I have asked Volker and Alan to let me know if they see anything unexpected.
R12456 /trunk/DevTools/Utilities/PushDevData.py
File Name | Posted | User |
---|---|---|
cdr_dev_data.html | 2014-02-28 10:15:48 | |
DevDataCheck-20140226.html | 2014-02-26 16:17:38 | |
PushDevData.html | 2014-02-28 09:39:30 |
Elapsed: 0:00:00.000713