CDR Tickets

Issue Number 3622
Summary (4.11.1) CDR Archival Service Script
Created 2012-10-01 10:54:22
Issue Type Task
Submitted By alan
Assigned To alan
Status Closed
Resolved 2014-04-08 11:12:26
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.81021
Description

Migrate Archival Service Script from CTB to CBIIT servers

Comment entered 2013-07-10 11:39:11 by chengep

Hi Volker, I am doing some clean-up with all the CDR data center migration stories. Is this the story where we said we can take care of after we finish migrating to CBIIT prod (which we just about have)? Thanks, Erika.

Comment entered 2013-07-10 11:47:42 by Englisch, Volker (NIH/NCI) [C]

Yes it is and the best person to work on this would be Alan.

Comment entered 2013-07-10 11:56:16 by chengep

Alan, we can strategize about how to best address this (whether it has to go into a "release" or can be done separately). Thanks, Erika

Comment entered 2013-07-10 12:20:39 by chengep

Changed fix version from CDR CBIIT DEV to CDR Final, and took off "(CBIIT DEV)" from the summary field.

Comment entered 2013-07-15 13:34:12 by alan

The script migrates archival documents, i.e., previous versions of documents which are never allowed to change from the actively updated "cdr" database to the "cdr_archived_versions" database - which does not change until the next time the script is run.

This is something that is done periodically, usually after at least a year has passed from the last time. The main consequence of not doing it is that backup times gradually get longer and the time taken to refresh the database on DEV or QA from PROD takes longer. However there is no impact one way or the other on users.

It's not part of a "build" in that the software that does this doesn't go into regular production, unless we count once a year as "regular".

The scripts that do the work were written in a very different environment from CBIIT and would have to be carefully tested. I therefore think we should only work on this after we think the more important user oriented issues have settled down.

Comment entered 2013-09-23 11:52:05 by Kline, Bob (NIH/NCI) [C]

I've raised the priority on this issue, as we haven't done this cleanup since 2011.

Comment entered 2014-02-18 23:53:56 by alan
I've made what I think are all of the required modifications to
ArchiveDocVersionXML.py, to wit:

    Eliminated testing of SQL Server database and log file sizes.
    That can't be done in the CBIIT environment.

    Commented out the CDR login requirement.  This was never
    actually used for any purpose except to make sure that the
    person running the script was a valid CDR user.  That may not
    be the case if CBIIT ever runs the script.

    Changed the default from single-user to multi-user database
    mode.  I don't know if we or the CBIIT web team even have the
    rights to put the database in single user mode, and I fear
    that the side effects are much less controllable here.  I
    haven't removed the code for single user mode, but the
    command for invoking it is not on the help screen.  Only a
    programmer reading the code would find it.

    Modified the logfile naming so that a user can run the
    program on a bastion host without needing access to the
    d:\cdr\log
    
I've tested the new version and it appears to work.  The new code
is in svn.

I ran a reportcopy command on Dev and got the following results:

-------------- Dev ---------------
Program parameters:
    command       = reportcopy
    cfgBatchSize  = 1000
    cfgBatchCount = 999999999
    cfgSingleUser = False
    logFullName   = d:\home\alan\ArchiveDocVersionXML.log

Tue Feb 18 22:30:01 2014: Will execute query:
    SELECT COUNT(DISTINCT v.id), COUNT (v.num)
      FROM all_doc_versions v
     WHERE v.xml IS NOT NULL
       AND NOT EXISTS (
         SELECT a2.id, a2.num
           FROM cdr_archived_versions..doc_version_xml a2
          WHERE a2.id = v.id
            AND a2.num = v.num
       )

Tue Feb 18 22:30:01 2014: Starting query
Tue Feb 18 22:42:12 2014: Finished query
Tue Feb 18 22:42:12 2014: Total rows that would be affected:
        Unique documents:    67172
         Unique versions:   309034
Tue Feb 18 22:42:12 2014: Completed without errors
-----------------------------------------

Here's a corresponding run from OCE Mahler:
-------------- OCE Mahler ---------------
Program parameters:
    user          = ahm
    command       = reportcopy
    cfgBatchSize  = 1000
    cfgBatchCount = 999999999
    cfgSingleUser = False

Fri May 07 00:19:12 2010: Will execute query:
    SELECT COUNT(DISTINCT v.id), COUNT (v.num)
      FROM all_doc_versions v
     WHERE v.xml IS NOT NULL
       AND NOT EXISTS (
         SELECT a2.id, a2.num
           FROM cdr_archived_versions..doc_version_xml a2
          WHERE a2.id = v.id
            AND a2.num = v.num
       )

Fri May 07 00:19:12 2010: Starting query
Fri May 07 00:26:42 2010: Finished query
Fri May 07 00:26:42 2010: Total rows that would be affected:
        Unique documents:    80658
         Unique versions:   360400
-----------------------------------------

Dev took 12 minutes to do less work than was done in 7 minutes on
Mahler.
Comment entered 2014-02-19 13:04:28 by Englisch, Volker (NIH/NCI) [C]

Dev took 12 minutes to do less work than was done in 7 minutes on Mahler.

I would be interested to know how long it will take on PROD. I bet it's much faster than 7 minutes.

Comment entered 2014-02-25 17:56:26 by alan
Nana Asare of the CBIIT DBA team refreshed the cdr database on QA (see
JIRA DBATEAM-850.)  He restored onto QA a backup of the Prod database
that was made yesterday, Feb 24, 2014 - which should work well for us.

The reportcopy query took only 3-3/4 minutes on the refreshed QA, faster
than OCE Prod and faster than OCE Dev.  It reported that the number of 
documents to be copied was:

        Unique documents:    79324
         Unique versions:   385432

These numbers look right.  A similar run on OCE Bach in 2011 produced:

        Unique documents:    78628
         Unique versions:   290213

I ran the full "copy" part of the archiving script on QA.  That copied
the 385,432 document versions from the cdr database to the
cdr_archived_versions database.  The time taken was 1:26:00 (one hour
and twenty-six minutes).

The run on OCE Bach to copy 290,213 document versions took 4:55:18.

It appears that any slowness we have seen on QA in the past has either
been corrected, or it was not due to a slow database server.
Comment entered 2014-04-08 11:12:26 by alan

All servers were updated and everything has been running fine for a while.

Comment entered 2014-05-01 15:16:15 by alan

Everything has worked fine for some time now.

Elapsed: 0:00:00.001564