Issue Number | 743 |
---|---|
Summary | Librarian Queue - Review decisions not getting recorded after adding topics/tags from a 2K+ list of sorted citations |
Created | 2023-03-02 17:54:49 |
Issue Type | Bug |
Submitted By | Boggess, Cynthia (NIH/NCI) [C] |
Assigned To | Kline, Bob (NIH/NCI) [C] |
Status | Closed |
Resolved | 2023-03-06 13:59:50 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/oceebms/issue.339868 |
The first thing I do when I start reviewing each month is to sort the entire queue by core journals so that I can review these citations first. Today while doing this with 2528 citations in my queue, I experienced a loss of decisions when trying to submit a page of 10 citations which had topics added during my review along with my review decisions for each. After clicking submit, the queue reloaded with the exact same citations – the topics I had added were there but my review decisions had not been recorded because all 10 of the citations were still in my queue.
I experienced this 3 times and one of those times was a set of 10 where I had added a topic to one citation and an internal tag to another. Pages where I was just recording my review decisions and had not added any tags or topics were being submitted and recorded with no issues.
This reviewing took place between 10am-2pm today.
After discussing with William, we decided to try to recreate and test this out on QA. I first imported a bunch of extra citations so that I could get a queue of at least 2500 citations to match the number in the live queue and because I am thinking that the size of the queue is a factor.
I was able to recreate the issue on QA.
Repro Instructions– go to the librarian queue, under Display options sort by core journals, keeping the default of 10 per page and click filter. Add a topic to 4 or 5 citations and add review decisions to all 10 citations. Make a note of the total number of citations in the queue and the citations you added topics to before you click submit. While the page is processing you can scroll and see that the decisions have already changed back to “none” and when the page fully loads, the number of citations should be the same because no decisions were recorded but the topics you added will be displaying for those citations.
The same test but adding 4 or 5 tags or internal tags instead of topics or a combo of both will also result in a loss of decisions.
Interestingly, if you do this same test but add a filter to Cancer Genetics (which will reduce the queue to about 700 or so citations) everything will get recorded as expected.
And also interesting – if you add 4 or 5 topics or tags to citations in the queue without any display options or filter options, everything will get recorded as expected.
Happy to demo if needed.
Note: Because this is how I always start my reviewing (with a sort by core journals of the entire queue), I specifically tested this on erc, dev and qa before deployment and everything was working. But during the testing phase, I don’t remember the review queue ever having more than maybe 1200-1500 citations at any given time which is why I think we did not see this issue. A starting queue in the 2500s is actually well below average prior to 2022. In previous years our average was over 3400 and never had issues like this in Prod when sorting by core journals. It did not occur to me that the size of the queue would be an issue. I assumed that if it worked with 1500 it would work with 2500 or even 3500 but might just be slower.
I'm investigating to try and track this down. It's unfortunate that we removed the summary display at the top of the queue showing the current decisions, as that would have helped detect and diagnose this problem more easily.
Well, that was quite an adventure! 😛
First of all, let me say again: excellent bug report. This is a model for how to do it right!
It turns out that the software is doing the right thing, it's just doing it too slowly (which, I'll be the first to admit, is in this case right next door to (if not indistinguishable from) "not doing the right thing").
Warning: this explanation is going to get more than a little geeky, so feel free to jump to the TL;DR at the bottom.
The code which updates the database each time you click a decision button is handled by an AJAX callback, which means a JavaScript function which is invoked asynchronously. The "asynchronous" part means that it's churning away in the background, not blocking you from proceeding with other actions. In the context of Drupal 9 the only way to handle this step of recording the decision in the database (which we do because of the requirement that the decisions you make are persistent, even as you page through the queue) is as part of the code which assembles and draws the form (which in this case includes the queue itself). It doesn't happen all the time, but using logic which is not really apparent Drupal decides that sometimes everything which is going to be drawn on the page has to be recalculated. I tried to find a way to call back to the server code directly from my own JavaScript but as far as I've been able to determine, Drupal 9 doesn't allow that. Because we have jettisoned the display of the decisions you have made for the current queue, you no longer have any visible way to tell when this background processing has completed. If you were to wait long enough after making your decisions (about 40 seconds or so on one of the slowest servers—for example, QA—in the worst case of a really big queue with a really slow sort) your decisions would have been saved permanently. But again, you have no way to know when that point has been reached.
The reason the problem didn't surface with most of the sort options is that they don't take very long. However the two sort options involving journals do take quite a long time, especially for larger queues. I was able to speed up the sort by journal title on the QA server sufficiently by adding an index that the problem reported here should be avoided. However, there's no reasonable way to sort efficiently on journal title with the core journals coming before the non-core journals. (I say "no reasonable way" because we could make such a sort efficient by storing the flag for whether the article is in a core journal directly in the record for the article, but that would mean maintaining that information in multiple places, with the extra overhead whenever changes are made to the journals, where we'd have to go through every one of the 900,000 articles to adjust that flag, and risk having the multiple storage places for this information get out of sync with each other.)
So I changed the "core journals sort" so that was really a filter, only showing articles which are in core journals, sorted by journal title, with a sub-sort by article title. That runs right away. My recommendation would therefore be to replace the "core journals sort" with a checkbox to indicate that only articles published in core journals should be in the queue. This would have the added benefit that it could be applied in combination with any of the other sorts.
So now all the sort options work fast enough on QA, even though it's a pretty sluggish server, that I can't reproduce the problem reported here on that tier. The changes I made also mean that coming back from the Add Topic or Add Article Tag pages no longer takes any appreciable time.
We can fix this problem by
adding an index on the journal title to the
ebms_article
table
replacing the "core journals sort" with a filter to restrict the queue to articles in core journals
In the meantime you have the workarounds you have identified (smaller queues or faster sorts).
Sounds good, I think a core journal filter to limit the queue will work.
However, is it possible to have both options available? because the BMs who work with smaller queues (and may not encounter the same issue reported here) may still want the option to sort by core journals.
The primary reason I would hesitate to do that is that it's dangerous. It relies on all users always remembering never to use that option for large queues (where the definition of "large" would depend on the load on the database from other applications as well as the web server at any given moment) because of the risk that they would not be watching as carefully as you were and therefore miss the fact that decisions they thought they had recorded have quietly vanished.
ok excellent point
Using the core journal sort this morning in prod and experiencing this issue again with a queue of about 2080:
In one case it took me 5 tries to get a citation with an added topic to record the review decision in a core journal sorted list even after adding a limit to board. The decision was finally recorded after logging out and back in and trying again.
AND I also noticed that decisions were not getting recorded with sets of citations (sorted by core journals) where only review decisions were being submitted. Unlike last week when I thought it was only happening when topics or tags were added along with review decisions.
AND lastly I confirmed that the green banner stating queued decisions applied is not displaying when this loss of decisions happens.
Fixed on https://ebms.rksystems.com and https://ebms4-dev.nci.nih.gov.
index on journal title added to article table
sort by core journals replaced by filtering option
Looks good on dev.
I added more core journal citations to the queue on dev before testing this.
I was unable to recreate the same issue I was having on prod using this new core journal filter option on dev. The new core journal filter option reduces the number in the queue and as a result it does not take as long to process adding topics/tags and returning to the queue. All decisions were recorded successfully even when I added a sort by author or journal.
One suggestion to save some space would be to remove the text "Filtering by journal" and "Only include articles published in core journals" and just have "Core Journals Only" next to the checkbox. But the BMs may want to give their input on this because this will be on their queue page as well.
~vshields want to weigh in on Cynthia's suggestions about scaling back the text around the new checkbox?
I'm looking at the text on DEV. Please update it to say "Core journals only" and add a period to the end of the explanation sentence, "Only include articles published in core journals." to be consistent with the other instructions. That's 2 "onlys" in a row, but we want the button's action to be clear. Thanks!
The latest wording changes for filtering by journal have been installed on DEV.
https://github.com/NCIOCPL/ebms/commit/f609507
Verified on QA
New "Core journals only filter" is working as expected and I have not been able to recreate decision loss while using it.
Elapsed: 0:00:00.000434