Issue Number | 330 |
---|---|
Summary | [Accounts] Migrate to NIH Federated Login (OpenID) |
Created | 2015-10-05 13:51:40 |
Issue Type | Improvement |
Submitted By | henryec |
Assigned To | alan |
Status | Closed |
Resolved | 2016-04-07 14:57:04 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/oceebms/issue.171420 |
Migrate off NCI's eDIR to use NIH Federated Login. Will need more discussion about how this will change the login screen, and which OpenID methods we want to allow (i.e. Google and/or PayPal).
Documentation:
Non-Technical Overview for Application Owners: http://isc.nih.gov/ISCSupport/Login/NIHLoginOverView.ppt
Technical Overview for Application Developers / System Administrators: http://isc.nih.gov/ISCSupport/Login/NIHLoginTechOverView.ppt
Information about Headers / Use of User Attributes for Authentication & Authorization Purposes & Determining Level of Assurance: http://isc.nih.gov/ISCSupport/Login/UserAttributesforNIHApplications.docx
Information about Integrating with PIV cards: http://isc.nih.gov/ISCSupport/Login/NIHLoginPIVCards.ppt
I've been going over documentation given to us. The docs, especially the powerpoint, strike me as very high level but quite good. I've also skimmed through some of the SiteMinder docs I found on the Net, and the MediaWiki docs for an open source PHP plugin that, hopefully, will work for us.
In addition to us and the NIH Login support group, it looks like significant work will need to be done by CBIIT. They will have to install and configure the SiteMinder "Web Agent" and the PHP plugin on each of our four tiers, and open the network ports needed for it to communicate with the NIH SiteMinder service. They'll have to work with NIH to get the configuration right and maybe test that it works.
When that's done the next step might be for us to build a proof of concept test program. Its only purpose would be to experiment with the techniques we have to use got get what we need from NIH SiteMinder. Once we have a working model of each piece of functionality we need (forward requests to the Web Agent, get objects back for successful and failed logins, query the objects for the info we want, etc.), then we can build a module to use in the EBMS, hopefully in a way that can be easily applied to our other Drupal systems in the future.
I have lots of questions but, unless I can start asking them of the NIH Login support group, I don't think there's much more for me to do at this time.
The documentation attached to this issue that explains "User Attributes for NIH Applications" appears to be out of date. Some of the headers described in that document published in 2009 are not found when I examine what I get in EBMS. I've therefore relied on what I found rather than that document.
The NIH SSO login offers exactly two login options. NIH and Google. The NIH login works as before. The Google login is the one that I've experimented with. It seems to me that we can use the following headers to find out what kind of login we are dealing with and who is the user.
A number of headers can tell us which kind of ID we have. These include SM_USERDN, SM_AUTHDIRNAM, and USER_AUTH_TYPE, USER_AUTHN_SOURCE, and others.
The one I like best is HTTP_USER_ORG. I've found two values there:
"NIH"
"Google"
USER_AUTHN_SOURCE or USER_AUTH_TYPE can tell us whether we are getting information from NIH or non-NIH, but HTTP_USER_ORG gives us that information plus it gives us the actual source for the federated login, in this case "Google". If NIH ever supports additional sources, I would hope that the test for equality with "NIH" will continue to work, but we'll know that the actual source is not Google, something we need to have and don't get from the other headers.
In addition to knowing the organization providing authentication, we also need to know the identity of the user. Not surprisingly, NIH has no mapping of Google IDs to NIH IDs, so what we get from the federated source is something else.
I found two useful IDs in the Google derived headers. They are:
HTTP_USER_EMAIL
Contains my NIH "alan@mail.nih.gov" email address when logging in to
NIH, and my Google account email address when logging in via Google
OpenID. Users can establish a Google account address with any email they
wish. It doesn't have to be gmail. Mine is actually a yahoo
address.
HTTP_SM_USER
Contains "alan" for an NIH login. Contains
"google_123456789012345678901@nih.gov" (I presume that the actual 21
digits are different for each user, but seem to be identical between
invocations.)
If we are going to build our own map of user identities to Drupal user records, I think the HTTP_USER_EMAIL is best. Most users who have a Google account will probably know the email address they provided to Google and can tell us what it is. The other identifier will be opaque to everyone and would be very difficult for us to find, even if it's more stable than the email address - which it may or may not be.
So, the two new headers we need to look for are:
HTTP_USER_ORG
HTTP_USER_EMAIL
I have attached a text file "UserPerspective.txt" explaining how the new OpenID login is projected to work for board member users and for OCE staff who register and maintain user accounts.
If there is anything untoward in this text, please let us know.
Attaching revised draft 2 of the user perspective document incorporating user feedback and adding more detail.
A couple of comments on the user perspective document.
Not all users will be prompted for their Google credentials. If you're already logged into your Google account for some other purpose that step will be skipped (but not the one-time step of granting NIH permission to use the basic account information).
I'd be inclined to continue with just a single field for the authname value (with surrounding description to explain that it can contain either an NIH domain account name or an external account ID).
A couple additional comments/questions:
1. I think the link to add a new user should say "+ NIH or OpenID SSO user" so that it is obvious that it covers more than just NIH users.
2. Would HHS (but non-NIH) employees use the SSO login or OpenID? Right now, these users (for example, a Board member who works at AHRQ) use eDir.
Answer to #2: OpenID.
OK. Is it possible for the menu a user gets to have options for "NIH Login" and "OpenID Login" then? Right now, the options are listed as "HHS Login" and "Social Login/Open ID".
Alan can ask them, but I won't be surprised if they say they won't change it. Can't hurt to ask, though.
I can check but, like Bob, I wouldn't expect much. I think the same login page is intended for many different systems, not just ours.
~BKline
I'd be inclined to continue with just a single field for the authname value (with surrounding description to explain that it can contain either an NIH domain account name or an external account ID).
I thought about that and see some advantages to it, but chose the two field approach because it allows more specialized validation of the OpenID. However it's not hard to change it to a single field. Let's discuss it on Tuesday morning and I'll change it if we think that's best.
I have attached an Excel spreadsheet "BoardMembersOpenIDs.xls" that can be used for entering all of the Google Account email addresses for OpenID logins. The list of members is current from the production database as of today.
We should enter the undecorated email addresses, e.g., "joe.smith@somemail.com". I'll add a "mail:" prefix that we are using to each one programmatically, when I process the data to enter into the database.
I tested the revised nci_SSO.module containing the new OpenID functionality on NINR and MyNCI. I couldn't find any change in behavior between the old and new modules when logging in as myself.
AccrualNet doesn't use nci_SSO, so it necessarily has no effect there.
I've left the new, 1.4 version of nci_SSO.module on DEV in the sites/all... directory. I renamed the one I found there as nci_SSO.module.ok.
I created a new Board manager account tied to a Google login and it's working great so far!
I've written a python script to read the spreadsheet after it's filled out and post the Google account email addresses into the database. I probably went overboard on error checking to reduce the possibility of corrupting anything.
I tested it successfully on DEV by putting in an email address in one of the rows and some clearly wrong information in other rows. The wrong rows were rejected. The rows without email addresses were rejected. The correctly formatted row was accepted and the change went into the database.
When we get the data spreadsheet we'll be ready for a bigger test.
Bob/Alan, would it be possible to upload a new version of the OpenID spreadsheet that has each Board on a separate sheet? It will be easier to divide up the work that way. Thanks!
New workbook with separate sheets for the boards attached.
Sorry Bob, one more request. Would it be possible to limit the spreadsheet to active, non-SSO Board members? We can filter out the SSOs if need be - there aren't many of them. But there are quite a few inactive Board members on the list so it would be helpful to remove them. Thanks.
Just to make sure we're on the same page: we're talking about the
status
column of the users
table, right?
You're not asking me to exclude users who haven't logged on to the
system in the past few months (or ever), are you?
That's correct - users who are marked as inactive in that table. Thanks!
Here you go.
While testing the new SSO/OpenID software on the new QA I ran into a bug in my software.
I entered the same email address twice, attempting to create two users with one OpenID - something I had never tried before. The system created the new user, notified the admin user that the new user had been created, and sent mail to the user explaining how to login (there may be some issues with that too), and then crashed with a PDOException on violating an integrity constraint that authnames in the authmap table must be unique.
There are now two users in the users table, "Alan OpenID" and Alan OpenID2", but only the first one has a user id entry in the authmap table.
The program has to be revised to check for possible errors BEFORE taking further action. I'm not going to work on this tonight, but I'm recording it here in order to be sure everyone is aware of the problem. I hope to fix the problem soon.
... and sent mail to the user explaining how to login (there may be some issues with that too)
The issue in this procedure occurs if the admin user checks the box that says "Notify user of new account". The notification comes from the Drupal user module and gives instructions to the user that don't apply and won't work in our SSO, eDir, and OpenID logins.
I don't know if this should be called a bug or not. Our EBMS Site Managers probably already know about this and know not to check that box. I'm not sure any harm is done to the database if the box is checked. It may just be that an inappropriate and confusing email goes out. I don't know how difficult it would be to remove the checkbox. I won't look at this again unless someone decides it's an issue that needs to be addressed.
I've attached a spreadsheet containing the Google-association e-mail accounts that we've garnered so far.
Some questions:
1. Some members on this spreadsheet are no longer on a Board. I can mark them as inactive in the system, but my question is do you want them removed from the spreadsheet?
2. Some new members have been added. Do you want to handle them as part of the cutover or add them separately after we switch to OpenID? My sense is that it may be more straightforward to add them after we make the switch since they do not have NCI network accounts.
~JutheR Answers to questions:
I went ahead and created a copy of the data in the spreadsheet, removing all users who were marked as NIH, NCI employees, N/A, Off Board, or New, or for whom there was no Google account email listed. We decided at our status meeting today that we would handle the new users by hand and all of the others will not, initially, have OpenID access. I also fixed a couple of minor character problems.
The total of OpenID user/email combinations remaining is 73. That's out of a total of 120 users marked as active and currently using eDir authentication. We know that not all of those marked active in the database are actually still active, so the count is not as bad as it might seem.
The plan is to convert all of them during the switchover next Wednesday.
I tested this on old QA and it's working well - I set up an OpenID account for myself and then logged in successfully. Thank you!
I have updated the nci_SSO module in subversion:
Created a tag "1.3" for the apache 2.4 header modifications made by Dan.
Merged branch 1.4 into 1.3.
Merged branch 1.3 (now including 1.4) into trunk.
Created a tag "apache2.4_OpenID" for the latest trunk with latest merges.
Deleted branch 1.4
Deleted branch 1.3
I've copied all the latest code to /local/drupal/sites/all/modules/Custom/nci_SSO. The only differences between it and what was there are the svn $Id tags.
Verified on PROD. This is working great and our Board members love it!
File Name | Posted | User |
---|---|---|
board-member.xls | 2016-01-15 09:57:54 | Kline, Bob (NIH/NCI) [C] |
Board Member Google OpenIDs.xls | 2016-02-04 13:23:46 | Juthe, Robin (NIH/NCI) [E] |
board-members.xls | 2016-01-04 14:47:45 | Kline, Bob (NIH/NCI) [C] |
BoardMembersOpenIDs.xls | 2015-12-11 00:14:25 | |
UserPerspective.txt | 2015-12-01 17:24:49 | |
UserPerspective02.txt | 2015-12-03 20:07:19 |
Elapsed: 0:00:00.000674