Automating Form Login
By Greg_Gutkin on Dec 13, 2012
A common task in configuring a web application for proxying in Pagelet Producer is setting up form autologin. PP provides a wizard-like tool for detecting the login form fields, but this is usually only the first step in configuring this feature. If the generated configuration doesn't seem to work, some additional manual modifications will be needed to complete the setup. This article will try to guide you through this process while steering you away from common pitfalls.
For the purposes of this article, let's assume the following characteristics about your environment:
- Web Application Base URL: http://host/app (configured as Resource Source URL in PP)
- Pagelet Producer Base URL: http://pp/pagelets
Form Field Auto-Detection
Form Autologin is configured in the PP Admin UI under resource_name/Autologin/Form Login.
First, you'll enter the URL to the login form under "Login Form Identification". This will enable the admin wizard to connect to and display the login page.
Make sure the entered URL matches what you see in the browser's address bar, when the application login page is displayed. For example, even though you may be able to reach the login page by simply typing http://host/app, the URL you end up on may change to http://host/app/login via browser redirect(s).The second URL is the one you will want to use.
Caution: External Login Servers
The login page may actually come from a different server than the application you are trying to proxy. For example, you may notice that the login page URL changes to http://hostB/appB. This is common when external SSO products are involved. There are two ways of dealing with this situation. One is to configure Pagelet Producer to participate in SSO. This approach is out of scope of this article and is discussed in a separate whitepaper (TODO add link). The second approach is to use the autologin feature to provide stored credentials to the SSO login form. Since the login form URL is not an extension of the application base URL (PP resource URL), you will need to add a new PP resource for the SSO server and configure the login form on that resource instead of the original application resource. One side benefit of this additional resource is that it can reused for other applications relying on the same SSO server for login.
After entering the login page URL (make sure dropdown says "URL"), click "Automatically Detect Form Fields". This will bring up the web app's login page in a new browser window. Fill it out and submit it as you would normally. If everything goes right, Pagelet Producer will intercept the submitted values and fill out all the needed configuration data in the Admin UI. If the login form window doesn't close or configuration data doesn't get filled in, you may have not entered the login page URL correctly. Review the two cautionary notes above and make any necessary changes.
If the form fields got filled automatically, it's time to save the configuration and test it out. If you can access a protected area of the backend application via a proxied PP URL without filling out its login form, then you are pretty much done with login form configuration. The only other step you will need to complete before declaring this aspect of configuration production ready is configuring form field source. You may skip to that section below.
Manual Login Form IdentificationLet's take a closer look at Login Form Identification. This determines how Pagelet Producer recognizes login forms as such.
URLThe most efficient way of detecting login forms is by looking at the page URL. This method can only be used under the following conditions:
- Login page URL must be different from the post login application URLs.
- Login page URL must stay constant regardless of the path it takes to reach the page. For example, reaching the login page by going to the application base URL or to a specific protected URL must result in a redirect to the same login page URL (query string excluded). If only the query string parameters change, just leave out the query string from the configured login page URL.
RegExIf the login page URL is not uniform enough across all scenarios or is indistinguishable from other page locations, PP can be configured to recognize it by looking at the page markup itself. This is accomplished by changing the dropdown to "RegEx". If regular expressions scare you, take comfort from the fact that in most cases you won't need to enter any special regex characters. Let's look at an example: Say you have a login form that looks like
<form id='loginForm' action='login?from=pageA' > <input id='user'> <input id='pass'> </form>
Since this form has an id attribute, you can be reasonably sure that this login form can be uniquely identified across the web application by this snippet: "id='loginForm'". (Unless, of course your backend web application contains login forms to other apps). Since no wildcards are needed to find this snippet, you can just enter it as is into the RegEx field - no special regular expression characters needed!
If the web developer who created the form wasn't kind enough to provide a unique id, you will need to look for other snippets of the page to uniquely identify it. It could be the action URL, an input field id, or some other markup fragment. You should abstain from using UI text as an identifier it may change in translated versions of the page and prevent the login page logic from working for international users. You may need to turn to regular expression wildcard syntax if no simple matches work. For more information on regular expression, refer to the Resources section.
Form Submit Location
Now we'll look at the form submit location. If the captured URL contains query string parameters that will likely change from one form submission to the next, you will need to change its type to RegEx. This type will tell Pagelet Producer to parse the login page for the action URL and submit to the value found. The regular expression needs to point at the actual action URL with its first grouping expression.
Taking the example form definition above, the form submit location regex would be:
The parentheses are used to identify the actual action URL, while the rest of the expression provides the context for finding it. Expression .*? is a so-called reluctant wildcard that matches any character excluding the single quote that follows. See Resources section below for further information on regular expressions.
Manual Form Field DetectionIf the Admin UI form field detection wizard fails to populate login form configuration page, you will have to enter the fields by hand. Use a built-in browser developer tool or addon (e.g. Firebug) to inspect the form element and its children input elements. For each input element (including hidden elements), create an entry under Form Fields. Change its Source according to the next section.
Form Field SourceChange the source of any of the fields not exposed to the users of the login form (i.e. hidden fields) to "Generated". This means Pagelet Producer will just use the values returned by the web app rather than supplying values it stored. For fields that contain sensitive data or vary from user to user (e.g. username & password), change the source to User (Credential) Vault.
User Vault ConsiderationsIn order to store credentials tied to a particular user (User Vault option), user must be authenticated with Pagelet Producer. There are several ways to establish a user session with PP. The simplest way that can be used for testing is to login to the PP Admin UI prior to rendering a particular pagelet. This obviously isn't realistic for actual deployments, since most pagelet users will not have access to the Admin UI. A more realistic way to authenticate a user is to protect the PP Resource with a role-based Policy. In the PP Admin UI, navigate to the Resource / Policy and click 'Create'. Then enter the role(s) you want to restrict access to. These roles must correspond to roles defined by your J2EE Container (WLS/WAS). After configuring the Policy, whenever an unauthenticated user requests a pagelet on that resource, he/she will be presented with a login form forcing container-based authentication. The final authentication option is to use an SSO provider, such as OAM. User and shared credentials are stored in the Credential Store under the map "ensemble". The credential key is formed by combining the username of the user it belongs to (blank if shared) and the key used in the PP Admin UI.
If you see error message stating "Cannot retrieve credential vault data for Anonymous user.", this indicates that the user hasn't authenticated with PP and so can't access the credential vault.Let's look at an example of login form configuration for Atlassian Confluence Wiki.
The values provided by users for os_username and os_password are saved in the Credential Vault using keys 'wiki_user' and 'wiki_pass' respectively. The key names are arbitrary. The value for os_destination is marked Generated, which means it is taken directly from the markup returned by the backend server.
Here's how the field keys are used in the Credential Store in Enterprise Manager:
The entered keys are combined with the captured username (in this case, 'weblogic') and stored under the 'ensemble' map.
Logging SupportTo help you troubleshoot you autologin configuration, PP provides some useful logging support. To turn on detailed logging for the autologin feature, navigate to Settings in Admin UI. Under Logging, change the log level for AutoLogin to Finest.
RegExRegEx Reference from Java
RegEx Test Tool