Proposal for using microformats to auto-fill forms
The current auto-fill browser extensions use regular expressions or previous user choices to work out how to automatically fill-in common registration forms. What we really need is a way to mark up forms semantically so that we can reliably auto-filled them using user selected structured data. Apart from obvious issues these applications also do not deal with repeating patterns very well, such as loading lists of friends or multiple email address.
As I worked on the draggables project and started to explore the possibilities of the W3C Contacts API this issue seems to be a hidden blocker to a number of interesting UX possibilities. I spent a week reviewing the common patterns of interaction and came up with a raw concept of how microformats could be extended to help with this problem. The proposal below was posted on the microformats wiki and I hope to carry forward the idea in the future.
Design goals for “input microformats”
Use the current microformat authoring conventions and schemas where possible. This goal should be understood in the light that most microformats are designed as hierarchical structures and whereas forms are not. Some breaking changes maybe required to deal with this fundamental difference, but these should be kept to a minimum.
Where possible any differences in authoring should be dealt with by creating a common superset of additional classes for auto-filling applications. This approach will limit the cognitive load on authors and allow the reuse of current parsers.
A primary design goal must be the consideration of i18n. This most effects dates and durations. The individual implementations may allow for language specific formatting, but it should not be part of this specification.
It should be reasonably easy for an author to add classes to a pre-existing form without having to change its data structure.
Auto-fill application vs generalist parsing documentation
Auto-filling a form and parsing its contents are two different operations, although they share the same conventions and schemeas. The discussion on how a microformat parser should extract values from form elements can be different to the needs of an application that can auto-fill a form using microformat data.
The input
classname
To aid discovery and differentiate the use of microformats for the use of auto-fill applications we should use an input
classname in conjunction with the root microformats classname. The current suggestion of appending the root classname with input
ie vcard-input
would break all current parsers. To add the demarcation we should use a new class input
Proposed markup:
This would allow authors and applications to determine the intended use case i.e. the microformat mark-up is for form auto-fill. It would also be a simple task to update the current generalist parsers to ignore the mark-up if they were looking for content only. By default, most of the current parsers would already ignore an empty form marked-up with microformats as the required properties such as fn in hCard would be blank.
Form fields should use the classname attribute
It is tempting to consider using the form field name attribute as an alternative to the classname attribute, but this would break the current authoring conventions and all current parsers. We should confine the definitions of microformat to classname and rel attributes.
Proposed markup:
[sourcecode language="html"]
Text inputs and textareas
The mapping of content into text inputs and textareas is relatively straightforward, matches are made on the classname attribute and the whole value is used to fill in the form field.
Proposed markup:
Multiple form fields for plural microformat properties
Where a microformat property such as street-address
in hCard can contain an array of values, these values will be added in order into the collection of form fields with the same classname.
Proposed markup:
Selects
Values can be passed into selects where a match can be found against the content or value of an option. The example below shows how a select could be used to define the type
of a tel
. The auto-fill application should check the content of the option first then its value attribute. Where a property such as a tel type is plural and the select element has the multiple
attribute an auto-fill application should set multiple values in the select.
Proposed markup:
Checkbox inputs
Checkbox inputs can be used much like a multiple select. The values have to match the controlled vocabulary of the microfomat property being targeted.
Proposed markup:
Home
Work
etc…
Radio inputs
Radio inputs can be used to force a single selection from a controlled vocabulary. There are very few single value controlled vocabularies in the microformats schemas, but rating
in hReview />
would be a good example.
Proposed markup:
5
4
3
etc…
Hidden input fields
In general, the use of hidden fields should be discouraged, but they are use useful for the post processing of language specific formatting. A good example of this would be hResumes use of duration
to define how long someone has had a job. This is most often done with month/year drop down selects as on linkedin.com.
Proposed markup:
In the example above the duration
value is correctly passing into a hidden input which will />
be submitted with the form allowing the page author to build custom i18n formatting through JavaScript. The alternative would be to display an ISO duration such as “P3Y4M”. This type of interaction design could be achieved using JavaScript hijacking techniques to aid clients that do not support scripts.
HTML5 input types
Although HTML5 has new semantic input types such as search
and url
their primary purpose is to describe data types rather than the context of data use. We may know that the input in the />
example below is a date, but we only know that it is a start date of an event by the use of the classname attribute dtstart
.
Proposed markup:
For clarity and consistency in authoring rules, an auto-fill application should not try and imply context from the input type. All the non date and number based input types search
, email
, />
url
and tel
should be treated as if they were a simple text input.
The color
input is not supported as it is not yet used as a data type in the current or proposed schemas
DateTime, Number and Range input types
The new HTML5 DateTime input types still do not give the data context, but they do inform us of specific formatting requirements. Where possible the auto-fill application should format the DateTime value as per the type
. If the passed data does not contain the DateTime fragment required for the specific format the input should be left blank.
Proposed markup:
1996-12-19
1996-12
17:39:57
1996-12-19T16:39:57-08:00
1996-12-19T16:39:57
The number and range inputs need to be type checked before data is past into the form fields. No invalid dates or numbers should be passed into the form fields.
Min and Max checking on form fields. If a min
or max
attribute exists for a number or datetime input the value being passed needs to be checked to make sure it is in range before the input is updated.
Repeating microformat properties
There are a number of plural properties in mircoformats that allow multiple values. In hCard the commonly used ones are tel, email and urls. To allow a form to extend to receive an unknown number of values auto-fill applications need to support a repeating pattern. This can be achieved with a new classname “repeat” which can be used in conjunction with a microformat property. The author needs to add an instructional classname to inform the application when to perform a repeat.
Proposed markup:
An auto-fill application working in a browser would duplicate the whole DOM node and append it as a sibling. All element ids and form field names would be appended with an index number to keep the HTML valid. The label for attribute would also be updated to any changed id reference.
Repeating microformat structures within one form
As well as repeating mircoformat properties whole mircoformats could be repeated within one form. Again, ids and form field names would be appended with an index number.
Defining types in the classname attribute
Most forms do not use hierarchical structures and as such the type/value structure used in microformats are less common in forms. Using the classname and type together allows the author to target a specific type/value i.e. a mobile telephone number or preferred email address.
Proposed markup:
String Concatenation
There are a number of circumstances where concatenating a plural microfromats property into a single string is required. The most common string concatenations involve a combination of spaces and/or comma’s. Auto-fill applications should concatenate three different patterns; comma-space-delimited, comma-delimited and space-delimited. These format operators have to be placed in the same classname attribute as the microformat property name. The concatenated string should be trimmed and there should be no trailing spaces or commas at the end of the string.
Proposed markup:
[/sourcecode]
Logical operators “or”
There are number of circumstances where an “or” operator would be useful. If a classname attribute with more than one microformat property and the “or “ operator the auto fill application will make a selection between the properties. The first non-null value will be used. Where the microformat property is a multiple value all the values of the first property are used before any subsequent properties.
Proposed markup:
Related microformat wiki documents