A ux.stackexchange question prompted me to consider how one might implement a more permissive type of input validation. It’s not rare for a form to punish the user if they add an extra space before typing in a date, or accidentally use a comma instead of a period when typing in an IP address. After-all, we employ strict validation to keep the data correct.

Garbage In — Garbage Out. It rings true but maybe, taken too literally, it makes us form strict validation and a no-exceptions policy to rebels. We punish a user typing ’12′ instead of the fully-qualified ’2012′,… why? Either it’s our thoughtlessness or it’s the very unlikely (depending on context) possibility that the user did in-fact mean the year ’1912′ or ’1812′ or ’1012′…

If we start down the road of permissive input validation then we need to also explore input correction. We can’t allow a rogue comma to slip in and not correct it. It’s probably best to correct it straight away (not too soon — possibly on blur) so that the actual data stored conforms to the correct format.

William Hudson executed a date survey in 2009 to discover all the various ways American users like to enter dates. The results show that users use a variety of formats. It makes perfect sense to accept all these variants and let the computer figure out what is what.

For the specific problem of entering dates, I would like to recommend Date.js, because it can successfully parse most of those variants. However, there is a big caveat when it comes to dates, especially on international forms. The American style of entering a date, MM/DD/YY, is technically impossible to differentiate from the other standard of DD/MM/YY, unless the DD portion happens to be above 12. For this reason I guess it would be best to cater to your localized users as best as possible.

An alternative is to retain rigidity in your validation but allow for some minor mistakes. For example, insist upon the ISO format of YYYY-MM-DD but don’t make a fuss if the user separates with a slash or a space (or heck, anything) instead of a dash.

My point is: Maybe formal validation with permissive aspects mixed in gives us the best of both worlds. We don’t punish the user for minor mistakes, and we don’t end up with ambiguous data.

In an attempt to practice this technique of mixing rigidity with leniency, I created vic.js.

Currently validation in JavaScript can be quite an ugly affair, plagued with remnants of DHTML and overly invasive input masks. It’s not uncommon to see stuff like this:

someInput.onkeyup = function() {
  if (!this.value.match(/some rigid regex/)) {
    alert('Enter the right value, you fool');
  }
};

Typically the rules are strict, the characters non-negotiable, the regular expression unyielding, and the presented invalidation UI annoying.

vic.js (a.k.a Vic, VIC) allows you to define a lenient regular expression, and it expects you to extract your important data from the captured groups.

Vic’s signature goes something like this:

vic(
  LENIENT_PATTERN_WITH_CAPTURED_GROUPS,
  PER_GROUP_PROCESSOR,
  POST_PROCESSOR
);

The simple example would be a ‘year’ field:

var yearVic = vic(
  /^\s*(\d{1,4})\s*$/,
  function(year) {
    // Let's assume anything between 14 and 99 is from the 1900s:
    return vic.pad(year > 13 && year <= 99 ? '1900' : '2000' )(year);
  },
  Number // cast full output to a Number
);
 
yearVic('2012');   // => 2012
yearVic('01');     // => 2001
yearVic('hd2kd9'); // => false
yearVic('20021');  // => false
yearVic('96');     // => 1996
yearVic('  4');    // => 2004
yearVic('113');    // => 2113

The regex used for the year example, /^\s*(\d{1,4})\s*$, is lenient in that it allows whitespace at the beginning and end, and doesn’t mind if the user enters one, two, three or four digits for the year. For years greater than 13 or less than 100 we assume the user is referring to the previous century, so we apply ’1900′ as padding, otherwise we assume we should pad with ’2000′.

Vic offers a couple of helpers for basic tasks like padding, applying lower/upper case, etc. I’ll probably be adding to these as I think of more common use-cases for vic.

Vic allows more atomized per-group processing too. In this example we’ll validate a date in the form YYYY-MM-DD, but we’ll allow any one of ./,:- (plus spaces) as separators, and we’ll validate the component numbers and pad them too:

var vicDate = vic(/^\s*(\d{1,4})[.\/,: -](\d{1,2})[.\/,: -](\d{1,2})\s*$/, {
    1: function(year) {
      // Year between 50 and 99 assumed to be '19YY', otherwise presumed after 2000
      return vic.pad(year >= 50 && year <= 99 ? '1900' : '2000' )(year);
    },
    2: function(month) {
      return month >= 1 && month <= 12 && vic.pad('00')(month);
    },
    3: function(day, i, all) {
      // Check that there are {day} amount of days in the entered month:
      return day > 0 &&
        day <= new Date(all[1], all[2], 0).getDate() &&
        vic.pad('00')(day);
    }
}, function(v) {
  return v.join('-');
});
 
vicDate('111');       // => false
vicDate('2/3/4/5');   // => false
vicDate('16.332.2');  // => false
vicDate('20  1  20'); // => false
vicDate(' 1999.7.0'); // => false
vicDate('1999.0.1');  // => false
 
vicDate('1999.9.32'); // => false (no 32 in Sept)
vicDate('1999.2.28'); // => '1999-02-28'
vicDate('1999.2.31'); // => false (no 31 in Feb)
 
vicDate('1.1.1');     // => '2001-01-01'
vicDate('1956.3.2');  // => '1956-03-02'
vicDate('16.03-2');   // => '2016-03-02'
vicDate(' 20 1 20 '); // => '2020-01-20'
vicDate('1999.7.31'); // => '1999-07-31'

What we’ve done above is execute a rigid validation of the data that’s important to us (YYYY, MM and DD) while letting the user mess with the non-important stuff to their heart’s content (whitespace & separators).

Vic is simple. It’s not a high level abstraction but it’s not complex. It’s a few lines of code.

The fact is: you could easily integrate this methodology into your own validation utilities. The basic principle is to extract the important data, validate it, but allow the user some flexibility in how they give you the important data.

Check out vic.js on Github.