Home / Analysis / Business  /  Name Validation For Developers 101

Share This Post

Analysis / Business

Name Validation For Developers 101

HMRC

When developers don’t have a value available for processing, they tend to fall back on “null”. The principle may seem innocuous – but not to people like Jennifer Null, a British customer who was forever denied online purchases and endured tax and billing problems by digital systems.

Names are a crucial piece of our identities, and many software problems occur due to a fundamental conflict between the two key aspects of names. On one hand, names are personal, so they carry a lot of cultural and family heritage, tracing back to a time long before computers. Specific spellings, accents, and parts of names have a meaning and can’t just be simplified or changed to make processing easier.

On the other hand, useful computer identifiers need to be standardised and easy to store and process. Software developers must look for a good way of modelling and standardising this information to avoid common biases, or they risk creating horribly expensive mistakes.

For example, developers in the Western world are very likely to underestimate how long some real names can get. In 2013, Janice Keihanaikukauakahihuliheʻekahaunaele started a campaign against the Hawaii Department of Transportation for not being able to print her name on her driving licence. The licences had space for only 34 characters, one fewer than Janice’s last name. The driving licences also could not show the ʻokina, the Hawaiian accent that looks like an apostrophe, despite the fact that the native name of Hawaii, Hawaiʻi, also includes the ʻokina.

Journalists from KHON2, a TV station in Honolulu, picked up Janice’s story. After public pressure, the government caved in and changed the policy to allow up to 40 characters in a name on a driving licence. This required an upgrade to computer systems across the state.

There’s no universally acceptable standard for maximum name length. The International Civil Aviation Organization (ICAO) allows up to 64 characters per name. Many governments today limit registered baby names to those that fit on a passport, which may be up to 40 letters. People born before machine-readable passports weren’t subject to that restriction. Just so that you do not think that 40 letters covers everything, according to the Guinness Book of World Records, the current longest name belongs to Rhoshandiatellyneshiaunneveshenk Koyaanisquatsiuth Williams, a girl born in Texas in 1984, whose actual first name is 1,019 letters long.

On the flip-side, it’s also common to overestimate how long the names have to be. Single-letter names are perfectly possible, so it’s bad practice to use length checks to prevent people from entering initials. Take, for example, the story of Stephen O, a South Korean living in Virginia, USA. Several of Stephen’s credit card applications were rejected because local banking systems could not record a single-letter last name, presumably to prevent people from just entering initials in the registration form.

Stephen was only able to get a driving licence under the surname OO. This caused problems when he tried to get car insurance because the credit agency couldn’t match any of the records. The credit agency computers recorded Stephen’s last name as Ostephen, presumingly ‘upgrading’ the name from Korean to Irish. In one database, after a lengthy search, he was found under blank-blank-O. Stephen finally gave up fighting computers and changed his name to Oh.

Another common category of biases relates to the number of names. People don’t always have a given name and a surname. In some east-Asian cultures, it’s perfectly fine to have just a single name – mononymic – but modern software systems are uniquely unqualified to deal with that.  Many Western government systems require first and last names to be recorded separately, and some will set mononymic names as the given name, some as the surname.  Some countries use markers such as FNU, LNU or XXX for the other name when recording mononymic people.

Sadly, these workarounds sometimes start to replace a person’s real name, especially if the name is difficult for westerners to pronounce. This was the case with Naqibullah, a former interpreter for the US Armed Forces in Afghanistan, who moved to Houston, USA.

When The Wall Street Journal wrote about his case, Naqibullah worked as a taxi driver, using a popular ride-hailing app. Unfortunately for him, the application only shows the driver’s first name to customers. So, pretty much everyone calls him FNU. The newspaper article also mentioned Neli from Indonesia, who moved to North Dakota. She is just Neli on her social security card, FNU Neli on her US visa, but Neli Neli for her bank. Computer systems were also a big problem for No Name Given Sandhya.

Of course, people don’t always have just one or two given names and surnames.  A nice example is Rosalind Arusha Arkadina Altalune Florence Thurman-Busson, the daughter of famous actress Uma Thurman. A more complex example is Tracy Nelson from Chesterfield, who actually has 139 middle names.

A really fun example is David Fearn from Walsall, who changed his name to a combination of all his favourite movies. He legally became known as ‘James Dr No From Russia with Love Goldfinger Thunderball You Only Live Twice On Her Majesty’s Secret Service Diamonds Are Forever Live and Let Die The Man with the Golden Gun The Spy Who Loved Me Moonraker For Your Eyes Only Octopussy A View to a Kill The Living Daylights Licence to Kill Golden Eye Tomorrow Never Dies The World Is Not Enough Die Another Day Casino Royale Bond’.

The third common group of bad assumptions relates to the valid characters for a name, often leading to people complaining about registration forms rejecting seemingly invalid words. Apart from letters and common punctuation marks (such as an apostrophe), names can contain a wide range of symbols. A US animal rights activist changed her name to Goveg.com, and some creative parents in New Zealand named their baby Number 16 Bus Shelter in 2008. And finally, taking inspiration from software releases, a US father named his son Jon Blake Cusack 2.0 in 2004. Trying to come up with a unique identifier for their baby, a Chinese couple even tried to name their child ‘@‘.

It’s important for developers not to ask for first and last name separately, but instead, ask people for their full name, and optionally, a short name to confirm how they would like to be addressed. When communicating with external systems, make sure you can handle cases in which one of those two fields is missing.

Lastly, common computing words can be perfectly valid names, such as in the case of Null. Even more importantly, common testing phrases are also valid names. Just because a user’s surname is Test doesn’t mean that it’s actually a test account. Avoid using specific names to mark example data, because real users might get caught by this as well. A notable example is Jeff Sample, who got stuck in digital hell at the Buenos Aires airport when the airline for his connecting flight could not find his ticket.

Share This Post

Gojko Adzic is a partner at Neuri Consulting LLP, winner of the 2016 European Software Testing Outstanding Achievement Award, and the 2011 Most Influential Agile Testing Professional Award. Gojko's book Specification by Example won the Jolt Award for the best book of 2012, and his blog won the UK Agile Award for the best online publication in 2010. Gojko is a frequent keynote speaker at leading software development conferences and one of the authors of MindMup and Claudia.js. As a consultant, Gojko has helped companies around the world improve their software delivery, from some of the largest financial institutions to small innovative startups.