I’ve written here before about the common practice of using “security questions” as a backup means of user identification (in case the user forgets his password, for example). Research has shown that many of the information items that answer these questions are relatively easy to guess (“What’s your favorite color?”), or to obtain from published sources (“Where did you go to high school?”). The security question mechanism is, often, just a way of providing a secondary password that is considerably less secure than the main password itself.
The New Scientist has an article presenting some new research on the topic. In a paper [PDF] presented at the Financial Cryptography conference last week, Joseph Bonneau of the University of Cambridge, and Mike Just and Greg Matthews of the University of Edinburgh, report on some statistical analysis and research that they carried out in order to assess the security of these questions. They focus on the class of security questions that essentially ask for names (e.g., mother’s maiden name, best friend’s name, pet’s name). As many of us have suspected, the security is more illusory than real.
Using data from sources such as national censuses and pet registries, the team calculated that if allowed three guesses, the norm for many websites, an attacker could correctly guess 1 in 80 answers.
Now getting 1 out of 80 right is not particularly good if you are planning a targeted attack against a specific individual, but it will work out just fine if you are doing what the authors call a trawling attack: just trying accounts to find some that you can crack. Some sites do have obstacles to automated guessing attacks built in, but these are not insurmountable:
Both Gmail and Yahoo require users to solve a CAPTCHA – blurred text designed to foil automated attacks – when recovering a password. However, a motivated hacker could work through 1000 in an hour, says Bonneau, enough to allow secret-question guessing software to break into around 12 accounts.
Moreover, if you know something about the account owners’ geographic or cultural backgrounds, you can improve the odds, since the set of common names, for example, is highly dependent on the locality. (Hernandez is a much more common surname in Mexico than in Japan.) There are also cross-cultural differences in the dispersion of the distribution of names. Perhaps not surprisingly, names tend to be very diverse in the USA — we are, after all, a nation of immigrants. On the other hand, names, especially surnames, are much less diverse in South Korea, for example. When considering name combinations, there are obviously differences in likelihood: Juan Hernandez is a much more likely name than Mohammed Cohen. And there is one small oddity: pet names tend to be more diverse than people’s names. (I doubt this has any real significance, but it’s kind of fun.)
The authors also explore some ways in which the odds can be changed to favor security. One technique is to disallow certain question/answer pairs if they are deemed too easy to guess. (If your mother’s maiden name was Smith or Jones, you might have to pick another question.) But other approaches would probably be better:
But Bonneau says that websites should consider abandoning secret questions altogether. One option, already offered by Google, is for users to provide a cellphone number when registering an account. Passwords can then only be reset using a code sent to that number.
This is a good idea. It essentially adds “something you have” (the cellphone) to “something you know”. Another intriguing idea is what Bonneau calls “social back-up”. With this scheme, you pick five trusted friends, and supply their names and addresses. When you ask to have your password reset, each of these friends is sent a one-time code; you must be able to supply three of the five codes in order to reset your password. As with the cellphone method, anything that requires an attacker to intercept more than one communication channel is a significant help to security.