Education

Can CAPTCHA be More Accessible?

Can CAPTCHA Be More Accessible?

Brian Dobreski

As users of the web, all of us have likely confronted a CAPTCHA test at some point: type the letters shown above, select all images with a bicycle, click to confirm you are not a robot. Such tests are a familiar part of end user web experiences these days. In fact, the Information Matters site itself relies on them for certain functions. Here and elsewhere on the web, CAPTCHAs serve as a kind of Turing Test: they are used to distinguish intended, human users from bots and other automated approaches that seek to exploit or disrupt services on the web. While the term CAPTCHA has come to be used colloquially to refer to any such test, the original CAPTCHA, or Completely Automated Public Turing Test to Tell Computers and Humans Apart, was first developed at Carnegie Mellon University (CMU) in 2000. This initial CAPTCHA test relied on humans’ abilities to visually interpret obscured and distorted texts in a way that bots cannot. Other, newer approaches similarly rely on human senses to solve visual, audio, or logic problems. While designed to exclude bots, the interactive nature of CAPTCHA tests also serves to exclude many human users with disabilities, as well as non-English speakers.

—CAPTCHA: Completely Automated Public Turing Test to Tell Computers and Humans Apart—

Can CAPTCHA practices be made more accessible? That’s the question explored in a recent publication by the W3C’s Accessible Platform Architectures Working Group. Titled Inaccessibility of CAPTCHA: Alternatives to Visual Turing Tests on the Web, this document was originally published in 2019, with a new draft issued in December 2021. In the current draft, the Working Group examines a number of different approaches that allow systems to tell humans and bots apart, and comments on the extent to which each of these can (or cannot) accommodate people with disabilities.

Roughly speaking, CAPTCHAs can be broken down into three main approaches. The first, referred to as interactive stand-alone approaches, are perhaps the most recognizable to many of us. These present the user with some kind of test, such as reading distorted text or identifying particular images. In general, stand-alone approaches pose the most obvious accessibility issues for persons with sensory impairments, persons with learning or cognitive disabilities, and non-English speaking persons.

The second type of CAPTCHAs are the non-interactive stand-alone approaches. These CAPTCHAs are less intrusive and usually do not rely on sense-based interactions. For example, one such approach, known as the honeypot technique, employs hidden fields or fields marked for users to leave alone. Bots tend to interact with these while human users do not, thus allowing the system to tell the difference. While non-interactive approaches are more accessible, they may pose other problems. Limited-use accounts, for instance, attempt to set limits on the amount of activity a single user account can undertake. This can help address bots who repeatedly buy tickets or engage in (or overwhelm) other services. Determining how much interaction is out of the ordinary for a human, however, can be difficult. Overall, while non-interactive approaches may be more accessible, they may have higher margins of error.

Finally, there’s what the Working Group refers to as multi-party approaches. Google’s reCAPTCHA v3 is perhaps the most well-known of these. In a multi-party approach, an unrelated third party may be used to verify that a user is indeed human. This can be done through the use of cookies, certificates, or tokens. While these approaches can avoid accessibility issues common to more traditional CAPTCHAs, they pose problems of their own. Certificates may disclose a user’s identity, and worse, their disabilities, to a service provider. Token-based approaches may not pose the same privacy risk, but questions remain regarding who should be authorized to issue such tokens.

In summary, most accessibility issues related to CAPTCHAs are associated with earlier, interactive stand-alone approaches, which still tend to be quite common on the web. Newer approaches often trade inaccessibility for other logistical, security, or privacy issues. Looking forward, the Working Group seems most optimistic about token-based, multi-party approaches, calling for the development of a third party capable of supporting identity management on the web. For now, the Working Group recommends that non-interactive approaches be used where possible; in cases where these are infeasible or pose too much security risk, another option would be for web developers to give users a choice of interactive CAPTCHA tasks, better accommodating for differences in ability and language.

With this draft report, the Working Group stresses that all persons need the ability to interact across the web with minimal intrusion and interruption in order to confirm that they are indeed human. After a period of public feedback and comment, Inaccessibility of CAPTCHA is currently undergoing revisions, with a new draft expected to release in the future. For now, the current draft can be viewed at: https://www.w3.org/TR/turingtest/

This and other documents, standards, and initiatives for accessibility on the web were explored at the recent ASIS&T Annual Meeting 2022 by members of the ASIS&T Standards Committee. This is just one of many areas that the Standards Committee follows closely. To find out more about the work of this committee or contact us, please visit: https://www.asist.org/about/committees-and-task-forces/

Cite this article in APA as: Dobreski, B. (2022, November 15). Can CAPTCHA be more accessible? Information Matters, Vol. 2, Issue 11. https://informationmatters.org/2022/11/can-captcha-be-more-accessible/

Brian Dobreski

Brian Dobreski is an Assistant Professor in the School of Information Sciences at University of Tennessee-Knoxville. His research focuses on the social implications of metadata, resource description, and other knowledge organization practices, as well as the concepts of personhood and personal identity in information. Brian received his Ph.D. in information science from Syracuse University. He has authored works in publications including Journal of Documentation, Knowledge Organization, Cataloging & Classification Quarterly, Education for Information, International Journal of Digital Curation, and Journal of Education for Library and Information Science. He currently serves as President of the Canada & U.S. Chapter of the International Society for Knowledge Organization.