Failing early. Lessons learnt from usability testing

When I was asked to evaluate a government website, I encountered a discrepancy between government stakeholder assumptions and the needs of the non-profit sector

The challenge

Evaluate a design in late-stage development
Improve services for government and non-profit organisations
Test the assumptions used to discover requirements
Gather feedback from community organisation fundraisers

I love working with teams who have a drive to make things better. But strong motivation to solve problems must be balanced with an understanding of which problems to solve.

I was I brought into the team to check that a web application was accessible. It was the last development sprint.

The team had built a website to help community organisations and non-profit fundraisers comply with government legislative requirements and make the formerly paper-based process easier.

I needed to become familiar with the complexities of the product, understand the needs of community fundraisers, then plan and run research activities to evaluate the proposed design.

Interviewing internal stakeholders revealed that subject matter experts had completed requirements discovery using expert-level knowledge of backstage processes and legislation. Community fundraisers had not seen the design.

Non-government organisation (NGO) fundraiser needs

Community NGO operating models are distinct from the big industry leaders
Community NGOs rely on older volunteers who are familiar with previous models
Organisations need to proactively support adoption of new models
Government needs to support compliance via organisations and volunteers

Testing

I observed two user groups with a think-aloud usability test protocol. Group one was a mix of staff administrators who oversee regulation and support backstage process, and group two were community organisation fundraisers who need to comply with regulation.

Observing staff subject matter experts helped me understand how internal stakeholders viewed the process fitted to the legislative model. Observing community user workflows helped me understand how the system fitted real world use.

I had a hunch that comparing and internal and external perspectives would help build empathy. By including both staff and community users in the research I hoped to understand how the design was aligned to the needs of both community users and staff.

Staff user group

Staff subject matter experts were familiar with the backstage processes.

Staff participants said that the design was clean and simple, and that filling in the forms online was much easier than the paper-based version.

Staff participants were observed completing tasks with a reasonably high success rate, and subjectively rated the system as performing well.

“It’s simple and basic, nothing that I think needs to be changed.”
“Easy to use overall.”
“That’s the way our process works.”

After the first few tests it looked like the new system was meeting staff expectations.

Community fundraisers group

Fundraisers were asked to use the new system to complete their workflows the way they would normally on paper. I asked them to talk to me about how and why they work that way.

Community users became disoriented by a process flow which didn’t match the way they needed to collect information.

Specialist jargon vocabulary used in the interface instruction caused confusion and incorrect information to be supplied. Errors would require call backs to resolve, increasing support requirements for both users and service desk.

Community users became increasingly frustrated with poorly sequenced navigation that appeared to take users backward.

I observed them failing key tasks, losing work done without a save function, or giving up having been unable to complete tasks.

“I’m lost – haven’t I been here already and already answered these questions?”
“I’ve had this screen before. We’re going around in circles. I thought I’d gone back to the previous screen…
“Where am I now? Are these traps that you set for us?”
“I’m really angry now. This is the worst thing I’ve seen in years”
“Where’s the exit? I would get out of the whole thing. This is disgusting. I’d complain to the Minister”
“I find this impossible… I’d rather print the form off and spend two hours completing it on paper”

Areas for improvement

Underlying problems and themes became clear throughout the test.

Process flows needed to match the way users gather information in the real world
Users needed to know what information they needed at hand before they started
Inaccessible overlays needed to be replaced with straightforward linear flows
With multiple business touch points, users still needed one point of contact
The form should not have asked for the same information twice, even if it is needed by two different business units
With multiple steps to complete, a save function was needed to retain work done
Internal business vocabulary and jargon needed to be simpler plain language
A clear progress bar was needed so that process steps could be understood

Measuring the difference

With hundreds of insights, I was confident that analysis would demonstrate a measurable difference in how the system matched the needs of staff and the needs of the community.

By correlating participant task success rates with questionnaire responses, along with in-depth voice of the community, I reported on the overall similarities and differences between the two groups and the challenges they faced.

Comparing staff and community user feedback

Staff participants rated the system significantly higher than community participants for satisfaction and effectiveness. Staff reported far fewer issues, and the problems observed were not as severe compared to external users. Staff used their knowledge of the backstage process to recover from errors.

Staff participants were far more forgiving, and important design problems were not represented in their feedback.

Actions taken

Findings were written in a detailed report and distributed for the team to consider. A summary of findings was presented in a follow-up workshop to prioritise improvement opportunities.

After commitments were made to address critical issues, the development team needed to estimate refactoring code, because many of the features that tested poorly had already been developed.

Key takeaways

Staff expert users are not representative of all users
Use experts to inform initial assumptions
Test early with external stakeholders
Problems are expensive to fix after development