In a Wizard of Oz test, users believe they are interacting with an automated system, but a human manually produces the outputs in real time. It allows you to test the value of AI or automation before building the actual system.
Testing AI or automation features where building the real system is expensive. Ideal for validating whether users will trust and act on AI-generated output.
- Design the interface or output format users will see — exactly as the automated system would show it
- Recruit users to interact with the 'system' in a realistic test scenario
- Behind the scenes, a human (the 'wizard') produces the outputs manually in real time
- Users should not know a human is involved during the test
- Measure: do they use the output? Trust it? Does it change their behaviour?
- Debrief honestly after the test and gather detailed feedback
Testing an 'AI playlist creator': the UI shows a text input 'Describe your perfect playlist.' Behind the scenes, a human curator builds the playlist manually in real time. Users who get a great playlist try the feature again 3x more. Learning: the quality threshold for AI playlist generation is much higher than initially assumed.
Please contact the author for more information on these examples at linkedin.com/in/kshitijrege
- The wizard producing higher quality than the actual AI will — keep output quality realistic
- Forgetting to debrief honestly — users deserve to know what they participated in
- Running it too long — 1-2 week tests maximum
- Testing Business Ideas — David Bland