While prompting and steering techniques have been actively developed in general-purpose generative AI, there are not many resources for assistive visual question answering (VQA) systems and blind users; interfaces follow rigid patterns of interactions with limited opportunities for customization. We invite 11 blind users to customize their interactions with a conversational VQA system. Drawing on 418 interactions, reflections, and post-study interviews, we analyze prompting-based techniques participants adopted, including those introduced in the study and those developed independently in real-world settings. VQA interactions were often lengthy: participants averaged 3 turns, sometimes up to 21, with input text typically tenfold shorter than the responses they heard. Even assistive applications built on state-of-the-art LLMs often lacked verbosity controls, relied on inaccessible image framing, and offered no camera guidance. We show how customization techniques such as prompt engineering can help participants work around these limitations. Alongside a new publicly available dataset, we offer insights for interaction design at both query and system levels
Each participant folder contains two folders organized as follows:
The contents of this paper were developed under grant from the National Institute on Disability, Independent Living, and Rehabilitation Research (NIDILRR Grant No. 90REGE0024). This material is based upon work partially supported by the NSF under Grant No. 2229885 (NSF Institute for Trustworthy AI in Law and Society, TRAILS). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
Please cite our corresponding paper if you find our dataset useful. Following is the BibTex of our paper:
@inproceedings{..
}