Frequently asked questions about rich image descriptions in Narrator

Note: Rich image descriptions in Narrator are available in preview through the Windows Insider Program.

Image descriptions in Narrator provide detailed descriptions of visual content such as images, charts, graphs, diagrams, unlabeled buttons, and more. Rich image descriptions enable blind users to understand visual content through detailed context. This feature is currently available on Snapdragon-powered Copilot+ PCs in the Windows Insider Program. Other Windows devices will continue to use the standard image description experience, which relies solely on online services.

Narrator uses AI models to provide detailed textual descriptions of images, charts, and graphs. When Narrator is turned on, you can press Narrator key+Ctrl+D to get a description of the image or item you are focused on.

For example, the description of an image of a nursery would be:

The image depicts a large organized arrangement of small green leafy plants which are likely sprouts or seedlings arranged in a neat dense grid pattern. Each plant is contained within a small shallow black container suggesting a nursery or a planting setup. The plants are evenly spaced creating a uniform and orderly appearance which may symbolize growth organization or a collection. The black containers provide a stark contrast to the green sprouts highlighting the focus on the plants.

Image descriptions in Narrator are designed to provide text descriptions of visual content for individuals who are blind or have low vision. The descriptions are intended to improve your understanding of images, charts, and graphs, and support accessibility. You can regenerate the image description and can copy the description for future reference.

To ensure the quality of descriptions generated by Narrator, a data set including various types of images was created. These images included natural photographs, charts, graphs, screenshots, and app user interfaces. The generated descriptions were evaluated for accuracy, completeness, relevance, and usefulness. Several evaluation methods, including human expert judgments and LLM-assisted scoring, were used to find areas for improving the quality of generated descriptions.

Microsoft is committed to creating responsible AI by design. Our work is guided by a core set of principles: fairness, reliability and safety, privacy and security, inclusiveness, transparency, and accountability. Narrator may provide inaccurate image description, data in charts or graphs, or emotional inferences. This may lead to incorrect assumptions about an image, or the intent of visual content based on the generated description. We continue to work on the models Narrator uses to improve the quality of provided image descriptions. You can submit feedback using any of the methods discussed in How do I provide feedback on image descriptions in Narrator?

This feature should not be used to:

generate descriptions for medical or health-related images that could be misinterpreted as medical advice. Incorrect descriptions could lead to misinformation and potentially harmful decisions by users.
generate descriptions for images in legal or financial documents where accuracy is critical. Misinterpretation of such images could lead to legal disputes or financial losses
generate descriptions for images containing cultural or religious symbols without proper context. Misinterpretation could lead to cultural insensitivity or offense.
generate descriptions for images containing maps, flags, or globes. Misinterpretation of these images could lead to misinformation and involvement in international affairs.

To get an image description when Narrator is on, press Narrator key+Ctrl+D while focusing on visual content. To turn off image descriptions in Narrator, go to Settings > Accessibility > Narrator > Get image descriptions, page titles, and popular links and select the toggle switch.

There could be inaccuracies in the descriptions Narrator provides. To improve the quality of descriptions, you can provide feedback by:

Selecting the thumbs-up or thumbs-down icon on an image description in the Narrator user interface.
Responding to occasional prompts from Windows asking you to rate or provide written feedback about the product or services you use.
Opening Feedback Hub to find similar feedback to upvote or give new feedback by filling out the form.

Microsoft’s commitment to responsible AI and Privacy

Microsoft has been working to advance AI responsibly since 2017, when we first defined our AI principles and later operationalized our approach through our Responsible AI Standard. Privacy and security are core principles as we develop and deploy AI systems. We work to help our customers use our AI products responsibly, sharing our learnings, and building trust-based partnerships. For more about our responsible AI efforts, the principles that guide us, and the tools and capabilities we've created to assure that we develop AI technology responsibly, see Responsible AI.

Rich image description in Narrator is designed to improve accessibility for blind and low-vision users and is not intended for a wider audience. The AI models for this feature use contextual cues in the entire image, including people or entities in the background, which is how the models can still associate the image with an individual, or describe emotions. Rich image descriptions in Narrator allow for emotional inferences, but do not use biometric data. Any processing that returns results that identify an individual or infer an individual's emotion is not the result of processing of the face, such as facial recognition, generation and comparison of facial templates. For example, if an image contains a photo of a popular athlete wearing their team's jersey and their specific number, the models may still return a result that might identify the individual based on those contextual cues.

This feature should not be used to infer or deduce the emotions of natural persons in the workplace or in education institutions (e.g., employees or students). Image description in Narrator can provide detailed text descriptions related to perceived emotions of people in images. The processes underlying human emotion are complex, and there are cultural, geographical, and individual differences that influence how we may perceive, experience, and express emotions. Responses related to the emotions of people in images are based on how they appear and may not necessarily accurately indicate the internal state of individual people.

Published: February 11, 2025

Last updated: February 11, 2025

Frequently asked questions about rich image descriptions in Narrator

Microsoft’s commitment to responsible AI and Privacy

Need more help?

Want more options?

Was this information helpful?

Thank you for your feedback!

What are rich image descriptions in Narrator?

How do rich image descriptions in Narrator work?

What can rich image descriptions in Narrator be used for?

How were rich image descriptions in Narrator evaluated? What metrics were used to measure performance?

What are the limitations of rich image descriptions in Narrator and how can users minimize the impact of these limitations when using the system?

What operational factors and settings allow for effective and responsible use of rich image descriptions in Narrator?

How do I provide feedback on rich image descriptions in Narrator?

Microsoft’s commitment to responsible AI and Privacy

Need more help?

Want more options?

Was this information helpful?

Thank you for your feedback!