Last updated: April 2025
What is a Transparency Note?
An AI system includes not only the technology, but also the people who will use it, the people who will be affected by it, and the environment in which it is deployed. Microsoft's Transparency Notes are intended to help you understand how the AI technology behind Copilot works, the choices we made that influence the system's performance and behavior, and the importance of thinking about the whole system, so that users of Copilot can take control of their own experiences and understand the steps we are taking to provide a safe and secure product.
Microsoft's Transparency Notes are part of a broader effort at Microsoft to put our AI Principles into practice. To find out more, see the Microsoft AI Principles.
The basics of Microsoft Copilot
Introduction
Copilot is an AI-powered experience that will help provide users with the information they're seeking while being prepared to support users in answering a wide range of questions regardless of the situation or topic. The refreshed Copilot goes beyond answering basic information retrieval queries and focuses on generating content to offer more proactive support to users when completing tasks. We have a growing understanding of how AI has the potential to help people learn, discover, and be more creative, which has required us to build a different type of product. The new Copilot experience seeks to become a new type of experience that is open-ended and dynamic to better address user needs in a manner that is more intuitive.
At Microsoft, we take our commitment to responsible AI seriously. The updated Copilot experience has been developed in line with Microsoft's AI Principles, Microsoft's Responsible AI Standard, and in partnership with responsible AI experts across the company, including Microsoft's Office of Responsible AI, our engineering teams, Microsoft Research, and Aether. You can learn more about responsible AI at Microsoft here.
In this document, we describe our approach to responsible AI for Copilot. Ahead of release, we leveraged Microsoft's state-of-the-art methods to map, measure, and manage potential risks and misuse of the system and to secure its benefits for users. As we have continued to evolve Copilot, we have also continued to learn and improve our responsible AI efforts. This document will be updated periodically to communicate our evolving processes and methods.
Key terms
Classifiers Machine learning models that help to sort data into labelled classes or categories of information. In the updated Copilot experience, one way in which we use classifiers is to help detect potentially harmful content submitted by users or generated by the system to mitigate generation of that content and misuse or abuse of the system.
Grounding For certain conversations where users are seeking information, Copilot is grounded in web search results. This means that Copilot centers its response on high-ranking content from the web and provides hyperlinked citations following generated text responses. Note that, at this time, user prompts in voice mode will not trigger a web search, therefore, no responses will include citations.
Large language models (LLMs) Large language models (LLMs) in this context are AI models that are trained on large amounts of text data to predict words in sequences. LLMs can perform a variety of tasks, such as text generation, summarization, translation, classification, and more.
Mitigation A method or combination of methods designed to reduce potential risks that may arise from using the AI features within Copilot.
Multi-modal models (MMMs) Multi-modal models (MMMs) are AI models that are trained on different types of data, such as text, images, or audio. These models can perform a variety of tasks, such as writing text, describing images, recognizing speech, and finding information across different types of data.
Prompts Inputs in the form of text, images, and/or audio that a user sends to Copilot to interact with the AI features within Copilot.
Red teaming Techniques used by experts to assess the limitations and vulnerabilities of a system and to test the effectiveness of planned mitigations. Red team testing includes testers adopting both benign and adversarial personas to identify potential risks and are distinct from systematic measurement of risks.
Responses Text, images, or audio that Copilot outputs in response to a prompt or as part of the back and forth with the user. Synonyms for "response" include "completion," "generation, and "answer."
Small language models (SLMs) Small language models (SLMs) in this context are AI models that are trained on smaller, more focused amounts of data compared to large language models. Despite their smaller size, SLMs can perform a variety of tasks, such as text generation, summarization, translation, and classification. While they may not match the extensive capabilities of LLMs, SLMs are often more resource efficient and can be highly effective for specific, targeted applications.
System Message The system message (sometimes referred to as a "metaprompt") is a program that serves to guide the system's behavior. Parts of the system message help align system behavior with Microsoft AI Principles and user expectations. For example, a system message may include a line such as "do not provide information or create content that could cause physical, emotional, or financial harm."
Capabilities
System behavior
With Copilot, we've developed an innovative approach to bring a more personalized AI experience to users for an engaging experience that can help users with a variety of tasks. This innovative approach leverages a variety of advanced technologies, such as language and multi-modal models from Microsoft, OpenAI, and other model developers. We worked on the implementation of safety techniques for the models underlying the new Copilot experience prior to public release to develop a customized set of capabilities and behaviors that provide an enhanced Copilot experience. In the updated Copilot, users can send prompts in natural language text or voice. Responses are presented to users in several different formats, such as chat responses in text form (with traditional links to web content as necessary) and images (if an image request was made as part of the prompt). If users send prompts in natural language voice within the Copilot Voice mode, they will receive audio responses.
When a user enters a prompt in Copilot, the prompt, conversation history, and the system message are sent through several input classifiers to help filter out harmful or inappropriate content. This is a crucial first step for helping to improve model performance and mitigate situations in which users might attempt to prompt the model in a way that could be unsafe. Once the prompt passes through the input classifiers, it is sent to an SLM to determine if the request requires grounding data from the web and which language model should respond to the request. All models generate a response using the user's prompt and recent conversation history to contextualize the request, the system message to align responses with Microsoft AI Principles and user expectations, and if appropriate, align responses with search results to ground responses in existing, high-ranking content from the web.
Responses are presented to users in several different formats, such as chat responses in text form, traditional links to web content, images, and audio responses. When responses are provided in the form of text—and the responses are grounded in data from the web—the output contains hyperlinked citations listed below the text so users can access the website(s) that were used to ground the response and learn more about the topic from there.
Copilot also helps users to create new stories, poems, song lyrics, and images. When Copilot detects user intent to generate creative content (such as a user prompt that begins with "write me a …"), the system will, in most cases, generate content responsive to the user's prompt. Similarly, when Copilot detects user intent to generate an image (such as a user prompt that begins with "draw me a …"), Copilot will, in most cases, generate an image responsive to the user's prompt. When Copilot detects user intent to modify an uploaded image (such as a user prompt that begins with "add a …"), Copilot will, in most cases, modify an image responsive to the user's prompt. Copilot may not respond with creative content when the user prompt contains certain terms that could result in problematic content.
Users with Microsoft accounts (MSA) now also have an option to subscribe to Copilot Pro, which offers an enhanced experience, including accelerated performance, use of Copilot Voice capabilities for longer periods of time, and in some cases, access to new, experimental features. Copilot Pro is currently available in a limited number of countries, and we plan to make Copilot Pro available in more markets soon.
Intended Safety Behavior
Our goal for Copilot is to be helpful to users. By leveraging best practices from other Microsoft generative AI products and services, we aim to limit Copilot from generating problematic content and increase the likelihood of a safe and positive user experience. While we have taken steps to mitigate risks, generative AI models like those behind Copilot are probabilistic and can make mistakes, meaning mitigations may occasionally fail to block harmful user prompts or AI-generated responses. If you encounter harmful or unexpected content while using Copilot, let us know by providing feedback so we can continue to improve the experience.
Use cases
Intended uses
Copilot is intended to support users in answering a wide range of questions regardless of the situation or topic. Users can interact with Copilot using text, image, and audio inputs where interactions are intended to feel more like natural conversations with an AI system. Additionally, if users are interacting with Copilot via text to seek out specific information on topics where Copilot might require more information to produce a more accurate answer, the experience is intended to connect users with relevant search results, review results from across the web, and summarize information users are looking for. In Copilot, users can:
-
Summarize real-time information when chatting via text. When users interact with Copilot via text, the system will perform web searches if it needs more information and will use the top web search results to generate a summary of the information to present to users. These summaries include citations to webpages to help users see and easily access the sources for search results that helped ground Copilot's summary. Users can click on these links to go straight to the source if they want to learn more.
-
Chat with an AI system using text. Users can chat with Copilot via text and ask follow-up questions to find new information and receive support across a wide variety of topics.
-
Interface with AI using voice. Copilot can not only receive audio input, but also produce audio output in one of four voices selected by users. Audio-to-audio capabilities enable users to interact with Copilot in a more natural and fluid manner.
-
Receive digestible news content. Users can use Copilot to receive a summary of news, weather, and other updates based on selected topic areas via the Copilot Daily feature and listen to these briefings in a podcast-like format. This feature will pull content from authorized sources that have agreements with Microsoft.
-
Get help generating new ideas. Each time users interact with the Copilot experience they will see a set of cards that they can click to start chatting with Copilot about useful and interesting topics. If users have interacted with other Microsoft consumer services, the cards will be personalized, in line with our privacy policies. Over time, cards in Copilot may be personalized based on a user's chat history. Users can opt-out of personalization at any time in settings, and we are still exploring options for personalization at a later date for users in the EEA (European Economic Area) and the UK.
-
Generate creative content. When chatting with Copilot, users can create new poems, jokes, stories, images, and other content with help from the Copilot experience. Copilot can also edit images uploaded by users if requested.
-
Carry out tasks on Android. Users can interact with Copilot through the Android platform via voice to carry out certain tasks. These tasks are setting timers and alarms, making phone calls, sending SMS messages, and ordering an Uber. Users must confirm the phone call, SMS message, and Uber order before the tasks are completed.
-
Assist with research tasks. Copilot can carry out research tasks by surfacing in-depth resources, offering detailed breakdowns of topics, and linking to sources to help users go beyond quick answers for more complex queries.
Considerations when choosing other use cases
We encourage users to review all content before making decisions or acting based on Copilot's responses as AI can make mistakes. Additionally, there are certain scenarios that we recommend avoiding or that go against our Terms of Use. For example, Microsoft does not allow Copilot to be used in connection with illegal activities or for any purpose intended to promote illegal activity.
Limitations
The language, image, and audio models that underly the Copilot experience may include training data that can reflect societal biases, which in turn can potentially cause Copilot to behave in ways that are perceived as unfair, unreliable, or offensive. Despite our intensive model training and safety fine-tuning, as well as the implementation of the responsible AI controls and safety systems that we place on training data, user prompts, and model outputs, AI-driven services are fallible and probabilistic. This makes it challenging to comprehensively block all inappropriate content, leading to risks that potential biases, stereotypes, ungroundedness, or other types of harm that could appear in AI-generated content. Some of the ways those limitations might manifest in the Copilot experience are listed here.
-
Stereotyping: The Copilot experience could potentially reinforce stereotypes. For example, when translating "He is a nurse" and "She is a doctor" into a genderless language such as Turkish and then back into English, Copilot might inadvertently yield the stereotypical (and incorrect) results of "She is a nurse" and "He is a doctor." Another example is when generating an image based on the prompt "Fatherless children," the system could generate images of children from only one race or ethnicity, reinforcing harmful stereotypes that might exist in publicly available images used to train the underlying models. Copilot might also reinforce stereotypes based on the contents in the user's input image by relying on components of the image and making assumptions that might not be true. We have implemented mitigations to reduce the risk of content that contains offensive stereotypes, including input and output classifiers, fine-tuned models, and system messages.
-
Overrepresentation and underrepresentation: Copilot could potentially over- or under-represent groups of people, or even not represent them at all, in its responses. For example, if text prompts that contain the word "gay" are detected as potentially harmful or offensive, this could lead to the underrepresentation of legitimate generations about the LGBTQIA+ community. In addition to including input and output classifiers, fine-tuned models, as well as system messages, we use prompt enrichment in Designer as one of several mitigations to reduce the risk of content that over- or under-represents groups of people.
-
Inappropriate or offensive content: The Copilot experience can potentially produce other types of inappropriate or offensive content. Examples include the ability to generate content in one modality (e.g., audio) that is inappropriate in the context of its prompt or when compared to the same output in a different modality (e.g., text). Other examples include AI-generated images that potentially contain harmful artifacts such as hate symbols, content that relates to contested, controversial, or ideologically polarizing topics, and sexually charged content that evades sexual-related content filters. We have put in place mitigations to reduce the risk of generations that contain inappropriate or offensive content, such as input and output classifiers, fine-tuned models, and system messages.
-
Information reliability: While Copilot aims to respond with reliable sources where necessary, AI can make mistakes. It could potentially generate nonsensical content or fabricate content that might sound reasonable but is factually inaccurate. Even when drawing responses from high-authority web data, responses might misrepresent that content in a way that might not be completely accurate or reliable. We remind users through the user interface and in documentation like this that Copilot can make mistakes. We also continue to educate users on the limitations of AI, such as encouraging them to double-check facts before making decisions or acting based on Copilot's responses. When users are interacting with Copilot via text, it will attempt to ground itself in high-quality web data to reduce the risk that generations are ungrounded.
-
Multilingual performance: There could be variations in performance across languages, with English performing best at the time of releasing the updated Copilot. Improving performance across languages is a key investment area, and recent models have led to improved performance.
-
Audio limitations: Audio models might introduce other limitations. Broadly speaking, the acoustic quality of the speech input, non-speech noise, vocabulary, accents, and insertion errors might also affect whether Copilot processes and responds to a user's audio input in a satisfactory way. Moreover, because user prompts when using Copilot Voice will not trigger web searches, Copilot may not be able to respond to current events in voice mode.
-
Dependence on Internet connectivity: The updated Copilot experience relies on Internet connectivity to function. Disruptions in connectivity can have an impact on the availability and performance of the service.
System performance
In many AI systems, performance is often defined in relation to accuracy (i.e., how often the AI system offers a correct prediction or output). With Copilot, we're focused on Copilot as an AI-powered assistant that reflects the user's preferences. Therefore, two different users might look at the same output and have different opinions of how useful or relevant it is to their unique situation and expectations, which means that performance for these systems must be defined more flexibly. We broadly consider performance to mean that the application performs as users expect.
Best practices for improving system performance
Interact with the interface using natural, conversational language. Interacting with Copilot in a way that is comfortable for the user is key to getting better outcomes through the experience. Similar to adopting techniques to help people communicate effectively in their day-to-day lives, interacting with Copilot as an AI-powered assistant either through text or speech that is familiar to the user may help elicit better results.
User experience and adoption. Effective use of Copilot requires users to understand its capabilities and limitations. There might be a learning curve, and users might wish to reference various Copilot resources (e.g., this document and our Copilot FAQs) to effectively interact with and benefit from the service.
Mapping, measuring, and managing risks
Like other transformational technologies, harnessing the benefits of AI is not risk-free, and a core part of Microsoft's Responsible AI program is designed to identify and map potential risks, measure those risks, and manage them by building mitigations and continually improving Copilot over time. In the sections below we describe our iterative approach to map, measure, and manage potential risks.
Map: Careful planning and pre-deployment adversarial testing, such as red teaming, helps us map potential risks. The underlying models that power the Copilot experience went through red team testing from testers who represent multidisciplinary perspectives across relevant topic areas. This testing was designed to assess how the latest technology would work both with and without any additional safeguards applied to it. The intention of these exercises at the model level is to produce harmful responses, surface potential avenues for misuse, and identify capabilities and limitations.
Before making the Copilot experience publicly available in a limited release preview, we also conducted red teaming at the application level to evaluate Copilot for shortcomings and vulnerabilities. This process helped us better understand how the system could be utilized by a wide variety of users and helped us improve our mitigations.
Measure: In addition to evaluating Copilot against our existing safety evaluations, the use of red teaming described above helped us to develop evaluations and responsible AI metrics corresponding to identified potential risks, such as jailbreaks, harmful content, and ungrounded content.
We collected conversational data targeting these risks, using a combination of human participants and an automated conversation generation pipeline. Each evaluation is then scored by either a pool of trained human annotators or an automated annotation pipeline. Each time the product changes, existing mitigations are updated, or new mitigations are proposed, we update our evaluation pipelines to assess both product performance and the responsible AI metrics. These automated evaluation context pipelines are a combination of collected conversations with human evaluators and synthetic conversations generated with LLMs prompted to test policies in an adversarial fashion. Each of these safety evaluations is automatically scored with LLMs. For the newly developed evaluations, each evaluation is initially scored by human labelers who read the text content or listen to the audio output, and then converted to automatic LLM-based evaluations.
The intended behavior of our models in combination with our evaluation pipelines—both human and automated—enable us to rapidly perform measurement for potential risks at scale. As we identify new issues over time, we continue to expand the measurement sets to assess additional risks.
Manage: As we identified potential risks and misuse through red teaming and measured them with the approaches described above, we developed additional mitigations that are specific to the Copilot experience. Below, we describe some of those mitigations. We will continue monitoring the Copilot experience to improve product performance and our risk mitigation approach.
-
Phased release plans and continual evaluation. We are committed to learning and improving our responsible AI approach continuously as our technologies and user behavior evolve. Our incremental release strategy has been a core part of how we move our technology safely from the lab into the world, and we're committed to a deliberate, thoughtful process to secure the benefits of the Copilot experience. We are making changes to Copilot regularly to improve product performance and existing mitigations, and implement new mitigations in response to our learnings.
-
Leveraging classifiers and the system message to mitigate potential risks or misuse. In response to user prompts, LLMs may produce problematic content. We discussed types of content that we try to limit in the System Behavior and Limitations sections above. Classifiers and the system message are two examples of mitigations that have been implemented in Copilot to help reduce the risk of these types of content. Classifiers classify text to flag potentially harmful content in user prompts or generated responses. We also utilize existing best practices for leveraging the system message, which involves giving instructions to the model to align its behavior with Microsoft's AI principles and with user expectations.
-
Consenting to Copilot image uploads. The first time a user uploads an image containing faces to Copilot, they will be asked to provide their consent to having their biometric data uploaded to Copilot. If a user does not opt-in, the image won't be sent to Copilot. All images, regardless of whether they contain faces or not, are deleted within 30 days after the conversation ends.
-
AI disclosure. Copilot is also designed to inform people that they are interacting with an AI system. As users engage with Copilot, we offer various touchpoints designed to help them understand the capabilities of the system, disclose to them that Copilot is powered by AI, and communicate limitations. The experience is designed in this way to help users get the most out of Copilot and minimize the risk of overreliance. Disclosures also help users better understand Copilot and their interactions with it.
-
Media provenance. When Copilot generates an image, we have enabled a "Content Credentials" feature, which uses cryptographic methods to mark the source, or "provenance," of all AI-generated images created using Copilot. This technology employs standards set by the Coalition for Content and Authenticity (C2PA) to add an extra layer of trust and transparency for AI-generated images.
-
Automated content detection. When users upload images as part of their chat prompt, Copilot deploys tools to detect child sexual exploitation and abuse imagery (CSEAI). Microsoft reports all apparent CSEAI to the National Center for Missing and Exploited Children (NCMEC), as required by US law. When users upload files to analyze or process, Copilot deploys automated scanning to detect content that could lead to risks or misuse, such as text that could relate to illegal activities or malicious code.
-
Terms of Use and Code of Conduct. Users should abide by Copilot's applicable Terms of Use and Code of ConductMicrosoft Services Agreement, and the Microsoft Privacy Statement, which, among other things, inform them of permissible and impermissible uses and the consequences of violating the terms. The Terms of Use also provides additional disclosures for users and serves as a reference for users to learn about Copilot. Users that commit serious or repeated violations may be temporarily or permanently suspended from the service.
-
Feedback, monitoring, and oversight. The Copilot experience builds on existing tooling that allows users to submit feedback, which are reviewed by Microsoft's operations teams. Furthermore, our approach to mapping, measuring, and managing risks will continue to evolve as we learn more, and we are already making improvements based on feedback gathered during preview periods.
Learn more about responsible AI
Microsoft Responsible AI Transparency Report
Microsoft AI Principles
Learn more about Microsoft Copilot
About this document
© 2024 Microsoft Corporation. All rights reserved. This document is provided "as-is" and for informational purposes only. Information and views expressed in this document, including URL and other Internet Web site references, may change without notice. You bear the risk of using it. Some examples are for illustration only and are fictitious. No real association is intended or inferred.
This document is not intended to be, and should not be construed as providing, legal advice. The jurisdiction in which you're operating may have various regulatory or legal requirements that apply to your AI system. Consult a legal specialist if you are uncertain about laws or regulations that might apply to your system, especially if you think those might impact these recommendations. Be aware that not all of these recommendations and resources will be appropriate for every scenario, and conversely, these recommendations and resources may be insufficient for some scenarios.
Published: 10/01/2024
Last updated: 10/01/2024