What Data Does Microsoft Collect

Once again Microsoft has confirmed that it collects Windows 10 diagnostic data in order to provide relevant recommendations to users (based on their experiences). Collecting diagnostic data also helps Windows 10 operate properly and keep it up-to-date. Well, we are thrilled to learn that Microsoft helps us by gathering basic information about our devices and our Internet practice, including security info, all of this aimed to help Microsoft fix our problems.

What Data Does Microsoft Collect

Does Microsoft Spy On You?

But what exactly does Microsoft collect? What data do they think is useful or necessary for their purposes. Here comes the answer. Please note that all other major companies, including Google or Apple do gather more or less the same data at the same level:

Common Data: Device ID – not the user provided device name, but the ID that is unique for the device. HTTP header information and IP address (not the IP address of the device, but the source address in the network packet reader). OS name, version, language preferences, Microsoft User ID (if the user has a Microsoft account), Xbox play-station UserID. Any application ID, if the user is logged, is also tracked, including different IDs when the user is connected to various events and events periods of time. Device class – Mobile, Laptop, Desktop, Server and so on.

Device Properties: Model, serial number, OS. Firmware/BIOS (type, manufacturer, model, version). Other than OS name, version, and genuine status, Microsoft gathers installation type, subscription status, processor architecture, number of cores, speed, model, manufacturer of the device. Device identifier and Xbox serial number are included in the report. Memory and storage data: total memory, speed, memory available, speed, total capacity and disk type. Battery capacity, its charge capacity. InstantOn support and disk type. Hardware chassis type, color (!), form factor. It also determines if the device is a virtual machine or not.

Device capabilities: How many cameras, only front facing, only a rear facing camera or both cameras. If the device has a touch screen, how many hardware touch points are supported and if this function is On or Off. Processor capabilities. If the device includes Trusted Platform Module and if so, what TPM version is used. Is virtualization hardware enabled in the firmware – IOMMU, SLAT support. Is voice interaction supported, how many active microphones are there installed? How many displays are connected to the device, what is their resolution and DPI.

Network, Connectivity and Configuration Data: Connectivity and configuration capabilities of the device and its status. If the device has wireless capabilities, OEM or platform face detection, OEM of platform video stabilization, quality level set, advanced Camera Capture mode, HDR probability, Low Light probability, OEM vs. Platform implementation. Network system capabilities, local or Internet connectivity status, proxy and gateway, DHCP, DNS addresses, other details. It determines if the network is free or paid, if the wireless drives is emulated or not. Access point: mode capable, manufacturer, model, MAC address. WDI Version. Name of networking driver service. Wi-Fi device hardware ID and its manufacturer, as well as scan attempt counts and item counts. Microsoft wants to learn if Mac randomization is supported, if it is enabled or not. How many spatial streams and channel frequencies are supported, if a Manual or Auto Connect mode is used. Microsoft counts times and result of each connection attempt. It knows whether device is in Airplane mode status, periods of time and number of attempts. Interface description provided by the manufacturer, data transfer rates, cipher algorithm, IMEI (Mobile Equipment ID) and MCCO (Mobile Country Code). Name of the Mobile operator and Service provider name. Available SSIDs and BSSIDs. IP Address type – iPv4 or iPv6. Signal Quality peercentage and changes. Hotspot presence detection. Hotspot presence connections and success rate. TCP connection performance, Miracast device names, Hashed IP address. Login success of failure, login sessions (beginning and end).

Preferences and Settings: All user settings: Device, system, time, user-provided device name, network, Internet, Personalization, Cortana use, Apps installed and activated, history of gaming, ease of access (if any), privacy settings, update and security info. Information about the device status in the net: Whether it is domain-joined or cloud-domain joined – for example if the device is a part of a company-managed network, hashed representation of the domain name, etc. In case of a company, Microsoft knows the Enterprise Organization ID and Commercial ID. Mobile Device Management enrollment settings and status. Encryption settings, BitLocker, Secure Boot, other systems and their status. All Windows Update history, settings, status. Developer Unlock settings and status. Default browser choice and default apps when the device is turned on, as well ad default language settings for apps, input, speech, keyboard, display. All App store update settings.

Peripherals: Every peripheral name, class, model, manufacturer, description of the device peripherals, including their state, install state, checksum, drivers name, packages name, versons, driver state, problem code, kernel mode or not, if the driver is signed, image size. Apart from the manufacturer, Microsoft wants to know the peripheral’s defined ID to match a device to a driver INF file.

Product and Service Usage Data, including OS, Apps and Services: OS component, app feature usage, user navigation and interaction with app and Windows features, including user input (such as name of a new alarm set, user menu choices, user favorites).. Time of and count of app/component launches, duration of use, session GUIDD, process ID. App time and status – if they are running foreground or background, if they are sleeping or receive active user interaction. Microsoft also finds out user interaction method and duration (whether user uses the keyboard, pen, mouse, touch-screen, speech, game controller – and how long does he or she use these accessories. Cortana launch entry point and reason. Notification delivery requests and status. What apps are used to edit images and videos. Statistics about SMS, MMS, Vcard, broadcast message usage (on primary or secondary line). Everything about incoming and Outgoing calls, as well as Voicemail usage statistics on primary or secondary line, as well as emergency alerts are received or displayed. Content searches within an app. Reading activity, including bookmarking used, print used, layout changed and more.

General Product State: All information about Windows and application state, including Start Menu, Taskbar pins, online/offline periods of time and status. Personalization impressions delivered, and whether the user clicks or hovers on UI controls or hotspots. App launch state with deep-link such as Groove launched with an audio track to play or share contract such as MMS launched to share a picture. User feedback Like or Dislake, rating and more. Caret location or position within documents and media files (for example, how much of a text in a book has been read in a single sessin or how much of a song has been listened to, even when the user reads or listens to the music offline).

Device and Service Performance Data, including details about the health of the hardware and software: Error codes, error messages, name and ID of the app which fails to respond or acts in a wrong way, process reporting the error. DLL library predicted to be the source of the error – xyz.dll. System generated files (product and app logs, files traced to diagnose a crash or hang). Registry keys and other system settings. User generated files, such as .doc, .ppt, .csv – when and where they are indicated as a potential cause for a crash or hang. All the details and counts of abnormal shutdowns, hangs, crashes. All the crash failure data, such as OS, OS component, device, drive, manufacturer (1st or 3rd party app data). Crash and Hang dumps: State of the working memory at the point of the crash; memory in use by the kernel at the point of the crash; memory in use by the application at the point of the crash; all the physical memory used by Windows at the point of the crash; class and function name within the module that failed.

Performance of the Device and Its Software, and Reliability Data: User Interface interaction duration, such as Start Menu display times, browser tab switch times, app launch, switch times, Cortana and search performance, reliability. Device on/off performance, such as boot, shutdown, power on/off, lock/unlock times, user authentication times (face recognition or fingerprint duration). In-app responsiveness, such as time to set alarm, time to fully render in-app navigation menus, time to sync reading list, to start GPS navigation, time to attach picture MMS, time to complete a Windows Store transaction. User input responsiveness – onscreen keyboard invocation times for different languages, time to show auto-complete words, touch or pen latency, latency for handwriting recognition to words, Narrator screen reader responsiveness, CPU score. UI and media performance and glitches/smoothness – video playback frame rate, audio glitches, animation glitches (shutter when bringing up Start), graphics score, time to first frame, play/pause/stop/seek responsiveness, time to render PDF, dynamic streaming of video from OneDrive performance. Disk footprint – Free disk space, out of memory conditions, disk score, etc. Excessive resource utilization, such as components impacting performance or battery life through high CPU usage during different screen and power states. Background task performance, such as download times, Windows Update scan, Windows Defender Antivirus scan times, disk defrag times, mail fetch times, service startup and state transition times, time to index on-device files for search results. Peripheral and devices, such as USB device connection times or time needed to connect to a wireless display, Wi-Fi connection time, time to get an iP address from DHCP or printing times, network availability. Smart card authentication times, automatic brightness environmental response times. Device setups, such as first setup, time to install updates or apps, etc. Time needed to recognize connected devices, such as printers or monitors. Time the device need to setup Microsoft Account. Battery and power life, such as power draw by component (process/CPU/GPU/Display), hours of screen time, sleep state transition details temperature and thermal throttling, battery drain in a power state (screen off or screen on), components and processes requesting power use during screen off, auto-brightness details. Time the device is plugged into AC vs. Battery, battery state transitions. Service responsiveness (operation, URI, latency, service success/error codes, protocol). Diagnostic heartbeat – regular signal to validate the health of the diagnostics system.

Software Installation, Update Information, Inventory: Microsoft collects data about apps, drivers, update packages, OS components installed on the device, including component’s name, ID, package family name, product, SKU, availability, catalog, content, Bundle Ids, app or driver publisher, version and type (Win32 or UWP). Installation type – clean install, repair, restore, OEM, retail, upgrade, update. Naturally Microsoft gathers data about install date, method of installation, install directory, count of install attempts. MSI package code, product code. Original OS version at install time, user or administrator or mandatory installation/update.

Device Update Information: Update Readiness analysis of device hardware, OS components, apps. Information about drivers, such as status, progress, results. Number of applicable updates, their importance and type. Update download size and source (CDN or LAN peers). Delay upgrade status and configuration. OS uninstall and rollback status and count. Windows Update server and service URL, as well as Windows Update machine ID and Windows Insider build details.

Content Consumption Data (diagnostic details about Microsoft Applications that provide media consumption functionality, such as Groove Music): This data is not intended to capture user viewing, listening or reading habits (see points above): Information about movie consumption functionality on the device, such as Video Width, height, color pallet, compression type, encryption type. Instructions for streaming content for the user (smooth streaming manifest of chunks of content files that must be pieced together to stream the content based on screen resolution and bandwidth). URS for a specific two second chunk of content if there is an error. Full screen viewing mode details. Information about reading consumption functionality, such as app accessing content and status; options used to open a Windows Store book; language of the book; time spent reading text; content type and size details. Information about music and TV consumption, such as service URL sof song being downloaded from the music service (this data is collected when an error occurs to facilitate restoration of service); content type (video, audio, surround audio); local media library collection statistics (number of purchased tracks, number of playlists); Region mismatch (user OS Region and Xbox Live region). Information about photos usage, such as file source data (local, SD card, network device, OneDrive), image resolution, video length, file sizes types and encoding. Collection view or full screen viewer and duration of view.

Browsing, Search and Query Data (includes activity in the Microsoft browsers, Cortand, local lfile searches on the device): Information about Microsoft browser data – text typed in address bar and search box; text selected for Ask Cortana search; service response time; auto-completed text if there was an auto-complete; navigation suggestions provided based on local history and favorites; browser I; URLs (which may include search terms); Page title. On-device file query, such information as kind of query issued and index type (ConstraintIndex, SystemIndex); Number or items requested and retrieved; File extension of search result user interacted with; Launched item kind, file extension, index of origin, and the App ID of the opening app; Name of process calling the indexer and time to service the query; A hash of the search scope, such as file Outlook, OneNote, IE history; the state of the indices (fully or partially optimized, being built).

Inking Typing and Speech Utterance Data which gathers details about the voice, inking, typing input features: type of pen used (highlighter, ball point, pencil), pen color, stroke height and width, time of use. Pen gestures (click, double click, pan, zoom, room, rotate). Palm Touch x,y coordinates. Input latency, missed pen signals, number of frames, strokes, first frame commit time, sample rate. Ink strokes written, text before and after the ink insertion point, recognized text entered, Input language which is processed to remove identifiers. Sequencing information, names, email addresses, numeric values which could be used to reconstruct the original content or associate the input to the user. Text of speech recognition results – result codes and recognized text. Model of the recognizer. System Speech language. App ID using speech features, whether user is known to be a child. Confidence and Success/Failure of speech recognition.

Licensing and Purchase Data with Diagnostic Details about the Entitlement Activity on the Device: Information about purchase history, product ID, edition ID, product URI, price, order requested date/time, store client type – Web or native client, purchase quantity, payment type (credit card type or PayPal). Entitlements: Service subscription status and errrs, DRM and license rights details – Groove subscription or OS volume license, Entitlement ID, lease ID, package ID or the install package. Entitlement revocation. License type (trial, offline vs. online), duration. License usage session.

See also:

  • To keep your high status through the next year, sell us your old electronics. Get a quote now! It’s free and takes a few seconds: Sell old electronics online now.

Comments

This post currently has 2 responses

  • Thanks for sharing your this, even though it’s a widely known info. However it always pays to remind people about data that Microsoft and other companies collect from us, their customers.

Leave a Reply

Your email address will not be published.

Sidebar