Full Report
Microsoft is investigating a new Microsoft 365 outage that is affecting Teams customers and causing call failures. [...]
Analysis Summary
# Incident Report: Microsoft 365 Service Outage Affecting Teams and Core Services
## Executive Summary
Microsoft experienced a significant service outage impacting Microsoft 365 customers, primarily manifesting as failures and call disruptions within Microsoft Teams. Although initially reported to affect only Teams, user reports indicated a broader service disruption encompassing Outlook, OneDrive, and Exchange. The incident's root cause and full scope were under active investigation by Microsoft, with services being analyzed via telemetry.
## Incident Details
- Discovery Date: Current report describes an ongoing incident ("Since the incident started more than one hour ago").
- Incident Date: Undisclosed, current ongoing outage.
- Affected Organization: Microsoft (as the service provider).
- Sector: Technology/Cloud Services (SaaS).
- Geography: Global (regions impacted not specified in detail).
## Timeline of Events
### Initial Access
- Date/Time: Not applicable (This is a service outage, not a typical cyberattack).
- Vector: Internal service issue/coding error (Inferred from previous related incidents, but root cause currently under investigation).
- Details: Microsoft acknowledged the outage, specifically noting that users "may not be able to receive calls placed through Microsoft Teams-provisioned auto attendants and call queues."
### Lateral Movement
- Not applicable (Internal service instability/failure).
### Data Exfiltration/Impact
- Data Exfiltration: None reported, impact is on service availability.
- Impact: Users experienced call failures in Teams, and broader issues with authentication, Outlook, OneDrive, and Exchange access.
### Detection & Response
- Detection: Hundreds of reports received by outage monitoring service Downdetector over an hour after the incident began.
- Response Actions: Microsoft logged a service alert (TM1022107) in the M365 admin center. They started analyzing service telemetry and call metadata to determine the next steps.
## Attack Methodology
*Note: As this appears to be a service stability/internal failure incident rather than a malicious cyber intrusion, standard MITRE ATT&CK categories are not fully applicable.*
- Initial Access: Service Degradation/Failure (Internal trigger).
- Persistence: N/A
- Privilege Escalation: N/A
- Defense Evasion: N/A
- Credential Access: Authentication problems reported, but no evidence of theft.
- Discovery: Microsoft analyzing service telemetry.
- Lateral Movement: N/A
- Collection: N/A
- Exfiltration: N/A
- Impact: Service unavailability/degraded functionality across multiple M365 services.
## Impact Assessment
- Financial: Undisclosed business productivity loss due to downtime.
- Data Breach: No data breach reported.
- Operational: Significant disruption to communication (Teams call failures) and core productivity suite access (Outlook, Exchange, potentially SharePoint/Bing).
- Reputational: Negative impact due to frequent outages, following recent related incidents.
## Indicators of Compromise
*Note: As a service outage, traditional Indicators of Compromise (IOCs) related to malware or intrusion tools are not available.*
- Network indicators: N/A
- File indicators: N/A
- Behavioral indicators: Failed Teams call establishment, inability to authenticate to M365 services, mobile vs. native app access discrepancies.
## Response Actions
- Containment measures: Analyzing service telemetry and call metadata.
- Eradication steps: Not yet detailed, pending root cause analysis.
- Recovery actions: Ongoing investigation to restore full functionality across affected services.
## Lessons Learned
- Repeated service instability: The organization is facing recurring major outages, suggesting underlying systemic fragility in recent updates (e.g., linking back to a previous buggy update impacting authentication or DNS changes).
- Inconsistent impact reporting: Initial assessment (Teams only) did not match user experience (multiple services affected), highlighting a need for rapid, accurate scope assessment.
## Recommendations
- Conduct a forensic review of the recent M365 authentication systems updates (including the one linked to the previous weekend's outage) to isolate and patch any systemic coding deficiencies.
- Improve the speed and accuracy of service telemetry analysis during major incidents to immediately identify the scope across all potentially affected M365 components (Teams, Exchange, SharePoint, etc.).
- Enhance change management procedures, particularly around DNS changes and authentication updates, to include more robust rollback or canary testing mechanisms.