Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
You can configure log sources to ingest directly into the data lake tier without mirroring to the analytics tier. Direct ingestion is useful when you have high-volume log sources that you want to retain for hunting and forensic purposes but don't need for real-time alerting.
By ingesting these sources directly into the data lake tier, you avoid analytics tier ingestion costs while still making the data available for KQL queries, Spark notebooks, and long-term analysis. This article helps you determine which log sources are good candidates for direct data lake ingestion based on their value for detection, hunting, and investigation workloads.
Configure direct data lake ingestion from the connector setup pages or the Table management page in the Microsoft Defender portal. For more information, see Configure table settings in Microsoft Sentinel.
Which logs should you ingest into the data lake?
After you onboard to Microsoft Sentinel data lake, you can choose which log sources to send to the data lake tier, the analytics tier, or both. The analytics tier is optimized for real-time detection and alerting, while the data lake tier is optimized for cost-effective long-term retention and hunting. Use the following guidance to determine which log sources to ingest into each tier based on their value for different security workloads.
Analytics tier use cases
You use the analytics tier to ingest log data into Microsoft Sentinel workspaces, where you can run analytics rules, custom detections, and live queries. Ingest log sources into the analytics tier when you need:
- Real-time detection and correlation: Alert on critical events from endpoints, identity systems, cloud security controls, and network perimeter devices.
- Active incident investigation: Run live queries against current data during incident response.
- High-fidelity signals: Ingest sources with direct detection value, such as EDR alerts, privileged access logs, authentication events, and threat intelligence indicators.
Data lake tier use cases
You use the data lake tier to store logs at lower cost for workloads that don't require real-time alerting. Ingest log sources into the data lake tier when you need:
- High-volume log retention: Store sources that are useful for forensic analysis or periodic threat hunts but are too costly to retain in the analytics tier.
- Threat hunting: Run cross-log searches, trend analysis, and historical queries to identify patterns across extended time ranges.
- Batch analytics and summarization: Use Spark notebooks, KQL, or similar tools to enrich, correlate, or summarize data, then forward only high-priority signals to the analytics tier.
- Machine learning and advanced analytics: Apply big data techniques to identify complex relationships and anomalies in historical data.
Analytics rules and custom detections
You can't run analytics rules or custom detections on data in the data lake tier. If you ingest logs only into the data lake tier, those logs don't generate alerts. To maintain real-time detection coverage, keep time-sensitive, high-fidelity log sources in the analytics tier.
Choose an ingestion tier by log source type
Use the following table as a general guide to decide where to ingest each log source type. Assess your own workloads, alerting requirements, and risk tolerance when you configure log ingestion. Some of the log sources have dedicated Microsoft Sentinel connectors while others may require Syslog, CEF, API-based, or custom connectors for ingestion.
| Log source type | Typical log volume | Value for real-time threat detection and alerting | Value for threat hunting | Value for incident investigation and forensics | Data lake only ingestion fit |
|---|---|---|---|---|---|
| AAA (TACACS/Radius) | Medium | High | High | High | Poor fit |
| Active Directory (on-premises) | High | High | High | High | Poor fit |
| Application Logs | High | Medium | Medium | High | Suitable fit |
| AV Logs (Windows Events 5000s & 3rd party) | Medium | High | High | High | Poor fit |
| Azure Activity | Medium | High | High | High | Poor fit |
| Biometric Access System Logs | Low | Medium | Low | High | Suitable fit |
| Building Security System Logs | Low | Low | Low | Medium | Suitable fit |
| Call Center/VoIP Logs | Medium | Low | Low | Medium | Suitable fit |
| CASB | High | High | High | High | Suitable fit |
| Citrix/Horizon/ALBs | Medium | Medium | Medium | High | Suitable fit |
| Cloud IAM | Medium | High | High | High | Poor fit |
| Cloud PaaS | High | High | High | High | Suitable fit |
| Cloud Security Controls | Medium | High | Medium | High | Poor fit |
| Cloud Storage (S3, Blob, etc.) Logs | High | Low | High | High | Poor fit |
| CRM Audit Logs | Low-Medium | Low | Low | Medium | Poor fit |
| Database Audit Tools | Medium | High | High | High | Suitable fit |
| DHCP Logs | Medium | Medium | Medium | High | Suitable fit |
| DLP Alerts | Low | High | High | High | Suitable fit |
| DNS Logs | High | High | High | High | Suitable fit |
| Endpoint Detection and Response (EDR) (Alerts) | Medium | High | High | High | Poor fit |
| Endpoint Detection and Response (EDR) (Raw) | High | High | High | High | Suitable fit |
| Email Security (3rd party alerts) | Medium | High | Medium | High | Poor fit |
| ERP Audit Logs | Low-Medium | Low | Low | Medium | Suitable fit |
| File Integrity | Low | Medium | Medium | High | Suitable fit |
| Firewall Threat/Malware/IPS/IDS | High | High | High | High | Poor fit |
| Firewall Traffic Logs | High | High | High | High | Suitable fit |
| GitHub/GitLab/Code Repo Logs | Low-Medium | Medium | Medium | High | Suitable fit |
| Google Workspace Logs | Medium | Medium | Medium | High | Suitable fit |
| Identity (Microsoft Entra ID, Okta, LDAP) | Medium | High | High | High | Poor fit |
| IIS/Apache Logs | Medium | High | High | High | Suitable fit |
| IoT Device Logs | High | Medium | Medium | Medium | Suitable fit |
| Kubernetes/Container Logs (alerts, critical) | High | High | High | High | Poor fit |
| Kubernetes/Container Logs (raw logs) | High | High | High | High | Suitable fit |
| LAN/WAN Router Switch | High | Medium | Medium | Medium | Suitable fit |
| Linux Server AuditD | Medium | High | High | High | Poor fit |
| Mobile Device Management (Microsoft Intune) | Medium | Medium | Medium | Medium | Suitable fit |
| Microsoft Office Logs (Teams, Office, SharePoint) | Medium | Medium | Medium | High | Poor fit |
| Microsoft XDR Alerts (Defender: Office, Identity, Endpoint, CloudApp) | Medium | High | High | High | Poor fit |
| Multifactor authentication (MFA) | Medium | High | Medium | High | Poor fit |
| Netflow | High | Medium | High | Medium | Suitable fit |
| Network Detection (Corelight, Vectra, Darktrace) | High | High | High | High | Poor fit |
| OT/ICS System Logs | Medium | High | High | High | Suitable fit |
| PAM (Privileged Access Management) | Low | High | High | High | Poor fit |
| PIM (Privileged Identity Management) | Low | High | High | High | Poor fit |
| POS System Logs | High | High | High | High | Suitable fit |
| Proxy Logging (URL filtering) | High | High | High | High | Suitable fit |
| Salesforce Audit Logs | Medium | Medium | Medium | High | Suitable fit |
| SD-WAN | Medium | Medium | Medium | Medium | Suitable fit |
| ServiceNow Audit Logs | Low | Low | Low | Medium | Suitable fit |
| SIEM/SOAR Platform Logs | Medium | High | High | High | Not recommended |
| Slack/Teams Collaboration Logs | Medium | Low | Medium | Medium | Suitable fit |
| Sysmon (Endpoint, for EDR complement) | Medium | High | High | High | Suitable fit |
| Threat Intelligence Indicators | Low | High | High | High | Not recommended |
| VDI Logs | Medium | Medium | Medium | High | Suitable fit |
| VPN | Medium | High | High | High | Not recommended |
| Vulnerability Scanning | Low | Medium | Medium | Medium | Suitable fit |
| Web Application Firewall (WAF) Logs | Medium | High | High | High | Suitable fit |
| Windows Server Events | High | High | High | High | Not recommended |
| XDR Source Logs (Defender: Office, Identity, Endpoint, CloudApp) | Medium | High | High | High | Not recommended |
| Zoom Meeting Logs | Low-Medium | Low | Low | Medium | Suitable fit |