Filtering Common Event Format at source for Microsoft Sentinel

The following is an rsyslog template for parsing Common Event Format messages and writing output to a dedicated log file.  The created log file can be transformed with the Azure AMA agent (JSON format) for ingestion into Log Analytics.

The benefit of using rsyslog templates for filtering CEF messages is that a greater level of filtering control of messages than is given by just using the defaut CommonSecurity filtering against a Linux system.  CEF log filtering is a more complex version of syslog filtering that was previously described here: https://laurierhodes.info/node/150.


Rsyslog Overview

Rsyslog is the service that processes event messaging by streaming data on port 514 to different log files based on what is specified in  /etc/rsyslog.conf

The config specifies which files will be appended to by each of the different facilities.

Also note that any number of application specific config files will be included in the config as long as they reside in the /etc/rsyslog.d/ directory.
 

#                                                         
# Include all config files in /etc/rsyslog.d/             
#                                                         
$IncludeConfig /etc/rsyslog.d/*.conf                      
                                                          
                                                          
###############                                           
#### RULES ####                                           
###############                                           
                                                          
#                                                         
# First some standard log files.  Log by facility.        
#                                                         
auth,authpriv.*                 /var/log/auth.log         
*.*;auth,authpriv.none          -/var/log/syslog          
#cron.*                         /var/log/cron.log         
daemon.*                        -/var/log/daemon.log      
kern.*                          -/var/log/kern.log        
lpr.*                           -/var/log/lpr.log         
mail.*                          -/var/log/mail.log        
user.*                          -/var/log/user.log        

As an example, a basic Linux system with the Azure Monitor Agent installed has separate config files for different applications:

root@myserver:~# ls /etc/rsyslog.d/
10-azuremonitoragent-omfwd.conf  openmediavault-pamfaillock.conf  postfix.conf


By looking at the Azure Monitor Agent config we can see how the service works with Linux.
 

# Azure Monitor Agent configuration: forward logs to azuremonitoragent

template(name="AMA_RSYSLOG_TraditionalForwardFormat" type="string" string="<%PRI%>%TIMESTAMP% %HOSTNAME% %syslogtag%%msg:::sp-if-no-1st-sp%%msg%")

# Forwarding all events through TCP port
*.* action(type="omfwd"
template="AMA_RSYSLOG_TraditionalForwardFormat"
queue.type="LinkedList"
action.resumeRetryCount="-1"
queue.size="50000"
queue.saveonshutdown="on"
target="127.0.0.1" 
Port="28330" 
Protocol="tcp"
)

The Template declaration specifies how, and what parts of a message are to be part of the message forwarded to Microsoft's custom AMA service, the percent symbol demarks a property of the message.  PRI represents the facility and alert priority of the message as an integer.  The actual message that contains the Common Event Format data is 'msg'.

Rsyslog has different output modules that allow the formatted message to be sent somewhere (see https://www.rsyslog.com/doc/v8-stable/configuration/modules/idx_output.html).  

Microsoft use the "forward" output module that forwards the message on the network to the local host on port 28330.  Note that the maximum in-memory queue size for messaging is 50,000 messages and if the microsoft service can't keep up, messages will start getting dropped.  This service (mdsd) parses messages into a stream that can be forwarded by the Log Analytics Agent to Sentinel.

root@myserver:~# netstat -lnp | grep '28330'
tcp        0      0 127.0.0.1:28330         0.0.0.0:*               LISTEN      1049/mdsd

Common Event Format

Syslog is an RFC defined format with messaging for Linux servers and messaging.  It has a very small number of components, the timestamp received from the host, the hostname the message etc  PRI is a numeric which states what facility is being written to & the alert priority.

It’s an old format that was added to by the “Common Event Format” that decided to have a standard structure within the syslog message:

The first 7 fields of the CEF message are separated with a pipe & are mandatory.  The final element (extension) is made up with any combination of 130 odd key value pairs.

  

Microsoft Syslog forwarder quirks with Common Event Format

Common Event format doesn’t have to use syslog – it’s primarily a structure for the data that comes in a message – you can take CEF lines from a text file just as it is.  This seems to have influenced Microsoft’s thinking – of suggesting CEF format (which they put in a dedicated table CommonSecurityLog) doesn’t specify a Timestamp as part of the format – only CEF messages that are delivered over syslog will have a timestamp (which is everything the SOC will be looking at with Sentinel).   This is why the Time in Sentinel’s Common Event Format table is titled “TimeGenerated” and not EventTime as exists in the Syslog table.         

The result is that the Timegenerated field loses accuracy, especially for any systems that submit Common Event Format messages using a high precision time stamp (using milliseconds).  This also means that a syslog forwarder under load will start seeing a greater variance with TimeGenerated from the initial event time and if the load is great enough, messages will start being dropped.

 Granular filtering issues

Using the Azure Monitor Agent with syslog and CEF filtering does provide filtering on facilities but with multiple systems directing logs to the same facilities, that depth of granularity is not ideal for filtering when Sentinel's data ingestion is charged by the Gb.

The AMA agent does have the ability to filter logs further using KQL but that has both charging implications and compute impact on the syslog forwarder.

Ideally, I'd prefer to be able to be able to be highly specific in collecting a minimal amount of data for alerting while retaining the ability to stream a broader amount of CEF data to cheaper storage for hunting.

Log Filtering with Rsyslog 

Rsyslog itself is designed for filtering messages.  As it is coded in C with POSIX support, it's extremely fast.  It provides an opportunity to filter messages prior to being received by the Azure Monitor Agent.

There are some advantages for using Rsyslog for filtering:

  • Logs can be filtered by source machines to have tamplates / configurations for specific systems.  This means we can have a dedicated template for Symantec Endpoint Protection, or a template for Cisco ASA etc.  Managing event collection specifically for a system allows very granular filtering to occur prior to log ingestion into Sentinel
  • The EventTime property can be put back in to the CEF data for greater precision with threat hunting.  Note that a separate "custom log" table for CommonSecurityLog needs to be created.  It may even be advantageous to have separate, dedicated tables for different products where Parsers can be built into.
  • We can look at filtering specific alert logs to Sentinel and broader hunting data (like auth) to cheaper storage.

 Common Event Format parsing with rsyslog

The benefits for parsing  Common Event Format messages with rsyslog prior to AMA ingestion are potentially significant but there are difficulties.

CEF logs can pass 130 odd key value pairs in the message file of the syslog message.  These pairs use whitespace as a separator but whitespace can alos be part of the key name or the value.  Rsyslog does support regex but a POSIX compliant version that doesn't allow paterns to be derived by "look ahead" "?" into the incoming stream. That makes a simple regex query unworkable with the potential variations for CEF data.

Without taking on the task of coding a C module for dealing with this problem, a somewhat lengthy rsyslog config can achieve the same result.

As it turns out, Microsoft uses different field names for standard Common Event Format logs.   All of those property names need to be changed to their Microsoft Sentinel equivalents as part of the message transformation process.  Knowing all the key names means its possible to parse the message to carve out all the keys by inserting quotes, commas and colon to shape the msg into valid json.  It's not spectacular in elegance but it works. 

The following is an rsyslog template for parsing Common Event Format messages and writing output to a dedicated log file.  The created log file can be transformed with the Azure AMA agent (JSON format) for ingestion into Log Analytics.

CEF log filtering is a more complex version of syslog filtering that was previously described here: https://laurierhodes.info/node/150.

Some notes for use with the configuration.

  • This inserts back the EventTime property has a high performache timestamp
  • All standard microsoft supported CEF properties are included but if a vendor uses other custom properties they will need to be added otherwise the shaping of the msg file into JSON will fail.
  • No vendor will use all properties, to save CPU and parsing, comment out the CEF properties that aren't needed based on the CEF properties supported from a vendor.
  • Ensure there is filtering of some form enabled rather than trying to ingest everything at debug level.

All the compute involved with this configuration is pulling apart the msg to create a JSON version of the key value pairs in the CEF message.  The actual logic for writing the data to a dedicated log is contained in a simple if statement. At the start of the script the logic also has an if statement based on the machine sending the logs to understand what application it is.  Try combining if statements and keeping conditionals together if it keeps administration simple and simply duplicate the if statement if different facilities are wanted to be added to a log.

 
  if ( ($syslogfacility-text == 'auth') and ($syslogseverity-text == 'alert') ) then {
      action(type="omfile" file="/var/log/myapplication-cef-messages.log" template="SentinelCEFFormat")
}
 
Device Product, Device Vendor and Device Version are redundant and should be commented out.  These values are added in the template as properties exposed by the mmfields module - they will never be part of the data section of CEF.  They are left in for completeness as Microsoft supported properties.

 

#------------------
# Constants
#------------------
# Use a template for constructing a UTC date time format for the
# originating message
 
 module(load="mmfields")
 
#------------------
# Define Templates
#------------------
 
template(
    name = "Syslog_DateFormat" type = "list") {
    property(name="timestamp" dateformat="year" date.inUTC="on")
      constant(value="-")
    property(name="timestamp" dateformat="month" date.inUTC="on")
      constant(value="-")
    property(name="timestamp" dateformat="day" date.inUTC="on")
      constant(value="T")
    property(name="timestamp" dateformat="hour" date.inUTC="on")
      constant(value=":")
    property(name="timestamp" dateformat="minute" date.inUTC="on")
      constant(value=":")
    property(name="timestamp" dateformat="second" date.inUTC="on")
      constant(value=".")
    property(name="timestamp" dateformat="subseconds" date.inUTC="on")
      constant(value="Z")
}
      
    # Sentinel CEF Format is a JSON representation of Sentinel's CommonSecurity table (CEF)
    template(name="SentinelCEFFormat" type="list" option.jsonf="on") {
        property(outname="DeviceVendor" name="$!f2" format="jsonf")
        property(outname="DeviceProduct" name="$!f3" format="jsonf")        
        property(outname="DeviceVersion" name="$!f4" format="jsonf")
        property(outname="DeviceEventClassID" name="$!f5" format="jsonf")
        property(outname="Name" name="$!f6" format="jsonf")
        property(outname="Severity" name="$!f7" format="jsonf")
        property(outname="EventTime" name="$!sntdate" format="jsonf")
        property(outname="Computer" name="hostname" format="jsonf")
        property(outname="ProcessID" name="procid" format="jsonf")
        property(outname="ProcessName" name="syslogtag" format="jsonf")
        property(outname="Facility" name="syslogfacility-text" format="jsonf" caseConversion="upper")
        property(outname="SeverityLevel" name="syslogseverity-text" format="jsonf")
        property(name="$.mymsg")  
    }
   
# Choose the machine(s) we want logs from.  The if statement
# can be extended with '-or' statements for multiple servers being part of the same system
if (( $hostname contains 'server1234')       ) then {
 
  # Only act on CEF messages  
  if (($rawmsg contains "CEF:") or ($rawmsg contains "%ASA-")) then {
 
    # Use mmfields for parsing the CEF message
 
    action(type="mmfields" separator="|" )
   
    # Construct the GMT date format from the message
    set $!sntdate = exec_template("Syslog_DateFormat");
   
    # Field 8 is the extended list of CEF properties (from mmfields)
    set $.mymsg = $!f8;

 

    # Standard CEF Properties not supported by Microsoft - leave at start
    set $.mymsg = replace($.mymsg,"dpriv=", "\" , \"dpriv\":\"");
    set $.mymsg = replace($.mymsg,"end=", "\" , \"end\":\"");
    set $.mymsg = replace($.mymsg,"start=", "\" , \"start\":\"");
 
    # Standard CEF Properties not used by Microsoft
    set $.mymsg = replace($.mymsg,"act=", "\" , \"DeviceAction\":\"");
    set $.mymsg = replace($.mymsg,"app=", "\" , \"ApplicationProtocol\":\"");
    set $.mymsg = replace($.mymsg,"c6a1=", "\" , \"DeviceCustomIPv6Address1\":\"");
    set $.mymsg = replace($.mymsg,"c6a1Label=", "\" , \"DeviceCustomIPv6Address1Label\":\"");
    set $.mymsg = replace($.mymsg,"c6a2=", "\" , \"DeviceCustomIPv6Address2\":\"");
    set $.mymsg = replace($.mymsg,"c6a2Label=", "\" , \"DeviceCustomIPv6Address2Label\":\"");
    set $.mymsg = replace($.mymsg,"c6a3=", "\" , \"DeviceCustomIPv6Address3\":\"");
    set $.mymsg = replace($.mymsg,"c6a3Label=", "\" , \"DeviceCustomIPv6Address3Label\":\"");
    set $.mymsg = replace($.mymsg,"c6a4=", "\" , \"DeviceCustomIPv6Address4\":\"");
    set $.mymsg = replace($.mymsg,"c6a4Label=", "\" , \"DeviceCustomIPv6Address4Label\":\"");
    set $.mymsg = replace($.mymsg,"cat=", "\" , \"DeviceEventCategory\":\"");
    set $.mymsg = replace($.mymsg,"cfp1=", "\" , \"DeviceCustomFloatingPoint1\":\"");
    set $.mymsg = replace($.mymsg,"cfp1Label=", "\" , \"deviceCustomFloatingPoint1Label\":\"");
    set $.mymsg = replace($.mymsg,"cfp2=", "\" , \"DeviceCustomFloatingPoint2\":\"");
    set $.mymsg = replace($.mymsg,"cfp2Label=", "\" , \"deviceCustomFloatingPoint2Label\":\"");
    set $.mymsg = replace($.mymsg,"cfp3=", "\" , \"DeviceCustomFloatingPoint3\":\"");
    set $.mymsg = replace($.mymsg,"cfp3Label=", "\" , \"deviceCustomFloatingPoint3Label\":\"");
    set $.mymsg = replace($.mymsg,"cfp4=", "\" , \"DeviceCustomFloatingPoint4\":\"");
    set $.mymsg = replace($.mymsg,"cfp4Label=", "\" , \"deviceCustomFloatingPoint4Label\":\"");
    set $.mymsg = replace($.mymsg,"cn1=", "\" , \"DeviceCustomNumber1\":\"");
    set $.mymsg = replace($.mymsg,"cn1Label=", "\" , \"DeviceCustomNumber1Label\":\"");
    set $.mymsg = replace($.mymsg,"cn2=", "\" , \"DeviceCustomNumber2\":\"");
    set $.mymsg = replace($.mymsg,"cn2Label=", "\" , \"DeviceCustomNumber2Label\":\"");
    set $.mymsg = replace($.mymsg,"cn3=", "\" , \"DeviceCustomNumber3\":\"");
    set $.mymsg = replace($.mymsg,"cn3Label=", "\" , \"DeviceCustomNumber3Label\":\"");
    set $.mymsg = replace($.mymsg,"cnt=", "\" , \"EventCount\":\"");
    set $.mymsg = replace($.mymsg,"cs1=", "\" , \"DeviceCustomString1\":\"");
    set $.mymsg = replace($.mymsg,"cs1Label=", "\" , \"DeviceCustomStringLabel1\":\"");
    set $.mymsg = replace($.mymsg,"cs2=", "\" , \"DeviceCustomString2\":\"");
    set $.mymsg = replace($.mymsg,"cs2Label=", "\" , \"DeviceCustomString2Label\":\"");
    set $.mymsg = replace($.mymsg,"cs3=", "\" , \"DeviceCustomString3\":\"");
    set $.mymsg = replace($.mymsg,"cs3Label=", "\" , \"DeviceCustomString3Label\":\"");
    set $.mymsg = replace($.mymsg,"cs4=", "\" , \"DeviceCustomString4\":\"");
    set $.mymsg = replace($.mymsg,"cs4Label=", "\" , \"DeviceCustomString4Label\":\"");
    set $.mymsg = replace($.mymsg,"cs5=", "\" , \"DeviceCustomString5\":\"");
    set $.mymsg = replace($.mymsg,"cs5Label=", "\" , \"DeviceCustomString5Label\":\"");
    set $.mymsg = replace($.mymsg,"cs6=", "\" , \"DeviceCustomString6\":\"");
    set $.mymsg = replace($.mymsg,"cs6Label=", "\" , \"DeviceCustomString6Label\":\"");
    set $.mymsg = replace($.mymsg,"destinationDnsDomain=", "\" , \"DestinationDnsDomain\":\"");
    set $.mymsg = replace($.mymsg,"destinationServiceName=", "\" , \"DestinationServiceName\":\"");
    set $.mymsg = replace($.mymsg,"destinationTranslatedAddress=", "\" , \"DestinationTranslatedAddress\":\"");
    set $.mymsg = replace($.mymsg,"destinationTranslatedPort=", "\" , \"DestinationTranslatedPort\":\"");
    set $.mymsg = replace($.mymsg,"Device Product=", "\" , \"DeviceProduct\":\"");
    set $.mymsg = replace($.mymsg,"Device Vendor=", "\" , \"DeviceVendor\":\"");
    set $.mymsg = replace($.mymsg,"Device Version=", "\" , \"DeviceVersion\":\"");
    set $.mymsg = replace($.mymsg,"deviceCustomDate1=", "\" , \"DeviceCustomDate1\":\"");
    set $.mymsg = replace($.mymsg,"deviceCustomDate1Label=", "\" , \"DeviceCustomDate1Label\":\"");
    set $.mymsg = replace($.mymsg,"deviceCustomDate2=", "\" , \"DeviceCustomDate2\":\"");
    set $.mymsg = replace($.mymsg,"deviceCustomDate2Label=", "\" , \"DeviceCustomDate2Label\":\"");
    set $.mymsg = replace($.mymsg,"deviceDirection=", "\" , \"CommunicationDirection\":\"");
    set $.mymsg = replace($.mymsg,"deviceDnsDomain=", "\" , \"DeviceDnsDomain\":\"");
    set $.mymsg = replace($.mymsg,"DeviceEventClassID=", "\" , \"DeviceEventClassID\":\"");
    set $.mymsg = replace($.mymsg,"deviceExternalId=", "\" , \"deviceExternalId\":\"");
    set $.mymsg = replace($.mymsg,"deviceFacility=", "\" , \"DeviceFacility\":\"");
    set $.mymsg = replace($.mymsg,"deviceInboundInterface=", "\" , \"DeviceInboundInterface\":\"");
    set $.mymsg = replace($.mymsg,"deviceNtDomain=", "\" , \"DeviceNtDomain\":\"");
    set $.mymsg = replace($.mymsg,"deviceOutboundInterface=", "\" , \"DeviceOutboundInterface\":\"");
    set $.mymsg = replace($.mymsg,"devicePayloadId=", "\" , \"DevicePayloadId\":\"");
    set $.mymsg = replace($.mymsg,"deviceProcessName=", "\" , \"ProcessName\":\"");
    set $.mymsg = replace($.mymsg,"deviceTranslatedAddress=", "\" , \"DeviceTranslatedAddress\":\"");
    set $.mymsg = replace($.mymsg,"dhost=", "\" , \"DestinationHostName\":\"");
    set $.mymsg = replace($.mymsg,"dmac=", "\" , \"DestinationMacAddress\":\"");
    set $.mymsg = replace($.mymsg,"dntdom=", "\" , \"DestinationNTDomain\":\"");
    set $.mymsg = replace($.mymsg,"dpid=", "\" , \"DestinationProcessId\":\"");
    set $.mymsg = replace($.mymsg,"dpriv=", "\" , \"DestinationUserPrivileges\":\"");
    set $.mymsg = replace($.mymsg,"dproc=", "\" , \"DestinationProcessName\":\"");
    set $.mymsg = replace($.mymsg,"dpt=", "\" , \"DestinationPort\":\"");
    set $.mymsg = replace($.mymsg,"dst=", "\" , \"DestinationIP\":\"");
    set $.mymsg = replace($.mymsg,"dtz=", "\" , \"DeviceTimeZone\":\"");
    set $.mymsg = replace($.mymsg,"duid=", "\" , \"DestinationUserId\":\"");
    set $.mymsg = replace($.mymsg,"duser=", "\" , \"DestinationUserName\":\"");
    set $.mymsg = replace($.mymsg,"dvc=", "\" , \"DeviceAddress\":\"");
    set $.mymsg = replace($.mymsg,"dvchost=", "\" , \"DeviceName\":\"");
    set $.mymsg = replace($.mymsg,"dvcmac=", "\" , \"DeviceMacAddress\":\"");
    set $.mymsg = replace($.mymsg,"dvcpid=", "\" , \"Process ID\":\"");
    set $.mymsg = replace($.mymsg,"externalId=", "\" , \"ExternalID\":\"");
    set $.mymsg = replace($.mymsg,"fileCreateTime=", "\" , \"FileCreateTime\":\"");
    set $.mymsg = replace($.mymsg,"fileHash=", "\" , \"FileHash\":\"");
    set $.mymsg = replace($.mymsg,"fileId=", "\" , \"FileID\":\"");
    set $.mymsg = replace($.mymsg,"fileModificationTime=", "\" , \"FileModificationTime\":\"");
    set $.mymsg = replace($.mymsg,"filePath=", "\" , \"FilePath\":\"");
    set $.mymsg = replace($.mymsg,"filePermission=", "\" , \"FilePermission\":\"");
    set $.mymsg = replace($.mymsg,"fileType=", "\" , \"FileType\":\"");
    set $.mymsg = replace($.mymsg,"flexDate1=", "\" , \"FlexDate1\":\"");
    set $.mymsg = replace($.mymsg,"flexDate1Label=", "\" , \"FlexDate1Label\":\"");
    set $.mymsg = replace($.mymsg,"flexNumber1=", "\" , \"FlexNumber1\":\"");
    set $.mymsg = replace($.mymsg,"flexNumber1Label=", "\" , \"FlexNumber1Label\":\"");
    set $.mymsg = replace($.mymsg,"flexNumber2=", "\" , \"FlexNumber2\":\"");
    set $.mymsg = replace($.mymsg,"flexNumber2Label=", "\" , \"FlexNumber2Label\":\"");
    set $.mymsg = replace($.mymsg,"flexString1=", "\" , \"FlexString1\":\"");
    set $.mymsg = replace($.mymsg,"flexString1Label=", "\" , \"FlexString1Label\":\"");
    set $.mymsg = replace($.mymsg,"flexString2=", "\" , \"FlexString2\":\"");
    set $.mymsg = replace($.mymsg,"flexString2Label=", "\" , \"FlexString2Label\":\"");
    set $.mymsg = replace($.mymsg,"fname=", "\" , \"FileName\":\"");
    set $.mymsg = replace($.mymsg,"fsize=", "\" , \"FileSize\":\"");
    set $.mymsg = replace($.mymsg,"Host=", "\" , \"Computer\":\"");
    set $.mymsg = replace($.mymsg,"in=", "\" , \"ReceivedBytes\":\"");
    set $.mymsg = replace($.mymsg,"msg=", "\" , \"Message\":\"");
    set $.mymsg = replace($.mymsg,"Name=", "\" , \"Activity\":\"");
    set $.mymsg = replace($.mymsg,"oldFileCreateTime=", "\" , \"OldFileCreateTime\":\"");
    set $.mymsg = replace($.mymsg,"oldFileHash=", "\" , \"OldFileHash\":\"");
    set $.mymsg = replace($.mymsg,"oldFileId=", "\" , \"OldFileId\":\"");
    set $.mymsg = replace($.mymsg,"oldFileModificationTime=", "\" , \"OldFileModificationTime\":\"");
    set $.mymsg = replace($.mymsg,"oldFileName=", "\" , \"OldFileName\":\"");
    set $.mymsg = replace($.mymsg,"oldFilePath=", "\" , \"OldFilePath\":\"");
    set $.mymsg = replace($.mymsg,"oldFilePermission=", "\" , \"OldFilePermission\":\"");
    set $.mymsg = replace($.mymsg,"oldFileSize=", "\" , \"OldFileSize\":\"");
    set $.mymsg = replace($.mymsg,"oldFileType=", "\" , \"OldFileType\":\"");
    set $.mymsg = replace($.mymsg,"out=", "\" , \"SentBytes\":\"");
    set $.mymsg = replace($.mymsg,"outcome=", "\" , \"EventOutcome\":\"");
    set $.mymsg = replace($.mymsg,"proto=", "\" , \"Protocol\":\"");
    set $.mymsg = replace($.mymsg,"reason=", "\" , \"Reason\":\"");
    set $.mymsg = replace($.mymsg,"Request=", "\" , \"RequestURL\":\"");
    set $.mymsg = replace($.mymsg,"requestClientApplication=", "\" , \"RequestClientApplication\":\"");
    set $.mymsg = replace($.mymsg,"requestContext=", "\" , \"RequestContext\":\"");
    set $.mymsg = replace($.mymsg,"requestCookies=", "\" , \"RequestCookies\":\"");
    set $.mymsg = replace($.mymsg,"requestMethod=", "\" , \"RequestMethod\":\"");
    set $.mymsg = replace($.mymsg,"rt=", "\" , \"ReceiptTime\":\"");
    set $.mymsg = replace($.mymsg,"Severity=", "\" , \"LogSeverity\":\"");
    set $.mymsg = replace($.mymsg,"shost=", "\" , \"SourceHostName\":\"");
    set $.mymsg = replace($.mymsg,"smac=", "\" , \"SourceMacAddress\":\"");
    set $.mymsg = replace($.mymsg,"sntdom=", "\" , \"SourceNTDomain\":\"");
    set $.mymsg = replace($.mymsg,"sourceDnsDomain=", "\" , \"SourceDnsDomain\":\"");
    set $.mymsg = replace($.mymsg,"sourceServiceName=", "\" , \"SourceServiceName\":\"");
    set $.mymsg = replace($.mymsg,"sourceTranslatedAddress=", "\" , \"SourceTranslatedAddress\":\"");
    set $.mymsg = replace($.mymsg,"sourceTranslatedPort=", "\" , \"SourceTranslatedPort\":\"");
    set $.mymsg = replace($.mymsg,"spid=", "\" , \"SourceProcessId\":\"");
    set $.mymsg = replace($.mymsg,"spriv=", "\" , \"SourceUserPrivileges\":\"");
    set $.mymsg = replace($.mymsg,"sproc=", "\" , \"SourceProcessName\":\"");
    set $.mymsg = replace($.mymsg,"spt=", "\" , \"SourcePort\":\"");
    set $.mymsg = replace($.mymsg,"src=", "\" , \"SourceIP\":\"");
    set $.mymsg = replace($.mymsg,"suid=", "\" , \"SourceUserID\":\"");
    set $.mymsg = replace($.mymsg,"suser=", "\" , \"SourceUserName\":\"");
    set $.mymsg = replace($.mymsg,"type=", "\" , \"EventType\":\"");
   
 
    # Logic will need a closing quote mark & this will allow for a doubled double quote for removal
    # at the start of the string
 
    set $.mymsg = wrap( $.mymsg, "\"");
 
    
   # Clean up formatting at the start of the string
    set $.mymsg = replace($.mymsg,"\"\" , ","");
 
    # Remove any trailing whitespace on property names
    set $.mymsg = replace($.mymsg," \"","\"");
 
          # I can now filter by facility and severity
          # logs will be written in a format that the Azure Monitor Agent can accept
          # change the output file location below after 'file'
 
    if ( ($syslogfacility-text == 'auth') and ($syslogseverity-text == 'alert') ) then {
      action(type="omfile" file="/var/log/myapplication-cef-messages.log" template="SentinelCEFFormat")
    }
  }
}


 

 

When an Azure Management Agent Data Collection Rule is created to monitor the custom CEF file created, a transform needs to be added replacing the original "source".

Note that I am sending my CEF logs to a dedicated custom table rather than the default CommonSecurityLog table used by Sentinel.  This will let me use the default table for cheaper, non-alertable storage while i can configure analytics rules against my filtered source.


 The transform is below.

source |  extend d=todynamic(RawData) | project TimeGenerated, EventTime_CF=todatetime(d.eventtime), Activity=tostring(d.activity), ApplicationProtocol=tostring(d.ApplicationProtocol), CommunicationDirection=tostring(d.CommunicationDirection), Computer=tostring(d.Computer), DestinationDnsDomain=tostring(d.DestinationDnsDomain), DestinationHostName=tostring(d.DestinationHostName), DestinationIP=tostring(d.DestinationIP), DestinationMacAddress=tostring(d.DestinationMacAddress), DestinationNTDomain=tostring(d.DestinationNTDomain), DestinationPort=toint(d.DestinationPort), DestinationProcessId=toint(d.DestinationProcessId), DestinationProcessName=tostring(d.DestinationProcessName), DestinationServiceName=tostring(d.DestinationServiceName), DestinationTranslatedAddress=tostring(d.DestinationTranslatedAddress), DestinationTranslatedPort=toint(d.DestinationTranslatedPort),  DestinationUserId=tostring(d.DestinationUserId), DestinationUserName=tostring(d.DestinationUserName), DestinationUserPrivileges=tostring(d.DestinationUserPrivileges), DeviceAction=tostring(d.DeviceAction), DeviceAddress=tostring(d.DeviceAddress), DeviceCustomDate1=tostring(d.DeviceCustomDate1), DeviceCustomDate1Label=tostring(d.DeviceCustomDate1Label), DeviceCustomDate2=tostring(d.DeviceCustomDate2), DeviceCustomDate2Label=tostring(d.DeviceCustomDate2Label), DeviceCustomFloatingPoint1=todouble(d.DeviceCustomFloatingPoint1), DeviceCustomFloatingPoint1Label=tostring(d.deviceCustomFloatingPoint1Label), DeviceCustomFloatingPoint2=todouble(d.DeviceCustomFloatingPoint2), DeviceCustomFloatingPoint2Label=tostring(d.deviceCustomFloatingPoint2Label), DeviceCustomFloatingPoint3=todouble(d.DeviceCustomFloatingPoint3), DeviceCustomFloatingPoint3Label=tostring(d.deviceCustomFloatingPoint3Label), DeviceCustomFloatingPoint4=todouble(d.DeviceCustomFloatingPoint4), DeviceCustomFloatingPoint4Label=tostring(d.deviceCustomFloatingPoint4Label), DeviceCustomIPv6Address1=tostring(d.DeviceCustomIPv6Address1), DeviceCustomIPv6Address1Label=tostring(d.DeviceCustomIPv6Address1Label), DeviceCustomIPv6Address2=tostring(d.DeviceCustomIPv6Address2), DeviceCustomIPv6Address2Label=tostring(d.DeviceCustomIPv6Address2Label), DeviceCustomIPv6Address3=tostring(d.DeviceCustomIPv6Address3), DeviceCustomIPv6Address3Label=tostring(d.DeviceCustomIPv6Address3Label), DeviceCustomIPv6Address4=tostring(d.DeviceCustomIPv6Address4), DeviceCustomIPv6Address4Label=tostring(d.DeviceCustomIPv6Address4Label), DeviceCustomNumber1=toint(d.DeviceCustomNumber1), DeviceCustomNumber1Label=tostring(d.DeviceCustomNumber1Label), DeviceCustomNumber2=toint(d.DeviceCustomNumber2), DeviceCustomNumber2Label=tostring(d.DeviceCustomNumber2Label), DeviceCustomNumber3=toint(d.DeviceCustomNumber3), DeviceCustomNumber3Label=tostring(d.DeviceCustomNumber3Label),  DeviceEventCategory=tostring(d.DeviceEventCategory),DeviceCustomString1=tostring(d.DeviceCustomString1), DeviceCustomString1Label1=tostring(d.DeviceCustomString1Label1), DeviceCustomString2=tostring(d.DeviceCustomString2), DeviceCustomString2Label=tostring(d.DeviceCustomString2Label), DeviceCustomString3Label=tostring(d.DeviceCustomString3Label), DeviceCustomString4=tostring(d.DeviceCustomString4), DeviceCustomString4Label=tostring(d.DeviceCustomString4Label), DeviceCustomString5=tostring(d.DeviceCustomString5), DeviceCustomString5Label=tostring(d.DeviceCustomString5Label), DeviceCustomString6=tostring(d.DeviceCustomString6), DeviceCustomString6Label=tostring(d.DeviceCustomString6Label), DeviceDnsDomain=tostring(d.DeviceDnsDomain), DeviceEventClassID=tostring(d.deviceeventclassid), DeviceExternalID=tostring(d.deviceExternalId), DeviceFacility=tostring(d.DeviceFacility), DeviceInboundInterface=tostring(d.DeviceInboundInterface), DeviceMacAddress=tostring(d.DeviceMacAddress), DeviceName=tostring(d.DeviceName), DeviceNtDomain=tostring(d.DeviceNtDomain), DeviceOutboundInterface=tostring(d.DeviceOutboundInterface), DevicePayloadId=tostring(d.DevicePayloadId), DeviceProduct=tostring(d.deviceproduct), DeviceTimeZone=tostring(d.DeviceTimeZone), DeviceTranslatedAddress=tostring(d.DeviceTranslatedAddress), DeviceVendor=tostring(d.devicevendor), DeviceVersion=tostring(d.deviceversion), EventCount=toint(d.EventCount), EventOutcome=tostring(d.EventOutcome), EventType=toint(d.EventType) , ExternalID=toint(d.ExternalID), FileCreateTime=tostring(d.FileCreateTime), FileHash=tostring(d.FileHash), FileID=tostring(d.FileID), FileModificationTime=tostring(d.FileModificationTime), FileName=tostring(d.FileName), FilePath=tostring(d.FilePath), FilePermission=tostring(d.FilePermission), FileSize=toint(d.FileSize), FileType=tostring(d.FileType), FlexDate1=tostring(d.FlexDate1), FlexDate1Label=tostring(d.FlexDate1Label), FlexNumber1=toint(d.FlexNumber1), FlexNumber1Label=tostring(d.FlexNumber1Label), FlexNumber2=toint(d.FlexNumber2), FlexNumber2Label=tostring(d.FlexNumber2Label), FlexString1=tostring(d.FlexString1), FlexString1Label=tostring(d.FlexString1Label), FlexString2=tostring(d.FlexString2), FlexString2Label=tostring(d.FlexString2Label), LogSeverity=tostring(d.severity), Message=tostring(d.Message),  OldFileCreateTime=tostring(d.OldFileCreateTime), OldFileHash=tostring(d.OldFileHash), OldFileId=tostring(d.OldFileId), OldFileModificationTime=tostring(d.OldFileModificationTime), OldFileName=tostring(d.OldFileName), OldFilePath=tostring(d.OldFilePath), OldFilePermission=tostring(d.OldFilePermission), OldFileSize=toint(d.OldFileSize), OldFileType=tostring(d.OldFileType), ["Process ID"]=tostring(d.ProcessID), ProcessName=tostring(d.ProcessName), Protocol=tostring(d.Protocol), Reason=tostring(d.Reason), ReceivedBytes=tolong(d.ReceivedBytes), ReceiptTime=tostring(d.ReceiptTime),  RequestClientApplication=tostring(d.RequestClientApplication), RequestContext=tostring(d.RequestContext), RequestCookies=tostring(d.RequestCookies), RequestMethod=tostring(d.RequestMethod), RequestURL=tostring(d.RequestURL), SentBytes=tolong(d.SentBytes), SourceDnsDomain=tostring(d.SourceDnsDomain), SourceHostName=tostring(d.SourceHostName), SourceIP=tostring(d.SourceIP), SourceMacAddress=tostring(d.SourceMacAddress), SourceNTDomain=tostring(d.SourceNTDomain), SourcePort=toint(d.SourcePort), SourceProcessId=toint(d.SourceProcessId), SourceProcessName=tostring(d.SourceProcessName), SourceServiceName=tostring(d.SourceServiceName), SourceTranslatedAddress=tostring(d.SourceTranslatedAddress), SourceTranslatedPort=toint(d.SourceTranslatedPort), SourceUserID=tostring(d.SourceUserID), SourceUserName=tostring(d.SourceUserName), SourceUserPrivileges=tostring(d.SourceUserPrivileges)