ELK + Palo Alto Networks Part 2 (URL and Custom Logs)

Part 1: https://anderikistan.com/2016/03/26/elk-palo-alto-networks/

To recap part 1 we did the following:

  1. Set up syslog-ng to read in logs from a Palo Alto Networks firewall
  2. Set up some syslog profiles in our profile and forwarded traffic logs
  3. Installed ELK stack and gained the ability to visualize our logs

In our ELK instance we have traffic logs and are able to get a lot of great information including the very cool GeoHash stuff. Now we want to take a better look at web traffic and help identify risky users or see what kind of stuff people are doing in our environment. For this tutorial we are going to do the following:

  1. Set up custom threat logs on our Palo Alto Networks firewall
  2. Ensure syslog-ng is working properly with our new syslog feed
  3. Update the Logstash configuration file
  4. Tweak our Elasticsearch index mapping
  5. Build out some visualizations

This tutorial assumes you have gotten through the previous tutorial as we will reference files and techniques discussed previously. Let’s get started

Custom Threat Logs with Palo Alto Networks

Before we do anything let’s create a new syslog object on our firewall. In case you forgot go to Device Tab > Server Profiles > Syslog. Click (+ Add) in the bottom left and use the following options:

  • Name: URLSyslog
  • Syslog Server: The same as your traffic syslog
  • Transport: UDP
  • Port: 514
  • Format: BSD
  • Facility: LOG_LOCAL1

So hopefully you picked up on the threat part of Custom Threat Logs. Palo Alto Networks doesn’t ship URLs in syslogs by default so we have to build our own custom threat log type in order to get the fields we want. To create the custom syslog do the following:

In the popup box click on the box called “Custom Log Format”‘

Here you will be presented with 5 different Log Type options. We will select threat and we will be prompted with another popup box



The beauty of these custom logs is we can refine them down to only what we are able to ingest and index in our ELK deployment. Here we are going grab the source IP, destination IP, app-id, category, and the URL itself. The PAN Firewalls use the $misc field as the URL field. For more detail check out this article:


Now add your newly created URL syslog object to the Log Forwarding object created in the previous tutorial:


Don’t forget to commit the firewall configuration. With this done . Let’s move over to our ELK server and get these logs coming in. Don’t forget to commit the firewall configuration.

Tinker with syslog-ng

To simplify having multiple syslog streams let’s use syslog facilities. Syslog facilities allow syslog-ng to differentiate between what type of logs it is dealing with. We actually did all this work in the previous lesson but maybe a quick review will help. Let’s look at the config file.

sudo vi /etc/syslog-ng/syslog-ng.conf

Below is the “meat” of the config:

source s_netsyslog {
        udp(ip( port(514) flags(no-hostname));
        tcp(ip( port(514) flags(no-hostname));

destination d_netsyslog { file("/var/log/network.log" owner("root") group("root") perm(0644)); };

destination d_urlsyslog { file("/var/log/urllogs.log" owner("root") group("root") perm(0644)); };

log { source(s_netsyslog); filter(f_traffic); destination(d_netsyslog); };

log { source(s_netsyslog); filter(f_threat); destination(d_urlsyslog); };

filter f_traffic { facility(local0); };
filter f_threat { facility(local1); };

We have one “source” which is a listener on port 514, the default port for syslog. Next we define a couple of options for destinations. Look near the bottom and you can see we define some filters based on facility. That’s why proper facility configuration on the firewall is so important. Lastly in the middle you can see the log function which defines the source, filter, and destination. With all this set up and a restart of the syslog-ng service:

sudo service syslog-ng restart

We should now start seeing traffic:

sudo tail -f /var/log/urllogs.log

Your logs should look something like this:

2016-04-24T21:59:27-05:00 WOPR ssl business-and-economy "e.crashlytics.com/"
2016-04-24T21:59:28-05:00 WOPR ms-onedrive-base online-personal-storage "bn1304.storage.live.com/"
2016-04-24T21:59:28-05:00 WOPR ms-onedrive-base online-personal-storage "bn1304.storage.live.com/"

Congrats. Your server is now reading in our custom URL syslog feed.

Update Logstash Configuration

So we have configured logstash once… how about we just repeat our steps. Well that could work except there’s one big difference between our custom syslog and the default traffic syslog. Can you spot it?


2016-04-24T21:59:27-05:00 WOPR ssl business-and-economy "e.crashlytics.com/"
2016-04-24T21:59:28-05:00 WOPR ms-onedrive-base online-personal-storage "bn1304.storage.live.com/"
2016-04-24T21:59:28-05:00 WOPR ms-onedrive-base online-personal-storage "bn1304.storage.live.com/"


2016-04-24T15:09:09-05:00 WOPR 1,2016/04/24 15:09:08,001606001622,TRAFFIC,drop,1,2016/04/24 15:09:08,,,,,Deny Outbound,,,not-applicable,vsys1,Wired,Wireless,ethernet1/4,,Fowarder,2016/04/24 15:09:08,0,1,56843,53,0,0,0x0,udp,deny,68,68,0,1,2016/04/24 15:09:08,0,any,0,6725833,0x0,,,0,1,0,policy-deny,0,0,0,0,,WOPR,from-policy
2016-04-24T15:09:12-05:00 WOPR 1,2016/04/24 15:09:11,001606001622,TRAFFIC,drop,1,2016/04/24 15:09:11,,,,,Deny Outbound,,,not-applicable,vsys1,Wired,Wireless,ethernet1/4,,Fowarder,2016/04/24 15:09:11,0,1,56843,53,0,0,0x0,udp,deny,68,68,0,1,2016/04/24 15:09:12,0,any,0,6725834,0x0,,,0,1,0,policy-deny,0,0,0,0,,WOPR,from-policy

If you said the difference is commas then you would be correct. The problem is if there’s no commas then the way we used last time:

csv {
      source => "raw_message"
      columns => [ "PaloAltoDomain","ReceiveTime","SerialNum","Type","Threat-ContentType","ConfigVersion","GenerateTime","SourceAddress","DestinationAddress","NATSourceIP","NATDestinationIP","Rule","SourceUser","DestinationUser","Application","VirtualSystem","SourceZone","DestinationZone","InboundInterface","OutboundInterface","LogAction","TimeLogged","SessionID","RepeatCount","SourcePort","DestinationPort","NATSourcePort","NATDestinationPort","Flags","IPProtocol","Action","Bytes","BytesSent","BytesReceived","Packets","StartTime","ElapsedTimeInSec","Category","Padding","seqno","actionflags","SourceCountry","DestinationCountry","cpadding","pkts_sent","pkts_received" ]

… it won’t work. We will have to do something else, but first let’s cover how the config file will work. Logstash is very cool in how it allows you to make multiple config files which certainly helps with readability. The problem is that logstash, upon startup, builds its own configuration file where it merges all of your configuration files together. So for the most predictable results I have found it’s easiest to just go ahead and build one large config file.

As we start adding more inputs in we already run into our first problem which is how to make sure we can differentiate between logs. In this scenario we will assign “types” to our input. Example is below:

input {
  file {
        path => ["/var/log/network.log"]
        sincedb_path => "/var/log/logstash/sincedb"
        start_position => "beginning"
        type => "traffic"

  file {
        path => ["/var/log/urllogs.log"]
        sincedb_path => "/var/log/logstash/urlsincedb"
        start_position => "beginning"
        type => "url"

Here we have created two separate file methods and assigned each a “type”, traffic and url, which can be used throughout our config file so we can perform certain actions on certain syslog data. We will want to treat traffic logs differently than url logs and this allows us to do that. Additionally we defined a new sincedb instance.

As we go down our config file we will repeatedly use the “if function”

if [type] == "url" {...}

We have “typed” our input and can refer to said type throughout our configuration. Let’s look at how the output section looks with both traffic and url.

output {
  if [type] == "traffic" {
    elasticsearch {
    index => "pan-traffic"
    hosts => ["localhost:9200"]
    template => "/opt/logstash/elasticsearch-template.json"
    template_overwrite => true

  if [type] == "url" 
    elasticsearch {
    index => "pan-url"
    hosts => ["localhost:9200"]
} #end output block

So within the one output method we have handled both our traffic and url syslog traffic that we are expecting. So we have gone over input and output but what about filter {}?

filter {
  if [type] == "url" {
    grok {
      #strips timestamp and host off of the front of the syslog message leaving the raw message generated by the syslog client and saves it as "raw_message"
      #patterns_dir => "/opt/logstash/patterns"
      match => { "message" => '%{TIMESTAMP_ISO8601} %{IPV4:firewallIP} %{HOSTNAME:firewall} %{IPV4:sourceIP} %{IPV4:destinationIP} %{NOTSPACE:application} %{NOTSPACE:category} "%{URIHOST:URIHost}%{URIPATH:URIPath}"'  }

What is all this match business? Remember how our custom syslog does not have commas in it? This prevents us from using the very simple csv trick. We have to ramp up our effort a notch in order to get this working right. Enter the grok.

What’s the Deal with Groks?

Grok is a way to take unstructured data and parse it into something indexable and thus searchable. There’s a lot to this concept and grok is an incredibly powerful tool! For the sake of this tutorial we are going to keep it pretty high level. The goal is to teach logstash how to parse out the good stuff by helping it identify patterns or have some sense of what to look for. There’s probably dozens of ways to go about building your grok filter but the way that worked for me was to use a grokconstructor: http://grokconstructor.appspot.com/do/construction.

To use the constructor start out with a line from your syslog file that you intend to match. For this example I chose the following:

2016-04-24T21:59:27-05:00 WOPR ssl business-and-economy "e.crashlytics.com/"

Copy and paste the log file in the text box and hit continue.


Once you hit Go! you will be presented with a bunch of different grok options and this is where the fun starts. For the first match we know we are looking at the syslog date time so the match will be something like:


Logstash is going to handle all the date/time stuff for us since the timestamp is determined to be compliant with ISO8601. Once we select TIMESTAMP_ISO8601 we will scroll back to the top of the page and hit Continue. When the page refreshes you will see the constructed regex for the grok match filter and below how much of the syslog line we have matched and what is remaining to be matched.


At this point it’s a game of iteration.


We will actually define the “spaces” in the syslog as well. This is helpful for the grok constructor tool but as you will see in a moment we will be ditching that space business and keeping the good stuff. When all the matching is done in the tool you should see something like this:


We are almost there. There is a slight manual transformation that we need to make. One we need to cull the match filter of the spaces, and we also need to give these matches names. We assign names by adding a : then a variable name. So {IPV4} becomes {IPV4:SourceAddress}.



%{TIMESTAMP_ISO8601} %{IPV4:firewallIP} %{HOSTNAME:firewall} %{IPV4:sourceIP} %{IPV4:destinationIP} %{NOTSPACE:application} %{NOTSPACE:category} "%{URIHOST:URIHost}%{URIPATH:URIPath}"

In total what we add to the filter is:

  if [type] == "url" {
    grok {
      #strips timestamp and host off of the front of the syslog message leaving the raw message generated by the syslog client and saves it as "raw_message"
      #patterns_dir => "/opt/logstash/patterns"
      match => { "message" => '%{TIMESTAMP_ISO8601} %{IPV4:firewallIP} %{HOSTNAME:firewall} %{IPV4:sourceIP} %{IPV4:destinationIP} %{NOTSPACE:application} %{NOTSPACE:category} "%{URIHOST:URIHost}%{URIPATH:URIPath}"'  }

    date {
      timezone => "America/Chicago"
      match => [ "GenerateTime", "YYYY/MM/dd HH:mm:ss" ]
    mutate {
        remove_field => ["message"]

A complete copy of the pan-traffic.conf file that you will need can be found at: https://onedrive.live.com/redir?resid=6595236A4C9AD71B!32560&authkey=!AHYSb4DufO-2aYk&ithint=folder%2cconf

Update Elasticsearch Index Mapping

Same exercise as before… We need to add “index”:”not_analyzed” to the Category and Application fields. The reason we add this to the mapping is so Elasticsearch won’t delimit strings by the “-“. Feel free to try it out without updating the mapping and you will see what I’m talking about. To update the mapping perform the following steps:

sudo service logstash stop
curl -X DELETE 'http://localhost:9200/pan-url'
You should get some sort of acknowledge true or a 404 error. Those typically mean you're good to go. With these two items done we will push the new mapping in.
curl -g -X PUT 'http://localhost:9200/pan-url' -d '{"mappings":{"url":{"properties":{"@timestamp":{"type":"date","format":"strict_date_optional_time||epoch_millis"},"@version":{"type":"string"},"URIHost":{"type":"string"},"URIPath":{"type":"string"},"application":{"type":"string","index":"not_analyzed"},"category":{"type":"string","index":"not_analyzed"},"destinationIP":{"type":"string"},"firewall":{"type":"string"},"firewallIP":{"type":"string"},"host":{"type":"string"},"path":{"type":"string"},"port":{"type":"string"},"sourceIP":{"type":"string"},"tags":{"type":"string"},"type":{"type":"string"}}}}'

Assuming no errors start back up Logstash.

sudo service logstash start

Wrap Up

With the traffic logs and url logs it’s possible to make some really cool dashboards. I hope these have helped. If you have any questions shoot me an email at anderikistan@gmail.com.
You can find me on LinkedIn at: https://www.linkedin.com/in/iankanderson
or on Twitter @Anderikistan

ELK + Palo Alto Networks Part 2 (URL and Custom Logs)

Implications of Modern Malware


The owners of the ransomware noticed similar machines reporting in, and after basic footprinting realized the victim was a hospital. Knowing the hospital would face regulatory fines from losing patient information, the ransom was raised.

Randy is dead on with his assessment here. The next generation of ransomware will gain situational awareness about the environment it is in. Attackers know that organizations are far more likely to just pay the ransom than most, and they are willing to spend more to restore their data.

Implications of Modern Malware

Trump Hotels Breached Again


Banking industry sources tell KrebsOnSecurity that the Trump Hotel Collection — a string of luxury properties tied to business magnate and Republican presidential candidate Donald Trump — appears to be dealing with another breach of its credit card systems. If confirmed, this would be the second such breach at the Trump properties in less than a year.

Maybe it’s his understanding of cyber security that’s obsolete.

Trump Hotels Breached Again

Critical Security Controls – Version 6 [UPDATED]

critical security controls


In October of 2015 the Center for Internet Security (CIS) released version 6.0 of the Critical Security Control Framework. The SANS Institute defines this framework as “A recommended set of actions for cyber defense that provide specific and actionable ways to stop today’s most pervasive and dangerous attacks. A principle benefit of the Controls is that they prioritize and focus a smaller number of actions with high pay-off results.” Many organizations throughout the world deploy the Critical Security Control Framework as a way to best manage available resources, guide future projects, or even as a means to get rid of existing systems that pose a security risk or under perform for the intended function. Given the organizations backing this framework, SANS and CIS, an update to the framework is a major deal.

For anyone wanting to get started with the Critical Security Control Framework version 6.0 check out the CSC Starter Kit.

Version 6 of the Critical Security Control Framework does some really great things in terms of helping organizations move forward with what next generation networks will look like as security models get baked into the core of redesigns. Additionally Version 6 dropped a control that never felt like it did much and replaced it with a control that was sorely missing and vaguely wrapped up in other more generic controls. Version 6 is not without its problems however as a major function, categorization of controls, was inexplicably removed which we will get to in a bit. Overall the new version is really good. It feels more streamlined and removes some of the fluffy sub controls that never really felt good, and with that let’s get to the good:

The Good

Reordering the controls

Understanding how to read the CSC helps one understand the simplicity of their approach. The higher up on the control list a control is the more important it seems to be. Prioritization is one of the five critical tenants of the CSC and it’s clear that CIS continues to invest in their approach. So how have things changed between version 5 and version 6?

  • CSC 5: Controlled Use of Administrative Privileges (Up from Control 12)
  • CSC 6: Maintenance, Monitoring, and Analysis of Audit Logs (Up from Control 14)
  • CSC 7: Email and Web Browser Protections (New in Version 6.0)
  • CSC 8: Malware Defenses (Down from Control 5)
  • CSC 9: Limitation and Control of Network Ports, Protocols, and Services (Up from Control 11)

There is more reordering in Version 6 but by now you should get the point. The folks behind the Critical Security Controls are’t afraid of reassessing the situation and making adjustments based on the most important critical tenant of them all… Offense Informs Defense. If you look at the attacks going on across the world it’s clear that some of the adjustments here are designed to meet the new trends. For instance Wireless Access Control which was at number 7 in version 5.0 and has been dropped to number 15 in version 6.0. The most likely attack vector is not going to be through a compromise of a wireless access system (I’m saying likely not impossible). Likewise Control of Administrative Privileges was promoted from 12 to 5 as attackers are most likely to try and take an administrative account to gain persistence.

Probably the most important takeaway from these changes is the power of the 20 controls. When you remain smaller and more focused you can change with much greater speed. This is something that NIST 800-53 can’t do like the Critical Security Controls. These changes are not drastic either which allows previous CSC deployments to have a migration plan over to the newer version. Like software platforms handling upgrade paths is a tricky thing and version 6.0 of the framework appears to have considered that prior to publishing. Users of this framework should appreciate how the developers of the framework clearly consider how this framework has been and will be used.

So long Secure Engineering

In version 5.0 there was this lingering oddity… “CSC 19: Secure Network Engineering”. The previous security control had four sub-controls with the Quick Win sub-control being, “Design the network using a minimum of a three-tier architecture (DMZ, middleware, and private network).” This by itself seems to try to enforce an architecture in the environment and leaves questions like… how does the cloud work with this secure engineering? Anyway this control never really felt like it fit in because all the controls were broad and unfocused, a clear deviation from the rest of the framework. Hardly any of the controls were directly actionable as well. Secure Engineering is a great piece of guidance for organizations building out a new network or going through a redesign initiative, but for an operational framework designed to gauge maturity and ability to prevent/respond to cyber attacks it just doesn’t fit. Another good win by the Critical Security Control Framework for understanding this and removing it.

Hello Email and Web Browser Protections

Browsers and Email systems provide direct access to the wild west of the Internet and until now the framework kind of ignored them or at best treated these pieces of software as anything else. The reality is that the browser and email are tools used by nearly every single individual in the world and thus attract a lot of attention from attackers. In version 5.0 there is a control CSC 6: Application Software Security. Until version 6.0 this was where we sought guidance for all software security standards. If you go look at the control you’ll see that outside of ensuring that the software is supported by its vendor there is really nothing for vended products, and unless you’re Google or Microsoft the chance is that your organization is not using a custom developed browser or email system. So there was clearly a gap. Enter version 6.0.

While not perfect the new CSC 7: Email and Web Browser Protections at least begins to address things such as browser and email system standardization, the use of plug-ins, and the allowance of various scripting languages on the browser itself. Additionally there is now more focus on email systems which in version 5.0 was relegated to a single blip in the CSC 13: Boundary Defense (version 5.0) calling for the use of SPF. One cool side affect of this work was that the old Application Software Security control has been able to be reduced down to control 18 from control 6 because we are meeting a lot of the threat head on with this new control.



The most glaring omission between version 5.0 and version 6.0 is that the official guidance does not appear to provide metric guidance. Why is this a big deal? Well one of the five central tenants of the framework is to provide metrics on effectiveness and automation of controls deployed in an organization. This reporting piece is what makes CSC such a valuable tool to cyber security managers. I am currently working on a document to add back in the effectiveness and automation metrics for the new version and hope to have those out soon.


So I spoke with the fine people at the Center for Internet Security and they kindly pointed out that I was the one with the oversight. There is an additional document, that I completely missed, called “The Measurement Companion” and it’s a take on version 5’s metrics and so much more. If you review the measurement companion document it not only gives you the testing methods but also some general levels that can help the security manager asses their risk thresholds. The reason for the separation was that the metrics guidance can and should be updated faster than the controls themselves. Decoupling this guidance from the master document makes these updates easier to push out. This is the long term support stuff that makes CSC a great framework. The measurement companion can be found in the starter kit link above.

Where did My Categories Go?

In version 5.0 there were these things called Categories. There were four types:

  • Quick Win – biggest bang for your buck
  • Configuration / Hygiene – reduce number and severity of vulnerabilities
  • Visibility / Attribution – monitor and identify security issues
  • Advanced – Complex actions that provide advanced security capabilities

In version 6.0 they have replaced Categories with something called “Family”. There are three families:

  • System
  • Application
  • Network

The benefit of the Categories was that you could show organizational leadership what types of functions the expenditures were going towards or show weaknesses / strengths in an environment. Families is nice as it helps push security responsibilities to teams within the organization but at the same time why remove Categories? Families and categories can co-exist without issue especially when there is little guidance on the use of families.

So why not? Well first off there’s the issue with “Quick Wins”. A great concept that was also a bit misleading. Given the budget and the personnel deploying the quick wins does provide you the biggest and best leap forward in how you secure your environment, but calling things like Application Whitelisting a “Quick Win” is a bit troubling. Application Whitelisting is something that only really dedicated or advanced organizations are capable of, and that dovetails into the rest of the categories… Not all environments are the same and so planning for the controls based on categories potentially led to an over investment in areas such as detection. I wonder if we were using categories the wrong way as a metric rather than a classification. Either way I have a feeling that additional guidance is on the way from the Center for Internet Security as they understand that there is still work to be done in the prioritization department.


The Critical Security Controls remains the easiest framework for an organization to implement and communicate throughout the organization. The adjustments made to the ordering and removal of non-effective controls shows how serious the Center for Internet Security is regarding their charge of guidance to organizations on best practices for a security program. While the loss of categories and the omission of metrics guidance does pose a problem for organizations deploying this tool it is certainly not a show stopper. Hopefully the Center for Internet Security will re-integrate this data in a minor version update to the framework.

Organizations are more effective when there is a clear strategy with data backing the use of the various resources available to a security program and the Critical Security Control Framework Version 6.0 is an incredibly valuable tool available for free to anyone willing to give it a try.

So my organization is just a user of this framework and certainly not the experts. I highly recommend anyone interested in learning more go directly to the source at: https://www.cisecurity.org/critical-controls/. I found the email address ControlsInfo [@] cisecurity.org to be super helpful as well.

Critical Security Controls – Version 6 [UPDATED]