Finding the needles in the CloudWatch haystack

Fields, filters and recipes to find what you need from discrimiNAT’s config and flow logs in AWS

Table of Contents

Fields

flow logs

Flow Logs can be found in CloudWatch under the discrimiNAT log group, in the flow log stream. A typical flow log looks like:

Here’s a summary of the possible fields:

dhost: destination hostname/FQDN
cat: packet origin – client or server
outcome: allowed or disallowed
src: source IP address
spt: source port
proto: tls, ssh or unknown protocol
proto_v: version of the identified protocol
dst: destination IP address
dpt: destination port
reason: reason for the outcome – such as in which Security Group was a matching protocol rule found, or which protocol anomaly led to the connection being disallowed

And if a Security Group with a see-thru rule in it is found attached to the client application, the following fields will be also present:

see_thru_days_remaining: number of days remaining while the see-thru rule will be non-blocking, as defined in the see-thru rule itself
see_thru_gid: the Security Group ID where the see-thru rule was found
see_thru_exerted: when true, the see-thru rule had to let this connection through and there was no protocol rule that would have allowed it; when false, there was a protocol rule that would have let the connection through anyway and its presence will be identified in the reason field

Config reference for the see-thru rules can be found here.

config logs

Config Logs can be found in CloudWatch under the discrimiNAT log group, in the config log stream. They form an audit trail of changes made to rules and Security Group attachments as clients come and go.

Here’s a summary of the possible fields:

cat: fqdn, client or see-thru change category
outcome: accepted something new , or removed it
client: IP address of an affected client
gid: the Security Group ID that relates to this change
fqdn: the FQDN found in a Security Group rule
proto: tls or ssh protocol as identified in the rule

And if a Security Group with a see-thru rule in it is found attached to a client application, the following fields will be also present:

thru_date: the date until which the non-blocking see-thru mode should remain effective; past this date such a rule would lose effect

Filters

The log lines are JSON-structured so each field can be addressed specifically. More of CloudWatch syntax can be explored on its Filter and pattern syntax page.

disallowed connections

{ $.outcome = "disallowed" }

allowed but protocol is not TLS

{ $.outcome = "allowed" && $.proto != "tls" }

connections from a specific client where the destination host is not api.github.com

{ $.cat = "client" && $.src = "172.16.1.9" && $.dhost != "api.github.com" }

security groups where the see-thru mode exception has only 2 or less days left

{ $.see_thru_days_remaining <= 2 }

connections where the see-thru mode had to be used, but not for perhaps telemetry data

{ $.see_thru_exerted is true && $.dhost != "ec2-instance-connect.*" }

Recipes

building an allowlist from scratch

  1. Create a new Security Group, with a see-thru rule in it. See the config reference and an example here. Give it a thru date sufficient enough to capture all stages of the application’s lifecycle, such as deployment, restart, the occassional uploading of reports, telemetry, monitoring, etc. Note the Security Group ID.

  2. Attach this Security Group to the application. In AWS this could be either attaching the Security Group to the EC2 instances in addition to other attachments, or in the case of serverless workloads, to the Network Interface that is in the VPC.

  3. Let the application follow its normal course of lifecycle. This could last a few hours, days or weeks depending on the application.

  4. Go to CloudWatch Log Insights and select the discrimiNAT log group. Then enter the following query, select an appropriate time range, and run it. Note: You’ll have to replace the Security Group ID in this example with the one that was created above.

filter see_thru_exerted AND see_thru_gid = "sg-00replaceme00"
| stats count() by see_thru_exerted, see_thru_gid, dhost, proto, dpt

This will produce a table of results like:

  1. From the information in the results table, specific allowlist protocol rules can be created. For example:

discriminat:tls:ec2.us-west-2.amazonaws.com,ssm.us-west-2.amazonaws.com
and discriminat:tls:api.github.com with protocol TCP and destination port 443.

Full reference for creating these protocol rules is here.

  1. Once these protocol rules are attached to the application, run the full application lifecycle again, and then the query at #4 to ensure that see_thru_exerted did not have to be used to let any disallowed traffic through. There may be the case where you choose to disallow certain destinations anyway, such as the endpoints that just collect telemetry data.

  2. Detach or remove the see-thru rule from the application and give it another full lifecycle run just to ensure it works smoothly.