How We Used it to Meet FedRAMP Moderate Compliance Requirements
I have something to confess: I LOVE AWS STEP FUNCTIONS! They are probably the most powerful service AWS offers. But it wasn’t always like this. When I first learned of Step Functions, they scared me.
“What is a State Machine?”
“How would you use it?”.
“Why so many Lambda Functions?”
They just seemed too complex for me to grasp. Then I started to wonder: how do you stitch two Lambda functions together, where one gets the result of the other? Then it all started to make sense! So let’s do a recap.
What is a State Machine?
A state machine is simply a logical flow of actions. It’s not much different than running a bash or Powershell script. For example: a script that runs daily that queries Active Directory for users who haven’t logged in for 90 days, disables any that meet that criteria and sends an email report of all the newly disabled users. That is a very simple state machine.
Why can’t I just do that instead and forget Step Functions?
Running a script on a server as a cron tab requires compute resources and an operating system. These are not without administration overhead — especially if that script is resource intensive and/or critical for business operations. Step Functions allows you to abstract that administration layer off while still having the ability to run long complex flows/scripts. Additionally, Step Functions integrates with other AWS services such as EventBridge, so you can trigger your State Machines off of different event patterns or even use it as your logic tier with API Gateway in a web application.
When I first started using Step Functions it was great but it was a little tedious. I essentially had to write a Lambda function for every single action — this also meant crafting an IAM Role with the least privileges for each Lambda function, and applying a resource-policy to the Lambda to prevent privileged escalation by only allowing my State Machine to invoke the Lambda. Since I had to write code anyways, I didn’t find many of Step Functions intrinsic functions very useful— I only used Choice and Wait with a CheckStatus Lambda for long asynchronous processes.
Direct API Integrations and Workflow Studio to the Rescue
Last year AWS announced not 1 but 2 total game changers for Step Functions:
- Direct API Integrations — you could now define AWS API calls directly in your State Machines — no more stitching together simplistic Lambdas for each action. EVERY API ACTION! This really unlocked the power of the intrinsic functions, since I could now offload that logic to Step Functions. You only need to write a Lambda if you are using a custom library or need to do some complex data transformation.
- Workflow Studio — a super slick browser based wizard where you can drag and drop to define your flow. This really reminded me (in a positive way!) of building SharePoint Workflows at my previous job.
Building an SSL Configuration Status Tracker
One of our FedRAMP clients is currently going thru the cATO process(Continuous ATO — you do 1/3 of the audit every year, rather than a full audit every 3 years) with their Authorizing Agency. For those not familiar with the FedRAMP process, each Cloud Services Provider has to meet hundreds of different controls based off of NIST 800–53 — though FedRAMP controls have hard requirements rather than many of those criteria being “organization-defined”. Some of the requirements for all public endpoints used by the Federal Government are outlined in CISA’s Binding Operational Directive (BOD) 18–01, Enhance Email and Web Security. This requires DMARC, DKIM and SPF for all email applications as well as strong ciphers and HSTS in your web traffic’s TLS configuration. It is not sufficient to just use FIPS 140–2 validated cryptography (though this does not preclude it), your application must comply with the following:
- 3DES is still on acceptable according to NIST’s CAVP; BOD 18–01 prohibits the use of 3DES in TLS. RC4 is also prohibited by BOD 18–01, however the IETF announced in 2015 via RFC 7465 that RC4 was no longer allowed for TLS, so it is very rare to see in the wild these days.
- Must enforce HSTS (more on what this is below).
Enter ssllabs.com (provided by Qualys). SSL Labs is a FedRAMP authorized service which evaluates your web application’s SSL/TLS posture. Since we had 30–40 endpoints to evaluate, this was not something we wanted to do manually! Luckily, ssllabs.com has a public API. This seemed like a perfect time to use Step Functions!
I created a DynamoDB Table which stored all of the urls that we had to track.
I then defined my Step Function like so:
- Invoked once per week via EventBridge
- Query DynamoDB for all the URLs I need to scan.
- I used the Map State intrinsic function to tell Step Functions to loop thru a list and perform a set of actions on each item in the list. Since the DynamoDB query had the list embedded in a JSON dictionary, I just had to specify where in the dictionary to find the list to iterate over.
4. Pass the URL to Lambda — I used this step to do some pre-parsing of the JSON from the previous step before passing it to Lambda.
5. Use Lambda to invoke the SSLLabs API. This was asynchronous, so rather than paying for Lambda to run until the job finished, I just had Step Functions check the status and wait until it was Ready. My Lambda function parsed out several attributes from the results. I will cover these later. **Side note: I used the requests library for this. Since this is a fundamental Python library, I would like to see AWS include this as one of their managed Layers. I understand not baking it into the runtime, but it would be nice to be able to import as a layer without having to upload the library manually.
6. Once Ready, I would upload the results to DynamoDB — the url was the partition key. Since I was using a direct API Integration to invoke DynamoDB:PutItem — I just defined the action like so.
7. Since I wanted to generate two different reports —I then ran two different branches:
- Report on Grade Status,
- Report on Certificate Expiration Status (we weren’t using ACM since we had to use the Government’s CA)
8. Query DDB for all endpoints with grades that were not A+ and all endpoints with a certificate expiring in less than 30 days.
“FilterExpression”: “Grade <> :val”,
“ProjectionExpression”: “endpoint, Grade”
Side note: take a look at the<> symbol, this is the expression for Not Equal to in DynamoDB Filter Expressions. I did not see this in the AWS Documentation, but did find it in StackOverflow!
“FilterExpression”: “Expiresin30Days = :val”,
“ProjectionExpression”: “endpoint, CertIssuer, ExpirationDate”
9. If there were any results (Count not equal to 0), it would send the JSON results to a Lambda function which used the json2html library to convert a JSON list into an HTML Table. I then passed this html to the sesv2:SendEmail action to send the weekly report.
Here is the final flow:
With this, I have an automated, serverless SSL/TLS Configuration Status Tracker. If I get time, I will bundle this up in CDK and post it to GitHub.
A few comments about SSL Labs
SSL Labs generates some very interesting data regarding your SSL/TLS configuration — which in and of itself could be a long and complex blog. Some of the attributes we were reporting were:
- Grade — the grade was based on many factors, some of which I cover below.
- Certificate Expiration Date — this was in UNIX time, so I had to use the datetime library to convert.
- Certificate Issuer — displays certificate authority simplifying the renewal process
- Certificate Body — I haven’t built this logic in yet with the DDB queries, but since we are using multi-SAN certificates, some of the URLs would have the same certificate, and would result in redundant entries in the expiration list. At some point in the future, I will add in logic to the queries to only output entries with unique Certificate Bodies.
- Whether it was vulnerable to several well-known exploits, like BEAST or HeartBleed.
- Whether is supports 3DES or RC4. RC4 is explicitly called out in the results. To determine if 3DES was supported, I had to parse all of the supported Cipher suites, which luckily wasn’t too rough.
- PFS — Perfect Forward Secrecy — is a concept that means that all sessions use a unique session key for all encryption. This ensures that if a session key was ever leaked or stolen and someone had recorded the encrypted conversation, they would only be able to decrypt a single session. The solution is using an ephemeral key exchange. If you see DHE in the cipher suite, you are using PFS. This includes elliptic curves which are more efficient than circular curves — ECDHE.
- HSTS — HTTP Strict Transport Security — is a header that is passed by your web server that tells the browser to use HTTPS for this domain moving forward — even if you enter http. Think of it as a client-side redirect for lack of a better term. Traditionally, you would use an HTTP to HTTPS Redirect on your web server or load balancer. However, in theory, if that original request was unencrypted and had sensitive data in it, this would protect against it. Best practice is setting a max-age of at least 1 year, and includeSubDomains — so all subdomains would enforce HTTPS the first time they were browsed. Also, you can include preload in the header which tells browser companies like Google and Mozilla, to preload this domain in their browser — ensuring that it NEVER goes over HTTP. Otherwise, your browser has to get the HSTS response header for the domain you are browsing before it will know to enforce HTTPS. It’s not guaranteed the browser will include your domain for preload, but it’s worth a shot. With BOD 18-01, the Federal Government is requiring all public federal endpoints to enforce HSTS in order to meet FISMA, CMMC, or FedRAMP compliance requirements.
- Secure Renegotiation — this was included only because it was a bit infuriating. One of our endpoints did not support this and thus was graded A-. Secure Renegotiation is used when a client or server wants to switch encryption keys mid-session, usually after Client Authentication. It was odd that this was required for A+ rating though. Renegotiation can expose your website to several attacks (DoS and MitM)— which is why Secure Renegotiation was invented to help mitigate those. Additionally, if you are using ephemeral Diffie-Hellman, you have to ask why do you need to renegotiate? Those keys are ephemeral and calculated at session initiation, do you need to immediately turn around and renegotiate after obtaining a unique and secure session key? This might be why Secure Renegotiation was not included in TLS 1.3 where PFS is explicitly enforced.
Matthew is Senior Solutions Director at stackArmor, a leading AWS cyber security partner that designs custom solutions for customers looking to meet security requirements of compliance frameworks: FedRAMP, NIST 800 series, PCI-DSS, DoD SRG, CMMC, HIPAA, FISMA, FIPS 140–2 (and 3) and more. StackArmor offers an AWS vetted solution to accelerate and decrease the cost of your FedRAMP ATO by over 40%.