Skip to content

CloudFront CDN configuration and Varnish refactor#4207

Merged
wjordan merged 3 commits into
stagingfrom
cdo-cloudfront
Oct 23, 2015
Merged

CloudFront CDN configuration and Varnish refactor#4207
wjordan merged 3 commits into
stagingfrom
cdo-cloudfront

Conversation

@wjordan

@wjordan wjordan commented Sep 25, 2015

Copy link
Copy Markdown
Contributor
  1. Introduces an application-specific HTTP cache-configuration hash that will be used for provisioning/configuring our HTTP cache layers moving forward.
  2. Refactors the Varnish configuration to take advantage of the cache-configuration hash, which separates the application-specific configuration from the Varnish-specific VCL code. I've also taken this refactoring opportunity to upgrade Varnish to 4.0.3 which is the current-supported release (Varnish 3 was retired in April 2015).
  3. Introduces a CloudFront CDN configuration script which uses the HTTP cache-configuration hash to create/update an environment-specific CloudFront distribution using the AWS SDK for Ruby.
  4. Adds a set of Chef integration tests to the refactored cdo-varnish cookbook, to ensure that no regressions have been introduced during the refactoring.
  5. Rack::SslEnforcer middleware will be added to all environments including production, to ensure that the HTTP Strict-Transport-Security response header is set from the application layer rather than from the HTTP-cache layer (this was previously being served by Varnish in non-development environments). HTTPS redirection is still handled by the HTTP-cache layer (CloudFront or Varnish) in non-development environments.

Aside from the middleware change, none of these changes affect the web-application code directly. The CloudFront CDN and Varnish cookbook are two separate pieces that are deployed independently of our webapp-deployment process. Steps for each rollout are documented below:

CloudFront (wiki)

Setup steps for CloudFront CDN rollout across environments:

(In the below instructions, replace ENV with the environment name, e.g. ENV-studio.code.org is staging-studio.code.org in staging, with the exceptions of code.org and studio.code.org in production. [num]/[abc]/[xyz] are placeholders for arbitrary ID strings.)

  • Run the update_cloudfront script with RAILS_ENV and AWS credentials provided in the environment. This will create/update the environment-specific CloudFront distribution based on the latest HTTP cache configuration in the codebase.
  • In Route 53, add 2 new A ALIAS entries for the dashboard/pegasus ELB endpoints, that will serve as the origin domains for the CloudFront distributions:
ENV-dashboard.code.org A ALIAS dualstack.ENV-dashboard-[num].us-east-1.elb.amazonaws.com.
ENV-pegasus.code.org A ALIAS dualstack.ENV-pegasus-[num].us-east-1.elb.amazonaws.com.
  • Test that the new hostnames serve the correct content when loading ENV-dashboard.code.org and ENV-pegasus.code.org directly.
  • To switch on the CDN integration, update the A entries from the existing ELB endpoints to the corresponding CloudFront ones:
ENV-studio.code.org A ALIAS [abc].cloudfront.net.
ENV.code.org A ALIAS [xyz].cloudfront.net.

Note: When testing, it might be a good idea to set the TTL to a relatively low number of seconds, so that a rollback will be applied without too much delay if necessary.

  • To rollback the CDN integration, simply update the A entries back to point directly to the ELB:
ENV-studio.code.org A ALIAS dualstack.ENV-dashboard-[num].us-east-1.elb.amazonaws.com.
ENV.code.org A ALIAS dualstack.ENV-pegasus-[num].us-east-1.elb.amazonaws.com.

Varnish (wiki)

For the Varnish update rollout I plan to test out a new versioned-cookbook deployment process:

  • First, set the cookbook constraint to the existing cookbook version in all relevant environments one of these three ways:
    • Adjust the cookbook_versions property from knife environment edit [ENV];
    • Adjust the "cookbook constraints" section in the policy->environments section of the Chef management console;
    • Running knife cookbook upload [cookbook] -E [ENV] with the existing cookbook present.
  • Increase the cookbook version in metadata.rb, and make all cookbook changes.
  • Upload the newly-versioned cookbook and update the environment-specific cookbook constraints one at a time, through one of the three ways listed above.

@wjordan wjordan changed the title [WIP] CDO cloudfront config CloudFront CDN configuration Sep 30, 2015
@wjordan wjordan force-pushed the cdo-cloudfront branch 2 times, most recently from dc1cab9 to dd168e9 Compare September 30, 2015 21:51
@wjordan wjordan changed the title CloudFront CDN configuration CloudFront CDN configuration and Varnish refactor Sep 30, 2015
@philbogle

Copy link
Copy Markdown
Contributor

Could you link or copy the set up stepsin this pull request into the wiki to make sure they can be found? (If there not finalized, it's fine if this doesn't happen right away.)

@wjordan

wjordan commented Oct 1, 2015

Copy link
Copy Markdown
Contributor Author

I've made the CloudFront rollout steps also available at CDN Setup. The Varnish rollout steps will be integrated into Chef-managed Configuration Changes once they have been tested and are finalized.

Comment thread cookbooks/cdo-varnish/README.md Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you wrap these lines that don't fit in the github line length?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@wjordan

wjordan commented Oct 1, 2015

Copy link
Copy Markdown
Contributor Author

A CircleCI integration-test run including the cdo-varnish tests is available here (expand the ./test.sh section to see the full log).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document the valid path patterns and add unit tests that verify that allowed patterns are correctly translated and that disallowed patterns are rejected with an error. The CloudFront document mentioned earlier in this file allows '*' and '?' anywhere in the path, but this appears to be more restrictive in what it can handle.

@philbogle

Copy link
Copy Markdown
Contributor

High level comment: I do like what you are doing to create a cache configuration hash which isn't tied specifically to Varnish and generating the Varnish config from that!

Comment thread cookbooks/cdo-varnish/README.md Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the link to the Amazon documentation is potentially misleading. It includes example paths which I believe aren't supported here (e.g. /images/*.jpg and /a??.jpg)

@wjordan

wjordan commented Oct 2, 2015

Copy link
Copy Markdown
Contributor Author

@philbogle ready for the next review pass. I've refactored the various helper methods to reject invalid path patterns and added several unit tests to shared/test/test_varnish_helpers.rb, among a few other minor tweaks.

@philbogle

Copy link
Copy Markdown
Contributor

Thanks, will take a look.

On Thu, Oct 1, 2015 at 5:50 PM, Will Jordan notifications@github.com
wrote:

@philbogle https://github.com/philbogle ready for the next review pass.
I've refactored the various helper methods to reject invalid path patterns
and added several unit tests to shared/test/test_varnish_helpers.rb
https://github.com/code-dot-org/code-dot-org/pull/4207/files#diff-5fcdf436dc4538858904c459711574ef,
among a few other minor tweaks.


Reply to this email directly or view it on GitHub
#4207 (comment)
.

@philbogle

Copy link
Copy Markdown
Contributor

Looking now.

@wjordan

wjordan commented Oct 2, 2015

Copy link
Copy Markdown
Contributor Author

FYI, I've added a behavior_for_path helper method, which will be used by a cookie-whitelisting Rack middleware I'm currently working on implementing. This middleware will read the same HttpCache configuration to simulate cookie-whitelisting behavior in development.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about calling this hoc_cookies or whitelisted_cookies?

all_cookies is confusing when I read it below because it doesn't mean all cookies, it mean's certain whilisted cookies.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think RETURN_LANG is incorrect in the case when x is longer than LANG_MAXLEN, because strncpy won't add a terminating null in that case, resulting in a non-null-terminated string.

(I realize this a pre-existing issue but we might want to consider whether it needs to be fixed because it might lead to a varnish crash.)

From http://man7.org/linux/man-pages/man3/strcpy.3.html: "If there is no null byte among the first n bytes of src, the string placed in dest will not be null-terminated."

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this code is safe as long as x is not longer than LANG_MAXLEN (8), then it should be safe as long as none of the keys in our list of whitelisted languages exceeds this length (max is currently 5), does that sound right?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't be certain that list won't change or that someone won't use this function in some new way.

It's easy to be safe in all cases by unconditionally adding the null termination ourselves after the strncpy, which is standard practice:
strncpy(lang, x, LANG_MAXLEN);
lang[LANG_MAXLEN - 1] = '\0' \

@philbogle

Copy link
Copy Markdown
Contributor

Done with main code review pass, thank you for the added tests!

@philbogle

Copy link
Copy Markdown
Contributor

LGTM after comments and merge conflicts are addressed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is another potential overflow.

strncat, unlike strncpy, always appends a terminate null character. So the maximum number of characters copies is LANG_MAXLEN +4

@wjordan

wjordan commented Oct 3, 2015

Copy link
Copy Markdown
Contributor Author

@philbogle thanks, I'll follow up on the remaining comments and then let you know.

Regarding the comments on accept-language.vcl.erb, I think we would best follow up on those upstream at varnish-accept-language, as I'm not knowledgeable enough about that low-level library to respond directly myself. Perhaps a look at the upstream tests will satisfy some of your concerns, or we could suggest additional tests to be added there (especially if there are failing tests we discover).

I think an upstream security-audit of this code would be valuable if we ever find the time for it, but I think it's also a quite separate task from the present work. The code was not substantially modified by this PR (I've merely extracted it into a separate file in order to ensure it will continue to be untouched), and it remains unchanged from the upstream codebase, so I think it's out of scope of this particular code review.

@wjordan

wjordan commented Oct 10, 2015

Copy link
Copy Markdown
Contributor Author

Rebase is failing tests because #4267 (session key refactoring which this branch depends on) was reverted in #4293 due to some session-cookie UI test failure issue that was not yet resolved.

@philbogle philbogle removed their assignment Oct 22, 2015
@philbogle

Copy link
Copy Markdown
Contributor

Clearing myself as assignee, please re-add me if you'd like me to take another look at this PR.

Add HTTP cache configuration shared by Varnish and CloudFront.
Varnish refactoring:
- Upgrade varnish, libvmod-cookie and libvmod-header to 4.0.3 packages.
- Refactor VCL to normalize/remove unused config and upgrade to Varnish 4 syntax.
- Use test-kitchen for chef integration testing cdo-varnish cookbook.
- Add Chef integration tests for cdo-varnish using WireMock.
- Run Chef integration tests on CircleCI.
- Bump `cdo-varnish` cookbook version for controlled deployment.
Adjust `Rack::SslEnforcer` configuration.
- Only Rack::SslEnforcer to redirect to HTTPS in development environment (otherwise, the HTTP-cache layer will handle it).
- HSTS header is still added to all HTTPS responses in all environments.
Cookie whitelist rack middleware.
@wjordan

wjordan commented Oct 23, 2015

Copy link
Copy Markdown
Contributor Author

Fully rebased, will merge once all tests pass.

@wjordan

wjordan commented Oct 23, 2015

Copy link
Copy Markdown
Contributor Author

circleci failures are unrelated to this PR.

wjordan added a commit that referenced this pull request Oct 23, 2015
CloudFront CDN configuration and Varnish refactor
@wjordan wjordan merged commit fb235aa into staging Oct 23, 2015
@davidsbailey davidsbailey deleted the cdo-cloudfront branch January 7, 2016 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants