Adding Takipi agent framework#472
Conversation
Added support for Takipi analytics and root cause analysis tool. The framework supports both local and remote collector approaches.
|
Hey chook! Thanks for submitting this pull request! All pull request submitters and commit authors must have a Contributor License Agreement (CLA) on-file with us. Please sign the appropriate CLA (individual or corporate). When sending signed CLA please provide your github username in case of individual CLA or the list of github usernames that can make pull requests on behalf of your organization. If you are confident that you're covered under a Corporate CLA, please make sure you've publicized your membership in the appropriate Github Org, per these instructions. Once you've publicized your membership, one of the owners of this repository can close and reopen this pull request, and dreddbot will take another look. |
|
signed and sent CLA doc. |
nebhale
left a comment
There was a problem hiding this comment.
This is a great submission and shouldn't require a huge amount of work to get it merged into the mainline. The one big change is that I'd like to have the framework trigger off the existence of a service who's credential payload exposes connection endpoints and authentication credentials (if necessary). Nearly every other framework could be used an example of how we do this.
I've also taken a look at the agent in your repository and noticed that it does not have a license within it. Anything distributed on the internet should probably have some sort of license that describes what the downloader is allows to do with it. Since there's no license of any kind in there at the moment, I could do pretty much anything with it legally speaking. In order to add the framework to the buildpack (and therefore distribute your agent as part of the offline buildpack), we require a distribution license to be packaged within the agent and liberal enough that all Cloud Foundry distributions can distribute your agent without any additional legal paperwork. As an example, you might look at the license that Contrast Security provides with their agent.
Again, a really great submission that I hope we can get into the buildpack quickly.
| @@ -0,0 +1,23 @@ | |||
| # Cloud Foundry Java Buildpack | |||
There was a problem hiding this comment.
This should be in config/ directory.
|
|
||
| ## Logs | ||
|
|
||
| Currently, you can get the Takipi agent logs using `cf files` command: |
There was a problem hiding this comment.
Given the ephemeral nature of the filesystem in a container, we recommend that you do not write to, or depend on files on, the filesystem. Any chance that the configuration of the agent when running on Cloud Foundry can be modified so that logs all go to stdout or stderr?
There was a problem hiding this comment.
I'll see what can be done. what about temp state files? we have some of those.
There was a problem hiding this comment.
Those should be fine as they only need to live for the lifetime of the container itself. Logs are particularly problematic because you typically need them after something has gone wrong. And when things go wrong, the container is recycled and the filesystem is disposed of. This is why all logging is collected from the console and streamed out to cf logs or a log warehouse like PaperTrail or Splunk.
There was a problem hiding this comment.
our logs are internal debug by nature and our users should not deal with them directly.
I opened an internal task to have some sort of "user facing logs" that can adhere to these requirements and live in the STDOUT/STDERR, but I hope this is not a blocker.
There was a problem hiding this comment.
Nope. I’ll leave it to you to issue an updated PR with the feature shows, but keep in mind that any debugging you’ll need to do for customers on CF likely won’t have logs.
There was a problem hiding this comment.
ok, i'm leaving it as is for now
|
|
||
| # (see JavaBuildpack::Component::VersionedDependencyComponent#supports?) | ||
| def supports? | ||
| @configuration['secret_key'] || @configuration['collector_host'] |
There was a problem hiding this comment.
Rather than doing this with explicit configuration, we would prefer to use services. The advantage of services is that they can communicate credentials to an application (and buildpack) without ever exposing them to users. So you'd expect a user to be able to cf create-service takipi && cf bind-service my-app takipi and everything would just configure itself transparently. In the case where a service broker doesn't yet exist, users would create the same service payload manually (cf create-user-provided-service takipi -p '{ "collector_host": "https://..." }' but to the buildpack this appears to be identical to the service broker approach.
There was a problem hiding this comment.
thanks for the detailed explanation. I did see all other agents do that but it kinda breached my cf knowledge so i brute forced my way in using config. I'll try and incorporate the services module and report back if i get stuck
There was a problem hiding this comment.
Absolutely. I've got a call with your team tomorrow, so if you come up against anything we can discuss there. Beyond that, I'm happy to answer any questions you have here or in the Cloud Foundry Slack #java-buildpack channel.
There was a problem hiding this comment.
No problem. Switched to services
| if @configuration['node_name_prefix'] && !@configuration['node_name_prefix'].empty? | ||
| "#{@configuration['node_name_prefix']}-$CF_INSTANCE_INDEX" | ||
| else | ||
| %q|$(ruby -rjson -e "puts JSON.parse(ENV['VCAP_APPLICATION'])['instance_id']")| |
There was a problem hiding this comment.
I'm not sure if this is really something you want to do. The instance_id is a UUID that is changed for each and every instance (it can also be more directly accessed as $CF_INSTANCE_GUID). This would mean that every instance in a multi-instance deployment would identify itself as a different node. In addition, instances come and go quite regularly (re-balancing, platform upgrades, etc.) and it's very common to see dozens of new instances in a day on many enterprise systems.
I think you might be better off coming with a default node_name_prefix instead and always having a <node_name_prefix>-$CF_INSTANCE_INDEX.
There was a problem hiding this comment.
When I previously dealt with PaaS it was Heroku, and there the "node name" was web1..web2 etc...
what is the equivalent CF way of achieving a reasonable name for each container?
There was a problem hiding this comment.
The numerical value is $CF-INSTANCE-INDEX and there isn’t really an equivalent to the web part. If you wanted to replicate that kind of behavior, I’d just make node_name_prefix default to web and allow the user to override it. Then with the concatenation, you’d see web-1, web-2 and if a user overrode it task-1, foo-1, etc. That seems reasonable.
There was a problem hiding this comment.
Vastly simplifies this code as well.
|
Hey chook! Thanks for submitting this pull request! I'm here to inform the recipients of the pull request that you and the commit authors have already signed the CLA. |
|
@chook And it looks like the CLA has gone through now as well. Progressing quite nicely. |
|
Awesome. I hope to push fixes later tonight. |
1. Config file placed correctly. 2. Moved from configuration to service provision 3. Updated tests 4. Added Takipi to list of frameworks in readme 5. Fixed node prefix definition
|
@nebhale 2 more issues on my plate, but I didn't want the review to wait:
|
|
@chook I'm going to squash all your commits, so if you want to do them before I get there, please do. The GH review system is good enough for me to compare my requested changes to the latest version of the code so there's no problem with you being aggressive about it. |
|
i committed my code changes. |
1. Readme updates now includes service 2. Change repository url
|
@chook Any news on getting a distribution license of some kind into the agent TARs? |
|
@nebhale yes. we've added a license as discussed. It should go live with the next release probably on Monday. I'll update |
|
Copy that. I look forward to getting this in. |
|
@nebhale see latest build (4.8.18) of the Takipi agent. It contains a license |
|
Looks good. I'll work on it now. |
This change adds the Takipi agent framework to the buildpack. This framework supports both the local and remote collector approaches for using the Takipi agent. [#472]
|
@chook All merged in. Go ahead and run some tests with it to ensure that you're happy and we'll cut a release late this week or early next week for you. |
|
LGTM! Thank you so much. |
|
Hi @nebhale was wondering when can we expect a release. |
|
There is not. The 3.x line is EOL at this time and all customers have been encouraged to upgrade to the 4.x line for the last couple of months. |
|
got it. and what about 4.5 release? any forecast? |
|
Historically we've released on about a three week cadence, but we don't have a fixed release schedule. At this point it's been two weeks since the last release, so some time in the next week is likely. |
This change adds the Takipi agent framework to the buildpack. This framework supports both the local and remote collector approaches for using the Takipi agent. [#472]
Added support for Takipi analytics and root cause analysis tool.
The framework supports both local and remote collector approaches.