I’ve recently started to play with some more of the internals of Cloud Foundry than I’ve been used to. This has been made much easier by the advent of bosh-lite, a system for deploying all of Cloud Foundry’s components using the bosh continuous deployment and configuration tool, into a single virtual machine. bosh-lite achieves this by using containers (Cloud Foundry’s own Warden container technology) to “emulate” the individual VMs where jobs would run in a full distributed topology.
bosh-lite has actually been around for a number of months now, but I’ve not had much of a chance to play with it until recently. This is partly due to other activities, and also that my earlier attempts to get an environment up-and-running were hampered by lack of memory. It should be possible to run bosh-lite with a Cloud Foundry deployment in 8Gb of RAM, but given my laptop’s configuration and the amount of other stuff I’m usually running, that was never comfortable – now I’m rocking 16Gb in a MacBook Pro, things are running more smoothly.
I don’t intend to spend this post documenting how to install bosh-lite and get a running single-node Cloud Foundry system. I followed the instructions in the README and things went well on this occasion. One suggestion that I’d make is if you can, to use VMware Fusion (assuming like me, you’re on OS X) and the Vagrant provider for Fusion, seems quite a lot better than Virtualbox. If you do, don’t forget to pass the --provider=vmware_fusion
flag when you bring your Vagrant image up (that’s something I do usually forget). One other little thing to mention is that after I started the bosh deployment, the bosh CLI gem timed out and returned a REST error – but the deployment process itself continued without any issues, and I was able to use bosh tasks
to check in on the progress. If you are interested, I used cf-release-157 this time around.
Once I had my minty-fresh Cloud Foundry running, I deployed Matt Stine’s handy, simple, Ruby scale demo app and pushed up the number of instances.
So what’s the point of this post? I want to mention two things…
Note: this is not about debugging applications on Cloud Foundry in general – a PaaS is an opinionated system and you generally shouldn’t need to poke around inside it like this. This is for debugging the Cloud Foundry runtime itself, or aspects that might run inside a container. Oh, and I’m sorry about the formatting of some of the shell output examples below!
Peeking at NATS traffic
NATS is the internal, lightweight message bus that Cloud Foundry components use to talk to one another. I’d read blog posts from Cornelia and from Dr Nic about digging into this before.
First of all, I used bosh ssh
to access the NATS host:
$ bosh ssh 1. ha_proxy_z1/0 2. nats_z1/0 3. postgres_z1/0 4. uaa_z1/0 5. login_z1/0 6. api_z1/0 7. clock_global/0 8. api_worker_z1/0 9. etcd_leader_z1/0 10. hm9000_z1/0 11. runner_z1/0 12. loggregator_z1/0 13. loggregator_trafficcontroller_z1/0 14. router_z1/0 Choose an instance: 2 Enter password (use it to sudo on remote host): *** Target deployment is `cf-warden' Setting up ssh artifacts Director task 9 Task 9 done Starting interactive shell on job nats_z1/0
So now I’m on the NATS host – now what? well, strictly speaking I didn’t need to login to that host / container, since of course, as a messaging system, the other hosts can connect to it anyway. The reason I wanted to login to it was to find out how NATS was configured.
$ ps -ef | grep nats root 1470 1 0 12:09 ? 00:00:12 /var/vcap/packages/gnatsd/bin/gnatsd -V -D -c /var/vcap/jobs/nats/config/nats.conf $ more /var/vcap/jobs/nats/config/nats.conf net: "10.244.0.6" port: 4222 pid_file: "/var/vcap/sys/run/nats/nats.pid" log_file: "/var/vcap/sys/log/nats/nats.log" authorization { user: "nats" password: "nats" timeout: 15 } cluster { host: "10.244.0.6" port: 4223 authorization { user: "nats" password: "nats" timeout: 15 } routes = [ ] }
From this, I can see that NATS is listening on IP 10.244.0.6, port 4222 (the NATS default), and that it is configured for username/password authentication. Handy to know!
I borrowed a little script from Dr Nic, but needed to modify it slightly to talk to authenticated NATS (his original script assumed there was no auth in place):
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env ruby | |
require "nats/client" | |
NATS.start(:uri => "nats://nats:nats@10.244.0.6:4222") do | |
NATS.subscribe('>') { |msg, reply, sub| puts "Msg received on [#{sub}] : '#{msg}'" } | |
end |
[update – Dr Nic has provided a more convenient method to do this, in the comments below – check out nats-sub
– but this works, as well]
$ ./nats-all.sh Msg received on [router.register] : '{"host":"10.244.0.134","port":8080,"uris":["login.10.244.0.34.xip.io"],"tags":{"component":"login"},"index":0,"private_instance_id":"e6194fe8-4910-4cb1-9f7c-d5ee7ff3f36b"}' Msg received on [router.register] : '{"host":"10.244.0.130","port":8080,"uris":["uaa.10.244.0.34.xip.io"],"tags":{"component":"uaa"},"index":0,"private_instance_id":"7713dd5b-3613-41a6-9c67-c48f22a769b4"}' Msg received on [router.register] : '{"dea":"0-1ba3459ea4cd406db833c1d188a78c02","app":"b8550851-37a0-4bd5-bdce-1d787b087887","uris":["andyp.10.244.0.34.xip.io"],"host":"10.244.0.26","port":61021,"tags":{"component":"dea-0"},"private_instance_id":"b52dfd91d68144cabb14b6c7bae77daae8b493acf1354c99941d49772a1f61fb"}' Msg received on [router.register] : '{"dea":"0-1ba3459ea4cd406db833c1d188a78c02","app":"b8550851-37a0-4bd5-bdce-1d787b087887","uris":["andyp.10.244.0.34.xip.io"],"host":"10.244.0.26","port":61025,"tags":{"component":"dea-0"},"private_instance_id":"090f5c5aeee94fdfb4a4e0f0afde2553480dcd97c018431db37b4dffdc80fde4"}' Msg received on [router.register] : '{"dea":"0-1ba3459ea4cd406db833c1d188a78c02","app":"b8550851-37a0-4bd5-bdce-1d787b087887","uris":["andyp.10.244.0.34.xip.io"],"host":"10.244.0.26","port":61028,"tags":{"component":"dea-0"},"private_instance_id":"92e10af77b274836a3f54373c9b7feee025c5b72f41a4c4982bde97d241ebd5b"}' Msg received on [router.register] : '{"dea":"0-1ba3459ea4cd406db833c1d188a78c02","app":"b8550851-37a0-4bd5-bdce-1d787b087887","uris":["andyp.10.244.0.34.xip.io"],"host":"10.244.0.26","port":61039,"tags":{"component":"dea-0"},"private_instance_id":"86edf0c0a7f84f04b52693b489ad93b7f857f77271b84d568d8f5600b34f7054"}' Msg received on [router.register] : '{"host":"10.244.0.26","port":34567,"uris":["8b24c0a7d28f4e03aa028a3dc89fb8c3.10.244.0.34.xip.io"],"tags":{"component":"directory-server-0"}}' Msg received on [dea.advertise] : '{"id":"0-1ba3459ea4cd406db833c1d188a78c02","stacks":["lucid64"],"available_memory":23296,"available_disk":22528,"app_id_to_count":{"b8550851-37a0-4bd5-bdce-1d787b087887":10},"placement_properties":{"zone":"default"}}' Msg received on [staging.advertise] : '{"id":"0-1ba3459ea4cd406db833c1d188a78c02","stacks":["lucid64"],"available_memory":23296}' Msg received on [dea.heartbeat] : '{"droplets":[{"cc_partition":"default","droplet":"b8550851-37a0-4bd5-bdce-1d787b087887","version":"a420d371-0816-4baf-9649-4e21255a66a4","instance":"d92d3c0c43ce4b6981e443e5c2064580","index":0,"state":"RUNNING","state_timestamp":1392639135.9526377},{"cc_partition":"default","droplet":"b8550851-37a0-4bd5-bdce-1d787b087887","version":"a420d371-0816-4baf-9649-4e21255a66a4","instance":"898e632697e246de9cf6b7330444227c","index":1,"state":"RUNNING","state_timestamp":1392639136.3117783},{"cc_partition":"default","droplet":"b8550851-37a0-4bd5-bdce-1d787b087887","version":"a420d371-0816-4baf-9649-4e21255a66a4","instance":"56d023e374aa49d88720daabac58e862","index":2,"state":"RUNNING","state_timestamp":1392639135.2225387},{"cc_partition":"default","droplet":"b8550851-37a0-4bd5-bdce-1d787b087887","version":"a420d371-0816-4baf-9649-4e21255a66a4","instance":"f11d86a7f4ad47f1ad554ae1b087d5f6","index":3,"state":"RUNNING","state_timestamp":1392639136.1042},{"cc_partition":"default","droplet":"b8550851-37a0-4bd5-bdce-1d787b087887","version":"a420d371-0816-4baf-9649-4e21255a66a4","instance":"c9e6de77f0484e6cae47f73ad6ca778a","index":4,"state":"RUNNING","state_timestamp":1392639135.9426212},{"cc_partition":"default","droplet":"b8550851-37a0-4bd5-bdce-1d787b087887","version":"a420d371-0816-4baf-9649-4e21255a66a4","instance":"924c387fc33444289b2db2762eefac42","index":5,"state":"RUNNING","state_timestamp":1392639135.940636},{"cc_partition":"default","droplet":"b8550851-37a0-4bd5-bdce-1d787b087887","version":"a420d371-0816-4baf-9649-4e21255a66a4","instance":"69866b260b1a49a09c03e178c4add2c5","index":6,"state":"RUNNING","state_timestamp":1392639135.944143},{"cc_partition":"default","droplet":"b8550851-37a0-4bd5-bdce-1d787b087887","version":"a420d371-0816-4baf-9649-4e21255a66a4","instance":"94bc605505d94dc1832e55bf2f671a99","index":7,"state":"RUNNING","state_timestamp":1392639135.4456258},{"cc_partition":"default","droplet":"b8550851-37a0-4bd5-bdce-1d787b087887","version":"a420d371-0816-4baf-9649-4e21255a66a4","instance":"8420df9bbe64456385dfa91285641ba4","index":8,"state":"RUNNING","state_timestamp":1392639135.9456131},{"cc_partition":"default","droplet":"b8550851-37a0-4bd5-bdce-1d787b087887","version":"a420d371-0816-4baf-9649-4e21255a66a4","instance":"ed9ad14f6599494c96f90296c59e6041","index":9,"state":"RUNNING","state_timestamp":1392639135.938359}],"dea":"0-1ba3459ea4cd406db833c1d188a78c02"}' Msg received on [router.register] : '{"host":"10.244.0.10","port":8080,"uris":["loggregator.10.244.0.34.xip.io"]}' Msg received on [router.register] : '{"host":"10.244.0.138","port":9022,"uris":["api.10.244.0.34.xip.io"],"tags":{"component":"CloudController"},"index":0,"private_instance_id":null}' Msg received on [router.register] : '{"host":"10.244.0.134","port":8080,"uris":["login.10.244.0.34.xip.io"],"tags":{"component":"login"},"index":0,"private_instance_id":"e6194fe8-4910-4cb1-9f7c-d5ee7ff3f36b"}'
Warden containers and shells
Cloud Foundry’s native container technology is called Warden. When an application is deployed, Cloud Foundry starts up a Warden container based on the limits assigned in terms of memory etc, and the applications run inside that. How can you get “inside” the container to see what is going on?
Well, there are a couple of techniques. Cloud Foundry Loggregator provides streaming access to the standard application logs (stdout/stderr) via the cf logs
command. Another option is James Bayer’s cool websocket-based method for getting access to the container. Yet another option is Warden’s own shell, wsh
. This does assume you can access the DEA machine with ssh, however.
wsh doesn’t seem to be very well documented, although I knew Cornelia had played around with it – see her excellent blog post on troubleshooting CF and applications, including a great flowchart / graphic suggesting different techniques.
Here’s the secret sauce:
1. Login to the DEA VM (called “runner_z1/0” in the list provided by bosh ssh
).
2. Identify your Warden container… there are a lot showing below, but I happen to know that these are several instances of the same app. The important part is the instance-17ij46hadt2
– the second part or that value maps to the location of the container’s private space on disk.
$ ps -ef | grep warden root 49 42 1 11:41 ? 00:00:41 /var/vcap/bosh/bin/ruby /var/vcap/bosh/bin/bosh_agent -c -I warden -P ubuntu root 5390 32634 0 12:12 ? 00:00:00 /var/vcap/data/packages/warden/38.1/warden/src/oom/oom /tmp/warden/cgroup/memory/instance-17ij46hadss root 5503 32634 0 12:12 ? 00:00:00 /var/vcap/data/packages/warden/38.1/warden/src/oom/oom /tmp/warden/cgroup/memory/instance-17ij46hadsu root 5697 32634 0 12:12 ? 00:00:00 /var/vcap/data/packages/warden/38.1/warden/src/oom/oom /tmp/warden/cgroup/memory/instance-17ij46hadt3 root 6779 32634 0 12:12 ? 00:00:00 /var/vcap/data/warden/depot/17ij46hadsu/bin/iomux-spawn /var/vcap/data/warden/depot/17ij46hadsu/jobs/58 /var/vcap/data/warden/depot/17ij46hadsu/bin/wsh --socket /var/vcap/data/warden/depot/17ij46hadsu/run/wshd.sock --user vcap /bin/bash root 6780 6779 0 12:12 ? 00:00:00 /var/vcap/data/warden/depot/17ij46hadsu/bin/wsh --socket /var/vcap/data/warden/depot/17ij46hadsu/run/wshd.sock --user vcap /bin/bash root 6784 32634 0 12:12 ? 00:00:00 /var/vcap/data/warden/depot/17ij46hadsu/bin/iomux-link -w /var/vcap/data/warden/depot/17ij46hadsu/jobs/58/cursors /var/vcap/data/warden/depot/17ij46hadsu/jobs/58 root 6930 32634 0 12:12 ? 00:00:00 /var/vcap/data/warden/depot/17ij46hadss/bin/iomux-spawn /var/vcap/data/warden/depot/17ij46hadss/jobs/59 /var/vcap/data/warden/depot/17ij46hadss/bin/wsh --socket /var/vcap/data/warden/depot/17ij46hadss/run/wshd.sock --user vcap /bin/bash root 6931 6930 0 12:12 ? 00:00:00 /var/vcap/data/warden/depot/17ij46hadss/bin/wsh --socket /var/vcap/data/warden/depot/17ij46hadss/run/wshd.sock --user vcap /bin/bash root 6934 32634 0 12:12 ? 00:00:00 /var/vcap/data/warden/depot/17ij46hadss/bin/iomux-link -w /var/vcap/data/warden/depot/17ij46hadss/jobs/59/cursors /var/vcap/data/warden/depot/17ij46hadss/jobs/59 root 6950 32634 0 12:12 ? 00:00:00 /var/vcap/data/warden/depot/17ij46hadt3/bin/iomux-spawn /var/vcap/data/warden/depot/17ij46hadt3/jobs/60 /var/vcap/data/warden/depot/17ij46hadt3/bin/wsh --socket /var/vcap/data/warden/depot/17ij46hadt3/run/wshd.sock --user vcap /bin/bash root 6955 6950 0 12:12 ? 00:00:00 /var/vcap/data/warden/depot/17ij46hadt3/bin/wsh --socket /var/vcap/data/warden/depot/17ij46hadt3/run/wshd.sock --user vcap /bin/bash root 6960 32634 0 12:12 ? 00:00:00 /var/vcap/data/warden/depot/17ij46hadt3/bin/iomux-link -w /var/vcap/data/warden/depot/17ij46hadt3/jobs/60/cursors /var/vcap/data/warden/depot/17ij46hadt3/jobs/60 vcap 23713 16807 0 12:26 pts/0 00:00:00 grep --color=auto warden root 32634 1 0 11:52 ? 00:00:09 ruby /var/vcap/data/packages/warden/38.1/warden/vendor/bundle/ruby/1.9.1/bin/rake warden:start[/var/vcap/jobs/dea_next/config/warden.yml]
3. Head over to the directory for your chosen Warden instance:
$ cd /var/vcap/data/warden/depot/17ij46hadt2
4. Notice that the Warden containers are running as root. If you run wsh now as an unprivileged user, you’ll get a connect: Permission denied
error. Time to switch to root, and then run wsh specifying the command to run inside the shell, as a parameter:
$ sudo su - # cd /var/vcap/data/warden/depot/17ij46hadt2 # bin/wsh /bin/bash
5. At this point, we’re inside the Warden container with a bash shell, and all commands are scoped inside it. So, let’s take a look at what is running:
root@17ij46hadt2:~# ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 0 12:12 ? 00:00:00 wshd: 17ij46hadt2 vcap 29 1 0 12:12 ? 00:00:00 /bin/bash vcap 31 29 0 12:12 ? 00:00:00 ruby /home/vcap/app/vendor/bundle/ruby/1.9.1/bin/rackup config.ru -p 61031 vcap 32 31 0 12:12 ? 00:00:00 /bin/bash vcap 33 31 0 12:12 ? 00:00:00 /bin/bash vcap 34 32 0 12:12 ? 00:00:00 tee /home/vcap/logs/stdout.log vcap 35 33 0 12:12 ? 00:00:00 tee /home/vcap/logs/stderr.log root 39 1 0 12:27 pts/0 00:00:00 /bin/bash root 52 39 0 12:27 pts/0 00:00:00 ps -ef
This is our Ruby app, running on port 61031, and we can see the logs being written as well.
Hopefully this is useful information for folks wanting to dig around inside bosh-lite and a running Cloud Foundry system!
I’ve subsequently learned how to get nats-sub to work, rather than writing some bespoke ruby.
Try out:
nats-sub ‘>’ -s nats://nats:nats@10.244.0.6:4222
nice write-up andy! the instances.json file on the DEA should help you identify which warden container handle belongs to which app easily. you should be able to find that easily once on the dea by using something like “find /var/vcap -name instances.json”.
Yup – that’s (as root)
/var/vcap/data/packages/nats/10.1/nats/vendor/bundle/ruby/1.9.1/bin/nats-sub ‘>’ -s nats://nats:nats@10.244.0.6:4222
(and no smart quotes, correct those if WordPress messes them up)
Hi, I have just a question that if the vmc-IronFoundry can only be inlltsaed on Windows.Now I have made micro cloud foundry and micro iron foundry run on a physical machine with Ubuntu. And I tried to push asp.net applications with the vmc-IronFoundry inlltsaed on Ubuntu. But it didn’t work.
I think the best way to find out would be to check with the Iron Foundry community.
There is a way to identify the exact warden container for a given app instance.
First get the GUID for the app, which can be seen by running “CF_TRACE=true cf app “. One of the requests should be something like “GET /v2/apps//summary”, where “” is the GUID for the app.
Once you have the app’s GUID, you can look in the file “/var/vcap/data/dea_next/db/instances.json” in the “runner” VM. This file contains the mapping of app instances to Warden containers. Look for the “application_id” and “instance_index” fields that match the app and instance you care about, then use the “warden_container_path” value in the same JSON hash.
WordPress mangled my comment. That should be:
“GET /v2/apps/[guid]/summary”, where “[guid]” is the GUID for the app
Thanks a lot!
This is even (slightly) easier to identify your Warden Container:
$ ps -ef | grep warden/depot