Rabbet plane

The first part in this series is about logging. When you’re playing with your at the REPL, you can immediately see what’s going on. When your code is running on a server in some data center, and you had a problem 6 hours ago, it’s too late for a REPL. You need logs.

Having good logs is an art, but adding logging to a Clojure project is straightforward. So this is where we will start.

This article will show you how to use clojure.tools.logging, and get it set up using the Log4j logging backend. I’ll explain some common logging concepts, cover how to get Log4j installed, configured and working in a Leiningen project, and as an aside explain how to read Clojure stack traces. Finally I’ll run through some code that’s useful for interacting with the logger configuration at run time.

I was chatting with @danoyoung on IRC the other day about embedding a nREPL server, and about using that as the basis for a talk at our local Clojure meetup.

I’ve also been in and seen conversations about all of the effort and knowledge that needs to go into making some Clojure code be a “real” software product.

Along these lines of thinking, I have an idea for a series of posts about the kinds of things one does when putting Clojure into production. Each will build on the previous, hopefully resulting in a nice multipart article and corresponding repo on Github to help beginners see some patterns and ideas for building Clojure services.

The first post will be about logging. Updates coming soon…

Updates within…

I wanted to query ES from Hive. They seem like really interesting, complementary tools.

Setup

Obviously I needed Elasticsearch and Hive to be running. The former is fairly straightforward, the latter requires a basic Hadoop stack.

I have been playing with the Cloudera distribution, and for a cluster it is great, but for this I don’t need the (significant) overhead.

So I found a Vagrant setup that did a bunch of what I needed, fixed a bunch of issues, added Elasticsearch, and pushed up hadoop-hive-elasticsearch-vm – a single-node ES and Hive cluster.

What I’m trying to accomplish

  • ES aggregations are cool, but I have definitely found use cases where they don’t do all that I need.
  • I’m trying to do things like calculating TF-IDF type statistics for documents and groups of documents.
  • The tricky thing with aggregations was stuff like summing TFs across a whole account.
  • These kinds of things are really easy with SQL, so it seems like ES + Hive (or later ES + Spark) would be a very powerful combination.

Butterfly net

Part one covered the types of things I’m considering when talking about “debugging”, and the process by which I debug code.

This part is about writing debuggable code, and the tools I use or have seen to debug code: the REPL, println debugging and better, tracing with Spyscope and clojure.tools.trace, and other assorted good things.

Bug I read a few posts recently on r/clojure asking about Clojure debugging tools. It seemed that those asking the questions were looking for the kind of tools we have come to expect from IDEs such as IntelliJ, Eclipse, Visual Studio and their ilk. It’s a familiar question since I asked it myself when I started learning Clojure.

I think I have come to the same conclusion as other Clojure programmers in that my short answer to the question is “there isn’t one”, or “it’s the REPL”, and that my long answer is, well, long.

This cryptic answer needs expanding upon, because it’s not really true. The real answer is that, like in any other language, the best Clojure debugger is your brain.

This is no less cryptic.

In coming up with the long answer, I found it fits best into two parts. This first part covers what I’m defining as debugging, and my overall approach – my philosophy, perhaps. The second part will cover building debuggable systems, and finally get into the debugging tools themselves.

Docker whale logo

I recently spent some time researching and using Docker, specifically with a view of how it would help me develop a scalable SOA. I gave a couple of presentations on some of the basics I picked up.

In the spirit of the order of the talk, I concluded that Docker had some really interesting use cases:

For developers: Docker is great for pulling in application dependencies. For example, starting up an ElasticSearch server is as simple as two commands at the shell.

For devops: Docker is great for getting developers to package up apps with all their dependencies into a declaratively constructed, directly deployable, entirely repeatable artifact.

For architects: Docker is entirely in line with the tenets of Twelve Factor Apps, and can easily be used in various infrastructure scenarios.

I didn’t have time in the talk to cover some of the interesting details I discovered along the way, so I’ll cover them here instead:

  • Structure Dockerfiles hierarchically
  • Use Versions in Tags
  • Check Dockerfiles into GitHub
  • Running in the foreground
  • Inter-container communication and web services APIs

A rusty padlock It’s clear that using wifi access points I don’t own is an increasingly risky proposition, and given I’m going to be using others’ wifi more soon, I wanted to put together some kind of VPN to help protect my network comms. Since I want all network comms to be routed through the VPN to a trusted access point, popular methods using Hamachi didn’t seem to go far enough. A VPN would make an interesting reason to get a Raspberry Pi, but I have an older laptop laying idle.

A lightweight Linux distro and OpenVPN server, and the Tunnelblick OSX OpenVPN client made setting up a VPN tunnel a fairly straightforward system to build and manage (after working through a couple of issues that would be obvious to a networking pro). With configuration of a DynDNS hostname, port forwarding on my home wireless router, and a little networking black magic, I now have give secure access to my home network and internet connection from anywhere in the world.

I found plenty of well-written resources online for getting OpenVPN installed and running, but nothing quite fit my need to build a VPN to my home network internet connection. I don’t know that this post reflects the best way to do what I wanted to do, but I hope if you are after a similar setup it will save you some time digging.

In my experience, there’s a common set of types of tools that every project needs. When it’s time to start up a new project, here’s an outline of the things to consider when putting together a new tool chain, and some examples of good tools I’ve seen recently.

My old MacBook Pro (A1260/MacBookPro4,1/OS X 10.5.8 Leopard) might be getting a little ragged at the edges, might be needing some new memory, but it can still play with the cool kids after fixing a couple of small issues getting Leiningen (both 1 and 2) to work.  In short, using Java 1.6 fixed the problem.

I didn’t even realize there’s a set of command line tools for working with Amazon’s Web Services.  It only took a few mins to get started, but it wasn’t immediately apparent what needed to happen.

I’m using OSX Lion.  If you’re using another *nix this is all fairly easy to translate. I have no idea how this could be Windows-ified…