2012 – Mike's House

My foray into web development

I like browsing the web. I do it a lot throughout my day.

A lot of people work hard at making the web a cool-looking place. Some sites make simplicity look so easy, that when you look under the hood, it’s all chaos and destruction, folded and crunched together, all to present something really nice and smooth for the end-user.

I’m not a developer – much less a web dev. There’s a lot to know in any field of computing – and in web it’s pretty much the most visible part of computing as a whole, since pretty much anyone anywhere is going to use a web browser to view a site at some point.

I mean, sure, we all learned some HTML – hell, I wrote some sites back in the days of Geocities, and it was awesome to learn about tiling backgrounds of animated GIFs, and when CSS came around, minds blown!

And I left that field for the frontend developers, and went into infrastructure and operations.

And as time passes, you find yourself managing a variety of systems and knowledge, and at some point, you may say to yourself, “I wish I knew how to answer this question…”

And then you write some code to answer it. Voila! You’re a developer, of sorts.

I’m a huge fan of data visualization. Telling stories with pictures dates back millennia, and it’s very relatable to most people. Recently, I wrote a tool to help myself display the dependency complexity of Chef roles, and I found that, while being very useful, the output is very limited, as it’s a static generated image, whereas we live in a web-friendly world where everything is interactive and fun!

So when I came across another hard question I wanted to answer, I thought, “Why not make this a web application?”

This time, the question I wanted to answer was: As a GitHub Organization owner for my company, what human-to-team-to-software-repository relationships do we have, and are they secure?

If you’ve ever managed an Organization in GitHub, there are a few key elements.

An Organization can have many Repositories
An Organization can have many Teams
A Repository can have many Teams
A Team can have many members, but only one permission (read only, read/write, owner)

So sorting out who is on what Team, what access they have, across many repositories, can be a security nightmare. Especially when you have more than 4-5 repositories.

During my first foray into solving this, I cobbled together a command-line tool, using Ruby with the Graphviz library. I’ve like Graphviz for years – it’s straightforward, as structured text gets rendered into a graph and then can be output to a file.

Very straightforward, has some limitations, but basically allows you to store graphs as text, and re-render them when changes happen. Basically, it’s like storing source code and not the binary output.

But since there were some limitations, and I wanted this new question to be more than a command-line tool, something I could share with the world at large, without requiring any client-side installation of any tools or dependencies.

So I spent a lot of time hemming and hawing, looking at web frameworks and trying to figure out some of them, and “how does this work?” came up a lot.

Finally, yesterday I set out to sit down and accomplish this task. I sat in a Starbucks in New York City, and had a Venti. I started banging away at about 11:30. I took a break for a refill and a snack around 1:30, and when I sat down again, I kept hacking away until 9:30pm, when I deemed completion.

The code was written, tested by me locally, pushed to GitHub, deployed to Heroku, DNS name wired up and all. As soon as I completed, I left Starbucks, and heaved a huge sigh – it was one hell of a mental high, I was in “the zone” and had been there for a long time.

You are more than welcome to browse the source code here and the finished project here. I call it the GitHub Organization Viewer, hence “GOVweb”.

I have a bunch of other ideas on how to make this better, how to model the data, which visual style to use, but I think for now, I’m going to leave it for a bit, and see what I think about it in a couple of months.

But all in all, this reinforced my opinion to never be afraid to try tackling a new idea, a new project, a new field you’re unfamiliar with – as long as you can read, comprehend and learn, the world is your oyster.

A picture is worth a (few) thousand bytes

(Context alert: Know Chef. If you don’t, it’s seriously worth looking into for any level of infrastructure management.)

TL;DR: I wrote a Knife plugin to visualize Chef Role dependencies. It’s here.

Recently, I needed to sort out a large amount of roles and their dependencies, in order to simplify the lives of everyone using them.

It wasn’t easy to determine that changing one would affect many others, since it had become common practice to embed roles within other roles’ run_list, resulting in a tree of cross-dependency hell.
A node’s run_list would typically contain a single role-specific item, embedding the lower-level dependencies.

A sample may look like this:

node[web1] => run_list = role[webserver] => run_list = role[base], recipe[apache2], ...
node[db1] =>  run_list = role[database]  => run_list = role[base], recipe[mongodb], ...

Many of these roles had a fair amount of code duplication, and most were setting the same base role, as well as any role-specific recipes. Others were referencing the same recipes, so figuring out what to refactor and where, without breaking everything else, was more than challenging.

The approach I wanted to implement was to have a very generalized base role, apply that to every instance, then add any specific roles should be applied as well to a given node.

After refactoring node’s run list would typically look like:

node[web1] => run_list = role[base], role[webserver]
node[db1] =>  run_list = role[base], role[database]

A bit simpler, right?

This removes the embedded dependency on role[base], since the assumption is that every node with have role[base] applied to it, unless I don’t want to for some reason (some development environment for instance).

Trying to refactor this was pretty tricky, so I wrote a visualizer to collect all the roles from a Chef repository’s role_path, parse them out, and create an image.

I’ve used Graphviz for a number of years now, and it’s pretty general-purpose when it comes to creating graphs of things (nodes), connecting them (edges), and rendering an output. So this was my go-to for this project.

Selling you on the power of visualizing data is beyond the scope of this post (and probably the author), but suffice to say there’s industries built around putting data into visual format for a variety of reasons, such as relative comparison, trending, etc.
In fact some buddies of mine have built an awesome product that does just that – visualizes data and events over time. Check them out at Datadog. (I’ve written other stuff for their platform before, it’s totally awesome.)

In my case, I wanted the story told by the image to:

Demonstrate the complexity of the connections between roles/recipes (aka spaghetti)
Point out if I have any cyclic dependencies (it’s possible!)
Let me focus on what to do next: untangle

Items 1 & 2 were pretty cool – my plugin spat out an increasingly complex graph, showing relationships that made sense for things to work, but also contained some items with 5-6 levels of inheritance that are easily muddled. I didn’t have any cyclic dependencies, so I created a sample one to see what it would look like. It looked like a circle.

Item 3 was harder, as this meant that human intervention needed to take place. It was almost like deciding on which area of a StarCraft map you want to go after first. There’s plenty of mining to do, but which will pay off fastest? (geeky references, are you surprised?)

I decided on some of the smaller clusterings, and made some progress, changing where certain role statements lived and the node <=> role assignment to refactor a lot out.

My process of writing a plugin developed pretty much like this:

Have an idea of how I want to do this
Write some code that when executed manually, does what I want
Transform that code into a knife plugin, so it lives inside the Chef Ecosystem
Package said plugin as RubyGem, to make distribution easy for others
Test, test, test (more on this in a moment)
Document (readme only for now)
Add some features, rethink of how certain things are done, refactor.
Test some more

Writing code, packaging and documentation are pretty standard practices (more or less), so I won’t go into those.

The more interesting part was figuring out how to plug into the Chef/Knife plugins architecture, and testing.

Thanks to Opscode, writing a plugin isn’t too hard, there’s a good wiki, and other plugins you can look at to get some ideas.

A couple of noteworthy items:

Figuring out how to provide command-line arguments to OptionParser was not easy, since there was no real intuitive way to do it. I spent about 2 hours researching why that wasn’t doing what I wanted, and finally figured out that "--flag" and "--flag " behave completely different.
During my initial cut of the code, I used many statements to print output back to the user (puts "some message"). In the knife plugin world, one should use the ui.info or ui.error and the like, as this makes it much cleaner and consistent with other knife commands.

Testing:

Since this is a command-line application plugin, it made sense to use a framework that can handle inputs and outputs, as that’s my primary concern.
With a background in systems administration and engineering, software testing has never been on the top of my to-learn list, so when the opportunity arose to write tests for another project I wrote, I turned to Cucumber, and the CLI extension Aruba.

Say what you will about unit tests vs integration tests vs functional tests – I got going relatively quickly writing tests in quasi-English.
I won’t say that it’s easy, but it definitely made me think about how the plugin will be used, how users may input commands differently, and what they can expect to happen when they run it.

Cucumber/Aruba also allowed me to split my tests in a way that I can grok, such as all the CLI-related commands, flags, options exist in one test ‘feature’ file, whereas another feature file contains all the tests of reading the roles and graphing them in different formats.

Writing tests early on allowed me to continue to capture how I thought the plugin will be used, write that down in English, and think about it for awhile.
Some things changed after I had written them down, and even then, after I figured out the tests, I decided that the behavior didn’t match what I thought would be most common.

Refactoring the code, running tests in between to ensure that the behavior that I wanted remained consistent was very valuable. This isn’t news for any software engineers out there, but it might be useful to more system people to learn more about testing.

Another test I use is a style-checker called tailor – it measures up my code, and reports on things that may be malformed. This is the first test I run, as if the code is invalid (i.e. missing a end somewhere), it won’t pass this test.

Putting these into a test framework like Travis-CI is so very easy, especially since it’s a RubyGem, and I have set up environment variables to test against specific versions of Chef.
This provides the fast-feedback loop that tests my code against a matrix of Ruby & Chef versions.

So there you have it. A long explanation of why I wrote something. I had looked around, and there’s a knife crawl that is meant to walk a given role’s dependency tree and provide that, but that only worked for a single role, and wasn’t focused on visualizing.

So I wrote my own. Hope you like it, and happy to take pull requests that make sense, and bug reports for things that don’t.

You can find the gem on RubyGems.org – via gem install knife-role-spaghetti or on my GitHub account.

I’m very curious to know what other people’s role spaghetti looks like, so drop me a line, tweet, comment or such with your pictures!

Quick edit: A couple of examples, showing what this does.

Sample Roles

(full resolution here)

Running through the neato renderer (with the -N switch) produces this image:

Sample Roles Neato

(full resolution here

Recruiting via LinkedIn – Don’t Do This!

I regularly get emails from recruiters all over the planet, telling me about their awesome new technology, latest and greatest ideas, and why I should work for them.

Most get ignored.

One came in this week that annoyed me, since it was from someone at a company that had sent me the exact same emailÂ six months ago.

I felt I had to respond:

Hi <recruiter name>,

I think heard of <YourCompany> last year sometime from a friend.

I also received this same stock email from you on 8/22/11, and you had addressed it to “Pascal” – further evidence of a copy-and-paste.

It would behoove you to keep records of whom you contact, as well as reviewing the message you paste before clicking “Send”.

A stock recruiter email is not a very likely way to attract good recruits, especially if you’re listing a ton of things that are not particularly relevant or interesting in the realm of technology.

Asking me to send a resume, while being able to view my full LinkedIn profile also seems superfluous – here’s the information, you have supposedly read it, and that is what attracted you to my profile in the first place, rather than “someone who turned up in a keyword search”.

I wish you, and your company all the best, and hope that these recruiting tactics work for you.

All the best,
-M

I am very curious what kind of response, if any, I shall get.

Chatting with a robot

Here I am, sitting calmly, trying to figure out the reasoning for the universe, and I get a GChat notification that someone wants to chat with me.

Here’s the transcript:

10:29:12 AM caitlyn ball: hi
10:29:17 AM miketheman: hi
10:29:24 AM caitlyn ball: hey whats up? 22/F here. you?
[email protected] is now known as caitlyn ball. (10:29:27 AM)
10:29:41 AM miketheman: totally bored.
10:29:49 AM caitlyn ball: hmm. have we chatted before?
10:30:15 AM miketheman: probably not, since you just added me to your list
10:30:24 AM caitlyn ball: oh ok. i wasnt sure. anyways.. whats up?
10:30:35 AM miketheman: not much, working. you?
10:30:45 AM caitlyn ball: im like so boreddd…. there is nothing to do
10:31:00 AM caitlyn ball: ohhh wait! i got a great idea. have you ever watched a sexy girl like me strip live on a cam before?
10:31:18 AM miketheman: no, I don’t believe that I have.
10:31:25 AM miketheman: And that seems to be a great idea.
10:31:29 AM caitlyn ball: wellllll….. you could watch me strip if you would like?
10:31:43 AM miketheman: possibly.
10:31:56 AM miketheman: Or we could discuss the nature of the desire for people to watch other people remove their clothing
10:32:00 AM caitlyn ball: yeah? ok well my cam is setup through this website so that i cant be recorded so you have to signup there.
10:32:09 AM caitlyn ball: it only takes a minute and it is free. ok?
10:32:13 AM miketheman: That doesn’t seem likely.
10:32:28 AM caitlyn ball: http://<removed> go there then at the top of the page click on the goldish JOIN FREE button.
10:32:33 AM caitlyn ball: k?
10:33:00 AM miketheman: Are you sure you don’t want to debate the reasoning behind the attraction with exposed bodies?
10:33:16 AM caitlyn ball: also it does ask for a credit card but thats how they keep kids out. it does not charge the card. k?
10:33:34 AM miketheman: Of course it does. Are there wizards with hats on the site as well?
10:33:50 AM caitlyn ball: ok babe well hurry up and when u get logged in then u can view my cam and we can have some fun!
10:34:03 AM caitlyn ball: i also have some toys but u have to tip me some gold or join me in private to see those.
10:34:08 AM miketheman: Again, probably not going to happen.
10:34:19 AM caitlyn ball: hey lets talk on there babe. my messenger is messing up.
10:34:36 AM miketheman: I believe you completely.
10:34:52 AM miketheman: You must have reached the end of your loop.
10:35:03 AM miketheman: Bye!

So it was a fun little distraction, and the the URL provided resolves to <obviously>Â girlcamz [net] – haven’tÂ visited, since there’s no point, really.

I was hoping the bot would be a little better than a simple responder to the next input. But alas. Developers of sex marketing spam bots are probably less inclined to put some real engineering efforts into their crap.