The Freedom of AWS Lambda

AWS Lambda is hardly new these days, yet people are still only starting to explore it – after all, not all of us have the flexibility to spend our free time exploring new tech, or are in a position to explore and experiment on the job, and many organizations simply can’t change on a dime, and chase every cool technology that they find.  There are growing pains here, to be sure, but AWS Lambda can provide a freedom to organizations that simply allows them to build products at a pace that can’t be achieved in more traditional ways, and at a fraction of the cost.

What is AWS Lambda?

We’re all familiar with containers – 3rd party software that sits on a server, and provides a set of services that our software can take advantage of.  Containers are everywhere, and we don’t blink an eye to use them.  No one builds a custom web server – they deploy their web app into a container, like Apache Web Server or NginX.  No one builds a server-based Java fat client – they build their app to conform to the Servlet spec, and then deploy it into a Servlet Container, like Apache Tomcat or Jetty.  This tech has been around for years, and has made us significantly more productive, because the containers have abstracted out a significant layer of complexity that we no longer need to worry about – we just build our products according to the rules (i.e. – specifications and standards), and the container happily does its’ job.

AWS Lambda is simply the next iteration on this theme, and takes advantage of the advances of virtualization over the last decade or so.  With each of the examples above, it’s someone’s responsibility to stand up a server or two (or more), install the container,  deploy your code into it, and then maintain those servers with upgrades, fixes, etc, for the lifetime of the service.  When traffic gets high, you might need to spin up a few new servers, and then hopefully remember to decomission them when its back down to normal.  As it turns out, AWS Lambda is an abstraction layer that handles exactly those services for you.

Serverless is NOT Scary

The term ‘serverless’ is a bit of a misnomer that even Amazon’s CTO, Werner Vogels admits.  Of course there is a server behind the scenes there somewhere – it’s simply that as a programmer, architect or systems operator, you don’t need to touch it.  This shouldn’t be a totally unfamiliar paradigm, since most of us have been working with virtual servers for the past 10-15 years anyway, and tech like Docker Containers more recently – we’ve simply shifted that responsibility to live behind Amazon’s walls.

This allows us to start thinking about our products not as applications, but as services.  It’s like an abstraction layer that takes care of the fact that code needs to be bundled and deployed, so the actual value proposition of our work can take center stage.  In this way, it can be extremely liberating, and we can suddenly go from idea-to-value with a few clicks.

Imagine that your customers are demanding that they need to be able to see and update their configuration in your API – with Lambda, you can provide this capability in at least pilot form in little more time than it takes you to implement it.

Did you say something about cost?

Why yes, I did.  Picture this – you’re a scrappy startup, and you’ve finally got your first paying customer.  The deal is signed, and its time to deliver – so you fire up the AWS console, spin up a few servers (redundancy, don’t you know), set up a load balancer and configure DNS, and you’re in business.  The problem is, what happens if it takes weeks for your app to really get traction?  Or what if the nature of the app is that it’s only actively used a few hours a day?  You’re paying for 24 hours worth of availability for these services, even if your app only uses a fraction of that.

AWS Lambda has a very different pricing model, based on actual processing time, instead of theoretical availability.  This means that if your app only did 3 hours of total work, then you only pay for the 3 hours of time – what a phenomenal way for a scrappy startup to scale up!

Check out the pricing page for full details, of course – there is a $.20 charge per million requests, the normal price involved for moving data out of the AWS cloud, a fee for storing the the actual code, and any other services you’re using – just like anything in AWS pricing, it’s a combination of many things, so read carefully!

Wait, that’s the wrong end of ‘Scale’!

Of course, no one really wants to talk about that end of the scaleability model – we want to know what happens when our app hits the big time (does anyone get ‘Slashdotted‘ anymore?)

There’s good news here, too – if you have more work than your Lambda function can process, they’ll just spin up more to handle the load.  So just like other AWS services like OpsWorks and Elastic Beanstalk, the auto-scaling comes with the service.

There’s a cost to this, of course.  Because they’re effectively giving you parallel processes, each instance will incur the usage cost simultaneously – so if 10 instances are spun up and continually working for an hour, you’ll incur 10 hours worth of cost.

All is not lost, though – Amazon provides the tools to limit how many instances can be running at any given time, so you have some control.  Of course, if you simply setup every workload to bring in more money than it costs, then you’re golden – the auto-scaling is effectively printing money (I’m still working on this myself).

Slow down there, Killer

Let’s take a step back and be realistic – AWS Lambda is not the Panacea of computing.  As with any technology, there are trade-offs, draw backs and side effects that you need to be aware of, among them are:

  • SLA – AWS does not currently provide an SLA for performance or reliability.  While this sounds terrible, it’s simply an important trade-off that needs to be made with design.  It might mean that you should have a backup plan for any important code that you deploy as a Lambda function.  It almost certainly means that if your code is mission critical (i.e. – your company goes out of business if it’s broken), or if lives are at stake, then AWS Lambda is not for you.  I’ll wager that the majority of code does not fall into that category, however, and can cleanly take advantage of the benefits provided.
  • Performance – Because Amazon takes care of the deployment for you, and only charges you for actual usage, and not for 24 hours worth of server availability, they reserve the right to reclaim resources if you’re not using them.  When this happens you will run into some latency as they spin up a new container – this is known as a ‘cold start’.  In practice, this seems to be a few hundred milliseconds for JavaScript functions, and a few seconds for Java functions, but the good news is that if you’re app is active, then this will be rare.  Consider this carefully when building your services – if you have a strong requirement for sub-second latency, you should plan accordingly.  For most ‘offline’, event oriented or batch systems, however, this is likely not a problem.
  • Deployment infrastructure – Like most of AWS’ products, it provides both a web interface to use Lambda’s as well as an API and CloudFormation support, but because it’s still early days, the landscape for integrating AWS Lambda into your deployment pipelines is fairly sparse.  I’ve come to like the Serverless Framework as a nice infrastructure tool that allows me to define my Lambda functions and any dependent components like DyanamoDB tables, API Gateway configurations, etc. into a single file that sits in my source control repo.  There aren’t a ton of other options out there, however, and as you scale your Lambda usage, you might need to do a bit of work here to create some templates and standards.

What You Should Be Doing – Today!

If you’re already working in the AWS environment, you should be considering Lambda as an option, but just like anything, you need to make the right decisions for your service, your organization and your customers.  Get creative – even if Lambda is not likely to be your deployment tech of choice, it may be a great place to experiment with prototypes.  It might just be a key piece of your infrastructure for internal tools.

The potential is huge here, as we truly get comfortable with the idea that virtualization is more than just a way to make more efficient use of hardware resources, but is truly a way to rethink how we approach problems in our IT world.

 

Advertisements

Empower Teams By Making The Right Choice The Easy Choice

My philosophy on leading a software team goes something like this:  Teams are empowered, and can be trusted to do the Right Thing when the Right Thing is also the path of least resistance.

From a managers perspective, this often means communication, tooling, budget, time, etc, but an architects can take a more technical, hands on approach.  Let’s dive into some techniques that can help make it easier for your team to do the Right Thing.

Architectural Governance

Years ago, I used to think about this concept as Architectural Enforcement (I’m a hockey guy, what can I say), but a colleague of mine convinced me that Governance was a better term, even if it is more politically correct.

Architectural Governance is the use of tools and products to prevent programmers from ‘breaking the rules’ — or more appropriately, to cause a conversation to occur when breaking a rule might be ok.

Checkstyle and FindBugs are two examples here, and easy to implement on a new project – is it a pain in the ass when you commit code, and it fails to build in your pipeline because your if statement looks like this:  if(aThingIsTrue) instead of this: if (aThingIsTrue)?  Absolutely, but it’s hard to deny that it makes the code more readable, and over time, you’ll realize that you need to remember to actually build your code before pushing it to your shared repo, and no-one is going to argue that that’s a bad idea.

These tools are also pretty pliable, so if you find that a default rule doesn’t sit well with your team, just turn it off – this isn’t about forcing everyone into a set of practices that they don’t like, it’s about a team making a decision that there certain standards and guidelines that everyone should follow, and then putting the tools in place to actually make sure they’re followed.

Quick show of hands – how many of you have worked on a team or in a company that had a ‘Coding Style Guideline’ of some sort published on a team wiki, but a quick look at the code shows that it’s completely ignored?  Think a bit about who is actually reading that wiki.  In most cases, it’s going to be your new employees — do you really want to send the message to new employees in their first one or two days that the guidelines are just there to look pretty, but don’t worry about it, because no-one pays attention?

Other tools and concepts in this space include much of the Simian Army – what better way to ensure that your teams are properly handling failures by causing failures to occur.  I’ve used simpler tools in the past like aspect oriented filters to ensure that a Controller only works with Services, and doesn’t directly access a Repository, or vice versa.  There are plenty of techniques to use – using these Governance tools can help ensure that your teams know what the Right Thing is, and can also help identify when the Right Thing might just need a little tweaking.

Deployment Pipelines

The term ‘build pipeline’ has been popular the last few years, since the publishing of Continuous Delivery (if you haven’t read it, do it.  Seriously, don’t wait – go do it!), but it’s not a new concept.  Build Pipelines are simply Continuous Integration processes taken to their logical conclusion, and realizing that automating your Unit Tests or even your Regression tests don’t mean a lot if that code is left sitting around, or if it the automation stops before the software is actually deployed.  What good are the hours spent testing, if the production software release process is completely different than the testing and preprod processes?  Are you prepared to tell your QA team to go home, because the testing they’re doing won’t be valid when it’s deployed to production?

My full set of thoughts on this topic would make this post far too long, so for now a few principals will have to do:

  • Start with a tool that allows you to put your configuration into Source Control – building your pipeline by hand on a project by project basis is a great way to make it unrepeatable.  The good news is that there are plenty of tools that can do this – GitLab Runner, Travis CI, and yes, Jenkins 2.0 are just a few options.
  • Build your artifacts once, and promote them as they move through the pipeline.  This will help you keep track of what builds are ‘approved’, and will eliminate any chance that the thing you tested is not the thing that you released.
  • If you have one team that develops the software, and another team that releases the software, it is wrong for either of these teams to build the pipelines in a vacuum.  Release Pipelines are a phenomenal tool to help your development teams and IT teams work more collaboratively – the term is ‘DevOps’ for a reason, not ‘NoOps’!
  • Optimize to fail fast.  If you have some tests that take time to run, execute them last, so you don’t have to wait 15 minutes to discover that you missed that Checkstyle issue mentioned above.
  • Categorize your tests stages, and build them up over time.  This is about practicality – you should start by identifying the types of tests that you want in the pipeline (unit, integration, acceptance, performance, release tests, visual diff tests, etc), but if you refuse to use your pipeline until they’re all in place, you’ll never get there.  Instead, configure your pipeline to run your build, unit tests and release processes, but leave  the final release processes to be manually triggered stages, rather than automated.  At the same time, make the decision about what tests you absolutely must have in place before you’re willing to automate that final release, and then add those steps to your project plan.  This will give you many of the early benefits of a pipeline, and it will give you a controlled release process – you can then make it more efficient as you go.
  • Recognize that 100%, hands off release automation is not necessarily the goal here.  Having a controlled release process that is agreed upon by development, IT and QA is.  If you still believe you need someone to hit the button before release, that’s fine, just recognize that each release will generally include larger change sets.  (While you’re add it, find some internal tools, or less ‘mission critical’ apps, and automate the crap out of them – this will help you gain a bit more confidence with the process.)

Project Archetypes

Maven Archetypes were one of the really valuable concepts that Maven brought to the Java world, but one that I haven’t yet found a really compelling implementation of – as good as the dependency mechanism and Archetype concept of Maven, I never really like using Maven as a tool on a day to day basis.  It had just as much of the XML ugliness as Ant, but because it was a declarative tool instead of a procedural tool, it always felt less readible.

The idea behind Archetypes was pretty simple, though – define a basic, empty structure of a ‘type’ of project, and provide the tool to allow a developer to recreate this in a single command.  Pretty straight forward, although somewhat limited for the time – we were building a lot more monoliths than micro-services back then, so the need to create a project from scratch was pretty infrequent.

Micro-services is the latest hotness these days, but rapidly developing cohesive, yet independent systems across an organization is hard.  Not only do you need to consider basic project structure, but you also need to worry about integrating with test frameworks, setting up circuit breakers, building a deployment pipeline, etc.  Thinking in terms of archetypes can give teams a head start, and do it the Right Way.

Unfortunately, I don’t know of a tool that does this really well.  Several tools will help move you in the right direction – tools like Gradle, Vagrant, and Serverless all have ‘init’ commands that will get you started, and the Spring Framework has the Spring Initializr, but none that I know of allow you to cleanly define your own (Gradle might be going in this direction with the InitBuild plugin, but it’s still in incubation and doesn’t have the option for a team to define their own project types).

So for the time being we might be stuck with ‘thinking’ in terms of Archetypes, but there’s still value here – creating a blank project template that defines folder structure, and includes configuration templates is easy, and can save a lot of time for teams that are building out their infrastructure.  It also serves as an effective way to communicate what the current best practices are, as they are discovered and added to the templates.

Still, there might just be an opportunity here somewhere…

Epilogue?

Of course there are more techniques here – this is only a start, and likely a topic that I’ll expand on in the future.  The key is about thinking in terms of making the Right Thing the Easy Thing – if the process of pushing out a hot fix is exactly the same as pushing out any other release, and if that process can run start to finish quickly, you will find yourself doing crazy things like opening a .jar file and replacing a .class file far less frequently.  Yes, many of us know how to do that, but most of us also know that it’s not ok.

BTW, if you couldn’t figure out the difference between those ‘if’ statements above, look for the space…

Sweet Simplicity of a REST App with Spring Boot

Did you know that you can build a fully featured REST app, right down to the database, with only three Java classes?  Yeah, you can, and it’s pretty sweet – here’s the skinny.

What you’re going to need

The ingredients list is pretty straight forward – you’ll be working with the following packages:

  • Spring Boot (you probably figured that out from the title)
  • The Spring Data REST package
  • All of the dependencies that these pull in – but you probably won’t care too much about these

That’s pretty much it – aside from Java, Gradle, and your favorite IDE.

What we’re going to build

We’re going to start off with something simple – say we need a micro-service to keep track of registered users in our system.  We need to be able to store user data, allow it to be retrieved, updated and deleted.  In other words, a typical CRUD interface.

The interface, of course, will be exposed as a REST service – in addition to the regular old GET, POST, PUT and DELETE methods, though, we’ll want to support some sort of discoverable to our API, so we’ll also be exposing a HATEAOS interface, to allow clients to discover, and dynamically adapt to our service, as we expand it in the future.

We’re already talking about a fair amount of functionality here, but trust me, we won’t be breaking the three class rule.

The Model

Our model starts with a simple User class:

public class User {
    private String firstName;
    private String lastName;
    private String email;

    //Getters, Setters, equals and hashCode methods removed
}

Simple enough place to start – I did remove some boiler plate code from that sample, but I literally used IntelliJ to generate them all, so they really weren’t that interesting.

The Repository

Our model, of course, is pretty much useless on it’s own – this is where the Spring Data project comes in.  Spring Data is a set of components that help make it easier to manage data – the range of functionality provided is huge, and best discovered on your own here.  For this tutorial, the short version is that we’ll be taking advantage of Spring Data’s ability to make it really easy to work with JPA objects

JPA?  I’ve been duped!

Yeah, I know, I didn’t mention anything about JPA earlier, but the fact of the matter is that for the very basic functionality we’re talking about here, we just don’t need to stress out about it too much.  Here’s all you really need to know:

  • JPA will be storing our Model object in a single table called Users
  • The columns will be named the same as the field names, and will be typed as varchar’s
  • You won’t be writing any SQL
  • Hibernate will be doing the actual work for us
  • You can customize pretty much everything above, if you want
  • Oh, and in case anyone cares, JPA stands for Java Persistence API, and you can read about it here.

Carrying on

Now that we’re all more comfortable, the first thing we need to do is update our model a bit – we’re going to give it a db-generated primary key, and we’re going to annotate it as an Entity, so the system knows that it should care.

@Entity
public class User {
    @Id @GeneratedValue
    private long id;
    private String firstName;
    private String lastName;
    private String email;

    //Getters, Setters, equals and hashCode methods removed
}

Now that you’ve added a field, don’t forget to regenerate your equals and hashCode object, and give it it’s own getter and setter.

We’re still on just a single class, of course – here comes our second.  Well, nearly – it’s an interface, actually:

@Repository
public interface UserRepository extends PagingAndSortingRepository<User, Long> {
    public User findByEmail(@Param("email") String email);
}

So what have we done here?  We’ve created the interface for our repository, extending the PagingAndSortingRepository that Spring Data provides – as you can probably guess, Spring Data assumes a lot for us here, since we didn’t need to declare any ‘save’, ‘update’, ‘delete’, or similar methods.  By extending this interface, we actually get a whole bunch of good stuff:

  • Full CRUD functionality, including the ability to load all entities, load a single entity by primary key, and of course save, update and delete entities.
  • The additional ability to page and sort our result sets, because nobody wants to load our entire database all at once.
  • A standard set of query extensions – as you can see here, we have a ‘findByEmail’ method that allows us to define queries in a really simple manner, based on the field names on the Entity.

And Finally – the Main Class

Of course, we need something to run – so here is our main class.  As you can see, we’ve annotated it with @SpringBootApplication, and our main class is calling SpringApplication.run:

@SpringBootApplication
public class Application {
    public static void main(String... args) {
        SpringApplication.run(Application.class, args);
    }
}

All we need now is a build script:

buildscript {
    ext {
        springBootVersion = '1.4.1.RELEASE'
    }
    repositories {
        mavenCentral()
    }
    dependencies {
        classpath("org.springframework.boot:spring-boot-gradle-plugin:${springBootVersion}")
    }
}

apply plugin: 'java'
apply plugin: 'spring-boot'

jar {
    baseName = 'rest-in-three-classes'
    version =  '0.0.1'
}

repositories {
    mavenCentral()
}

sourceCompatibility = 1.8
targetCompatibility = 1.8

dependencies {
    compile('org.springframework.boot:spring-boot-devtools')
    compile('org.springframework.boot:spring-boot-starter-data-jpa')
    compile('org.springframework.boot:spring-boot-starter-data-rest')
    compile('com.h2database:h2')

    testCompile('org.springframework.boot:spring-boot-starter-test')
}

And away we go!

Shenanigans!  I Call Shenanigans!

Yes, really, that’s it.  Don’t believe me?  Build and run the thing with ‘gradle bootRun’, and then open http://localhost:8080/users in your favorite browser.  This is what you’ll see:

{
 "_embedded": {
 "users": []
 },
 "_links": {
 "self": {
 "href": "http://localhost:8080/users"
 },
 "profile": {
 "href": "http://localhost:8080/profile/users"
 },
 "search": {
 "href": "http://localhost:8080/users/search"
 }
 },
 "page": {
 "size": 20,
 "totalElements": 0,
 "totalPages": 0,
 "number": 0
 }
}

Instant REST service!  Go ahead, play around — send a POST to the same URL with this JSON to create a new user:

{
 "firstName":"Bing",
 "lastName":"Crosby",
 "email":"white@christmas.com"
}

List them, PUT them, DELETE them – it all works!

But Where’s Everything Else?

I know, you really want a super complicated Spring configuration – sorry, not here.  You were really excited to implement that repository interface – nope, not today.  You even wanted to download and install your favorite Servlet engine, and configure it just so – sorry to disappoint!

There’s no sorcery here – this is Spring Boot in action, a set of libraries that make it very easy to build small, nimble, easy to extend and configure micro-services.  It’s no longer a hassle to setup a new project – you can literally do it in five minutes. But before we congratulate ourselves, let’s take a closer look at what’s going on here.

Simplified Configuration

I’ve been a fan of Spring for years, when I realized i could use it to make my code easier to read, more testable, and get me the transactionality of EJB without miles and miles of boilerplate code (anyone remember EJB 2.0?).  But Spring’s Achilles Heel has always been in the configuration – when it’s working, its magic, but when it’s not, it’s maddening.

Spring Boot (and, in fact, Spring 4 in general) addresses this with some simple and rather clever auto configuration.  With Spring Boot, all you need to enable a certain feature is to include a ‘starter library’ on the classpath.  There are a slew of these available, both from Spring, and from third parties – Spring has even provided what looks like a pretty comprehensive list of both.

So in our case, we’ve included ‘spring-boot-starter-data-rest’ on our classpath, which gives us a whole lot:

  1. It adds the Spring Data libraries to our class path, obviously
  2. It includes the Tomcat Servlet engine on our classpath, and embeds it into the jar file.  Yes, this makes the jar file larger than it otherwise would be, but it tremendously simplifies the deployment of our app
  3. It includes a bootstrap library that ties everything together when executing the jar file
  4. Our Repository interface has a full REST web service defined and implemented automatically, complete with paging support, and a full HATEAOS design, all based on the definition of the Repository and the Entity class.  Even the findByEmail method is exposed as a search resource.

Repository Implementation

The ‘spring-boot-starter-data-jpa’ library is what ties Spring Data and Hibernate together – it takes our Repository interface, and provides a default implementation, meaning that we are free to focus our JPA efforts into the mapping, and we don’t need to touch the API.  While this doesn’t mean we can get by without understanding what’s going on behind the scenes, it allows us to simply eliminate an entire class of code that tends to include a lot of boiler plate, and can be error prone.

In addition, it tremendously simplifies the configuration – simply add a JDBC driver to the classpath and the connection info in an application.properties file, and you’re all set to use that database.  Heck, if you include the H2 in-memory JDBC driver, as we do above, it will start and stop the database for you with no further configuration at all – sweet for testing.

JPA is not my favorite library – it gives us Annotation overload at times, it’s tricky to work with complex object relationships, and it provides us with a query language that’s close enough to SQL to look familiar, but different enough to not work the way I usually think it should – but with a library like Spring Data, it’s hard to argue that this isn’t a great option.

The End

And that’s it – really.  Download the code from my GitHub repo, and please, poke around and find whatever else is interesting.

This was obviously only a taste of what you can do, but it shows off that with the current state of tools, you can motivate your team to build small, independent micro-services without a lot of overhead.  This isn’t all that’s to it, of course – good testing practices, simple deployment mechanisms, and solid discipline are still required for working with micro-services, but the bar is being lowered every day!

Hmm, this place is sort of familiar…

Wow, it’s been a while – I’ve got a lot of cleaning up to do, but you might just see me start to put some thoughts here again.  It’s been seven or eight years since I’ve done any writing, so I’ll call this all experimental, but I’ve got a few thoughts wrapped up in my head that I might be able to yank out.  What will it looks like?  Who knows – one thing I will say is that it likely won’t be about any upcoming Java standard like my older posts – I had actually forgotten that I used to pay that much attention to that crap :).

Stick around, see what happens, and make sure to leave me a note – it’s always good to know what my audience looks like (I.e. – is there one!? 🙂 )

M

Java EE 6 – Who’s In?

Been a while since I’ve written anything, so I’ll ease into the waters with this one – it’s been over a year since Java EE 6 was released with some very cool updates that I’ve discussed here and here and here and here and here and here and here and here and here and here and here (dang, I was busy!). So I’m interested in hearing what kind of adoption it’s gotten so far. Anybody?

Now, I know that there still aren’t a lot of servers that support it — let’s see, there’s Glassfish, and then there’s… hmmm… well, I think Resin 4 has been released… JBoss 6 isn’t quite there yet, nor are any of the more expensive products, at least not to my knowledge (I’ll be perfectly honest – I don’t pay much attention to them!)

One that interests me is SIwpas – it’s a Web Profile implementation based on Tomcat, and apparently several other open source products, although I fear it suffers from AAS (Awful Acronym Syndrome!). But the question is, is anyone using it, or the other products? I’d love to know!

M

BTW – the last time I blogged about JBoss not having a server released after an extended period of time, they released it the very next day – if I were a bettin’ man, I’d put money on JBoss 6 going final tomorrow, but since I’m not, and since no one releases software on a Saturday, I’ll have to go with a firm guess that’ll be out soon!

http://pagead2.googlesyndication.com/pagead/show_ads.js

Organize Your Logs With a Cool Java EE 6 Trick

Picture this — it’s 9:00 Friday night, and you’ve just gotten a phone call asking why the hell a key part of your system is down… after verifying that something’s definitely busted, you open up the only resource you have — your system logs… it doesn’t take you long to find some exceptions, but they don’t tell you much of the story… pretty soon, you realize there are 5 or 6 different errors being thrown, plus messages from areas of the system that appear to be working fine… to boot, it’s the middle of your busiest time of the year, which means that you may have a few thousand users on the system at this very moment… yikes — how the heck do you make heads or tails of this mess?

Logging — no longer an afterthought

Ok, so four or five days later, when you finally sort out your issue, it’s time to make things better before that happens again… it’s time to actually put some thought behind your logging practices… first stop — learn how to log, and put some standards in place! I’m not going to elaborate on the details of that article, because I think the author does a fine job… frankly, I was hooked when he defined the logs as a ‘secondary interface’ of your system — your support staff (i.e. — you) can’t see what your customers are looking at in their browsers, so you need to make damn sure that you’re providing enough information in your logs for you to understand what’s going on!

Let’s be real, though — the traffic on your system hasn’t gone down any since that fateful Friday (luckily), and you don’t have the time to rework all of the logging in your system… there has to be a way to put some incremental improvements in here that will make your life easier the next time things catch fire, even if that’s tomorrow…

Adding the Context without the Pain — or a single change to your core code

Ultimately, you were able to make some sense out of that catastrophe by realizing your logging framework was providing you with a subtle piece of context — the thread name… seems innocuous, but in most Servlet containers, it’s enough to identify that each line in the log belonged to a particular thread — or request… It’s not perfect though — you didn’t have any messages in your logs that stated “Starting GET request for /shoppingcart/buySomething.html”, so you couldn’t tell exactly where each request started and ended… luckily, with Java EE 6 and a good logging framework, it’s not hard to get there…

Before I dig in, though, let’s get acquainted with the Mapped Diagnostics Context, or MDC — hopefully, your logging system supports it (log4j does, so most folks will be covered)… MDC provides the ability to attach pieces of context to the thread of execution you’re in, and allows for you to add this info on all log messages…

The following example shows a piece of code that uses the MDC in SLF4J — a logging framework much like the Apache Commons Logging framework that can provide a single interface to multiple logging run-times — excellent for building libraries when you don’t want to impose a logging system on your users… Anyway, on to the show:

public class RequestLoggingContext implements Filter {
private static final String SESSION_CONTEXT = "session-context";

...

@Override
public void doFilter(ServletRequest req, ServletResponse resp, FilterChain chain) throws IOException, ServletException {
if(req instanceof HttpServletRequest) {
HttpServletRequest httpRequest = (HttpServletRequest)req;
session = httpRequest.getSession(false);

if(session != null)
MDC.put(SESSION_CONTEXT, session.getId());
}

chain.doFilter(req, resp);

if(session != null) {
MDC.remove(SESSION_CONTEXT);
}
}
}

Pretty simple — two static methods on the ‘MDC’ class — ‘put’ and ‘remove’… while I’m not a particular fan of the static API, this is about as simple as it gets (incidentaly, this is the only ‘unfortunate’ use of static methods that I have seen in SLF4J — they use the standard method of having static factory classes, but that at least makes sense, and has precedent)… so what the heck did this do? Well, we now have the ability to refer to that “session-context” as a part of our logging ‘Pattern’, using the “%X{session-context}” flag — like so:





%d{HH:mm:ss.SSS} [session-context=%X{session-context}][%thread] %-5level %logger{36} - %msg%n







BTW, that is not a log4j config file — it’s a Logback config… Logback is the ‘native’ implementation of the SLF4J library that’s written by the same folks who brought you Log4J — kind of a ‘take two’, if you will… anyway, it should be obvious that it’s driven heavily from Log4J’s configuration 🙂

So we have now added context to our logging system — and all without disturbing a single line of code in our existing system… but wait, there’s more!

The Trick

One of the interesting additions to Java EE 6 is the combination of Servlet Annotations, and web fragments — this allows library authors to self configure the use of their library, where previously the end user would need to make additions to the web.xml… a great use of Convention Over Configuration, and very powerful, indeed!

So let’s take the above code sample and expand it to include a randomly generated context id for each HttpRequest, and some basic log messages to delineate the start and end of every request:

@WebFilter("/*")
@WebListener
public class RequestLoggingContext implements Filter, HttpSessionListener {
private static final String REQUEST_CONTEXT = "request-context";
private static final String SESSION_CONTEXT = "session-context";

private Logger log = LoggerFactory.getLogger(RequestLoggingContext.class);

@Inject
private ContextGenerator contextGenerator;

@Override
public void init(FilterConfig fc) throws ServletException {
}

@Override
public void doFilter(ServletRequest req, ServletResponse resp, FilterChain chain) throws IOException, ServletException {
MDC.put(REQUEST_CONTEXT, contextGenerator.generateContextId());

StringBuilder msg = new StringBuilder();
if(req instanceof HttpServletRequest) {
HttpServletRequest httpRequest = (HttpServletRequest)req;
HttpSession session = httpRequest.getSession(false);

if(session != null)
MDC.put(SESSION_CONTEXT, session.getId());

//Build Detailed Message
msg.append("Starting ");
msg.append(httpRequest.getMethod());
msg.append(" request for URL '");
msg.append(httpRequest.getRequestURL());
if(httpRequest.getMethod().equalsIgnoreCase("get") && httpRequest.getQueryString() != null) {
msg.append('?');
msg.append(httpRequest.getQueryString());
}
msg.append("'.");
}

if(msg.length() == 0) {
msg.append("Starting new request for Server '");
msg.append(req.getScheme());
msg.append(":\\");
msg.append(req.getServerName());
msg.append(':');
msg.append(req.getServerPort());
msg.append('/');
}

log.info(msg.toString());
long startTime = System.currentTimeMillis();

chain.doFilter(req, resp);

msg.setLength(0);
msg.append("Request processing complete. Time Elapsed -- ");
msg.append(System.currentTimeMillis() - startTime);
msg.append(" ms.");
log.info(msg.toString());

if(((HttpServletRequest)req).getSession(false) != null) {
MDC.remove(SESSION_CONTEXT);
}
MDC.remove(REQUEST_CONTEXT);
}

@Override
public void destroy() {
}

@Override
public void sessionCreated(HttpSessionEvent hse) {
MDC.put(SESSION_CONTEXT, hse.getSession().getId());
}

@Override
public void sessionDestroyed(HttpSessionEvent hse) {
}
}

All that’s left is to literally throw that in its’ own .jar file, put it in your WEB-INF/lib folder, and add either or both of the ‘context’ keys to your logging config and presto — you have logging context! (I have omitted the definition of the ContextGenerator class for brevity — it just generates a random string) Now your logs will look something like this:

INFO: 00:02:11.140 [request-context=sonqc52zbqia][http-thread-pool-8080-(1)] INFO  c.m.l.support.RequestLoggingContext - Starting GET request for URL 'http://localhost:8080/Test/'.
INFO: 00:02:12.156 [request-context=sonqc52zbqia][http-thread-pool-8080-(1)] INFO c.m.l.i.TimingLogInterceptor - Executing com.test.facade.LoadHomeFacade.loadData
INFO: 00:02:12.156 [request-context=sonqc52zbqia][http-thread-pool-8080-(1)] INFO c.m.l.i.TimingLogInterceptor - Doing something interesting.
INFO: 00:10:36.250 [request-context=sonqc52zbqia][http-thread-pool-8080-(1)] INFO c.m.l.support.RequestLoggingContext - Request processing complete. Time Elapsed -- 719 ms.

So now, without touching a single line of existing code or modifying a single class, we can now clearly associate any logging message in our system with other messages generated on that request, and we have clear delineation of where each request begins and ends, and how long it took to execute… pretty damn sweet! So now when your system blows up next Friday night, you’ll be a bit more prepared to sort things out before the weekend is over! (just don’t throw out those scripts that sort based on ‘request-context’!)

Final Word

Final word? I guess that means there’s more — three things, actually… first — there is absolutely nothing preventing you from putting the above in place if you’re on an earlier version of the Java EE spec (and let’s face it — that’s pretty much all of us!)… The only thing you lose is the self configuration, so you’ll need to add the appropriate and elements to your web.xml

Second, if you’re on Java EE 6 (wow, that was fast!), and your application already makes use of Servlet Filters, whether they’re ‘self configured’ or not, you may need to do some configuration in your web.xml to provide an explicit ordering — note that this is not strictly required, although it is probably a good idea :)…

And finally, I mentioned above that Log4J users were in luck when it came to supporting MDC… unfortunately, the JDK Logging API doesn’t support MDC (come on! Why not! Am I the only one who seems to think they haven’t advanced this API in the last five years!?) — those users aren’t entirely out of luck, though… there is a way to ‘subclass’ the JDK Logger and add logging info to the front or end of any logging message, although it’s tricky — unfortunately, I don’t have this code handy anymore, but perhaps I’ll sit down and figure it out again if I’m so inclined one day (of course, if I get feedback to do this, it might make me more inclined 🙂 )

Now don’t forget to get back and add better logging messages to your code!

M

<!–
google_ad_client = “pub-3840214761639097”;
/* 300×250, created 8/9/09 */
google_ad_slot = “7488975184”;
google_ad_width = 300;
google_ad_height = 250;
//–>

<script type="text/javascript"
src=”http://pagead2.googlesyndication.com/pagead/show_ads.js”&gt;

@DataSourceDefinition — A Hidden Gem from Java EE 6

In the old days, DataSources were configured — well, they were configured in lots of different ways… That’s because there was no ‘one way’ to do it — in JBoss, you created an XML file that ended in ‘-ds.xml’ and dumped it in the deploy folder… in Glassfish, you either use the admin console or muck with the domain.xml file… in WebLogic you used the web console… and this was all well and good — until I worked with an IT guy who told me just how much of a pain in the ass it was…

Up until then, it wasn’t such a big deal to me — I set it up once, and that was that… then I ran into this guy a few jobs ago who liked to bitch and complain about how much harder it was to deploy our application than the .NET or Ruby apps he was used to… he had to deploy our data source, then he had to deploy our JMS configurations — only then would our application work… in the other platforms, that was all built into the app (I’ll have to take his word for it, since I haven’t actually deployed anything in either platform)… I was a but surprised at first, and then I realized that maybe he had a point… nah, it couldn’t be, he must just be having a bad day (lots of us were having bad days back then 🙂 )…

Then I ran into Grails, which is dead simple — you have a Groovy configuration file that has your db info in it… you even have the ability to specify different ‘environments’, which can change depending on how you create your archives or run your app… pretty slick…

The Gem

Well, lo and behold, we now have something that’s nearly equivalent in Java EE 6 — the @DataSourceDefinition attribute… it’s a new attribute that you can put on a class that provides a standard mechanism to configure a JDBC DataSource into JNDI, and as expected, it can work with local JNDI scopes or the new global scope, meaning you can have an Environment Configuration that uses this attribute making it shareable across your server… it works like this:


import javax.annotation.sql.DataSourceDefinition;
import org.jboss.seam.envconfig.Bind;
import org.jboss.seam.envconfig.EnvironmentBinding;

@DataSourceDefinition (
className="org.apache.derby.jdbc.ClientDataSource",
name="java:global/jdbc/AppDB",
serverName="localhost",
portNumber=1527,
user="user",
password="password",
databaseName="dev-db"
)
public class Config {
...
}

As you would expect, that annotation will create a DataSource that will point to a local Derby db, and stick it into JNDI at the global address ‘java:global/jdbc/AppDB’, which your application, or other applications can refer to as needed… no separate deployment and no custom server-based implementation — this code should be portable across any Java EE 6 server (including the Web Profile!)…

It’s almost perfect!

In typical Java EE style, there’s one thing that just doesn’t appear to be working the way I’d like it — it doesn’t appear to honor JCDI Alternatives (at least not in Glassfish)… Here’s what I’m thinking — we should be able to have a different Config class for each of our different environments… in other words, we’d have a QAConfig that pointed to a different Derby db, a StagingConfig that pointed to a MySQL db somewhere on another server, and a ProductionConfig that pointed to kick ass, clustered MySQL db… we could then use Alternatives to turn on the ones that we want in certain environments with a simple XML change, and not have to muck with code… unfortunately, it doesn’t appear to work — it appears in Glassfish that it is processing them in an undeterministic order, with (presumably) the class that is processed last overwriting the others that came before it…

There is a solution, though, and it is on the lookup side of the equation — using JCDI Alternatives, we can selectively lookup the DataSource that we’re interested in, and then enable that Managed Bean in the beans.xml file… it’s definitely not ideal, since we need to actually inject all of our DataSources into JNDI in all scenarios, but it works, it’s something I can live with, and is probably easily fixed in a later Java EE release… Update: Looks like it’s in the plan, according to this link — thanks, Gavin 🙂

Here’s how it works — first the ‘common’ case, probably for a Development environment:


@RequestScoped
public class DSProvider {
@Resource (lookup="java:global/jdbc/AppDB")
private DataSource normal;

public DataSource getDataSource() {
return normal;
}
}

Simple enough — has a field that looks up ‘jdbc/AppDB’ from JNDI, and provides a getter… now for QA:


@RequestScoped @Alternative
public class QADSProvider extends DSProvider{
@Resource (lookup="java:global/jdbc/AppQADB")
private DataSource normal;

public DataSource getDataSource() {
return normal;
}
}

Pretty much the same, except this does the lookup from ‘jdbc/AppQADB’, and it is annotated with @Alternative… so how do these things work together? Take a look:


@Named
public class Test {
@Inject
private DSProvider dsProvider;

...
}

Again, simple — we’re injecting a DSProvider instance here, and presumably running a few fancy queries… Nothing Dev-ish or QA-ish here at all, which is the beauty of Alternatives… finally, when building the .war file for QA, we turn on our Alternative in the beans.xml, like so:




com.mcorey.alternativedatasource.QADSProvider


You’ll notice that this solution requires us to rebuild our .war file for QA, which I obviously don’t like — not to worry, there will be support for this in the Seam 3 Environment Configuration Module, which will effectively create a binding by mapping from one JNDI key to another… I have no idea what the syntax will look like at this point, but it should be pretty straight forward, and will allow us to — you guessed it — build our .war one, and copy it from place to place without modification…

M

http://pagead2.googlesyndication.com/pagead/show_ads.js