The Freedom of AWS Lambda

AWS Lambda is hardly new these days, yet people are still only starting to explore it – after all, not all of us have the flexibility to spend our free time exploring new tech, or are in a position to explore and experiment on the job, and many organizations simply can’t change on a dime, and chase every cool technology that they find.  There are growing pains here, to be sure, but AWS Lambda can provide a freedom to organizations that simply allows them to build products at a pace that can’t be achieved in more traditional ways, and at a fraction of the cost.

What is AWS Lambda?

We’re all familiar with containers – 3rd party software that sits on a server, and provides a set of services that our software can take advantage of.  Containers are everywhere, and we don’t blink an eye to use them.  No one builds a custom web server – they deploy their web app into a container, like Apache Web Server or NginX.  No one builds a server-based Java fat client – they build their app to conform to the Servlet spec, and then deploy it into a Servlet Container, like Apache Tomcat or Jetty.  This tech has been around for years, and has made us significantly more productive, because the containers have abstracted out a significant layer of complexity that we no longer need to worry about – we just build our products according to the rules (i.e. – specifications and standards), and the container happily does its’ job.

AWS Lambda is simply the next iteration on this theme, and takes advantage of the advances of virtualization over the last decade or so.  With each of the examples above, it’s someone’s responsibility to stand up a server or two (or more), install the container,  deploy your code into it, and then maintain those servers with upgrades, fixes, etc, for the lifetime of the service.  When traffic gets high, you might need to spin up a few new servers, and then hopefully remember to decomission them when its back down to normal.  As it turns out, AWS Lambda is an abstraction layer that handles exactly those services for you.

Serverless is NOT Scary

The term ‘serverless’ is a bit of a misnomer that even Amazon’s CTO, Werner Vogels admits.  Of course there is a server behind the scenes there somewhere – it’s simply that as a programmer, architect or systems operator, you don’t need to touch it.  This shouldn’t be a totally unfamiliar paradigm, since most of us have been working with virtual servers for the past 10-15 years anyway, and tech like Docker Containers more recently – we’ve simply shifted that responsibility to live behind Amazon’s walls.

This allows us to start thinking about our products not as applications, but as services.  It’s like an abstraction layer that takes care of the fact that code needs to be bundled and deployed, so the actual value proposition of our work can take center stage.  In this way, it can be extremely liberating, and we can suddenly go from idea-to-value with a few clicks.

Imagine that your customers are demanding that they need to be able to see and update their configuration in your API – with Lambda, you can provide this capability in at least pilot form in little more time than it takes you to implement it.

Did you say something about cost?

Why yes, I did.  Picture this – you’re a scrappy startup, and you’ve finally got your first paying customer.  The deal is signed, and its time to deliver – so you fire up the AWS console, spin up a few servers (redundancy, don’t you know), set up a load balancer and configure DNS, and you’re in business.  The problem is, what happens if it takes weeks for your app to really get traction?  Or what if the nature of the app is that it’s only actively used a few hours a day?  You’re paying for 24 hours worth of availability for these services, even if your app only uses a fraction of that.

AWS Lambda has a very different pricing model, based on actual processing time, instead of theoretical availability.  This means that if your app only did 3 hours of total work, then you only pay for the 3 hours of time – what a phenomenal way for a scrappy startup to scale up!

Check out the pricing page for full details, of course – there is a $.20 charge per million requests, the normal price involved for moving data out of the AWS cloud, a fee for storing the the actual code, and any other services you’re using – just like anything in AWS pricing, it’s a combination of many things, so read carefully!

Wait, that’s the wrong end of ‘Scale’!

Of course, no one really wants to talk about that end of the scaleability model – we want to know what happens when our app hits the big time (does anyone get ‘Slashdotted‘ anymore?)

There’s good news here, too – if you have more work than your Lambda function can process, they’ll just spin up more to handle the load.  So just like other AWS services like OpsWorks and Elastic Beanstalk, the auto-scaling comes with the service.

There’s a cost to this, of course.  Because they’re effectively giving you parallel processes, each instance will incur the usage cost simultaneously – so if 10 instances are spun up and continually working for an hour, you’ll incur 10 hours worth of cost.

All is not lost, though – Amazon provides the tools to limit how many instances can be running at any given time, so you have some control.  Of course, if you simply setup every workload to bring in more money than it costs, then you’re golden – the auto-scaling is effectively printing money (I’m still working on this myself).

Slow down there, Killer

Let’s take a step back and be realistic – AWS Lambda is not the Panacea of computing.  As with any technology, there are trade-offs, draw backs and side effects that you need to be aware of, among them are:

  • SLA – AWS does not currently provide an SLA for performance or reliability.  While this sounds terrible, it’s simply an important trade-off that needs to be made with design.  It might mean that you should have a backup plan for any important code that you deploy as a Lambda function.  It almost certainly means that if your code is mission critical (i.e. – your company goes out of business if it’s broken), or if lives are at stake, then AWS Lambda is not for you.  I’ll wager that the majority of code does not fall into that category, however, and can cleanly take advantage of the benefits provided.
  • Performance – Because Amazon takes care of the deployment for you, and only charges you for actual usage, and not for 24 hours worth of server availability, they reserve the right to reclaim resources if you’re not using them.  When this happens you will run into some latency as they spin up a new container – this is known as a ‘cold start’.  In practice, this seems to be a few hundred milliseconds for JavaScript functions, and a few seconds for Java functions, but the good news is that if you’re app is active, then this will be rare.  Consider this carefully when building your services – if you have a strong requirement for sub-second latency, you should plan accordingly.  For most ‘offline’, event oriented or batch systems, however, this is likely not a problem.
  • Deployment infrastructure – Like most of AWS’ products, it provides both a web interface to use Lambda’s as well as an API and CloudFormation support, but because it’s still early days, the landscape for integrating AWS Lambda into your deployment pipelines is fairly sparse.  I’ve come to like the Serverless Framework as a nice infrastructure tool that allows me to define my Lambda functions and any dependent components like DyanamoDB tables, API Gateway configurations, etc. into a single file that sits in my source control repo.  There aren’t a ton of other options out there, however, and as you scale your Lambda usage, you might need to do a bit of work here to create some templates and standards.

What You Should Be Doing – Today!

If you’re already working in the AWS environment, you should be considering Lambda as an option, but just like anything, you need to make the right decisions for your service, your organization and your customers.  Get creative – even if Lambda is not likely to be your deployment tech of choice, it may be a great place to experiment with prototypes.  It might just be a key piece of your infrastructure for internal tools.

The potential is huge here, as we truly get comfortable with the idea that virtualization is more than just a way to make more efficient use of hardware resources, but is truly a way to rethink how we approach problems in our IT world.

 

Advertisements

Empower Teams By Making The Right Choice The Easy Choice

My philosophy on leading a software team goes something like this:  Teams are empowered, and can be trusted to do the Right Thing when the Right Thing is also the path of least resistance.

From a managers perspective, this often means communication, tooling, budget, time, etc, but an architects can take a more technical, hands on approach.  Let’s dive into some techniques that can help make it easier for your team to do the Right Thing.

Architectural Governance

Years ago, I used to think about this concept as Architectural Enforcement (I’m a hockey guy, what can I say), but a colleague of mine convinced me that Governance was a better term, even if it is more politically correct.

Architectural Governance is the use of tools and products to prevent programmers from ‘breaking the rules’ — or more appropriately, to cause a conversation to occur when breaking a rule might be ok.

Checkstyle and FindBugs are two examples here, and easy to implement on a new project – is it a pain in the ass when you commit code, and it fails to build in your pipeline because your if statement looks like this:  if(aThingIsTrue) instead of this: if (aThingIsTrue)?  Absolutely, but it’s hard to deny that it makes the code more readable, and over time, you’ll realize that you need to remember to actually build your code before pushing it to your shared repo, and no-one is going to argue that that’s a bad idea.

These tools are also pretty pliable, so if you find that a default rule doesn’t sit well with your team, just turn it off – this isn’t about forcing everyone into a set of practices that they don’t like, it’s about a team making a decision that there certain standards and guidelines that everyone should follow, and then putting the tools in place to actually make sure they’re followed.

Quick show of hands – how many of you have worked on a team or in a company that had a ‘Coding Style Guideline’ of some sort published on a team wiki, but a quick look at the code shows that it’s completely ignored?  Think a bit about who is actually reading that wiki.  In most cases, it’s going to be your new employees — do you really want to send the message to new employees in their first one or two days that the guidelines are just there to look pretty, but don’t worry about it, because no-one pays attention?

Other tools and concepts in this space include much of the Simian Army – what better way to ensure that your teams are properly handling failures by causing failures to occur.  I’ve used simpler tools in the past like aspect oriented filters to ensure that a Controller only works with Services, and doesn’t directly access a Repository, or vice versa.  There are plenty of techniques to use – using these Governance tools can help ensure that your teams know what the Right Thing is, and can also help identify when the Right Thing might just need a little tweaking.

Deployment Pipelines

The term ‘build pipeline’ has been popular the last few years, since the publishing of Continuous Delivery (if you haven’t read it, do it.  Seriously, don’t wait – go do it!), but it’s not a new concept.  Build Pipelines are simply Continuous Integration processes taken to their logical conclusion, and realizing that automating your Unit Tests or even your Regression tests don’t mean a lot if that code is left sitting around, or if it the automation stops before the software is actually deployed.  What good are the hours spent testing, if the production software release process is completely different than the testing and preprod processes?  Are you prepared to tell your QA team to go home, because the testing they’re doing won’t be valid when it’s deployed to production?

My full set of thoughts on this topic would make this post far too long, so for now a few principals will have to do:

  • Start with a tool that allows you to put your configuration into Source Control – building your pipeline by hand on a project by project basis is a great way to make it unrepeatable.  The good news is that there are plenty of tools that can do this – GitLab Runner, Travis CI, and yes, Jenkins 2.0 are just a few options.
  • Build your artifacts once, and promote them as they move through the pipeline.  This will help you keep track of what builds are ‘approved’, and will eliminate any chance that the thing you tested is not the thing that you released.
  • If you have one team that develops the software, and another team that releases the software, it is wrong for either of these teams to build the pipelines in a vacuum.  Release Pipelines are a phenomenal tool to help your development teams and IT teams work more collaboratively – the term is ‘DevOps’ for a reason, not ‘NoOps’!
  • Optimize to fail fast.  If you have some tests that take time to run, execute them last, so you don’t have to wait 15 minutes to discover that you missed that Checkstyle issue mentioned above.
  • Categorize your tests stages, and build them up over time.  This is about practicality – you should start by identifying the types of tests that you want in the pipeline (unit, integration, acceptance, performance, release tests, visual diff tests, etc), but if you refuse to use your pipeline until they’re all in place, you’ll never get there.  Instead, configure your pipeline to run your build, unit tests and release processes, but leave  the final release processes to be manually triggered stages, rather than automated.  At the same time, make the decision about what tests you absolutely must have in place before you’re willing to automate that final release, and then add those steps to your project plan.  This will give you many of the early benefits of a pipeline, and it will give you a controlled release process – you can then make it more efficient as you go.
  • Recognize that 100%, hands off release automation is not necessarily the goal here.  Having a controlled release process that is agreed upon by development, IT and QA is.  If you still believe you need someone to hit the button before release, that’s fine, just recognize that each release will generally include larger change sets.  (While you’re add it, find some internal tools, or less ‘mission critical’ apps, and automate the crap out of them – this will help you gain a bit more confidence with the process.)

Project Archetypes

Maven Archetypes were one of the really valuable concepts that Maven brought to the Java world, but one that I haven’t yet found a really compelling implementation of – as good as the dependency mechanism and Archetype concept of Maven, I never really like using Maven as a tool on a day to day basis.  It had just as much of the XML ugliness as Ant, but because it was a declarative tool instead of a procedural tool, it always felt less readible.

The idea behind Archetypes was pretty simple, though – define a basic, empty structure of a ‘type’ of project, and provide the tool to allow a developer to recreate this in a single command.  Pretty straight forward, although somewhat limited for the time – we were building a lot more monoliths than micro-services back then, so the need to create a project from scratch was pretty infrequent.

Micro-services is the latest hotness these days, but rapidly developing cohesive, yet independent systems across an organization is hard.  Not only do you need to consider basic project structure, but you also need to worry about integrating with test frameworks, setting up circuit breakers, building a deployment pipeline, etc.  Thinking in terms of archetypes can give teams a head start, and do it the Right Way.

Unfortunately, I don’t know of a tool that does this really well.  Several tools will help move you in the right direction – tools like Gradle, Vagrant, and Serverless all have ‘init’ commands that will get you started, and the Spring Framework has the Spring Initializr, but none that I know of allow you to cleanly define your own (Gradle might be going in this direction with the InitBuild plugin, but it’s still in incubation and doesn’t have the option for a team to define their own project types).

So for the time being we might be stuck with ‘thinking’ in terms of Archetypes, but there’s still value here – creating a blank project template that defines folder structure, and includes configuration templates is easy, and can save a lot of time for teams that are building out their infrastructure.  It also serves as an effective way to communicate what the current best practices are, as they are discovered and added to the templates.

Still, there might just be an opportunity here somewhere…

Epilogue?

Of course there are more techniques here – this is only a start, and likely a topic that I’ll expand on in the future.  The key is about thinking in terms of making the Right Thing the Easy Thing – if the process of pushing out a hot fix is exactly the same as pushing out any other release, and if that process can run start to finish quickly, you will find yourself doing crazy things like opening a .jar file and replacing a .class file far less frequently.  Yes, many of us know how to do that, but most of us also know that it’s not ok.

BTW, if you couldn’t figure out the difference between those ‘if’ statements above, look for the space…

Sweet Simplicity of a REST App with Spring Boot

Did you know that you can build a fully featured REST app, right down to the database, with only three Java classes?  Yeah, you can, and it’s pretty sweet – here’s the skinny.

What you’re going to need

The ingredients list is pretty straight forward – you’ll be working with the following packages:

  • Spring Boot (you probably figured that out from the title)
  • The Spring Data REST package
  • All of the dependencies that these pull in – but you probably won’t care too much about these

That’s pretty much it – aside from Java, Gradle, and your favorite IDE.

What we’re going to build

We’re going to start off with something simple – say we need a micro-service to keep track of registered users in our system.  We need to be able to store user data, allow it to be retrieved, updated and deleted.  In other words, a typical CRUD interface.

The interface, of course, will be exposed as a REST service – in addition to the regular old GET, POST, PUT and DELETE methods, though, we’ll want to support some sort of discoverable to our API, so we’ll also be exposing a HATEAOS interface, to allow clients to discover, and dynamically adapt to our service, as we expand it in the future.

We’re already talking about a fair amount of functionality here, but trust me, we won’t be breaking the three class rule.

The Model

Our model starts with a simple User class:

public class User {
    private String firstName;
    private String lastName;
    private String email;

    //Getters, Setters, equals and hashCode methods removed
}

Simple enough place to start – I did remove some boiler plate code from that sample, but I literally used IntelliJ to generate them all, so they really weren’t that interesting.

The Repository

Our model, of course, is pretty much useless on it’s own – this is where the Spring Data project comes in.  Spring Data is a set of components that help make it easier to manage data – the range of functionality provided is huge, and best discovered on your own here.  For this tutorial, the short version is that we’ll be taking advantage of Spring Data’s ability to make it really easy to work with JPA objects

JPA?  I’ve been duped!

Yeah, I know, I didn’t mention anything about JPA earlier, but the fact of the matter is that for the very basic functionality we’re talking about here, we just don’t need to stress out about it too much.  Here’s all you really need to know:

  • JPA will be storing our Model object in a single table called Users
  • The columns will be named the same as the field names, and will be typed as varchar’s
  • You won’t be writing any SQL
  • Hibernate will be doing the actual work for us
  • You can customize pretty much everything above, if you want
  • Oh, and in case anyone cares, JPA stands for Java Persistence API, and you can read about it here.

Carrying on

Now that we’re all more comfortable, the first thing we need to do is update our model a bit – we’re going to give it a db-generated primary key, and we’re going to annotate it as an Entity, so the system knows that it should care.

@Entity
public class User {
    @Id @GeneratedValue
    private long id;
    private String firstName;
    private String lastName;
    private String email;

    //Getters, Setters, equals and hashCode methods removed
}

Now that you’ve added a field, don’t forget to regenerate your equals and hashCode object, and give it it’s own getter and setter.

We’re still on just a single class, of course – here comes our second.  Well, nearly – it’s an interface, actually:

@Repository
public interface UserRepository extends PagingAndSortingRepository<User, Long> {
    public User findByEmail(@Param("email") String email);
}

So what have we done here?  We’ve created the interface for our repository, extending the PagingAndSortingRepository that Spring Data provides – as you can probably guess, Spring Data assumes a lot for us here, since we didn’t need to declare any ‘save’, ‘update’, ‘delete’, or similar methods.  By extending this interface, we actually get a whole bunch of good stuff:

  • Full CRUD functionality, including the ability to load all entities, load a single entity by primary key, and of course save, update and delete entities.
  • The additional ability to page and sort our result sets, because nobody wants to load our entire database all at once.
  • A standard set of query extensions – as you can see here, we have a ‘findByEmail’ method that allows us to define queries in a really simple manner, based on the field names on the Entity.

And Finally – the Main Class

Of course, we need something to run – so here is our main class.  As you can see, we’ve annotated it with @SpringBootApplication, and our main class is calling SpringApplication.run:

@SpringBootApplication
public class Application {
    public static void main(String... args) {
        SpringApplication.run(Application.class, args);
    }
}

All we need now is a build script:

buildscript {
    ext {
        springBootVersion = '1.4.1.RELEASE'
    }
    repositories {
        mavenCentral()
    }
    dependencies {
        classpath("org.springframework.boot:spring-boot-gradle-plugin:${springBootVersion}")
    }
}

apply plugin: 'java'
apply plugin: 'spring-boot'

jar {
    baseName = 'rest-in-three-classes'
    version =  '0.0.1'
}

repositories {
    mavenCentral()
}

sourceCompatibility = 1.8
targetCompatibility = 1.8

dependencies {
    compile('org.springframework.boot:spring-boot-devtools')
    compile('org.springframework.boot:spring-boot-starter-data-jpa')
    compile('org.springframework.boot:spring-boot-starter-data-rest')
    compile('com.h2database:h2')

    testCompile('org.springframework.boot:spring-boot-starter-test')
}

And away we go!

Shenanigans!  I Call Shenanigans!

Yes, really, that’s it.  Don’t believe me?  Build and run the thing with ‘gradle bootRun’, and then open http://localhost:8080/users in your favorite browser.  This is what you’ll see:

{
 "_embedded": {
 "users": []
 },
 "_links": {
 "self": {
 "href": "http://localhost:8080/users"
 },
 "profile": {
 "href": "http://localhost:8080/profile/users"
 },
 "search": {
 "href": "http://localhost:8080/users/search"
 }
 },
 "page": {
 "size": 20,
 "totalElements": 0,
 "totalPages": 0,
 "number": 0
 }
}

Instant REST service!  Go ahead, play around — send a POST to the same URL with this JSON to create a new user:

{
 "firstName":"Bing",
 "lastName":"Crosby",
 "email":"white@christmas.com"
}

List them, PUT them, DELETE them – it all works!

But Where’s Everything Else?

I know, you really want a super complicated Spring configuration – sorry, not here.  You were really excited to implement that repository interface – nope, not today.  You even wanted to download and install your favorite Servlet engine, and configure it just so – sorry to disappoint!

There’s no sorcery here – this is Spring Boot in action, a set of libraries that make it very easy to build small, nimble, easy to extend and configure micro-services.  It’s no longer a hassle to setup a new project – you can literally do it in five minutes. But before we congratulate ourselves, let’s take a closer look at what’s going on here.

Simplified Configuration

I’ve been a fan of Spring for years, when I realized i could use it to make my code easier to read, more testable, and get me the transactionality of EJB without miles and miles of boilerplate code (anyone remember EJB 2.0?).  But Spring’s Achilles Heel has always been in the configuration – when it’s working, its magic, but when it’s not, it’s maddening.

Spring Boot (and, in fact, Spring 4 in general) addresses this with some simple and rather clever auto configuration.  With Spring Boot, all you need to enable a certain feature is to include a ‘starter library’ on the classpath.  There are a slew of these available, both from Spring, and from third parties – Spring has even provided what looks like a pretty comprehensive list of both.

So in our case, we’ve included ‘spring-boot-starter-data-rest’ on our classpath, which gives us a whole lot:

  1. It adds the Spring Data libraries to our class path, obviously
  2. It includes the Tomcat Servlet engine on our classpath, and embeds it into the jar file.  Yes, this makes the jar file larger than it otherwise would be, but it tremendously simplifies the deployment of our app
  3. It includes a bootstrap library that ties everything together when executing the jar file
  4. Our Repository interface has a full REST web service defined and implemented automatically, complete with paging support, and a full HATEAOS design, all based on the definition of the Repository and the Entity class.  Even the findByEmail method is exposed as a search resource.

Repository Implementation

The ‘spring-boot-starter-data-jpa’ library is what ties Spring Data and Hibernate together – it takes our Repository interface, and provides a default implementation, meaning that we are free to focus our JPA efforts into the mapping, and we don’t need to touch the API.  While this doesn’t mean we can get by without understanding what’s going on behind the scenes, it allows us to simply eliminate an entire class of code that tends to include a lot of boiler plate, and can be error prone.

In addition, it tremendously simplifies the configuration – simply add a JDBC driver to the classpath and the connection info in an application.properties file, and you’re all set to use that database.  Heck, if you include the H2 in-memory JDBC driver, as we do above, it will start and stop the database for you with no further configuration at all – sweet for testing.

JPA is not my favorite library – it gives us Annotation overload at times, it’s tricky to work with complex object relationships, and it provides us with a query language that’s close enough to SQL to look familiar, but different enough to not work the way I usually think it should – but with a library like Spring Data, it’s hard to argue that this isn’t a great option.

The End

And that’s it – really.  Download the code from my GitHub repo, and please, poke around and find whatever else is interesting.

This was obviously only a taste of what you can do, but it shows off that with the current state of tools, you can motivate your team to build small, independent micro-services without a lot of overhead.  This isn’t all that’s to it, of course – good testing practices, simple deployment mechanisms, and solid discipline are still required for working with micro-services, but the bar is being lowered every day!