Recently at ThoughtWorks my team had a requirement to run a small ETL style set of tasks on a given domain event. We didn't want to have a concern for hosting and subsequently infrastructure so decided to test out AWS Lambda.

The general idea here is we would write our ETL function in Node.js, host it on Lambda and forget all about the infrastructure, and it works beautifully - however the most challenging aspect of adopting AWS Lambda was getting succinct logs into our Logging solution - SumoLogic.

We have a close working relationship with Sumo, who had been trying to tackle this problem themselves before we had first approached them. Working together we got to a solution where we have log lines correlated by a requestId per each Lambda request, visible in SumoLogic.

Overview of the Solution

The general premise here is actually quite simple, the steps are:

  1. Configure a HTTP collector in SumoLogic.
  2. Create a new Lambda function to parse incoming CloudWatch logs and forward to the HTTP collector.
  3. Configure your CloudWatch stream to forward logs to a new Lambda function.

Once you've done the above, you should see your logs appearing in SumoLogic.

Step 1: Configure a HTTP collector in SumoLogic

This article presumes some level of SumoLogic knowledge, but basically you want to add a new HTTP source to either a new, or existing collector.

Manage -> Collection -> Add Collector -> Hosted Collector

Then you need to add a source to that collector, which would be a new HTTP source. This will give you a unique URL that you can use to send logs to. Save it somewhere - you'll need it soon!

Step 2: Create a Logging Lambda function

Create yourself a new Lambda function, with the code below. This is where you will use the HTTP collector URL from Step 1, basically put that URL into the 'path' part of the options object.

sumo_role_config

The role is down to your security configuration, it doesn't need anything specific in terms of permissions so the Basic Execution Role would suffice.

var https = require('https');
var zlib = require('zlib');

exports.handler = function(event, context) {
    var options = { 'hostname': 'collectors.us2.sumologic.com',
                    'path': 'https://collectors.us2.sumologic.com/receiver/v1/http/blahblahblah',
                    'method': 'POST'
    };
    var zippedInput = new Buffer(event.awslogs.data, 'base64');

    zlib.gunzip(zippedInput, function(e, buffer) {
        if (e) { context.fail(e); }       

        awslogsData = JSON.parse(buffer.toString('ascii'));
        console.log(awslogsData);
        if (awslogsData.messageType === "CONTROL_MESSAGE") {
            console.log("Control message");
            context.succeed("Success");
        }
        var req = https.request(options, function(res) {
            var body = '';
            console.log('Status:', res.statusCode);
            res.setEncoding('utf8');
            res.on('data', function(chunk) {
                body += chunk;
            });
            res.on('end', function() {
                console.log('Successfully processed HTTPS response');

                context.succeed();
            });
        });
        req.on('error', context.fail);
        stream=awslogsData.logStream;
        group=awslogsData.logGroup;
        curRequestID = null;
        var re = new RegExp(/RequestId: (\S+) /);
        awslogsData.logEvents.forEach(function(val, idx, arr) {
            val.logStream = stream;
            val.logGroup = group;
            var rs = re.exec(val.message);
            if (rs!==null) {
                curRequestID = rs[1];
            }
            val.requestID = curRequestID
            req.write(JSON.stringify(val) + '\n');
            //console.log("Final Data:" + JSON.stringify(val));
        });
        req.end();
    });    
};

Step 3: Consume your Cloudwatch logs with your new Function

By default, your Lambda logs will be streaming to CloudWatch, so go into CloudWatch and find the stream for the function you're wanting to act on. Another way is via the Lambda configuration page as described here by Amazon.

sumo_stream_config

Simply click Stream to AWS Lambda and select the newly created function from Step 2

Step 4: Viewing your logs in Sumo

There is a bit of a delay (up to a minute) for the logs to appear in Sumo but have faith, they will. They're sent in JSON format so you're best using the json auto formatter to view them:

Search: _source=lambda_logs | json auto nodrop