My log saw something that night

4 minute read

If you are running any type of automated job in Python, you should probably be using logging instead of print. Benefits of logging include…

  1. Output actually being available after your job is run
  2. Automatically track timestamps for each log event
  3. Set logging levels (INFO, WARNING etc.) to easily see if something is wrong

There are a lot of ways to create a logger object. Here I’ll demonstrate how to build one directly in Python, and explain some of it’s components.

Initializing Logger

The first step is actually importing the logging module. It’s really easy.

import logging

logging itself is actually the root logging module and can be configured and used directly. However, many Python modules are configured to use the root logger which can result in a lot of extraneous output. Because of this, I prefer to use the getLogger function to create a custom logger object.

Logging and Logger Levels

Like I mentioned earlier, one of the benefits of logger is the ability to log your output at various “levels”. Levels are hierarchical and each level corresponds to the severity of the event you are logging. The hierarchy is as follows …

DEBUG < INFO < WARNING < ERROR < CRITICAL

Once you get your logger object, logging events is simple. The logger object will have functions corresponding to each logging level - you simply call the relevant function and use it like print. Note - you will need to configure the logger object’s level, or the lowest event in the hierarchy that will be logged. For example, if you set the level to INFO, than DEBUG events will not be logged, but everything else will be. The root logger’s default level is WARNING.

Below are a few examples using the most basic default logging module.

Note: you can parameterize your logging output using % like this - logger.warning('This %s broke', param)
import logging
import sys

# Note: root logger will no longer be configurable after an initial handler is set up!
logging.basicConfig(level=logging.info, stream=sys.stdout)

logger.info("That gum you like is going to come back in style")
>> INFO:root:That gum you like is going to come back in style

logger.warning("The owls are not what they seem")
>> WARNING:root:The owls are not what they seem

Custom Logger Components

Next I’ll go over a custom logger object, generated by the below code.

import logging
from logging import StreamHandler
from logging.handlers import RotatingFileHandler

logger = logging.getLogger('logger_name')

logpath = '{path}/stream.log'

fh = RotatingFileHandler(logpath, maxBytes=10*1024*1024, backupCount=0)
sh = StreamHandler(stream=sys.stdout)

formatter = logging.Formatter('%(asctime)s | %(levelname)s | %(message)s')
fh.setFormatter(formatter)
sh.setFormatter(formatter)

logger.addHandler(fh)
logger.addHandler(sh)

logger.setLevel(logging.INFO)

First, you’ll notice that I am using the getLogger argument to create a logger separate from the root logging module.

Next, I generate two Handler objects. These objects configure where the logging output goes.

In the below, I add two handlers to my custom logger object - RotatingFileHandler and StreamHandler. StreamHandler is the simpler of the two, and sends logger output directly to sys.stdout like in the root logger. The RotatingFileHandler writes log output to a file. It’s arguments specify the log’s path on disk, max file-size in bytes (10MB in this case), and the number of backup logs that will be maintained before it overwrites itself.

In addition, each Handler object has a Formatter object attached to it, which changes how the information is displayed in the log. My preferred formatter records a timestamp, level name, and custom message. The output will look like this.

logger.critical("Where does creamed corn figure into the workings of the universe?")
>> 2018-06-11 00:53:36,050 | CRITICAL | Where does creamed corn figure into the workings of the universe? 

Lastly, we need to set the level again. In this case, we can do so explicitly using the setLevel argument. I typically prefer setting to INFO, as DEBUG can lead to spammy output.

Parsing a Log File

An optional, but nice touch is writing some code to parse your log output into a pandas dataframe. This is especially helpful if you are working in a Jupyter Notebook environment. Below is a sample of some code that I personally use to parse output in the format '%(asctime)s | %(levelname)s | %(message)s'.

Functionality includes an optional argument to only return log output from today, and parsing out an ID variable (This can be done if the ID is in a standard format i.e ID-#s through log)

import datetime
import pandas as pd
import re
import StringIO

def read_log(logpath, today=True):
    '''
    Parses log elements into a pandas dataframe

    Parameters
    ----------
    logpath: str
        log location on disk
    today: bool, optional
        TRUE to only parse values from today, else FALSE
    '''
    with open(logpath) as f:
        log = f.read()

    df = pd.read_csv(StringIO(log), sep='|', names=['Time', 'Level', 'Message'], engine='python')
    df['Time'] = pd.to_datetime(df.Time.str.replace(',', '.'), format='%Y-%m-%d %H:%M:%S.%f')
    df['ID'] = df['Message'].apply(lambda x: re.findall(r'ID-[0-9]+', x))
    df['ID'] = df['ID'].apply(lambda x: 'None' if x == [] else str(x).strip("[] | '"))

    if today == True:
        bix = df['Time'].apply(lambda x: x.date() == datetime.date.today())
        df = df[bix]

    df = df[['ID', 'Time', 'Level', 'Message']]

    return df

That’s all there is to it. Thanks for reading!

Updated:

Leave a Comment

Your email address will not be published. Required fields are marked *

Loading...