My log saw something that night
If you are running any type of automated job in Python, you should probably be using logging
instead of print
. Benefits of logging include…
- Output actually being available after your job is run
- Automatically track timestamps for each log event
- Set logging levels (INFO, WARNING etc.) to easily see if something is wrong
There are a lot of ways to create a logger object. Here I’ll demonstrate how to build one directly in Python, and explain some of it’s components.
Initializing Logger
The first step is actually importing the logging module. It’s really easy.
import logging
logging
itself is actually the root
logging module and can be configured and used directly. However, many Python modules are configured to use the root
logger which can result in a lot of extraneous output. Because of this, I prefer to use the getLogger
function to create a custom logger object.
Logging and Logger Levels
Like I mentioned earlier, one of the benefits of logger is the ability to log your output at various “levels”. Levels are hierarchical and each level corresponds to the severity of the event you are logging. The hierarchy is as follows …
DEBUG < INFO < WARNING < ERROR < CRITICAL
Once you get your logger
object, logging events is simple. The logger object will have functions corresponding to each logging level - you simply call the relevant function and use it like print
. Note - you will need to configure the logger object’s level
, or the lowest event in the hierarchy that will be logged. For example, if you set the level to INFO
, than DEBUG
events will not be logged, but everything else will be. The root logger’s default level is WARNING
.
Below are a few examples using the most basic default logging module.
%
like this - logger.warning('This %s broke', param
)
import logging
import sys
# Note: root logger will no longer be configurable after an initial handler is set up!
logging.basicConfig(level=logging.info, stream=sys.stdout)
logger.info("That gum you like is going to come back in style")
>> INFO:root:That gum you like is going to come back in style
logger.warning("The owls are not what they seem")
>> WARNING:root:The owls are not what they seem
Custom Logger Components
Next I’ll go over a custom logger object, generated by the below code.
import logging
from logging import StreamHandler
from logging.handlers import RotatingFileHandler
logger = logging.getLogger('logger_name')
logpath = '{path}/stream.log'
fh = RotatingFileHandler(logpath, maxBytes=10*1024*1024, backupCount=0)
sh = StreamHandler(stream=sys.stdout)
formatter = logging.Formatter('%(asctime)s | %(levelname)s | %(message)s')
fh.setFormatter(formatter)
sh.setFormatter(formatter)
logger.addHandler(fh)
logger.addHandler(sh)
logger.setLevel(logging.INFO)
First, you’ll notice that I am using the getLogger
argument to create a logger separate from the root
logging module.
Next, I generate two Handler
objects. These objects configure where the logging output goes.
In the below, I add two handlers to my custom logger object - RotatingFileHandler
and StreamHandler
. StreamHandler
is the simpler of the two, and sends logger output directly to sys.stdout
like in the root logger. The RotatingFileHandler
writes log output to a file. It’s arguments specify the log’s path on disk, max file-size in bytes (10MB in this case), and the number of backup logs that will be maintained before it overwrites itself.
In addition, each Handler
object has a Formatter
object attached to it, which changes how the information is displayed in the log. My preferred formatter records a timestamp, level name, and custom message. The output will look like this.
logger.critical("Where does creamed corn figure into the workings of the universe?")
>> 2018-06-11 00:53:36,050 | CRITICAL | Where does creamed corn figure into the workings of the universe?
Lastly, we need to set the level again. In this case, we can do so explicitly using the setLevel
argument. I typically prefer setting to INFO
, as DEBUG
can lead to spammy output.
Parsing a Log File
An optional, but nice touch is writing some code to parse your log output into a pandas dataframe. This is especially helpful if you are working in a Jupyter Notebook environment. Below is a sample of some code that I personally use to parse output in the format '%(asctime)s | %(levelname)s | %(message)s'
.
Functionality includes an optional argument to only return log output from today, and parsing out an ID
variable (This can be done if the ID is in a standard format i.e ID-#s through log)
import datetime
import pandas as pd
import re
import StringIO
def read_log(logpath, today=True):
'''
Parses log elements into a pandas dataframe
Parameters
----------
logpath: str
log location on disk
today: bool, optional
TRUE to only parse values from today, else FALSE
'''
with open(logpath) as f:
log = f.read()
df = pd.read_csv(StringIO(log), sep='|', names=['Time', 'Level', 'Message'], engine='python')
df['Time'] = pd.to_datetime(df.Time.str.replace(',', '.'), format='%Y-%m-%d %H:%M:%S.%f')
df['ID'] = df['Message'].apply(lambda x: re.findall(r'ID-[0-9]+', x))
df['ID'] = df['ID'].apply(lambda x: 'None' if x == [] else str(x).strip("[] | '"))
if today == True:
bix = df['Time'].apply(lambda x: x.date() == datetime.date.today())
df = df[bix]
df = df[['ID', 'Time', 'Level', 'Message']]
return df
That’s all there is to it. Thanks for reading!
Leave a Comment
Your email address will not be published. Required fields are marked *