Learning to include logging and testing to application code

Hello -

I’m trying to become a better rounded developer. I need to start including logging and testing with my code.

Can anyone help me with books, resources, or examples of apps in github which implement logging in their app. Or, if anyone has advice from experience, please share. I’m really interested in the principles and best practices of logging and testing.

I typically use Python and Ruby, most often, but I’m familiar enough with programming any language would work ( javascript, java, C#, etc).

1 Like

For my hobby projects that only are run on occasion, I generate a logfile named program-year-month-day-hour-minute.txt in the programs folder, then have a header with program name, version and some other miscellaneous info to help match the log to a program should it end up in the wrong folder.

The log entries are then hour-minute-second (0 padded) followed by severity (info, warning, critical) and some text to specify what is going on.

Internal structure of the program to get this is that methods (or functions) that have something interesting to do shout over to “logger” with the parameters of their name, the severity (as String for flexibility) and what went wrong. Try{}catch{} in the method are used as “sources” (catch gets triggered, throw a warning or critical at the logger) and if(){}else{} in the “logger” are used to check the message importance against the desired log level (there is probably a better way to do it).

Recently I’ve used pytest with python and xunit with .net for testing.

Many languages have at least one popular library like that for unit testing. That’s where I’d start.

For logging you should take a look at structured logging. The idea is to not log just a raw message but break it up into a template with placeholder and variables. So you can query them in your logging solution of choice and take action automatically if desired.

Unfortunately, off the top of my head cannot tell you any on prem solution to play around with. In azure we´ve been using application insights. In C# serilog does quite a good job as a logging library to log to basically anything including application insights.

For testing i´d say just get started with learning some basic unit test framework at first. You´ll probably quickly discover that it´s very easy to write near untestable code and it´s very easy to write tests that you cannot keep up with / maintain. You´ll really start seeing the cracks in your code design once you have to test it. Its much easier to test your code when functions do only what they have to, aren´t tightly coupled and dependencies are provided to it in a sane way that you can override in tests.

At work we differentiate between “debug logging” which is mostly text “did this and that, loaded this data, starting to shutdown, etc etc” and “logging” or “event logging” where you send a use case specific data structure somewhere that contains the request and everything important or relevant that happened as part of processing this request, … and then analyze this across time with “big data” tools.


For python testing, just look at mock and pytest, maybe have a look at home assistant sources.

My journey took of improving testing has been to occasionally following Test Driven Development (TDD) for some of my side projects.

In particular, I loved Quii’s Learn Go with Tests ( Learn Go with Tests - Learn Go with tests ), but there’s tons of TDD how to’s and approaches (my bread and butter is Go and Java).

In particular during TDD’s refactor phase, I really started to ingrain in myself Design Patterns ( Software design pattern - Wikipedia ) and the testing, typically, gets easier along with just flexibility and extensibility of your code.

I’m not saying that you should 100% live, eat, breathe TDD in your professional life (though I swear some of my colleagues do). I’m just saying it’s a useful kata to take yourself through for a while.


Logging on the other hand my main “north star” is: Does my team have enough information to know where the issue is in production at 2am in the morning half awake because something got borked.

Maybe someone in this thread can give me a better guidance or approach to logging, but that’s been my “approach” for a while, and it’s been good enough.

And ironically, being able to locate the issue can sometimes mean removing stuff from the logs. That’s one good reason to clean up “warnings” vs “errors” in languages which classify things that way. A “warning” may not have any visible or functional impact on your application, but if it adds a ton of useless garbage to your logs that can really slow down the process of diagnosing a problem.

Unfortunately, we’re not all so lucky as to work only on applications where these sort of things have already been taken care of… :neutral_face:

Just grep through the logs and feel like a god, duh :stuck_out_tongue:

In seriousness, log aggregation of JSON formatted logs is a hell of a drug when tracing things. But that also means that you have traceability such that you can start to query your logs and actually get things out of it.

But, to @cotton 's benefit, I think this starts to verge me solutioning for work (i.e. large enterprise) rather than “Hey what’s a good thing to learn”. Sadly, I feel like my logging approach is more grok’d over years rather than something that I explicitly learned.

Personally, I find testing is a much easier thing to learn than logging. Not that logging itself is hard, rather I feel like the approach to logging is a bit dependent on the company you’re at (or maybe I’ve just had a string of companies who do things vastly different. IDK YMMV).

I’d agree with that. Writing good log messages and knowing what events are going to be useful for logging takes practice while testing is more procedural.

2 Likes

As someone who develops web apps for a living, logging is definitely something that takes focus and attention to get right. Usually developers have to “unlearn” using logging in lieu of debugging even though it’s faster, but then they have to re-learn logging because they can’t debug their code when the app is deployed. It’s certainly not something that I’m perfect at either.

I think what you should log is really going to depend on what the process is and what sort of visibility any issues would have. A small/tight CRUD app might not need as much logging since there’s the record of database and user interaction, but a scheduled process or coordination with multiple systems would definitely need multiple levels of logs if you ever need to debug an issue.

I have always looked at programming as basic functional I/O, in other words what the input of any module is and what the output is. In my mind this also extrapolates to logging - in any process there will be specific phases that you’ll enter or leave. Based on your code’s branching you may want to ensure you have at least one INFO output for each of its exit conditions or error states to track a record properly. Ensuring that from a process level means that when you come back you can track the exact branch that a given input took even if you don’t have trace/debug enabled or have debug statements. My workplace also emphasizes to add all the debug/trace as you need, since they can be safely configured to be ignored at runtime. We’re not exactly developing around performance though.

For user-controlled processes I think it’s possibly safer to log even less, ironically - most CLIs effectively only log at a WARN level with maybe some additional INFO statements for exit cases. Some have multiple levels of verbose flags (-vvv).

It’s also worth finding a good example source that is similar to what you’re developing that you can use as a guideline. It may take a while to find good examples though. Logging and testing are two features of development that often get left for last, and subsequently forgotten.

Makes a lot of sence we also sample logs on application insights but occasionally configure it to bypass said sampling to troubleshoot some an issue we don´t manage to see while see debugging locally. Can be hard to tell what happens if you can only see every second log message in some cases. And logging everything all the time is more expensive than it needs to be. But depending on your logging framework it might not make much different in terms of programming. At least in serilog if you still always have the choice to log raw text if you want to, but if you just log raw text in code that is all it can do for you.

After looking into testing and logging - you might want to also look into realtime monitoring metrics (counters, gauges, maps a-la Prometheus), and Tracing (a-la opentelemetry or Zipkin), and profiling (a-la pprof), and sanitizers like asan, tsan, msan

Thanks everyone for the input.

I was assigned two projects. On both I’ve been implementing structured logging from the start. Ultimately, my goal is to get the logs moved into a log injection service, so admins can use the service to review rather than connecting to the systems, directly.

Generally speaking, I default a log event to “debug”, unless it needs to be elevated. Whereas, I used to use “print” statements to debug, I am now setting the logger level to debug, and tracing the logs to debug.

This has changed how I write code, firstly because now I’m including logging in my code, and that I attempt to set up logging in such a way where I can “mask” code that I use to write and debug code when it runs in production, but use the logging functionality when developing.

I added structured logging to my code and was able to get it hooked up to a cloud logging service.

I think I need to make a distinction between event logging and application logging.

For Python, you might want to check out “Python Testing with pytest” by Brian Okken. It’s a practical guide to testing in Python. Also, the “Python Logging” module documentation provides excellent insights into logging best practices.

1 Like