[GH-ISSUE #468] Switch to rich logging package for CLI logs, progress bars, etc. #1819

Open
opened 2026-03-01 17:53:56 +03:00 by kerem · 5 comments
Owner

Originally created by @MartinThoma on GitHub (Sep 6, 2020).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/468

I'm currently writing an article about logging in Python. As a part of that, I look at other projects and how logging is done there. I've seen logging_util.py and was wondering why this project does not use the built-in logging module.

Could somebody share some thoughts around logging? Why is it done like that in this project? Do you consider this a best practice?

Originally created by @MartinThoma on GitHub (Sep 6, 2020). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/468 I'm currently writing an article about logging in Python. As a part of that, I look at other projects and how logging is done there. I've seen `logging_util.py` and was wondering why this project does not use the built-in logging module. Could somebody share some thoughts around logging? Why is it done like that in this project? Do you consider this a best practice?
Author
Owner

@pirate commented on GitHub (Sep 7, 2020):

It's not a best practice, I just tend to write my own logging utils because I like very fine-grained control over how colored log output appears and I find the logging builtin library to be somewhat clunky.

<!-- gh-comment-id:688447959 --> @pirate commented on GitHub (Sep 7, 2020): It's not a best practice, I just tend to write my own logging utils because I like very fine-grained control over how colored log output appears and I find the `logging` builtin library to be somewhat clunky.
Author
Owner

@MartinThoma commented on GitHub (Sep 7, 2020):

Thank you for answering!

If it's not a best practice, what would you consider a best practice for logging?

Have you given 3rd party logging libraries a try (structlog, loguru, pysnooper, or something else)? If so, what do you think of them?

<!-- gh-comment-id:688480647 --> @MartinThoma commented on GitHub (Sep 7, 2020): Thank you for answering! If it's not a best practice, what would you consider a best practice for logging? Have you given 3rd party logging libraries a try ([structlog](https://pypi.org/project/structlog/), [loguru](https://pypi.org/project/loguru/), [pysnooper](https://pypi.org/project/PySnooper/), or something else)? If so, what do you think of them?
Author
Owner

@MartinThoma commented on GitHub (Sep 7, 2020):

I find the logging builtin library to be somewhat clunky

Could you explain that a bit more? What exactly is it that you don't like?

I think I have heard about loguru at Python bytes and they mentioned that it deals pretty well with colored output (e.g. over various consoles / terminals / within PyCharm)

<!-- gh-comment-id:688481283 --> @MartinThoma commented on GitHub (Sep 7, 2020): > I find the logging builtin library to be somewhat clunky Could you explain that a bit more? What exactly is it that you don't like? I think I have heard about loguru at [Python bytes](https://de.scribd.com/podcast/419006379/111-loguru-Python-logging-made-simple) and they mentioned that it deals pretty well with colored output (e.g. over various consoles / terminals / within PyCharm)
Author
Owner

@pirate commented on GitHub (Sep 7, 2020):

Those libraries would all be great for a long-running background daemon that writes to a logfile, e.g. archivebox server, but the output is far too verbose/hard-to-tune for a one-shot cli command like archivebox add, archivebox version, etc.

Because I sometimes want timestamps, I sometimes want color, I sometimes want structured output sections, but not always, I it's easier for me to just hand code it than use a logger that imposes those things all the time or never.

Take for example this output from archivebox add:
image

There are multiple tiers of indentation that are context-dependent, multiple colors for different types of notifications that are tuned for UX, not for strict internal consistency with any particular "log level" e.g. DEBUG/INFO/etc. (which a library would enforce). I also have multiple colors on a single line to highlight important parts of the line, something not any of the tools let you do easily.

I also dislike having to import/declare a logger at the top of every file, something that neither matches with my mental model of logging, nor is enforced by any linter, so it's easy to forget and cause subtle logging errors in certain environments.

At the end of the day, why use a dependency for something that I can code in half a day by hand, and matches my needs exactly. In general in my Python projects, I try to only use dependencies for big internal features that cannot easily be written by hand. I dislike the NPM-style of dependency management that encourages hundreds of 10-line dependencies for small conveniences.

Not to mention, it's much more straightforward when debugging to read something like this, than to have to read the docs on a 3rd party dependency and understand how it works:

print('{green}# ArchiveBox Imports{reset}'.format(**ANSI))
print('{green}from archivebox.core.models import Snapshot, User{reset}'.format(**ANSI))
print('{green}from archivebox import *\n    {}{reset}'.format("\n    ".join(list_subcommands().keys()), **ANSI))
print()
print('[i] Welcome to the ArchiveBox Shell!')
print('    https://github.com/pirate/ArchiveBox/wiki/Usage#Shell-Usage')
print()
print('    {lightred}Hint:{reset} Example use:'.format(**ANSI))
print('        print(Snapshot.objects.filter(is_archived=True).count())')
print('        Snapshot.objects.get(url="https://example.com").as_json()')
print('        add("https://example.com/some/new/url")')

In a new project I would consider using either normal print statements, or if absolutely necessary, the logging python built-in library as a best practice. Unless you have a massive project or a project thats core UX centers around its logfiles, I would not add additional dependencies to manage logging. (archivebox is <6k lines of python)

<!-- gh-comment-id:688489159 --> @pirate commented on GitHub (Sep 7, 2020): Those libraries would all be great for a long-running background daemon that writes to a logfile, e.g. `archivebox server`, but the output is far too verbose/hard-to-tune for a one-shot cli command like `archivebox add`, `archivebox version`, etc. Because I sometimes want timestamps, I sometimes want color, I sometimes want structured output sections, but not always, I it's easier for me to just hand code it than use a logger that imposes those things all the time or never. Take for example this output from `archivebox add`: ![image](https://user-images.githubusercontent.com/511499/92414140-e6b39e00-f120-11ea-8ff6-d81ed1ba3c43.png) There are multiple tiers of indentation that are context-dependent, multiple colors for different types of notifications that are tuned for UX, not for strict internal consistency with any particular "log level" e.g. DEBUG/INFO/etc. (which a library would enforce). I also have multiple colors on a single line to highlight important parts of the line, something not any of the tools let you do easily. I also dislike having to import/declare a logger at the top of every file, something that neither matches with my mental model of logging, nor is enforced by any linter, so it's easy to forget and cause subtle logging errors in certain environments. At the end of the day, why use a dependency for something that I can code in half a day by hand, and matches my needs exactly. In general in my Python projects, I try to only use dependencies for big internal features that cannot easily be written by hand. I dislike the NPM-style of dependency management that encourages hundreds of 10-line dependencies for small conveniences. Not to mention, it's much more straightforward when debugging to read something like this, than to have to read the docs on a 3rd party dependency and understand how it works: ```python3 print('{green}# ArchiveBox Imports{reset}'.format(**ANSI)) print('{green}from archivebox.core.models import Snapshot, User{reset}'.format(**ANSI)) print('{green}from archivebox import *\n {}{reset}'.format("\n ".join(list_subcommands().keys()), **ANSI)) print() print('[i] Welcome to the ArchiveBox Shell!') print(' https://github.com/pirate/ArchiveBox/wiki/Usage#Shell-Usage') print() print(' {lightred}Hint:{reset} Example use:'.format(**ANSI)) print(' print(Snapshot.objects.filter(is_archived=True).count())') print(' Snapshot.objects.get(url="https://example.com").as_json()') print(' add("https://example.com/some/new/url")') ``` --- In a new project I would consider using either normal `print` statements, or if absolutely necessary, the `logging` python built-in library as a best practice. Unless you have a massive project or a project thats core UX centers around its logfiles, I would not add additional dependencies to manage logging. (archivebox is <6k lines of python)
Author
Owner

@pirate commented on GitHub (Feb 22, 2024):

I've decided to re-open this because A. ArchiveBox has gotten bigger and does a lot of logging now, and B. I found rich: https://github.com/Textualize/rich

It's an amazing package and it provides everything we need out-of-the-box and more! including:

image
rich README.md -o readme.html
rich data.csv -o data.html
rich index.json -o index.html
rich -x python errors.log -o errors.html
rich -x ini ArchiveBox.conf -o config.html

# also consider imgcat for showing images/thumbnails directly in cli output
imgcat --width 2 archive/*/favicon.ico
imgcat --width 100 archive/*/media/*.webp
imgcat -t text/python logs/errors.log
imgcat -t application/json archive/1698749080.583/index.json

Use textual-web to generate live web views of commands & output:

pip install textual-web
textual-web -t

Autogenerate TUI from existing management commands


Other TUI Browsers

<!-- gh-comment-id:1958963415 --> @pirate commented on GitHub (Feb 22, 2024): I've decided to re-open this because A. ArchiveBox has gotten bigger and does a lot of logging now, and B. I found `rich`: https://github.com/Textualize/rich It's an amazing package and it provides everything we need out-of-the-box and more! including: - scrolling partial-height livepanes: https://rich.readthedocs.io/en/stable/live.html - progress with known and unknown totals: https://rich.readthedocs.io/en/stable/progress.html - pretty traceback printing with locals: https://rich.readthedocs.io/en/stable/traceback.html - filesystem tree display: https://rich.readthedocs.io/en/stable/tree.html - generic syntax highlighting: https://rich.readthedocs.io/en/stable/syntax.html - markdown rendering to console: https://rich.readthedocs.io/en/stable/markdown.html - full layouts with sub-panels: https://rich.readthedocs.io/en/stable/layout.html - integration with python/django `logging` system: https://rich.readthedocs.io/en/stable/logging.html - pretty printing for `archivebox shell`: https://rich.readthedocs.io/en/stable/pretty.html - tabular data display for `archivebox list`: https://rich.readthedocs.io/en/stable/tables.html <img width="550" alt="image" src="https://github.com/ArchiveBox/ArchiveBox/assets/511499/2219d861-8be9-4448-804c-a849de241d24"> --- ```bash rich README.md -o readme.html rich data.csv -o data.html rich index.json -o index.html rich -x python errors.log -o errors.html rich -x ini ArchiveBox.conf -o config.html # also consider imgcat for showing images/thumbnails directly in cli output imgcat --width 2 archive/*/favicon.ico imgcat --width 100 archive/*/media/*.webp imgcat -t text/python logs/errors.log imgcat -t application/json archive/1698749080.583/index.json ``` - https://github.com/textualize/rich-cli - https://github.com/eddieantonio/imgcat --- Use textual-web to generate live web views of commands & output: ```bash pip install textual-web textual-web -t ``` - https://github.com/Textualize/textual-web --- Autogenerate TUI from existing management commands - https://github.com/anze3db/django-tui - https://github.com/Textualize/trogon ---- Other TUI Browsers - https://github.com/juftin/browsr (Filesysstem explorer) - https://github.com/tconbeer/harlequin (SQL DB explorer) - https://github.com/romanin-rf/SeaPlayer (Terminal media player) - https://github.com/Textualize/frogmouth (markdown browser) - https://github.com/wustho/baca (ebook reader) - https://github.com/mahrz24/netext (render graph nodes and edges in terminal) - https://github.com/michelcrypt4d4mus/pdfalyzer (deep inspect PDF contents) - https://github.com/royreznik/rexi (interactive regex tester) - https://github.com/QbDesu/django-tui.editor (WYSIWYG web editor for markdown in django)
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#1819
No description provided.