Show HN: Tach – Visualize and untangle your Python codebase

github.com

251 points by the1024 3 days ago

Hey everyone! We're Evan and Caelean, the authors of Tach.

Tach lets you visualize the architecture of your Python codebase, and gives you the tools to incrementally improve it. It uses module boundaries to give teams the benefits of microservices without the deployment complexity.

If your code has been getting tangled up as your team and codebase grows, Tach helps you move back in the right direction, incrementally and quickly. You can use Tach to incrementally adopt a "modular monolith" architecture [1], for better local reasoning and smoother feature development.

Since our last Show HN (https://news.ycombinator.com/item?id=41359181) we've shipped support for layers, third party dependencies, visualizations, and more.

Tach is: * Open source (MIT) * completely free * fast (written in Rust) * in use by teams at NVIDIA, PostHog, and more.

One way Tach differs from existing systems that handle this problem (build systems, import linters, etc) is the ability to be incrementally adopted. Also, runtime speed.

If you struggle with dependencies, onboarding new engineers, or a massive codebase, Tach is for you! We built it with developers in mind - with clean integrations into Git, CI/CD, and IDEs, and the performance for it to be effective in any form factor.

[1] https://www.milanjovanovic.tech/blog/what-is-a-modular-monol...

hansonkd 2 days ago

Really excited to see this project gain traction.

> Note that this graph is generated remotely with the contents of your `tach.toml`

Isn't shipping off parts of your codebase to a 3rd party without warning in the CLI a security risk? Or in regulatory environments you get audited that your code was only stored on properly vetted services which is why some sales cycles for AI coding assistant tools are so long. It would be kind of frustrating to have something like that happen and get set back on licensing, etc.

Just from the video it doesn't seem like any sort of warning that you are shipping config files to your servers and the URL that you produced doesn't seem to have any authentication.

Maybe i am misunderstanding that functionality, but it gives me pause to use it.

  • 0x63_Problems 2 days ago

    Co-author here, fair question!

    In short, we want to make the visualization UX as smooth as possible, and this is best done with a web app. The URLs use UUIDs, and the contents being sent don't include literal source code, only module names and Tach configuration. We will also delete graphs by UUID on request, and have done so in the past.

    That said, we do try to be up-front about this, which is why that disclaimer exists, and when running this command on the CLI, you must supply an explicit `--web` argument to `tach show`. Otherwise, the default behavior is to generate a GraphViz DOT file locally.

    • ycombiredd a day ago

      If it outputs DOT, I can recommend you visualize your graphs with PHART ( https://github.com/scottvr/phart/ )

      I’m mostly kidding but incidentally PHART was born in order to visualize Python dependency graphs in-line in 7-bit ASCII because I wanted the functionality in my dependency analyzing code summarizing concatenator tool I was using to aid in pair-programming with ChatGPT and Claude when codebases started outgrowing useful context lengths. That tool is here https://github.com/scottvr/chimeracat/ (it is nowhere as slick-looking as OP’s app, but also that is by design.)

      The first time someone in public said they were curious to see the chimeracat output for his company’s codebase was also the first time I considered “wow.. how do I make sure people know they can trust chimeracat isn’t stealing their code?” and started thinking of ways to give people that surety and safety for any app, because so realized that though it was my first time to think about how “code analysis” tools like this, it even linters, prettifier’s etc. are a fertile ground for subterfuge and espionage, it was no doubt not the first time the thought had occurred elsewhere, and likely to at least a handful of folks who would (and no doubt are) putting such tools out there in the wild.

    • airstrike 2 days ago

      > we want to make the visualization UX as smooth as possible

      still doesn't explain why you need to ship the data to a third party

      > and this is best done with a web app

      debatable. you could always write a GUI app. it's not that hard for such a self-contained project

      there would be _a lot_ to gain from having this run totally locally without any network access and leaking source code to third parties.

      • tyre a day ago

        > you could always write a GUI app. it's not that hard for such a self-contained project

        beautiful HN comment. They might simply be familiar with web apps and want to focus on the part that provides the most value to users.

        The external network requests are optional. It can run fully locally.

        They’re a tiny startup that just launched, trying to ship something that helps people. Building a native app is not the most impactful thing they could spend their time on.

        • airstrike a day ago

          The part that provides most value to users is not shipping data to third parties needlessly. I can write the GUI for this app in a week.

          • Majestic121 8 hours ago

            Good thing that it is open source then, it means you can fix this issue in a week !

    • bmitc 2 days ago

      Why not just let users run the web app locally? There's no reason it needs to be remote.

      Also, the mere fact that it sends any data, no matter what you say it contains is a non-starter at many places. And even module names can contain proprietary data.

      • 0x63_Problems 2 days ago

        I can understand the frustration, but I think there are legitimate reasons to run this remotely.

        Tach is an installable Python package, shipping a full web app would have to come in a separate form factor and has significant maintenance implications. Given we are explicit about the remote app before anything is sent, require explicit opt-in, and we provide usable alternatives locally, we prioritize shipping a useful graph experience that is immediately usable.

        If you are at an enterprise that cannot tolerate this, then you can use a local viewer with either GraphViz DOT format or Mermaid which is generated by using `tach show` or `tach show --mermaid` respectively.

        • byteknight 2 days ago

          I appreciate the attempt but the reasoning of "it requires maintenance" is entirely moot. You have to do this regardless. Its just whether or not you publish it open-source. You are still saying, internally, this is good enough for customers, when you push it out.

          This is a (very) thinly veiled attempt at a closed garden of sorts, IMHO. Its a "clean" excuse for not giving away the milk for free, but it falls short on actual reasoning.

          • jlg23 2 days ago

            Looking at the license (MIT) we already got much more than what we paid for and the authors don't "have to" do anything but accept thanks of those who chose to be grateful for software they got for free.

            • 9question1 2 days ago

              This. It's ridiculous how often people complain about the design of free software. If you don't like it, just don't use it! Use something else! Build your own! Or fork it to work in the way you described that you'd prefer - you can do that yourself if you really want since the source is available

              • Eisenstein 2 days ago

                It is totally valid to tell people not to criticize a project offered by someone who made it for their themselves or wants to offer the value to the public but doesn't have the resources to do everything perfectly. But this is not that, and I don't see a non-profit org behind it, so it appears to be something that is being offered on a quid pro quo basis. Thus we need to figure out where the value is being extracted and if the dev are cagey about it, that rings alarm bells.

                • tyre a day ago

                  Brother.

                  The default of the command is to generate locally. They don’t need to open source an entire web app. It’s easier to deploy themselves then deal with the burden of open sourcing and maintaining.

                  This isn’t some conspiracy. It’s a tiny startup trying to ship something useful.

                  • Eisenstein a day ago

                    I think you misunderstand my comment. I was addressing whether or not it can be appropriate for someone to question an aspect of an open source project, and not whether this project was part of a conspiracy.

              • bmitc a day ago

                It's not complaining to provide critique, especially when the tool is being marketed and part of a technique to sell services.

                The point of my post was to say why I'm not interested in using it.

              • LtWorf a day ago

                So once can no longer comment on anything?

            • bmitc a day ago

              This has nothing to do with being grateful or not.

          • cmcconomy 2 days ago

            I am having an allergic reaction too, I don't see any reason this should exfiltrate any information from my machine.

        • bmitc a day ago

          To be clear, I'm not frustrated. Just providing feedback.

        • bolognafairy a day ago

          Since you’re being somewhat brigaded by the “everything local!” mob, I just wanna say that this all sounds completely reasonable to me. Some people hate being told that their demographic just isn’t currently being catered to exactly in the way that they want. I’m sure that these people working on things so utterly Top Secret can wait a while for your new little tool to support them. They’re just mad they can’t use it at Meta or whatever.

        • globular-toast 2 days ago

          There are hundreds of "full web apps" on PyPI. What's special about yours?

godelski 2 days ago

I've been surprised that there hasn't been much progress in code tracing. It's incredibly hard to jump into a new code base. Cscope and ctags are still used but uncommon. It's not common to see people use debuggers. I suspect this is a major reason why Python and interpreted languages are so popular. But as code bases have exploded we still haven't gotten much better than where we were over a decade ago. Yeah RAGs can help but I'm not sure are a huge improvement over tags. Maybe realistically regressed, relying more on print statement debugging. When do we improve from gdb, cscope, grep, awk, and find? (I'm aware of the improved version of those but listen carefully if that's what you're jumping to respond with)

So I'm really happy to see a project like this. Well done. Can't wait to see more

stavros 2 days ago

We replaced our microservices architecture with a modular monolith and got tons of benefits, something I've been meaning to write up. However, while discussing that here, I was pointed to Tach, which looks fantastic if you have (or want to create) a modular monolith.

drdrey 2 days ago

I would recommend installing it with

    uv tool install tach
rather than

    pip install tach
that way tach is installed system-wide but in its own isolated venv
  • globular-toast a day ago

    I keep meaning to write a blog post about this. "pip install x" basically should be read as "this project is packaged and available on PyPI" rather than a literal installation instruction.

    A seasoned Python developer will rarely, if ever, directly pip install something. Instead they would manually add it to pyproject.toml or, if they use something like poetry, use that to add it, or they'd use something like pipx to install it as a "global" tool (or just their system package manager). This has been true for years now. I think it's time projects stop writing "pip install x" and we come up with a standard way to say "the package name is x" and maybe a recommended installation method (like use uv/pipx to install as a system tool, or add to your project dependencies etc).

  • adammarples 21 hours ago

    Venv or no Venv, you can use pip. Asking people to have an additional requirement on uv is unreasonable, although I do prefer using it now myself.

Attummm 2 days ago

This sounds great.

Python is really great for quickly developing applications.

However, maintaining them is a real pain point—especially when it comes to packages and their dependencies.

Furthermore, because there isn’t a compile-time checker, function or method signatures can change unnoticed. Compilers are great for catching such issues at compile time rather than at runtime. Python does have mypy, which can play that role, but the package must support it. Currently, you are dependent on the package maintainer regarding their adherence to semver.

Maybe this project will be able to fill that hole.

tracnar 2 days ago

When I tried it, it seemed like you really need to list all modules in `tach.toml`.

What I wanted was to work at a coarser package level. For example if you have the modules `foo.a`, `foo.b`, `bar.a`, and `bar.b`, I'd like a rule that `bar` can import from `foo` but not vice versa, without having to list or care about the submodules.

Is that something you'd want to support?

  • 0x63_Problems 2 days ago

    This should definitely be supported out-of-the-box, we can take a closer look in Discord or through GH Issues! But generally I would expect:

    ```

    [[modules]]

    path = "foo"

    depends_on = []

    [[modules]]

    path = "bar"

    depends_on = ["foo"]

    ```

    would do the trick, assuming both are within a configured source root, and their children are _not_ also marked as modules. If the children are marked as modules, their dependency rules are treated separately and wouldn't automatically inherit from a parent.

    • tracnar 2 days ago

      Ok great, I indeed just tried, I might just have been confused last time! Maybe you could mention in the doc that a module can be a "package"? (Even though I suppose a package is also a module, I always find the Python terminology a bit confusing there.)

      • rswail a day ago

        I always used to think of it as a module is a file `example.py`, a package is a directory `example` with a module `__init__.py`.

yogurt-male a day ago

This is great! Really wish that I'd known about this tool a month ago. Would have saved me a lot of headache. Thanks for sharing.

KronisLV a day ago

SourceTrail did something a bit similar, though it didn't focus just on the modules, but also what methods are called where. I really liked the tool, but it didn't really work out for them and they discontinued it: https://en.wikipedia.org/wiki/Sourcetrail

adamc 2 days ago

Having the example be a video that changes was confusing at first, and if you are going to show me something that is changing, I would like to be able to rewind to the beginning. But really I just think it's a bad idea to show something like that without making it obvious what it is.

  • mathfailure 2 days ago

    What do you mean by "a video that changes"?

    I liked the video-example, it's way better than examples in many projects that use just text, when the tool does something quite complex that better be demonstrated in a video with a narrator explaining what's happening and why.

    • 0x63_Problems 2 days ago

      I actually made that change in response to their comment! It used to be a live GIF with no narration.

      • mathfailure 18 hours ago

        Thank you for the change then. I liked the video with narration.

benrutter 2 days ago

This looks nice! I vaguely know Grimp as a similar tool, any idea how they differ/compare?

  • the1024 2 days ago

    Thanks! You can think of Grimp as a lower-level tool for interacting with the import graph in Python, while Tach is a high-level tool responsible for 'modularity' as a whole (e.g. modules, interfaces, layers, deprecations etc.)

    Tach is also more opinionated - so it doesn't require you to write any custom code, and uses declarative config to enforce your desired architecture.

chairhairair a day ago

If this tool looks like it would improve your life I think you should consider using Bazel instead of whatever build system you are using. I don’t see much value add here for a project using Bazel.

jtwaleson 2 days ago

Cool! Do you have plans to launch a paid offering? The website makes me think it's a company but I didn't see any pricing / sales details.

  • 0x63_Problems 2 days ago

    Co-author here - We do provide a web platform (https://www.gauge.sh/platform) which we have been developing with design partners. The fundamental difference between using Tach alone vs. the platform is that the platform provides incremental enforcement at the pull request level.

    We're always happy to chat about adding more design partners! email: founders@gauge.sh

xtiansimon a day ago

Would this also apply to learning and exploring an unknown codebase?

efitz a day ago

What kind of data set are you trying to build using the dependency information from collected toml files, and what do you intend to do with it?

vednig 2 days ago

This is pretty cool, thanks for sharing

lijok 2 days ago

Tools like this rub me the wrong way.

We have well established conventions like prefixing private modules and symbols with an underscore, or declaring your public interfaces in the __init__.py file, but the Python developer decries it as "busywork", "weird" and "hard to read", so we instead use tools like this.

We can manage dependencies with protocols, a type checker and generally following SOLID principles, but the Python developer decries it as "too indirect and convoluted", so we instead use tools like this.

This is more commentary on the Python developer than this tool. Tach looks great.

  • 0x63_Problems 2 days ago

    Co-author here, I can understand where you're coming from!

    Part of the philosophy here is that the tools and techniques you're describing can (and should) be used diligently to solve this problem, and Tach is often a complement to this approach.

    The benefit of centralizing the concern into a single tool, and often a single config file, is that teams get better documentation, earlier feedback (in-editor vs. code review), and more visibility when planning new development. Teams also get to choose _how_ they would like to satisfy Tach's config, and other teams can still rely on the same guarantees due to Tach's static checks.

  • echelon 2 days ago

    > We have well established conventions like prefixing private modules and symbols with an underscore, or declaring your public interfaces in the __init__.py file,

    The language doesn't enforce them, so they may as well not exist. See: python dependency management.

    > This is more commentary on the Python developer than this tool.

    100%. Python has become an unstructured Wild West, perhaps even worse than modern JavaScript. The "Zen of Python" is a bold faced lie.

    Python has incredible use cases. It blends together different disciplines effectively. But perhaps we should ask ourselves whether or not it's a language suitable for writing large monoliths in.

    • SalmoShalazar 2 days ago

      The conventions are widely used and Python is used successfully in numerous “large monoliths”. Saying that the conventions may as well not exist if they’re not enforced is demonstrably nonsense.

      • globular-toast 2 days ago

        It depends on your team. If the whole team "gets it" then things will be fine. But if you've got a team with juniors or people happy to do whatever crap they or ChatGPT can come up with to make things work then it doesn't.

        • goosejuice 2 days ago

          That says more about your process as a swe org than the technology you're using.

          • globular-toast a day ago

            That you'd use a tool or a language to help you?

rubenvanwyk 2 days ago

Does this work together with uv?

  • 0x63_Problems 2 days ago

    Yes!

    Here is an example project that is configured as if it were a uv workspace: https://github.com/gauge-sh/tach/tree/main/python/tests/exam...

    In that project, `tach check-external` would handle between-workspace dependencies, while the core `modules` and `interfaces` config would handle within-workspace dependencies.

    Soon these will be better unified, we kept the 1st-party/3rd-party distinction separate while we learned what the UX should be.

butterlettuce 2 days ago

Just wanted to say Caelean is such a cool name.

Is it “kay-leen” or “kay-lee-an”

  • the1024 19 hours ago

    "kay-len" - confusing I know

willgax a day ago

do project like this exist for java spring boot it would be very cool if it could work on every codebase regardless of tech stack

  • ameymh1571 a day ago

    yeah i would like to know that too as a to get big picture of how everything works it is needed