Transcript: Your Claude Code Agentic OS Sucks

Watch video

Chase AI20m 04sTranscript ✅Added May 15, 12:40 am GMT+8

Source video ID: d86VCtQ_dN8

Transcript

0:00 — Your cloud code agentic OS sucks and it’s because you’re focused on the wrong things. You’re spending all your time on fancy dashboards and command centers like this one and this one instead of focusing on what actually drives value in a cloud code agentic OS and that’s this a skill and automation backbone that actually drives everything. The problem is creating something like this at a high level takes time, isn’t flashy, and can be kind of boring, especially when we compare it to these
0:30 — wildlooking command centers that bring in a ton of views. But the truth is to get any value out of a cloudcoded gentic OS, especially when we’re talking about the observability piece, the dashboard bees, the command center thing, it’s only going to happen if this is locked in. And that’s because a strong agentic OS has three parts to it. The first is what you see right here. It’s the skill and automation backbone. It’s the idea that we are going to take cloud code and turn it into a system that can give us reliable outputs. We are going to take
1:01 — your daily or your teams or your clients workflows and tasks, turn those into skills, turn those skills into automations where it makes sense, and in the process build out a cohesive system like you see here. So we can do the same thing over and over again at a high level and get consistent outputs. The second part of an agentic OS is the memory layer. How do we handle the idea of context engineering? Well, there’s a number of ways we can do it. We can do something super fancy with full-blown
1:31 — knowledge graphs and do something like light rag or we can keep it simple and just use something like Obsidian which is a 80% solution that’s more than enough for the vast majority of people. And it’s only once we’ve locked all that in does any sort of dashboard or command center for an OS make sense because the value of a dashboard really comes in two parts. First is the observability side. That’s the idea that I can kind of cover some of the weaknesses of being in a terminal. Things like seeing my metrics for my social media channel, being able
2:01 — to quickly dive into different audience metrics, have all my research shown to me on one tab. The second half of that value comes from here. all these sort of buttons. And that’s the idea that if I want to bring the power of cloud code to a team member or to a client who’s never going to jump into the terminal, I can instead build out that skill architecture for them, assign it to these buttons, and they can essentially just execute them on command by just clicking them. And so today, I’m going to show you how to properly set up this skill backbone. And then we’re going to
2:31 — talk about the dashboard side of it because there is a lot you actually can do in this scenario. And there’s really two paths you can go down. Like you’ve been seeing, I’ve kind of been showing you two versions. There’s the one you see here, which is literally a part of Obsidian itself, which is pretty cool because we also get an integrated terminal. And there’s this web app version, which is really built for distribution if you’re someone who’s trying to bring in other team members or packages for clients. But before we jump into the nitty-gritty of how to do it, a quick word from today’s sponsor, me. So, as you know, inside of Chase AI plus, I
3:02 — just released the Claude code masterass, which is the number one way to go from zero to AI dev, but I have also just added an Aentic OS masterass inside as well. So, everything you see in today’s video, the prompts, the dashboards, the setups, all that can be found at a much deeper level inside of Chase AI Plus. There’s a link to that in the pin comment. Also, today I guess when this video comes out, I’ll be running a free webinar of how to set up an Aentic OS for yourself, going through all three layers. So, if you want to join, make
3:33 — sure to check out the pin comment as well. I’ll have a link for both of those. So, if this is where all the value lies, how do we set this up and why is it set up like this? Why does it look like an org chart? Well, the whole org chart setup like you see here where we have stuff broken out into different sections like productivity and research and content. This is just to help you visualize something that is ultimately invisible. This is just for your mental model and it’s the idea that you do a bunch of different things across a bunch of different domains in your day-to-day week- toeek flows and whether it’s in your business or just in your personal
4:03 — life. For me, that is split up amongst things like my productivity. So things like Google research, content, my community, my agency, my sales, on and on and on. And what we need to do for you is we need to take the giant morass of things you do in a dayto-day, right? All these different tasks and we need to break them out and we need to turn them into skills. Why do we need to turn them into skills? Well, chances are the way you work right now with cloud code, when you need it to do something, you just
4:34 — spin up cloud code in the terminal and you kind of tell it what to do. You’re pretty much just using it as a slightly better chat GPT. And if you’re doing this all the time, why are we not codifying this into a skill? Because when we codify it into a skill, there’s a few things that gives us. One, it’s convenient. I’m taking that entire task and instead of talking about it over the course of a paragraph, I just tell it to do skill whatever could be a single word and it does it. So the convenience is one piece. The second piece is that
5:04 — since we have codified it, we can also test it using something like the skill creator skill. We are able to actually benchmarks the benchmark the skills we create. So we can see if a does the skill even make sense because it will AB test it against us using the skill versus not having the skill at all. And over time if this skill is good, we’re going to start getting more deterministic outputs from a system that is inherently non-deterministic. Like when we talk about LLMs, there’s a certain randomness
5:36 — to it, just inherent to how it works. Anytime we can make things less random, the better. And by codifying these things you do daytoday and turning them into skill, that’s one giant step forward in doing so. And while that makes sense to a lot of people, if you were to ask them if they’d actually ever sat in front of their terminal, turned their mic on, opened up Claude, and said, “Hey, here’s my daily plan. Here’s what I do. Can you pull some skills out of that?” And then turn them into skills using the skill creator skill.
6:06 — You could probably count the percent like on one hand, which is wild because this is one of the easiest yet most powerful upgrades to how you use cloud code. And this visualization is kind of just there to help you think about it because we do a bunch of different things in a bunch of different domains. And often times we can even combine a lot of the tasks we do into quote unquote like workflow skills or higher order skills that have it do a bunch of different things at once. For example, I have a skill called the content cascade
6:37 — skill. This skill for all intents and purposes is a content repurposer. When I create a YouTube video and I call on the content cascade skill does a number of things for me. It downloads the transcript. It creates a blog post. It creates a LinkedIn post. It creates a Twitter post. It spins up playright. It then posts those things for me. That’s a bunch of different individual tasks all in one. But instead of breaking out into nine different skills, well, now it’s just one skill. And that’s something that can be a huge like productivity
7:09 — boost. But have you done that with all the different things you do in your day-to-day? Probably not. And it’s this process of sort of walking through what you do step by step and codifying it. That’s the power of an Agentic OS. Everything we do outside of this, the memory layer, the dashboard, it’s kind of just a nice little bow around it. And if you’re someone who’s not trying to work with team members, someone who’s not trying to package these things and sell it, you could probably stop here and you’re like, you know, the 80% solution and you’re way ahead of the pack. And so to actually execute this
7:40 — process is pretty simple at its core. You’re just going to do what I said, open the terminal, start a new session, and just start talking. And at the end, say, “Hey, can we turn this into any sort of skills?” Now, I have an entire prompt that breaks this down at a very detailed level of how to do the skill triage. But at its core, that’s all we’re doing. Here’s what I do. Turn it into skills. Sweet. Okay, let’s test the skills. Let’s move on to the next domain in my business, in my team. And the
8:10 — thing is, this is going to be extremely customized and specific to you. I think we get kind of lost in the morass of like the 10 billion skills that are floating around. We go to these mega repos like awesome clawed skills and we look through 10 million different skills thinking this is what’s going to change you know my dayto-day outcomes with cloud code and it’s like you’re kind of looking for a diamond in the rough here when instead knowing that one of the most powerful parts of cloud code is how easy it is to customize it for you like
8:40 — why aren’t we leaning into that more in a systemized way but outside of the custom stuff I think there’s a few things that almost everyone can get some value out of I think on the productivity side. A big one is if you’re in the Google ecosystem. I’ve kind of talked about it before using things like the GWS CLI to basically allow you to do anything inside of the Google ecosystem and turning those into skills, whether that’s like email triage, Google Drive work, or stuff on the calendar. But the truth is, you can also just use the standard MCP connectors that come with
9:11 — cloud code. And I’m just talking about the basic claw.ai, Gmail, Google Calendar, and Drive. The only things you’re really losing there is you’re not going to be able to send emails, but you can still do drafts, which for a lot of people is good enough since they don’t want it to actually send them off. And that takes 30 seconds to do and like it’s such a productivity boost that again very few people actually do. Now, after you’ve gone through the skill creation process, next becomes the decision tree when it comes to automations. For each skill, does it need to be on demand or is it something
9:41 — we can turn into a routine inside of Cloud Code? Now remember, when we talk about routines and automations with cloud code, it’s broken down into two different parts. That’s going to be local automations versus automations running in the cloud. If you don’t know which is which, just stick with local. That basically means it’s going to run when your computer’s on. You have some version of Cloud Up. On the cloud, that means it’s going to be run on Anthropics servers, and you’re going to be limited to how many you can do because they’re basically paying for it. And if you’re on the cloud, hey, it doesn’t have
10:12 — access to your actual computer. It’s not running on your computer. It doesn’t have your CLIs, your skills, your files. So most of the time, it’s just going to be a local automation if you’re in doubt. And this is the process by which you create the backbone for a cloud code agentic OS. And I keep saying cloud code. The truth is cloud code is just the engine. And we’ll talk about this a little bit a little bit more. You could replace this with codeex. You could replace this with really anything. You know, we’re building the chassis for this. We can swap out the engine at any time. So everything I say here also
10:43 — applies to something like Codex. Now let’s talk about Obsidian and memory very quickly before we dive into the command center observability dashboard piece because I think a lot of people get confused about what Obsidian’s actually buying you and the point of it all. Remember the point of Obsidian is simply an organization layer. Obsidian isn’t doing anything special to all these markdown files. It’s simply giving us the human being a way to kind of figure out what the heck is going on in our files and gives us a simple way of sort of connecting them. It isn’t
11:13 — inherently changing the memory. This isn’t rag. It’s not embedding anything. There’s no like vector database despite you know these like cool graphics like this isn’t a true knowledge graph in that sense. That being said, being organized, especially when we talk about being organized at scale with thousands and thousands of documents is very important and it’s not important just to you being able to figure out where stuff is. It eventually becomes important to claude code at a certain scale in terms of token efficiency for finding things.
11:43 — That’s why everyone brings up this right the carpathy rag go through it very quickly. It’s just the idea that we have a vault which is where Obsidian lives and some series of subfolders. Carpathy says, “Hey, we have raw for like unstructured data. We have wikis which kind of break the take the unstructured data and turns it into like reports articles and then we have outputs for like deliverables. So hey I did some research on AI agents which went to raw that research got turned into an article about AI agents in my AI agent wiki. Hey
12:14 — I turned that into a slide deck. That’s sort of the idea. The truth is you don’t have to do that at all. Alls you need to do is you need to figure out something that makes sense to you and it needs to be created in a way that you and Claude Code could snake your way through the folder system if there was a 100,000 files in there. A baseline like this is a good start, especially because there are things called master index files and index files all over the place. These index files are essentially at every
12:44 — level of Obsidian. Remember, Obsidian is just a folder. So, we’re talking about every subfolder we go down, there’s some sort of folder that’s acting like a table of contents. So, if I’m in the vault and I click on the wiki folder, inside the wiki folder is a table of context called an index file, which tells me, oh, inside here we have agents, rag systems, and content creation wiks. Cool. I know where to go. I go inside the AI agent folder. What’s inside there? There’s another index. There is another
13:15 — table of contents saying, “Hey, inside the AI agents folder, we have this document and this document.” That’s the biggest thing I would take out of Carpathy is the idea of indexes and indices and the idea that for every layer I go down in Obsidian in my file structure, there’s some sort of master document that points me in the right direction. If you don’t have that in the beginning, have fun figuring that out when you’re 5,000 documents deep. For me, in my scenario, I have several folders. I have an archive, content, notes, dashboard, inbox, ops, project,
13:45 — systems, wiki. Makes sense for me. I have an index. I understand what’s going on. You, like all these things, need to customize it so it makes sense for you. And speaking of customizations, now let’s go into the dashboard piece. These command centers for these agentic operating systems. We talked a little bit already about the value play there, right? It’s the idea that there’s visibility and I can actually see things that I couldn’t see in the terminal. And we have sort of like these skill panels that anyone could use. The next question becomes, why the heck are there two of them? Why do you have this one inside of
14:16 — Obsidian itself? Cuz I’m inside of Obsidian here. And why do you have this one as a Streamlit app on a local host that’s essentially a web app? What’s the difference between these two? Which makes sense for what? Well, I think the value play for the Streamlit applications or really any sort of web app that’s your dashboard layer for Aentto OS is for distribution. If I want to bring this to a team or really if I want to package this for a client, having it set up like this is super easy. I can have the template inside of a GitHub and I can clearly or very
14:46 — quickly distribute that to anyone anywhere. Setting this up takes literally seconds. And if this is meant for a non-technical team member or a non-technical client, keeping it as simple as possible like this and just having clear buttons that are mapped to skills and it executes them, that’s great. That’s all they want. The Obsidian forward dashboard is a little bit different. you’re trading distribution for really ergonomics at this point and I would argue a little bit more power because it’s super easy as you can see here to also have an integrated terminal inside of your
15:17 — Obsidian command center which basically means I now have the best of both worlds. Not to mention because it’s inside of Obsidian all my stuff is right here for me to play around with. And Obsidian is infinitely customizable like over here right? You know I have my full calendar but this isn’t like a calendar plugin. This is literally me just having the Google calendar web page open and put here on the right hand side on the overview. I have a very clear idea of what’s going on that day, what my tasks are, what’s going on with the activity feed, and like where I’m at across
15:48 — different communities. Want to dive deeper into audience stuff, I have a tab for that. If I want to dive deeper into research, I have a tab for that that shows like trending GitHub repos, stuff going on, hacker news, as well as some of my briefs, which are also tied to skills, things like headlines, things going on X and YouTube, and like content opportunities. Again, having this if I’m in a pure terminal setup is just a little bit clunky. It’s a little more difficult. The problem though with the Obsidian setup, and I kind of alluded to it, is the idea of distribution. How
16:18 — could I distribute something like this to a team or to a client? You can kind of do it because this whole dashboard command center is essentially just a custom plugin that Claude Code created. But it’s a little more again clunky and awkward to set this up for somebody else. It’s not just like, oh, clone it. You’re good to go. It’s like, okay, clone it now. Go into Obsidian now. Enable these plugins. Now move this here. Move this there. Do all this stuff. So there’s a certain awkwardness to it. So if you’re someone who’s like a
16:50 — solo operator and you’re just like, “Hey, I want an agentic OS with cloud code. I want all these cool customizable buttons, whatever they may be, and I also want the terminal like clearly available all on the same pane, the Obsidian forward route is perfect.” If on the other hand, you’re someone who’s like, I’m just trying to package this for teams and clients and turn this into an actual product, the web app is the way to go. But understand these systems are only as powerful as the skill architecture it’s built upon. It’s just a nice layer on top of cloud code because if you don’t have that, this is
17:22 — just some fancy nonsense. That’s all it is, right? You need some actual meat to this. So don’t forget where you make your money. So I’m going to wrap it up there. I hope I was able to make it a little bit clear as to where I think the value in these Aentic OS systems are at. I see a certain contingent of people who really rail on these and say they’re worthless. I don’t think that’s a fair assessment at all. Um, when they do, it’s usually kind of purely targeted on the dashboard side of it, which makes sense if you’re arguing against the dashboard or the command center in the
17:53 — vacuum, but that’s not where the power really is, right? The dashboard and all this is somewhat of a facade. Like what’s going on is behind it. And that’s where sort of the focus I think should be. And if we focus on that and the idea of skills and everything, it’s like, are we then arguing that you shouldn’t have a system of skills that are codified that are based around what you do in your day-to-day life? I think you have a hard time arguing against that. Oh, one last thing, something other people brought up. The idea of costs, which is an important one, especially if you’ve been paying attention lately in the idea
18:23 — that the -p command doing headless clawed code runs is something that apparently Anthropic doesn’t like anymore. And by doesn’t like, I mean they’re throwing you $200 to use exclusively on that, but it’s on API cost. Is there an issue with that in this whole setup? Because, as you can imagine, all this is running headless cloud code under the hood. Yes and no. For 200 bucks a month, you would have to be kind of like spamming these to get to that point. And so, I think in reality, it’s
18:55 — probably not going to be issue. If it was an issue and you felt like you were hitting usage issues or clients were hitting usage issues, I think the simple solution is you just move this all over to something like codec cli because codeex is great and they don’t have these issues as well and you get more you get more bang for your buck and switching everything under the hood here for for codeex very simple. I mean you could use cloud code to do it. You would just point it at the code and just be like all right well switch it. So now it calls the codec cli instead of cloud.
19:26 — This is something you could essentially like refactor in a matter of in minutes and you can even put like a button on the dashboard which I might do. It’s like all right let’s go to the codeex version. So just something to be aware of. In reality I think for 99.99% of people it has no effect. So that’s where I’m going to leave you. Again everything you saw here if you want the actual like my exact setup for this Obsidian command center and everything else you can find that inside of Chase AI plus. and make sure to check out the webinar
19:56 — uh that’s going on, you know, in I don’t know, like 20 hours from this video being posted.