The Ruby AI Podcast

Real-World Ruby AI: Practical Systems That Work

Valentino Stoll, Joe Leo Season 1 Episode 8

In this episode of the Ruby AI Podcast, co-hosts Joe Leo and Valentino Stoll, alongside guest Amanda Bizzinotto from Ombu Labs, delve into the ongoing controversy within the Ruby community involving Ruby Central, Shopify, and Bundler/Ruby Gems. While both Valentino and Amanda share their perspectives on the situation, the conversation swiftly transitions into Amanda's journey and current work in AI and machine learning at Ombu Labs. The episode highlights various AI initiatives, including the creation of an AI bot to streamline internal processes, automated Rails upgrade roadmaps, and multi-agent architectures aimed at enhancing efficiency in Rails projects. Amanda also discusses the challenges of integrating AI in consultancy services and shares some insights on the tools and strategies used at Ombu Labs. The podcast concludes with exciting updates about Amanda's recent work, Joe's announcements on upcoming projects including Phoenix's public release, and Valentino's discovery of a new user interface for Claude Swarm.


00:00 Introduction and Welcome

00:26 Ruby Community Controversy

04:37 Amanda's AI Journey

08:45 AI in Business and Consultancy

16:24 AI-Powered Tools and Applications

23:09 Managing Knowledge Base Updates

24:42 Prompting Strategies and Agentic Workflows

26:02 Understanding Workflows vs. Agents

28:37 Observability in AI Systems

29:06 Advanced Prompting Techniques

31:08 Multi-Agent Architectures

34:32 Ruby AI Gems and Libraries

37:09 Exciting Announcements and Future Plans

41:44 Conclusion and Final Thoughts

Mentioned In The Show:

Joe Leo  00:00
Hey, everybody. Welcome to the Ruby AI podcast. I am your co -host, Joe Leo, and I'm here with my co -host, Valentino Stoll. Say hi, Valentino. Hey, now. And we're joined here by Amanda, who's going to tell us about some of the AI initiatives happening at Ambu Labs. Say hi, Amanda. Hello. It's great to have you here today. And I thought, and I don't know. I didn't get anybody who said no to this, so I thought we could jump right in and talk about the controversy that is roiling the Ruby community today. And maybe we're a couple of days late, but that's okay. We're keeping a schedule here. We are plunged into this controversy of Ruby Central and Shopify and DHH and Bundler and Ruby Gems. Valentino, who you got in this struggle?

Valentino Stoll  00:54
To be honest, I haven't been following it too closely.

Valentino Stoll  00:57
I like to stay away from controversy personally, but it seems like there was some mishandling of the commit aspects of the GitHub repositories and organization. I don't really know who's in the wrong where, but if I had been a longtime committer to a project and just suddenly lost access to something I had committed for a long time and had good faith efforts and contributions, I'd be a little upset too. I

Joe Leo  01:27
think the only thing I know about Andre Arco is that he has maintained Bundler for 15 years. That is his thing. We had him at NYCRB and he gave an impassioned talk. I mean, this is years ago about Bundler, about what he had done, about what the future holds. It was very compelling. And when these kinds of people in our community who do these things, you know, work on these projects as a labor of love. And then have it, they're taken away or, you know, have the sort of how the reins pulled without any real reason, you know, accusation of malfeasance or, you know, some kind of reason. It really does burn me up. I don't think that's right. And when we dig a little deeper, you find like RubyGems and Bundler.

Joe Leo  02:20
RubyGems, there's this conflation of the concept of RubyGems. the open source library and rubygems .org, the website. And I've found that this conflation genuinely confuses a few people, but then is used to mask explanations to a whole bunch of other people. I don't like the way that that is being sent out into the community as justification. Well, we need rubygems .org to be this site where you can get all of these gems and that's great, but that's not the open source library that everybody is maintaining that we're talking about.

Joe Leo  02:55
I find it interesting. There seems to have been a huge push from Shopify. And I understand the stakes of Bundler not being available or RubyGems not being available. Those are pretty high. But I don't know anybody, many companies that are better equipped to, of their own accord, stem the risks of those kinds of things than Shopify, right? They have all the people to do it. They had people on standby ready to take over. support of Bundler and RubyGems once they kicked the maintainers off temporarily. I think that Ruby Central lost some trust, maybe just some amount of trust, bringing DHH in to speak at RailsConf. And I think the way they did it was not great.

Joe Leo  03:43
But for me, DHH, there's a lot not to like about the guy, but he did invent Rails. And so I'm like, okay, so he's at RailsConf. I love it, but I kind of expected it. This I do think is different. This erodes a lot more trust because it's digging into these people who have selflessly given themselves to the Ruby community for thousands of hours over a decade and a half or more.

Joe Leo  04:08
So let me step off my soapbox for a second. Amanda, what do you think?

Amanda Bizzinotto  04:11
Yeah, I'm with Valentino here. I haven't been following too closely, but it could have been handled better, especially from a communication standpoint. And it's definitely generated some ruckus within the community, that's for sure.

Joe Leo  04:23
All right. I can tell that we're not in comfortable territory for my co -host and my guest. So that's all I'll say at the moment. But trust me, I've got lots more to say. So, you know, find me after the show.

Joe Leo  04:37
Let's get into you, the person. What is the thing that is drawing you into artificial intelligence today? What's exciting you about it? What's your journey like to get there?

Amanda Bizzinotto  04:47
Well, I got a background in data science, machine learning. I worked a little bit with that before I joined. And it's always been an area of interest, even though I kind of shifted a little to more operational work, but it's always been, data's always been part of the role for me. And so with the boom that we saw when ChatGPT came out and all the GPT models, it became more and more a day -to -day topic. And so it just felt like a natural thing to check out, getting to just the natural evolution of that. Coming from a machine learning type of background and working with machine learning models. So yeah, I had to just work with the GPT -2 before GPT was all the rage. So it's been really interesting and really fun following all this evolution and how things are shifting as well. And also how it's shaping how we work and what can we realistically do. And also, honestly, making a lot of this technology more accessible. than previously having to deal with a whole lot of historical data and training models and hosting them. There's still obviously a place and a time for AI and for traditional machine learning, but it does make some things a lot more accessible.

Joe Leo  06:00
You said something interesting about coming to this from an ML background. And AI, as we know it today, stands on the shoulders of the ML that has been researched and developed and built for years and years.

Joe Leo  06:15
Most people, including I think most developers, don't really experience it that way. They experience it as we're building an API integration so that we can send our queries to an LLM and parse the results and do something with it. So do you find that disconnect when you come to a sort of traditional boutique consultancy like Ambulabs?

Amanda Bizzinotto  06:39
Yes, yes. There's definitely a bit of a disconnect and our team is primarily Rails software engineers. And so for the things that we're building, a lot of the times it is approached in that sense. We need to build an API integration, talk to an LLM, and we'll be able to get what we want out of it. And for the vast majority of what we do with AI and cloud providers, that works. But there are definitely some situations, especially when it comes to... optimizing how you interact with an LLM or really understanding how these models work and realistically what you can do with it. And also when you get to very specific use cases where you might need to fine tune a model or in some way really customize a large language model where it really makes a difference. understanding what's underneath and the architecture that powers these GPT models, which are all built on top of the transformer. The transformer really hasn't changed a lot from the 2014 paper that introduced it, but there have been some evolutions. And it can really help get the most out of these integrations in the most efficient way possible.

Joe Leo  07:43
This is probably Valentino territory, but I'm curious to know how the transformer has changed in the last 11 years.

Amanda Bizzinotto  07:50
And I say it hasn't changed a lot. It's mostly, if you look at the 2014 paper by DeepMind that introduced it, it was conceived as an architecture to power machine translations, right? And so for the GPT models that we have today, it's really just power generation, but we're not necessarily doing machine translations. So you don't need the dual encoder decoder, right? Like mostly you have one and that's it. We have seen some papers recently, some from Meta, for example, with the Lama series that then modified a little bit to include adapters or routers within the transformer that can then lead to smaller specialized models. So this has been one of the main evolutions. But in terms of the overall characteristics of how it works, how the attention mechanism works, which is what really makes this model really powerful, the backbone is still there.

Joe Leo  08:41
So you're part of a... Consultancy, I've run a consultancy for 11 years. And so the kinds of stuff that you and your team experience is probably familiar to me.

Joe Leo  08:53
I'm curious how things have changed over the last two to three years, let's say, since the advent of ChatGPT and since everybody in the world has said, well, we need AI with this, which, you know, when they may or may not really understand what they mean by AI and they really may or may not understand. what it means to integrate it into their project. So how do you navigate the varying levels of AI competence and demand among the customers of Ombulabs?

Amanda Bizzinotto  09:24
To a certain extent, everyone is curious about AI. It's all the rage. Everyone's doing it. So we basically have, I would say, three different types of situations. You have people who are curious about AI. They believe they could do something with AI, but they're not sure what. And so there is all that process of finding the right fits. Like what can AI actually do for you? How can it help you? What's the most efficient way to integrate it? And then we have a very clear idea of what they want to do. They just don't know how, because there's a lot of stuff out there. There are a lot of libraries. There's a lot of new concepts. And so it's like, okay, I think I can solve this problem with AI. I have a clear idea for what I want out of this, but how do I do this? And the third, which I think is probably the most interesting is, People who are very convinced that their problem needs AI and you need to convince them that it actually doesn't. Because there are some situations where you are actually better off training a classifier, for example, than trying to do whatever you're trying to do with a language model. There are also some situations where a rule -based system is going to give you 95 % of the result with 10 % of the cost. Yeah. That doesn't make a lot of sense to rely on LLM for.

Valentino Stoll  10:38
That's really funny. Classification is usually the first thing people reach for, for an LLM. And they're like, oh, it can bucket data for you. And it's like, not really great at it.

Valentino Stoll  10:50
But when you get a lot of throughput through, right? Like at first it might seem like it's doing a good job. Like if you're in chat GPT or something. And then if, if you've ever used a like traditional ML classification, even a, bad one is good in comparison yeah

Amanda Bizzinotto  11:08
exactly and also elements aren't great with tabular data and so if you have a whole lot of historical tabular data you're probably better off training a traditional machine learning model interesting

Valentino Stoll  11:19
yeah so i'm curious as you start to adopt or onboard a lot of businesses and varying use cases are there specific use cases that you've seen like surface as, oh, like this is an obvious choice and like easy recommendation for businesses. Where are you starting to see the most useful applications? I guess it will depend on the kind of business, but.

Amanda Bizzinotto  11:45
Depends a lot on the kind of business, but honestly, one question we get a lot is how are we using AI to improve or make the Rails upgrade process more efficient? So at the end of the day. Ombolabs are the makers of FastRuby, and FastRuby is really focused on Rails maintenance and Rails upgrades. So that is the main use case we get asked about, is how can AI help with my Rails upgrades? And so we've been spending a significant amount of time and effort on that because we truly believe with the information that we have and with the power of language models, it is very possible to make that process faster, more efficient, even if it is to have a hybrid human agent team that is working on a Rails upgrade. Other than that, there are always the small use cases. I say small because they are a little bit simpler typically than this very specialized coding agent that can help you save time, especially on very, very manual processes. So things like automating part of your routing process, part of your sales process, part of your marketing process. And we do that for ourselves as well. We've recently built a small agent that... Basically helps our marketing team with our newsletter. We have a bi -weekly newsletter. Our team suggests links. And as you can imagine, they're fairly technical links. And our marketing team doesn't really follow what's happening in that article. And so now we have a little dashboard that the marketing team can use to organize those links. And then an agent in the background that's interpreting the article and generating the snippets for them and saying, okay, here's what you should say or the main topics. And so it helps us automate that process. It's helped us save a lot of time. That goes into creating a newsletter by reducing the back and forth between the marketing team and our engineering team. We're like, okay, what is this about? Is this summary good? Does this work? It's really helped with that.

Joe Leo  13:37
Yeah, that's cool. I want to go back just a minute to what you said about FastRuby and upgrading Rails. That is where Phoenix started. Like we started with the idea that, hey, we can have an LLM kind of go and we can do with one LLM, but we could build some agents that could go end to end and upgrade a Rails project.

Joe Leo  13:56
There are some really gnarly Rails projects out there and upgrading, it's a really multifaceted task when you get into all of the different downstream gen versions, all the gen dependencies, all of the tests that need to be upgraded or changed to support the changes that are happening in each dependency to say nothing of just Rails itself, right? Or Ruby itself and those changes that come with the upgrades.

Joe Leo  14:23
you know, long story short, we're like, okay, well, that's a really tough project. That's really tough problem to solve with SaaS software, which is what we're going for. So we changed direction, we pivoted a little bit. So I'm curious to know what you've learned when you've tried to shorten the timeline or replace some of the manual work with an LLM or an agent.

Amanda Bizzinotto  14:45
Indeed, it's a pretty complex process. And the way we've been tackling it is we have a very established process that we use to upgrade Rails. And so we've started tackling it step by step. So the first tool that we built is actually the automated roadmap to upgrade Rails. So the way every Rails upgrade starts for us is with a roadmap, which is basically an analysis where two engineers go in, analyze your code base, create that action plan. So the question for us was, can we automate this or at least part of this?

Amanda Bizzinotto  15:17
As it turns out, we tend, and it's not 100 % like a human roadmap, but it gets you 85, 90 % there depending on the project. Now, it's all static analysis, of course, so any warnings that you would only get at runtime, it can't handle. But it is pretty good at that repetitive task that our engineers used to do manually where you have to get a deprecation warning, look at it, grab the code base, see if whatever is being deprecated is being used in that code base, and then put it in the action plan. Our automated roadmap does that for you. And so it's basically an agent that just does that. It's highly specialized in taking the data that's already processed. So taking the information about a notification warning and then using tools that mimic what a human would do to search for the code base and try to find usages of that. The way that helps us, we can generate a roadmap for you that includes every single notification warning that's raised between two different versions of Rails, but it's a lot. We have some versions where you have 90, 100 warnings. And so the agent can pare it down to 10, 12 that are actually relevant for you.

Unknown  16:22
Okay.

Joe Leo  16:23
Now, the way you and I met actually was the night at Artificial Ruby that you gave a talk on the AI powered Slack bot, which initially I thought, okay, as this is being, as you're just kind of introducing the concept, I thought, all right, sounds like it's kind of, there's a little bit of a vanilla. you know, pour some words into an LLM and see what you get back. But it's not. It actually goes a little deeper than that. And I was impressed by its functionality. So do you want to tell us a little bit about that and then tell us maybe since that was months ago, you know, has it evolved since then?

Amanda Bizzinotto  16:56
Yeah, that one was a pretty interesting case. We had a big problem that started years ago, really, with our team members having a really hard time finding information. We have information distributed in a lot of places. So it really started as, OK, can we concentrate this? information all in one place and then build a chatbot on top of it because that's what chatbots are really good at. And so we have primarily three sources of information, right? We have Slack, we have learning channels and all that. We have our knowledge base where all of our internal information is. We have a blog and those are three primary sources of textual information. Then we also started having two additional sources of information last year. One was office hours that are recorded. And then we also have just talks with the team. team members give once every month that are also recorded. And finding information in a video recording is painful. Like you don't know if that's the right recording. You spend 30 minutes, 40 minutes of your life watching a video and then you get to the end and it's the wrong one. Nobody likes that. And so with the evolution of all this transcribing models as well, it just seemed like a pretty good use case for it. So it was a pretty interesting. application of it. We basically got all of this data into one centralized place. So transcribed videos, textual information, Slack information, formatted it somewhat similarly, and then we could do rug on it and then put it in a Slack bot that the team can just access and do search. The way it's been evolving, like it's still, we had to shift priorities. So it really didn't do a lot of follow -up on that, but yeah, we're doing that now. The way it's evolving is to add some additional capabilities to it. Right now, we really just focus on our engineering knowledge base and the technical knowledge. So we're trying to see, okay, can we add other things to this? Can we make it smart enough to separate, okay, this is a technical question. I'm going to go here. This is a support or company policy question. I'm going to go there, but this person is a contractor. So there's some information that they don't have access to, or this person is a team member, but not in an HR capacity. So there's information that they don't have access to. And so it creates this interesting. permission problem. And we've also been investing in adding some guardrails really to help prevent hallucinations. Like it still hallucinates a little bit. And so we've been adding some guardrails there to help prevent that in situations where it doesn't have data or information is too similar for it to identify that it's unrelated. Like we have some situations where someone's asking a question specifically about Rails 4 .2. But then it's pulling information from an article about Rails 6 because it's similar enough.

Joe Leo  19:36
I see. Yeah.

Valentino Stoll  19:38
I really loved this talk. I went back and watched this event. But I think this puts into perspective probably the most common use case of trying to extend the knowledge of these LLMs. And you kind of really broke it down in a really great way of what all the problems are. And I really liked how you solved them all. And so I'm curious. Thank you. Part of the problem is like hallucinations, like you mentioned. So I'm curious, like what you use for those kind of guardrails, maybe what your approaches are for grounding information. Like how do you make sure that it generates and references those, not that knowledge base? What is your strategy for like managing all that kind of complexity?

Amanda Bizzinotto  20:25
We've been leveraging a couple of libraries out there. The primary Python libraries, I'm just going to connect to them. But there's one called Guardrails AI that offers a really robust system of guardrails that helps you prevent a whole bunch of problems, really, that can help with LLMs. And one of the things that it helps prevent is hallucinations. And it does that by basically allowing you to hook into a natural language inference model, which is one of those models, you know, those logical problems, right? You have all given A and B is C true. It's basically a model trained on information like that. And what it does is it takes in your source data and their generated response and it evaluates sentence by sentence, given the information I have, can the sentence be true? And that's how it helps you kind of prevent hallucinations and rethink the response if it cannot be grounded into your source information.

Valentino Stoll  21:18
Just real quick, what are your go -to models for that grounding fact -checking?

Amanda Bizzinotto  21:23
We've just been using the... Guardrails AI one. So they have this hub of Guardrails that are available and you can just use directly from their hub. So we've just been using the out of the box one. Another interesting one is sometimes we want to make the information available to everyone using the bot, but not everyone has access to the same information, right? Because we have multiple teams working on different client projects and we always want to share client information. And so you can use models like Microsoft Presidio. to remove any kind of sensitive information or PII from the response, from the data even. So it anonymizes it before even sending it to the language model. And so you don't leak any sensitive information either to OpenAI or to someone who's benefiting from the technical information there, but really shouldn't have access to any private information.

Joe Leo  22:11
So the PII is stored in this data store. What's the data store?

Amanda Bizzinotto  22:16
Not in the data store. Sometimes we have information in the source, like our knowledge base, for example. And the source itself is the vector database that we use as Quadrant. And so we don't have PII in Quadrant. We have PII in the original source. And so sometimes when you fetch information from there, there might be a snippet with a comment that has something in there. Because the knowledge base is a lot more gated, of course, because we have notes and things as projects evolve, they get deleted at the end of the project. But as the project is happening, teams might add information there. And because it's a closed space just for the people who already have access to that information, you might end up having some sensitive information in there.

Joe Leo  22:56
Yeah, of course. That's interesting. And then, you know, it is filtered out before a response is sent through to Slack. Yeah.

Valentino Stoll  23:05
Yeah, so I'm curious what your strategies are around knowledge -based updating processes. How do you manage that pipeline reliably?

Valentino Stoll  23:14
you know, you start to update an article somewhere and then there's some kind of lag ingestion process, but then kind of like your clients need to talk to it the same way still. So like, how do you handle like that transition of like, oh, I'm going use a different chunking strategy or something. Do you change that ever?

Valentino Stoll  23:35
Yeah, I'm curious.

Amanda Bizzinotto  23:37
So far, we haven't changed it in the sense that we still use a sliding window chunking strategy. for all of that content. What we do is pre -process the text before it goes into the chunking pipeline. And so we have a prompt that basically tells AI workflow, okay, like take this document, analyze it, put it in this format, like break it down this way, put it in this format, rewrite it. And then that can go into chunking because we needed to standardize information that's really coming in a lot of different formats. Like the way you write a knowledge base article. versus the way you just write information for yourself to remember specific characteristics of something you're currently working on versus a talk. Those are very different. And so we needed to kind of like find a common ground there, a way to really format this information in such a way that we could have some consistency in the chunking and the vector retrieval as to not confuse the retrieval process.

Valentino Stoll  24:36
Yeah, that makes a lot of sense. It aligns with a lot of my experience, too.

Valentino Stoll  24:41
So this is kind of like a good transition into like prompting strategies and like agentic workflows it seems like you guys have a preference to create specific agents that like perform tasks for you in specific ways is that right yes

Amanda Bizzinotto  24:57
but for a lot of the things that we do agents are a good tool mostly because they can make autonomous decisions and so we especially for internal tools or things that are really just there to save us time and automate processes for us It's relatively low risk in the sense that there's nothing in there that needs a high level of security or auditing, for example, that a workflow would give you. And if the agent makes a mistake or hallucinates a little, it's internal. Like our team is going to catch it and that's it. So they are a pretty good tool. But we do have some systems as well that are just a one element integration, just a call to a language model directly because really like you don't need an agent. It's just OK. This is the kind of task that LLM is good for. So we're just going to send it there, get it back. Summarization, for example, is a good one. And we also have some that require a higher level of observability and auditing where we use workflows instead. So it's not an agent in the sense that it's not autonomous to make decisions. It just follows the steps that we outlined, but it can still use tools and all that. You just know exactly what's going to happen next.

Valentino Stoll  26:01
So can you enlighten our listeners here on what you're talking specifically about workflows? Because I feel like. The industry is trying to normalize some terminology here. And maybe some people are thinking workflows in a different way.

Amanda Bizzinotto  26:15
Fair enough. I think there's a lot of difference in terms of how these terms are defined. So what I mean by workflow and the way I've been trying to differentiate with our team workflow from agents, a workflow would be basically a set of steps that you can execute, right? And you execute them in order. An agent is a little bit less predictable in the sense that your agent can make decisions as to like which tools to use when, for example. So when I say a workflow, what I mean is going back to the automated roadmap, for example, that is an agent because it has different tools available to it. And based on the query, it decides I'm going to use this tool. Okay, now I get this result. I'm going to use this other tool next. We have one tool internally that helps us automate. reports that we send to our maintenance clients, just reports on the state of the project and all of that. And it really is just a tool for our team. Like it helps them automate part of that process and makes the review easier. For that, we need a higher level of observability. And so what we have there is a set of steps. It's like, take this information from here, do this. Now go there, take this information, do this. Now go there, take this information, do this. So the agent has absolutely, it's not an agent in the sense that the process has absolutely no decision -making power. It just executes what we tell it, but there are LLMs in each step.

Joe Leo  27:32
And is there an opportunity with a workflow versus an agent? Maybe it doesn't matter which one, but is there an opportunity for observability and monitoring in a workflow that may not be available in an agent?

Amanda Bizzinotto  27:45
Observability tools have gotten significantly more robust. So I wouldn't say there's necessarily a huge opportunity that comes with workflows that you don't have with agents. But if you take a library like LaneChain, for example, on my index, and you take the agents that are available out of the box, there are some things happening underneath that you might not get as good a look at. So you can still see the process, you can still see the tools that were called or the order of execution. but you're losing predictability. And with a workflow where you know exactly which steps are going to be executed and when, you have a lot more predictability and you can then add a whole lot of logging and tracing around each one of those activities because there's nothing underneath you that you don't control. You control the prompts, you control the outside system calls, you control everything.

Amanda Bizzinotto  28:36
What do you use for observability?

Amanda Bizzinotto  28:39
Right now, we're using LengthFuse. It's integrated well with our tools. But our recommendation for anyone already using tools like Datadog, for example, or for really large production systems is to use those integrated observability tools. So Datadog has an LLM observability tool that's pretty good. I think New Relic added one as well. It really helps. So we've been using LengthFuse. Nice.

Valentino Stoll  29:03
Yeah, I like LengthFuse. You have some great articles, by the way, on prompt engineering techniques. And I love your React pattern implementations. article as well. If you're interested in learning, these are great examples because they give the literature on why different prompting techniques are effective and what they're good at. So I recommend people go take a look at this. But I'm curious, when it starts getting more complicated and the reasoning aspects start coming into play, is React something that you would pursue? Do you see other strategies kind of working their way in here. How do you start to think about even the systems as you start to introduce reasoning into these workflows or agents? Do you have like some thinking mechanisms or thought process you go through?

Amanda Bizzinotto  29:58
Yeah, absolutely. So React is a really robust choice. We always consider because it's pretty powerful in the sense that it can integrate the tool calling with a chain of thought prompt, which is also pretty robust. prompting strategy. And so we basically create an agent that has its own scratch note book, so to speak, that's saying, okay, I need to do this, calls the tool, gets the result. All right, I did this. I got this. What do I do next? And it automates that loop for you. So it's a really powerful one, but you obviously have some other strategies as well that you can use that are pretty good. There's a planner executor strategy that's really nice, especially when you have very complex workflows. You basically use a planner first that takes a task and is like, okay, this is the plan. This is how we're going to execute this task. And then it starts calling sub -agents that are each specialized in one thing and are going to execute that. And the planner can modify the plan as it goes and as it gets more information. So planning is a really, really robust tool as well, especially if you have complex. workflows. One thing that we've been looking at more and more recently as well is multi -agents, because as we tackle more and more complex use cases, it gets to the point where if your agent's trying to do too many things, then it's going to start failing more often. And so multi -agent systems can be very effective at orchestrating these multiple agents that can work collaboratively in complex tasks. And then you can also get creative with what we're doing. We've had uh situations where you can have like a blog writer and then a reviewer that reviews this writer's work but then maybe you have three writers and each one of them gives you a different version and now the reviewer is like okay which version is best or how do i compose this you can have agents that can call other agents as tools so your one of the tools available to your agent is a different agent that specializes at a different thing so this thing's going to be very very flexible But yeah, at the end of the day, the ones we've been using the most are either a planner and reviewer strategy. So inside the same agent, you have an execution, like plan, execute, review, flow, and the React agent.

Joe Leo  32:18
Yeah, I like that. We evolved Phoenix the same way, you know, starting with a single agent. And then I think planner was the first of the multi -agent chain. Reviewer is actually the most recent because... reviewer needed its own set of guidelines to say, okay, well, what is a good Rails test, right? And how do we grade them? And so now we have a grader that goes back and says, okay, these are, well, I guess you could consider at first the reviewer was something like, well, do these tests pass or fail, right? And let's get the failing ones out of there because nobody wants that. And then it went into like, actually, are these good tests? Let's grade them, right? And then kind of evolved from there. So that's interesting. And I'm curious to know, we've talked a lot about the work that is going on

Joe Leo  33:01
internally at Ombu Labs. But of course, you also have services that you are building these kinds of solutions for customers. And so have you gotten into some of these multi -agent architectures on behalf of your customers as well?

Amanda Bizzinotto  33:18
Not so far. We're really hoping to get a use case where a multi -agent architecture would make sense. But so far, we really haven't had one for a customer. So it's really been mostly just internal tools that we're developing. Indirectly though, the one significant application that we have with a multi -agent is for client -facing work. Because one of the questions that we get asked the most is, how are you using AI to make the Rails upgrades more efficient? And that's where one of the current effort is going into developing basically a multi -agent system that can be your upgrading pair. And basically it's a tool. initially to help our engineers perform Rails upgrades in a more efficient way. But eventually we want to grow it into something that we can also offer customers that need to upgrade their Rails applications as, hey, here's an agent that can help you and guide you through the process. It won't be an agent that can upgrade for you. Upgrading can be an earlier process, but an agent that's capable of building on top of not just the data that's available out there, but also data that we've accumulated over the years with the amount of upgrades that we've done. and help you find the best solution to those problems.

Valentino Stoll  34:31
So I just noticed from your examples, Amanda, that you use almost no gems. It's all like a Faraday. And so I'm curious, like, do you have any AI, Ruby AI gems out there that you do like the interfaces of, especially building agents and things like this? Are there ones that you see promising at least? I'm curious what your thoughts are there.

Amanda Bizzinotto  34:54
So yes, we've started actually keeping track internally of Ruby projects that we find interesting in the space. I had a chance to talk to Justin Bowen about ActiveAgent. I've been looking into that a lot and we've actually used that for an internal example and we're hoping to do more with it. So that's one gem that I really like that can help with the agent process. We've used some features of the library formerly known as LaneChainRB. Yeah, it's gotten really, really mature as well. And recently, I've also been looking into ways to better integrate DSPyRB into our projects. So yeah, there's definitely a lot of interesting things out there. To be honest, I think the gems we've used the most have been PGVector, which is just to help us do vector search with Postgres, and along the same lines, Neighbor, which just integrates that for Rails. A lot of those examples were really built before the projects I mentioned were very mature, with the exception maybe of Linkchain RB, which is why we're basically doing the integration, doing everything basically by hand, so to speak. It also helps in the sense that our team is very interested in this AI development, as I'm sure everyone is, and excited to get to work on some of these internal tools as well. And I think it's important to know what's underneath you. And so I've been... focusing on that a little bit as well so that we can shape that knowledge before we actually go and use a library right like you're using a library to build an agent so what actually is an agent what's underneath this what's happening because we like to give fancy names for things that are actually not that complicated

Valentino Stoll  36:29
that's true and i happen to agree if you're just like summarizing something maybe you don't need to pull in like a giant library like lane chain rb to run that single api call well thanks for sharing I said I could breathe. Well, honestly, we should have you back on to just tear through all of these great libraries and how you're making use of different things. Because I think your applications from a raw perspective are very well put together. So I'd be interested to find out more at another time for sure. Thank you. Yeah, me too. Yeah,

Amanda Bizzinotto  37:05
absolutely. Happy to. Cool. That'd be great.

Joe Leo  37:08
At the end of each episode, we like to go through and just share one thing that has been really interesting to us. One thing that's either inspired or just given us something interesting to think about. And so we'll do that this week and I'll start. I have two. The first is that the well -grounded Rubyist edition four is going to be released in its early access program. So if you've ever seen this book, any of the first three versions, another one is on its way. And if you are interested in getting a discount on the early access Manning book of the well -grounded Rubyist, we will have codes for you in the show notes and in the sub stack email. So take a look and pick up a copy. The other exciting thing is that Phoenix is making a public release this month. We are going to have a special artificial Ruby event. On October 15th, our lead engineer, Steve Bruds, is going to come talk to us a little bit about Phoenix. And there on that day, we will have a private beta sign up for anybody that's there and wants to join and wants to check it out, followed by a larger release about a week or two after. But we're really excited about this. Until now, Phoenix has really just been sort of enterprise B2B. We've gotten some great customers and some great feedback. We're ready to open up the doors a little bit and let everybody in. So look for that coming a little bit later this month.

Joe Leo  38:39
All right. How about you, Amanda?

Amanda Bizzinotto  38:42
Cool. Yeah. Well, we've just released our automated Rails roadmap. So Rails upgrade roadmap two weeks ago. So yeah, it's available. It's free to use and you can find it on the FastRuby website. I think we've also linked it here in the chat, but yeah, you can use our little automated roadmap. So that's very exciting. And I think in terms of what's next for us. Yeah, like you mentioned, we're working on agents that can potentially pair with one of our engineers and make Rails upgrades a little bit more effective. It's still early stages, but we're pretty excited about it. And we're hoping to be able to test it in a couple of months. So that's been really interesting.

Unknown  39:22
Very cool.

Joe Leo  39:23
And yeah, we'll have that in the notes. We're excited about that.

Valentino Stoll  39:27
Yeah, I'm gonna have to give that a spin and see what kind of plan it makes.

Valentino Stoll  39:31
Yeah.

Amanda Bizzinotto  39:32
Nice. I'm curious. and if you have any feedback just let us know yeah you got it always happy to hear how we can improve it upgrades

Valentino Stoll  39:40
are so painful and i think creating a roadmap i wouldn't even thought of that i would have just gone and done it and then left notes for somebody uh you know yeah and then just have ci try and like keep trying yeah i love the uh you know approach to create a plan and have something more definitive I think everybody needs that. So yeah, I appreciate everything that you guys are putting together for that.

Joe Leo  40:08
I appreciate your brute force method, Valentino. I'd like to see it at work. It doesn't really work. Forced upgrades by Valentino Stoll is coming. Forced upgrade jam.

Valentino Stoll  40:22
Brute force your way through GitHub, you know, AI action.

Valentino Stoll  40:29
Yeah, don't do that, please. We

Amanda Bizzinotto  40:32
have some LLMs suggest us deleting your tests to fix them. Oh, yeah, that's a tests, no failures.

Joe Leo  40:38
Yeah, I know plenty of people that have taken that advice.

Valentino Stoll  40:42
I've seen too many LLMs just be like, oh, okay, I guess I'll just, you know, remove this test because it's not related to the changes I'm making.

Valentino Stoll  40:52
Right.

Valentino Stoll  40:54
I have one to share. Actually, it came from the last... artificial Ruby meetup and it's a swarm UI. So there's a cloud swarm project from Shopify and the same author of that Aruda. He has this swarm UI, which is a rails app user interface for cloud swarm. So if you use the cloud swarm project at all, which I recommend you also check out, there's a nice now rails interface where you could create projects and spin up. coding agents, one tasks and a nice user interface. And it's pretty fun. I haven't gone too deep into it, but my preliminary tests have been successful. So yeah, kudos to him.

Joe Leo  41:40
Very cool. Thank you. Well, I think we've reached the end of our show. Amanda, I want to thank you very much for being such a great guest and enlightening us. And we wish you all the best in the continuing work, especially all of these internal tools and the work that BooLabs is doing. You know, these kinds of things are important as we continue to mature as a community and as language and tool builders around AI.

Amanda Bizzinotto  42:10
Thank you. I appreciate it. Thanks for having me. This has been a lot of fun. Thank you.

Joe Leo  42:13
Sure thing. We'll see you again soon. All right, everybody. Thanks for listening. And we'll see you again soon. Bye -bye. Peace.

People on this episode

Podcasts we love

Check out these other fine podcasts recommended by us, not an algorithm.