Hi everyone, my name is McKay, I'm a software developer and I spent all my time working on various AI projects, and today I'm going to be showing you how to build my paulgram GPT web app here. So what program GPT is? It is a open, open AI powered Search tool, so it's a basically a q a bot that utilizes open ai's text embedding model to embed all of the essays from Paul grams. If you don't know who paulgram is, he's a famous Tech entrepreneur, an investor. He's he's the guy who started white combinator, which is the most successful and most famous startup accelerator, so he's pretty popular in the tech industry and he he also happens to be a really great writer, and if I pull up his essays here, he has this whole uses like 200 essays.
He's really good writer. He writes about all sorts of stuff, a lot about startups, so a lot of people find his writing really helpful, and so what I did was I built a web scraper that scraped through all of his essays, grabs the text, chunks them into sections and embeds them with openai embeddings, and so what that means is we're basically taking this text, feeding it into a model that spits out a vector, which is effectively just a bunch of numbers that represents kind of the semantic meaning of this text.
So if I just show you how this works, really quick, how do I, how I do? How do I start a startup? Okay, so what this is going to do, it's going to take our search query here. It's going to embed this itself. Okay, so we're turning this into a vector of a bunch of numbers. I'm going to go ahead and hit send, and what this is doing is it's hitting the open AI API, specifically the chat GPT model, which is GPT 3.5. It is taking all of these passages that we're performing similarity search on in our database. We've uploaded all these different chunks of text from this season where database we're performing a similarity search function on that. We're taking these passages. We're injecting that into a prompt that gives the model context that it can answer, provide an answer to our query, right, so we try another one. What is a hacker, Maybe? Okay, so we get different results here, obviously, because this is a different search. So what I'm going to do is I'm going to show you how to build. We're not going to build the full version of this, but we're gonna. I'm gonna show you how to build all kind of the tricky stuff underneath that makes this work and, if you are curious, I have a repo here. It's open source, so this entire code is open source.
If you want to go through it anytime or you get lost, I'll have the link for this, so go check it out. At least about two weeks ago got pretty popular on GitHub and just a lot of people have been asking me to do this, so I'm gonna give it a shot here. So if you do like this video, drop a like. It does actually help me know if this is worth my time. I really try and go off people's feedback, so if you do find it helpful, let me know if you have any feedback for me. I would love constructive feedback in the comments. So this is gonna be my first video here. Hopefully we do a lot more. So this is going to be kind of the the trial run, so let's get to it. So I'm gonna go open my terminal here that I have right here and what we're going to do right off the bat, so maybe I'll move this off the screen. Hide this, okay.
So what I'm going to do is we're going to create a next JS app. So what I need to run is npx: create next app at latest with: we're going to be using typescript. So our Tech stack we're going to be using here is nextjs with typescript and we're going to be using Tailwind CSS for the UI. So that's kind of the core of what we're going to be working with here. So I'm just going to run this in my terminal. It's going to ask us to give a project name. We'll just call it embedding PG, I guess. Okay, I'm just going to use all the defaults here, which is kind of how I like to roll with things. So we're going to sit here for a sec while this installs and then we're going to change. We're going to move into that directory, so I'm going to CD to embedding PG, okay, so now we're in that directory and we're going to open this in BS code. So I'm just going to run the code space Dot, and this is going to open NBS code. I'm just going to trust that real quick. Okay, let's go full screen here. So what I'm going to do now we'll work with the S codes terminal and just we're going to do a few setup things really quick.
So first I'm going to we need to install Tailwind. So we're going to run npm eye for install. We're gonna give it the developer flag, which is just Dash, capital D. And for Tailwind, we need Tailwind CES, we need post CES and we need Auto prefixer. Okay, so we need these three things, and then what we need to do is we need to initialize Tailwind. So we're going to run npx Tailwind CSS, init, Dash p, okay, and this is going to knit Tailwind. So you're gonna see this Tailwind configjs file, and then what you need to do in here is we basically just need to tell it what files it needs to. We just need to tell it what files are gonna have talent in it.
Okay, so I'm going to copy and paste this. I have this right here, so you just need to run that or write that in this array. Okay, save that. One other thing we have to do really quick is go into the Styles directory in globals CSS. We're just going to delete all of this, and then we need to import the Tailwind directives. Okay, so we're gonna go at Tailwind base. We need at Tailwind components and we need a tail and utilities.
Okay, we're going to save that. It's complaining at me here- semicolons, okay, so now that we have that, we're gonna exit out of here. We're gonna just do a couple quick file clean up things. Let's get rid of this whole module. We're not going to need that, okay? So now what we're going to do is we're going to build the web scraper. So I'm just going to create a new folder here, or actually we're going to call it scripts.
It's called scripts. And then in scripts we're going to create a new file. We're kind of I'll call it scraperts, okay, so this is where our scraper is going to live. Actually I'm just going to call it scrapets, that's how I prefer to name it. And then we're going to go to our packagejson file here and in here we're going to create a new script that we can run and we're gonna call this scrape and this needs to be TSX scripts: slash scrape dot TS. Okay, and this is just going to let us run our script like this: if we go run npm, run scrape, it's going to run that and you're gonna see it says TSX command not found. We need to install that. So we're gonna go npm, install TSX and we'll have that as a Dev dependency once again. Should see it pop up here, okay, and then if we run that command again, it is actually working. It's just completing because this file is empty. So let's make it not empty. We're going to create a self-invoking asynchronous function here using Arrow functions, and what we also need to do really quick is we need to install the packages that we're going to be using for our scraper.
Okay, so we're going to need axios, we're going to want to use cheerio and we're going to be using a tool called gpt-3-encoder to handle some token stuff. So we'll go ahead and install those. We'll also throw the dependency flag on there and if you look in your packagejson file, you'll notice we have all those installed. Okay, so we're going to go ahead and import those for axios from axios with cheerio. Maybe I should just really quick explain why all three of these packages are. So axios is just going to let us make the HTTP requests that we need to make to go scrape Paul Graham's website.
Cheerio, and the way you need them for this is you need to do: import all as Cheerio from cheerio and then we'll import the code function here from the gpt3 encoder. Okay, so we're going to use all three of these things and Cheerio lets us- as I was saying, it allows us- to parse the HTML that we're going to fetch, so you'll kind of see how this works in just a sec. So the first thing we're going to do here is we're going to create a new variable. We're going to call it base URL. So this is just going to be the URL of Paul Graham's website. It's going to be HTTP colon slash- wwwpaulgramcom, just like that, and that's going to give us the ACR of the website. So maybe what I should do here- so let me bring in his website back in here, drag this down and I'm going to throw it over here, because we are going to be doing some stuff with it- I'm actually going to open up the developer console. Let's put this at the bottom. Give you guys a little bit more viewing room here.
Okay, then we can just inspect the HTML. We're gonna go to essays, because that's what we're going to be working with. What we need to do is we need to. We need to do two things right. So the objective here with the scrape is we want to take all of this text from each one of these essays and we want to chunk it, in other words, split it up into little sections, and then we want to embed it with open ai's embedding tool and then we want to upload it to our database, okay. So we're gonna go ahead and do build a little scrapey tool in this section. So first thing we're going to do is we're going to create a git links function, okay, and that's going to be an async function that takes in- we actually doesn't need to take in any parameters- and what we're going to do here is first we're going to go ahead and get the HTML so that we can work with it and try and find a way to parse these links off of this webpage. So we're going to create a new variable HTML and this is going to be an axios get request to programs website, and what we actually want to do here is we need to do base URL and then we need to do slash articles HTML, which is the page that all these links are on. Okay, so that's going to fetch is HTML. And then we're going to load in the HTML to Cheerio and just kind of the way that, just the sort of traditional syntax materials to save it in this dollar sign variable. So that's what we're going to use here, and if I log this, it's actually just log the HTML here.
I just want to show you guys how this works. So let's run the get links function. So the way we can run this now is we can go npm, run scrape. Right, that's that thing we did in the packagejson file to create that script. Okay, it is complaining at me. Oh, I accidentally put TSX in here instead of TS. Okay, so let's change that and let's try running it again. And as you can see here, we got all of the HTML from this web page, okay, so that's what we're going to work with. So web scraping is a little bit of like a half art, half science kind of thing. So I've done sort of my fair share of it. So I I can work pretty fast. But if you're new to it, you know, if you're new to trying to get text off of web pages and parse it and do some interesting things with it, let me just kind of walk you through an example of like how I think through this. So what we need to do is we need to sort of look through the HTML here and figure out a pattern as to how this data is stored on this web page, right? So if you kind of go in and inspect it, you know we have all these different things. We got this table. Everything seems to live in here, and then if you open the body here, maybe we'll click on one of these links to get a little deeper. Okay, you can kind of see how everything's nested. So I I'm going to skip through this a little bit because I don't want to get too hung up on the particulars of this, but sort of the pattern I noticed was that there's these tables. So if you see this is the first table, then there are these like nested tables, and so you can see right here all of the links. They sort of live in this table right here. So what I wanted to do is I want to get access to this table so that I can get access to the links inside it. Okay, and the way I'm going to do that is we're gonna go create tables of variable and we're going to use this table selector so that it just selects. So we have all the HTML stored here and then we're going to pass in the table selector which just selects the table elements. Okay, and what's cool is we get this, this each function where we can Loop over each one of these tables. And so what I'm going to do is I'm going to check if it is the third table again. First table here, second table here, third table here with the links. So we're going to check if it is the third table and you're gonna notice this is a two. That's because the index is zero based index. So the the two here actually represents the Third table. So if that's true, what we want to do is we want to get all the links off of it. Okay, so we're going to pass in this table here- sorry, these should be parentheses, excuse me- and we're going to find all of the a tags, okay, so these are all the link elements here.
And then we want to do a similar thing that we just did with these tables, where we're going to Loop over each one of these and what we're going to do is we're going to grab the URL off of it, and the way you grab the URL is again similar syntax link: pass in the link there and we want to get the href attribute off of it. Right, which is what the link is cool. And what we also want to do is we want to get the title, which is just the name of the essay, and the way you can get the text off of that is just to call this dot text method, and this is going to give us the URL and it's going to give us the, the title there, okay, and we're gonna do a quick check here, just make sure that those exist.
And I'm going to add one thing here, to make sure it ends with DOT, HTML. So if I click on one of these essays here, you'll notice every single one of his essays is a HTML file here. So we're just going to make sure that it matches that syntax so that we don't get any other stray links, okay, and if that's true, if both of those are true, we're going to create a new variable called link object and on that object we want to store the URL and we want to store the title, okay, and to keep track of all of these, maybe up here. Let me clean this up a little bit. Let's have a links array and I'm going to be typing things here.
We're using typescript, so we're going to have a URL as a string and we're going to have the title as a string, perfect, and then in here we're going to push the link object, okay, and then once we Loop through all those and push them into the array, then we can return. I'm going to log that. We want to return the links array, right. So, in summary, getting each one of these links, we're getting the URL and the essay title and we're going to push it into this. We're going to save it to this links object and we're going to push the object to the links array and then, once that's all done, we're going to return the links array. Okay, so if I log the links here, what you should see is a giant array here. If we're on npm, run scrape of all the links with the titles, boom, Okay, cool, so we got that. So now what we need to do is we need to go into an essay and we need to get a few things. We need to get title, we need to get the date and then we need to get the text.
Okay, so we're going to create a git essay function and we actually want to pass in: the URL is a string and then the title that we got is a string, okay, and again, that's going to be an asynchronous function and right here we're going to repeat a few of the steps here. So, again, we're going to create an axios get request and I maybe I shouldn't name this URL right, because it's actually just kind of the HTML file.
We'll just keep that for now and what we need to do is we need to do acorl again, this is just the base URL of the website- and then we need to go to that individual page which we have stored here. Okay, again, we're going to say we're going to load in the HTML material so we can parse through things very easy. And to start here- we're actually going to take a pause here just for a sec- we're gonna go create some types files. Again, we're using typescript, so we're going to be good with our types here. We're going to create types folder. We're going to create an indexts file and we're going to do two things here.
We're going to create- we're actually going to do three things here. We're gonna create three types. So the Syntax for that we're gonna. We're gonna need to export them, so export type. And we're going to do PG essay. We are going to do: let's create a PG chunk, all right, so this is going to be a chunked portion of the essay, and then we're also going to be saving us to a Json file, so let's just call it PG Json. And now let's add in all the properties we need. Okay, so for an essay, we need a bunch of stuff. We're gonna have the title as a string. Well, we want to have that URLs string. We're going to grab the date. We're gonna save the content.
So this is just the text of the essays- the spring- and then no, here's some number stuff we're going to need. We want to do tokens and that's going to be a number, okay. And then we also need to save an array of chunks, and the chunks is going to be an array of the PD chunks. So this, this shapes the data of our essays. Now let's shape the data of our PG chunks. So this is each one of these chunks is what's going to get saved into our database. So we're actually going to be a little repetitive here, just because we do want to keep this data in our database as well, since we're not passing the essays. So we're going to want to get the title again. We're going to be a little bit verbose here and do prepend it with sa underscore URL. Okay, let's do that same thing for the date, and we don't really need the essay tokens. Content is just going to be the chunked content, not policy content, so let's just call it content.
We're going to want content, tokens, which is a number, and then we're going to want to have the embedding, and that is going to be 80 array of numbers. So type it like so and then for the Json, let's just do: let's store the total number of tokens, maybe tokens, and then all the essays, which is just going to be array of the essays. Okay, so this is kind of. We just kind of defined the data that we're going to be working with Here and Now. What we're going to do is we're going to go back to our scrapets file and let's go ahead and create a new essay, actually use a let here, and what I'm going to be doing is creating, let's call it, we'll just call SA, okay, is we're going to type it as PG essay? It's going to be an object, of course, and we're just gonna put all of the empty properties in here. So we're just going to run through this really quick. I'm actually not going to do things here. I need the content, I need the tokens and we need the chunks, right?
So, just if you're not familiar with typescript, the reason we do this is like, for example, if I take out chunks here, we get these sort of nice errors. So it just helps us avoid bugs in our code and it helps us understand the data, the shape of the data that we're working with, right. So it says chunks is missing, right. So I know, oh, I forgot, I forgot my chunks, right here, okay. So that's kind of why we're doing that and we're going to want to do a couple things here. We're going to create the full link, all right. So we're just kind of constructing- let me fix that- just kind of constructing some of the data that we're going to be working with here, okay, and then, actually, just for organization purposes, I'm going to move that like: so, okay, and we're gonna change this to: yeah, we'll get rid of that, okay. So we're going to fetch the essay, we're going to load the HTML in here and if we inspect this again, you're going to notice there's a similar pattern, right, it's kind of the same table thing. So this table, everything's in there, like if we go to another one- I think this is a little bit longer- you can see, everything lives in this table, which this time it's table one and table two. So once again, we're going to select the tables and we want to Loop through those, and this time if it's table number two, so if I equal one, what we want to do is we want to get the text off of this. Okay, parentheses, table dot text. Okay, so this is going to get all of this text and that include, right? So this is every all the text that is in this table element, which includes the title, includes the date, includes this text. So what we're going to need to do is we need to do a few things just to kind of clean up this data, and I'm just going to speed through this really quickly.
It's kind of some of the stuff I noticed initially from scraping it that it was mostly just spacing issues we're going to fix. So you know, if you want to slow down and kind of think through what we're doing here, I do recommend that. But I'm just going to speed up and kind of get this part over with, because this is just cleaning, which is not the most fun thing. Okay, so we're gonna run a replace on this text and I actually have the regex is all ready to roll right here, so I'm just going to copy and paste them, okay.
Oh, did I not grab the text? Let's remember to grab our text. Okay, so now we can run the the replace on this text and it's going to clean up our text a little bit. And I'm actually going to run one more on this. We're gonna chain up one more replace on here, and this is mostly just some like weird double space things that happen sometimes in the essay and we just want to get rid of those. And what we're going to do is we're going to do, we're going to create, let's do, let's call this split, and what we're going to do is we're going to take the clean text, we're going to run a clean text, we're going to run a match on this.
Okay, once again, I have another regex. So what this does is I I noticed the pattern of how to get the date off of here, based on some new line characters, and that's all we're doing here. So if, if I lose you a little bit here, just kind of roll with it. This is just cleaning the text and we're gonna go. We're gonna want to set a date string and then a text without date string here, and so what we're going to do is we're going to check if this split- so this is actually an array- we want to get the first element off of that array and save it to the date. So this is just the. We're gonna want to get that off, split, sorry. So this is just that date, right here, based on how the, the text comes in from the HTML. And then we're going to want to save text without date, to clean text, and we're going to do a replace on here.
So what this is, all this is doing, is it's taking the date off of this, right? So we have one string, the date string, with the date, and then we have the text without date. That's the actual body text of the essay. Okay, so we have those two things and we're gonna do one more little cleanup step here. Oh, I see what I did here: move all of this into this tables Loop here. Sorry, I knew I had created that earlier. Okay, so we're going to redo one more cleanup step here and what we need to do is we just need to remove the new lines. So this is: we're going to create a new variable called essay text and we're going to take text without date. We're gonna do one more replace and one more regex, and this just removes some of the new line stuff and places it with the space. Okay, so now we have that body of text. And we're also bringing one more thing: we're going to trim it, and all this trim method does is it removes. It removes the white space on either side of the string, okay. So this should be all cleaned up now, and what we're going to do is we're going to reference this essay here.
We're going to reset some of these properties, okay. So the title- we got the title coming in from here- URL is actually going to be the base. We want to save that AC URL, Slash the file name again. I I messed up a little bit calling this URL, but just roll with it. And we want to save the date as the date string. We want to save content as essay text we got. You want to save tokens and this is going to be the first time we use that in code function from the gpt3 encoder package. We're going to want to run in code. Do we not import that? Let's import that in code from gpt3 encoder.
And so, if you're not familiar with tokens and how tokens work, tokens are basically just like chunks of text and it's the way that these AI models- they're not some of these large language models- process text. So what we're gonna do is we're going to encode the essay text and then what this encode function does is it tokenizes this text. So basically, like a token like one, the word one is like one token, the word kind it's like one token of his own token, and then opinion is probably like two tokens. So most of you are probably, if you, if you've used the open AI apis before, you're probably familiar with tokens. And then you get the token count off of this once we encode it, you just get use the dot link here, okay. So we have the token count and we need that chunks array still, and we're going to handle that in the next step. So for right now I'm just gonna keep that as an empty array, okay. And then, once we get out of this table Loop, what we need to do is return the essay, okay. So what we're going to do is we're gonna do a for Loop over all of these links. I'm just gonna do do a number base one so I can show you this.
I'm just going to show you the first example here. So let's see if I does not equal zero, break out of this Loop. So I can show you this with one essay. Let's maybe clean this up just a little bit. Let's say S8 or actually link equal links, I okay. And then again in the get sa function we pass in two parameters. We pass in that link URL in the link Title Here. So let me show you what this does. So what we should do is here is we scrape through all the links and then we're going to Loop through each one of these links and then we're going to pass in the link into the get essay function- get the essay- and then we should get an object here with all of the data we just set here. So let me clear: we're going to run npm, run scrape, and it looks like I made a mistake here. So let's, let's debug this really quick. So we're not even getting the title. Let's make sure this link is right. We'll do a little bit of a live debugging section. Make sure we're getting the links logging that, sure enough, we're getting a links. Oh, you know what we should do in this conditional here in our get links function you can see there's one stray Link in here and so what we're going to do is we're going to just add a little check here to make sure the title exists.
So, as you can see, there's no title here. So it's going to get rid of that. And now this is gonna work: debugging 101, all right, Okay, cool. So now we're getting that. So this is the essay as an object, with all of the data, right. So we got the title, we got the date, we got the URL, we got all the text content and then we got the token count- beautiful. So what we're going to want to do is we're going to want to get rid of this. So we get each one of these essays and we're going to want to create an array here- we'll call it essays, and this is going to be an array of PG essays, and then, for each one of these, we're going to want to push the essay into here. Foreign. If we log our essays here, let's get rid of this log, clean this up a bit. If we run this again, we're going to get all of the essays. So I'm just going to let this run for a sec just to show you them all coming in, and I'm actually I'm going to cancel this and instead of, what we're actually going to do is we're going to log this here, because that's going to take a little bit.
Okay, consolelog essay and now we can see these coming live, okay? So, as you can see, it's literally going through this entire list, grabbing the link, going into this web page, scraping it or parsing through it and cleaning up that data, as we did in the get essay function, and it's it's going to spit it out into this essays array. So I'm going to take a break for just one sec and then I'll be back and we're going to show you how to chunk each one of these essays the quick. I noticed an error I made, which explains why we got that weird bug with the essay with the MD title. So, if you see, right here in the get links, I'm passing in tables and this is supposed to be this table. So we want to get rid of the S and coincidentally, I made the same mistake down here in the get essay function. Get rid of that essay or the S, save that, nothing should work as intended. So over the last part here, we're going to create a git chunks function. Okay, it's going to be an async function that takes in a pgsa and it's only parameter, and what we're going to do in this function is we're going to chunk the text. So what we're going to do is we're going to create. We're going to split each one of these essays into about 200 token chunks, based off of sended settings.
Because what we don't want to do is, you know, say this right here. Let me highlight, this was like 200 tokens right here. We don't want to split the text right here. We want to split it at a nice even area or even spot, like a sentence ending. So that's what we're going to do here. So we're going to need a few values off of this essay. So we're just going to go ahead and destructure this. We're going to take title, we're going to take the bar over, we're going to take the date, we're going to take the content off of essay. This is, we just wanted an easier way to access that data and now we're going to do some chunking stuff. So the first thing we're going to do is we're going to create an array we're going to call Essay text chunks and this is going to be an array of strings, not Sterns, strings, string- and, as I said, we're going to split.
We're going to chunk this into about 200 token chunks. So we're going to run a little conditional if here and we want to encode the content and again to get the token count. When you encode a string, you do dot length. Okay, what we need to do is we need to Define our chunk size. So I'm just going to make that a global constant up here. We're going to call it chunk size. We'll do 200 so you can play around with different sizes. I found 200 to be pretty good for a lot of things, so we're gonna stick with that here and we're going to check if the length is greater than the chunk size. Okay, so the token amount of the sa is greater, then we need to chunk it, otherwise we can just push the whole thing in. Okay, so let's handle the scenario where we need to chunk it. So what we're going to do is we're going to split this by sentence. I'm gonna do a constant. Let me name it split. We're going to run the split method here, which is just a function that takes in a string, and it's going to split that string into an array of a bunch of strings based on the parameter that we give it, right. So this, you know: a period followed by space that represents a sentence. Okay, we want to keep track of the chunk text. We'll set that as an empty string: X. We haven't done anything with that. And now we want to Loop over this split array. Okay, so we're going to Loop over that.
And then we want to have the sentence. By the way, I should mention this, this gray text that you see here, this is GitHub co-pilot working. So if I just hit tab like, it completes that whole thing. It's not always 100 right, so you have to kind of look over it and make sure that the AI Auto completion is correct and a lot of this spoiler alert is correct. But I just kind of want to type through it because it I think it helps you. You know, see how I do it. So we're going to save the sentence and then we need to. We're actually do. Let's save this token linked descendants. We'll call it sentence token length. We need encode sends dot length. We need to get the chunk text token link.
So again, that's that's an empty string right now, but over the course of this process that's going to change. So we need to check that. And then we need to check if the chunk text token link plus the sentence token linked not less than we want a greater than chuck size. If so, we need to push a new chunk and then we need to reset the text and we're actually good there. And then we need to do one more if statement here, where we go and check if the last sentence- so sentence dot length minus one, the last sentence- matches this. And I have another regex here that I'm just going to copy and paste. I'm I know you're loving these red X's- and if so, we're gonna add this text or the sentence Plus a period space.
Otherwise we're just gonna add a space, okay, and then, once that's done, we can push that chunk into essay chunks and we're also we're gonna trim these just in case there's some extra white space on either end of these. We'll do that for both of them. Trimming is just a good way to clean stuff, okay. So now we have chunked this into approximately 200 tokens, so it's sometimes it's going to go a little over. It's basically just saying: if the next sentence, you know, if the, the length of a chunk in here is greater than 200 tokens, don't add another sentence.
Otherwise, add another sentence. That's all that's doing, and what we need to do now is we need to build. So, if you remember, in our type here right now we only have the string, we only have the chunk content. What we need to do is we need to add these other properties to it. So we're going to go ahead and do that right here. We're actually gonna do that down here. So if we create a new array, we'll call it essay chunks, okay. So again, this is just the string, this is the content string, these are the full chunks. Okay, so those are PG chunks and we don't want to set it to an FD array. What we want to do is we want to map over all those strings we got, which is why we called it text. You could call it like essay string chunks, that makes more sense to you. And we're going to Loop over that and we're going to create a new chunk which is a PG chunk and we need to add all those properties. So I'm just going to kind of tab through and it should handle probably everything correctly: content, content token. So it's doing the encoding correctly. And then, if you remember, we have the one last property, which is the embeddings, and we're going to handle that in the next step. So I'm not going to worry about that one right now. So I'm going to save and getting it out.
I'll return that chunk, turn chunk, okay. So now we have all those chunks saved to, sa chunks here. And we need to do here is return the essay chunks. Looks like I have a little syntax there here. We gotta sort out. Okay, these brackets match up. Looks like a parenthesis here. Yep around to see there. Okay, and what we're actually going to do is we're going to do one more check. One thing we don't want here is we could have scenarios where, like it, chunks through this whole essay and then, like, the last chunk is just this tiny little string.
So we're just gonna add a little check here where we don't want any chunks to have tokens under 100.. So we're going to perform that check here. So we should get essay chunks, dot links, Freedom one. So if there's more than one Chunk, we want to make sure none of them are under 100. We're gonna need a loop through those. I equals zero. I is less than, say, chunks length. Okay, so we're just going to create a loop here, Loop over that, get the chunk here. We're going to get the previous chunk and then if chunk dot content tokens, so the chunk tokens is less than 100 and the previous chunk exists, then what we want to do is we just want to add that. So what we're going to do is we're going to go pre-chunk content- we can use the plus equal here to just depend it- and it's going to be plus space and then plus the chunk content, and then we need to go ahead and adjust the token account here on the previous chunk and then we need to remove it from essay chunks, since we just combined it and that should be good. So what we need to do now is we are we're actually not returning essay chunks, returning the essay itself.
So we're going to create the chunked essay, which is a PG essay, and we're just going to use the spread operator to spread in the essay, right? So we pass in as a parameter, right here, we're going to spread that in. And if you remember, right here we left chunks empty. So now what we're going to do is we're just going to fill at chunks array with all these chunks that we just made, and then we're going to return the essay, the chunk. I say that is essay. Okay, so let's show you an example of a chunked essay. Let's just go with the random one here. So if I say I does not equal, let's just say like seven, break out of here. And then if I run npm, run scrape to show you what a chunked essay looks like. Once again, we're going to do a quick debug here. I obviously did something wrong essay. Oh, the reason this is happening- this is actually funny- is it's literally breaking out of this on the first Loop.
Let's just do: if I equals seven, let's slide it, we'll do a lazy way, okay, and I'm actually not getting the chunks in here, which is because we're getting the regular sa. We need to return the chunked essay. So chunked essay, wait. And then so we pass in the essay we got here into the get chunks function. Then it returns that chunked essay and now, instead of pushing the unchunked essay, we're going to push the chunk essay. So now this time it should work, okay. Good, let's get a little worried there. So these are all the chunks, right, so you can see this thing has a lot of chunks in it. Okay, you can see. That's the total count for the essay up here. And then we go to chunks and then you see each one of these is just a nice mandible size, right. 178, 193, 181, right. So they're all pretty close to about that 200 limit we set.
Okay, and what we can do here could get rid of that log. We're gonna save this to a Json file. So this is going to scrape through all of them, get the links, pass the link into the essay. We're going to scrape all the data off of that essay. Then we're going to chunk the essay and then we're going to push it to the spinal essays array and then we're going to create a Json object which is the PG, PG Json type. It's this right here, and that is going to be an object. And let's see what did we say? We were going to Source the tokens. So the way we need to get the tokens here is we're actually going to do a reduce, so we do essays that reduce.
It just goes through all the essays, grabs the token property off the essay and then it just adds it all up. So we're just getting a big, a final count of the total tokens of all the essays and then we're saving the essays. Okay, and then what we want to do is we want to write this. So we're going to use the file system. Actually, let's go import that, import FS from FS. So now we can access to our file system here and if a tab here we can see it's saving it as a Json file and what we want- where we want to save it to- is scripts slash, let's call it pgjson. So what I'm going to do is I'm going to run my scrape here- npm, run scrape- and this is going to give us all of that data that we need to scrape and it's going to save it to a Json file in the scripts. So I'm going to run that and it's going to run. I'll be right back. This should work for you. I'll make sure it does. Let's see if we get our DJ some. No, what do you know? We already got our Json. Anyone know what that is? Because we're only getting one. We need to get all them.
So get rid of that. If you wanted to test it, make sure you're getting this entire for Loop, which, if I log the length of this, links- I think it's 215 essays, yeah, 215. Okay, so we're gonna get all 215 of those, and then I'll be right back with the updated Json. Here we got all that data. We wanted all of these tokens, right: 609 000.. 817 tokens worth of essays and is a very big file here. So what we're going to do next is we're going to go ahead and create our embedding script, right. So what we need to do now is we need to take all of this text and we need to upload it to our database and get the embedding for it. So we're going to do here is: in scripts, we're going to create a new script, embed dot TS. We're going to go back to our packagejson, create a new script. We'll call it embed. This one's going to be TSS scripts slash- embed dot TS. Okay, we're also going to do another thing: we're going to create a DOT envlocal file. So these are the environment variables that you are going to need, and you're gonna need three of them. You're going to need one for open AI- open AI underscore, API underscore key- and you're going to need two from Super Bass, and it is going to be next public underscore, Super Bass underscore URL, and Super Bass do not put next public on this. Next public allows this to be exposed to our front end. We do not want that here. This is going to be service roll key, and what you need to do, if you have not already done so, is let's bring this in.
You need two things: you need to create an openai account so you can get an API key, and once you create your account. You can go create your key and you're going to paste that in right here. I'm not going to show you that because I'm not going to expose my key, so go ahead and do that if you haven't. And then the other thing you're going to need to do is you're going to need to go to superbasecom and you're going to need to create a super base account.
So super bass is, if you're familiar with Firebase, it's very similar to that. If you're not, basically it's just a super easy way to get set up with a post Grace postgres database. And of course, they handle some other things too right, like they do authentication and whatnot. But essentially we're we want to use super bass for to host our database and it's just. They provide very like a very easy UI to to use their tools with and it's just. It's very friendly for beginners and it's powerful for advanced users as well. So it's just a great tool. They also have a really generous free tier, so you get like two projects with a ton of usage like, so you're not gonna have to pay anything with Super Bass, it's going to be totally free for here. So go create an account and come back and I'll show you what to do. Okay, so if you've gone ahead and created your account, it's going to bring you to your project screen and it may ask you to create an organization, and then what you need to do is you need to hit new project.
So I'm going to hit- I'm actually not going to create a new project, because I already have mine set up. This is just like my embeddings playground here. So I'm gonna go ahead and open this up and then on the left here, you're gonna see this little sidebar and what you want to do is you want to go into the SQL editor, and this is where we're going to create our database. So you can see, I already have some Snippets here from some other open source projects that I've been working on. So what you're going to do is you're going to hit this new query button, we're gonna hit this drop down and we're going to rename it.
I'm gonna do all caps here. We'll just do Paul Graham test. I just wanted an easy way to identify the difference between my production one. So we're going to rename it. It's going to rename it here, and what we're going to do now is we're going to run all the SQL that we need to run. Okay, so the first thing you need to do is we actually need to create a new extension into base here. So they have it native, which is really nice. So all you have to do is do create extension vector and this just enables a postgres extension called PG vector, postgres vector, and this is just what's going to allow us to store embeddings into our database.
Okay, so once you type this in, you can either just hit run right here, or command enter. I'm going to do command enter and you're gonna see it's gonna fail for me, because I've already done this for you. I've already done this before, but this should work for you- and then we need to create our table. So what we're going to do is we're going to write some SQL here and we want to do a create table and we'll call this Paul underscore, gram essays- maybe.
Actually I'm just gonna do program a little shorter to work with here. Okay, for those parentheses, we need an ID. It's gonna be a big cereal and it's going to be the primary key, this table- and then this is going to be very similar to the and, in fact, identical to this PG chunk, because, again, this is what we're storing in our database. We've got essay title. We've got to say URL, which is text. We have the essay date. That's text as well. Content, that's text. Content tokens- content, underscore tokens- that's text. Oh, I'm sorry, that's not text, that's a. It's going to be a big ant. And then we need our embedding, which is going to be a type vector, and you're going to have this: parentheses 1536. So open ai's embedding model gives us a vector of this size, so it's- you can kind of think of it like as an array with 1536 numbers in them. So we're just gonna explicitly tell it that that's what it's going to be expecting. Okay, if we run this, we see success. If we go to this sidebar again, if you go to your table editor, you're going to see your table here. So now we have that program table and let me make this bigger. We're gonna go full screen for a sec. What you want to do is it says RLS is not enabled. I remember recommend that you turn this on. So it's going to bring you over here and you're just going to do enable RLS. This stands for row level security and it just makes your database secure. Okay, so once that's done, I'm going to pop back into the SQL editor. Let Me Go side view again. Okay, now we can get rid of that, since we created our table. And now what we're going to do is we're going to create our Vector based, our embeddy based search similarity function.
So we're going to create a postgres function. Right, create, we'll replace function. We're going to call this Paul, underscore, gram, underscore, search. Okay, parentheses. And this function is going to take three parameters. We're going to take our query embedding. So this is the embedding that we're going to pass, in which, if you remember to that example I showed you at the very start of this video, it's that it's that search query vector. Okay, we need a similarity threshold as a parameter threshold. That's a float- and then we need a match count. That's going to be an it. So what this parameter does, the similarity one, is when you calculate, when you take the embedding of your search query and then you compare that to the embedding of one of these chunks, you get a number and basically it's basically from zero to one and it measures kind of how similar your query is to the embedding of the chunk. So, like closer to one means the more similar. If it's like, you know, 0.9, that's pretty similar and it's point one. It's like very dissimilar. So that's what that is, and the match count is just going to be how many results we want to return.
Okay, so this is going to return, table returns table and it's going to return. This is just all the data we want to get off of it, so we want to get it. Basically, we're just gonna like repeat what we just did in that last create table step. I'm just going to run through this real quick, say date text, trying to think what else do we have? Content, tokens, concept tokens, which is a bigint. And lastly, we're going to get the similarity. So this is the similarity that we're calculating here and that's going to be a float. Okay, SQL I could type as dollar sign, dollar sign, begin, okay, so this is just some syntax stuff that we're doing to create this similarity function. And then we need to select from program, the program table. We need to get the ID and this. We're just matching these so that they correspond to the right thing. You're really careful when you're doing this, because I see a lot of people run into errors and then it turns out to speed, these didn't match up correctly, so you just make sure you're matching them up correctly. So we need essay title. We need whole gram dot. I say gate. Oh see, I would have made a mistake here.
This is URL: Paul Graham dot. Now we'll get the essay eight content. I know this is a lot of typing, but this is just some of the boring part. We gotta grind through content tokens. I feel pretty. Just fast forward this and go through it. Don't fast forward this, though. This is how we're gonna get the similarity here. So we're gonna do one minus program dot embedding: okay, the embedding of the text Chunk. We're going to use this syntax here. This is the Syntax for the cosine similarity, which is the function we're going to be running here. Embedding the merchandival S similarity: okay, and that's going to be from the program table, where now we need our where Clause, where one minus program dot vetting sign- similarity query embedding is greater. So we're checking a, we're making sure it's greater than the similarity threshold. Similar similarity underscore. Is that syntax views? Yeah, underscore threshold. Now let's order this by the similarity. We're almost done here. I promise query betting, and then we want to limit it by that match count. Right, we give the match count parameter. That's the limit of responses here. Okay, you need to tell this to end this. Okay, now, if we run this, save your changes here. It should work. Okay, I clearly made a syntax error. It's because I need a semicolon here. I think, yep, okay, okay. So now we've created our Paul Graham search function and again, with all this stuff we just typed out, does this is just calculating the similarity between the search query that's coming in and the embedding of the PG chunk? Okay, so let's get rid of that for now and let's hop into our embed file here. So the first thing we're going to need to do is we're going to need to do load and the config, and this is going to load the environment variables in the script so that we can use that. And what we're actually going to need to do is we're going to need to install this. So we're going to do npm at next slash: Envy Dash, capital D. It's a Dev dependency, boom. Okay. So let's hear in here: if we save that, that should Auto Import for us. And we're going to create a function called generate embeddings. Okay, that's going to be an async function and it's going to take in the entirety of the essays. So pgsa array as the type and arrow function here. Close, that we're also going to create a self-invoking async function down here, Excel, okay.
So the first thing we need to do is we need to load that Json file in here, okay, so I'm just going to go const Json and that's PG Json and we're going to import file system from success equal. It's going to be this thing right here: jsonparse, fsread file sync script, pgjson. Okay, that's where it is and utf-8 is the encoding here. Okay, and then we're going to call a weight, generate embeddings with grabbing the essays off of that Json object- very cool. And what we need to do as well, we need to install open AI in here. So we're going to run open- sorry, npm, I open AI once again, Dev dependency here- okay, and we need to configure open AI.
So it's const iguration equal, new config configuration, and it takes in our API key, API key, and then we're going to get that as processenv dot, open AI, underscore API, underscore key. So again, you should have put your API key in that environment variable. Save that. Actually, we need import configuration, not that from open AI. We're also going to need openai AP, open AI API, like so, and then we'll create a new variable called open AI, which is going to be new open AI API with our configuration passed in here. Okay, we also need to install super bass. We're gonna go npmi- and I have the package name here it is at Super Bass, slash superbase dot. Js and create our Super Bass here, which uses create client. And then you have to pass in two things in here and these are the two environment variables that we set up. So it's the next unscore Publix URL and it is the what I'm trying to think. What the other one we did was super bass through the base service parole. Okay, and we're going to do a little typescript magic. We're gonna do a parenthesis after each of these just to let it know that it exists, and then we have to import this create client from base Perfect. All right, so we got a lot of that setup done and now what we need to do is we need a loop through the essays. I'm going to skip the essay. We need to Loop through the chunks of that essay, so essaychunkslength, make sure you use it different, don't use I.
And now we need to get the chunk. Okay, and what we're going to do here is we're going to create an embedding request, so we're going to call openai Dot create embedding and this is going to take two parameters. So first we need to give it a model. So the embedding model that you want to use is text, embedding Dash data, Dash zero, zero, two. So this is the embedding model and then we have to give it the input, which is our content, which is chump dot content, sorry, okay. So that scene, betting, response, and the way you can grab the embedding off of that is, with this syntax here, embedding.
We need to destruction that off of embeddingresponsedata dot data, and then what we can do is we can upload it to superbase- wait, Super Bass, and we're gonna go a new line here for some formatting and it's going to be from: so this is pass in the table name, so program or whatever else you named it. We're going to do an insert, give it an object and now we need to give it all these properties. So I say title, let's say URL, I say dates, the content, the content tokens and the embedding. Okay, so that'll save our chunk with the embedding added onto it, to our database, and then we're going to want to return. I'm just going to make sure we get that data back, just just in case we want it. This needs to be a string. Okay, very cool. Let's just do one quick error handling thing.
So if error, we'll log an error. Otherwise we want to log the data. Maybe let's just do it this way. Let's do like saved i j, we can kind of keep track of this as it's happening. And then what we want to do just to avoid possible rate limits with the open AI API, because we're going to return a promise like this and basically all this is doing is it's running this code and then you're just waiting a second. So this is in milliseconds. So I'm actually just gonna do this down to like 300 milliseconds. So sometimes if you run into an error when you're embedding stuff, it might be the rate limit thing, so you might just need to increase this, but I I've never had an issue with it here. Okay, and this should do everything we needed to do. And one thing I'm going to show you really quick is: just let's let's double back and make sure that these environment variables are are set up. So once again, you need your open AI API key, and here I'm about to go get mine and paste it in here, and then, with the super, super bass ones, the place you're going to access these is in your project here. So let's go back over to Super Bass on the right, go to Project settings and then go to get the API tab here and you're going to find everything you need. So this first one is going to be your project URL. This is public. So you know, don't worry, you don't have to keep the secret from anybody. This is often exposed to your client and whatnot. So you want to copy that. I'm going to paste it here. And then the service role, this one you do want to keep secret. Okay, so treat this like a password. You know, don't commit it to get, keep it to yourself.
So I'm about to go get my open AI key and then reveal this, copy this and paste it here and I'll be right back the open AI key and my service role here into that EnV file and I saved it. So now this should be working. So what we're going to do is, in Super Bass, we're going to navigate back over to our table editor. I'm going to navigate back into that program table and what we should be able to do here. Assuming there's no errors- there might be one. We have to do a debug here, because we should be able to run npm, run embed, and then we should be able to start seeing all of our chunks populating into our database. So let's run that and okay, okay, it's saving.
So everything's working. We set everything up correctly. So let's refresh and let's go full screen here for a sec so, as you can see, everything is coming in exactly how we wanted it. All that hard work we did- setting up this table, running those SQL scripts, scraping all of those essays and chunking them and all that stuff- it's now paying off. We're getting that data in here. You see, we're getting this embedding. You can see it's. It's an array of numbers. Here we're getting the content, which is that chunked text reading the content tokens. We're getting every single thing that we wanted in here.
So if I refresh, we're gonna see even more. Okay. So I'm gonna go side view here again and you can see this is just totally running through the whole thing. So this is going to take a bit right. We're running through literally hundreds of these. And, just as a small note, these embeddings do cost open AI credits, so you will get billed, but it's embeddings are very cheap. It's the cheapest thing you can do with open AIS API. It's, I want to say, like for a dollar, you can embed like a thousand Pages.
It might even be more. It's it's super cheap. So I think this whole thing is probably gonna cost like 10 cents. Just a small note. So I'm gonna let this run and then I will be back and then we can finally get to building the actual UI for this thing. So congrats on hanging in there. You get it. If you encounter any errors again, we have- I have the the GitHub repo link. Just peruse through those files and make sure all your syntax is correct. And if you run into any huge bugs, you can always reach out to me in the comments below or you can find me on Twitter at McKay Wrigley and just hit me up and I I try to get back to everybody as fast as I can. So we run into errors, let me know. Otherwise let this embedding script run all the way and you can continue to see all the magic of seeing your your different chunks get uploaded into your database here. So I'll be back. Your script should have hopefully run successfully, and now you should. If we go full screen here, you should have your table here with all your embeddings. As you can see, we have a ton of Records here to work with.
So one thing we need to do really quick is we need to do one more step with SQL. So we're going to go back into our SQL editor and we're going to do one more thing, which is we need to create an index with PG vector on our database. So we're going to create an index on our table name, which is polygram. Running into some autocomplete stuff here. Okay, new line using IVF, flat embedding column Vector, underscore, cosine Ops, and then you want to type with lists equal 100..
So all this is doing is creating an index on our database and it's just helping us increase the performance of our similarity search. That we're going to be doing, okay, so you want to run that? So I'm running it right now. Just gotta run through all those rows and create the index. Should be successful. If you get narrow, maybe you forgot like a semicolon, just make sure the syntax is exactly as so and we should have everything we need to go create our front end here. So the first thing I'm going to do is I'm going to install two packages that we're going to need and this is not a Dev dependency. So don't put that Dash D flag here. You want Event Source Dash parser and then space, and then you want a package called indent. So Event Source parser is going to help us handle some of our streaming capabilities, and then indent is just- it's just a little utility for template strings and cleaning up some spacing, which we'll be using for our prompt. Okay, so once you have that, we're gonna do a couple things here. The first thing we're going to do is we're going to go into our pages and then into this API directory and what we're going to do is we're going to rename this hellots to search. So this is where we're going to perform. We're going to create the API endpoint that handles our similarity search, so this is where we're going to fetch the most similar chunks here that we embedded, from the embedding of our search parameter that we're going to pass in.
So we're going to go ahead and just get rid of all this. And what we're going to do here is we need to configure our runtime here. So we're going to create a config object here and you just want to set runtime to Edge. So with nexjs, you can in versus hosting, you can take advantage of edge functions, and they're just super Snappy and they're super fast. The latency is really low, so you end up just with really fast API responses, which is great for streaming. So we're going to take advantage of that here. What you need to do is we're going to create a Handler. It's gonna be async and it's going to take in a request parameter. We're going to type that correctly and it returns a promise with a response okay, and then you need to export that as default. Default. There we go and you'll. You'll see you're getting a little error complaint here. That's just because we haven't returned anything yet in our function. So we need to set up a try catch block and, oops, accidentally did a parentheses here instead of brackets. So if our for a function fails up, here we just want to return an error.
So let's just, let's just have our message be error and that's going to be a 500.. For the status: okay. And then here we're going to want to handle that. The search: the first thing we need to do is we need to get our parameters that are coming in off of the body. So in this case we're just going to be taking one, which is that search query which is going to come in as a string. So let's grab that. It's gonna be recjson as a query, which is a string, if that's the only parameter that we're going to be passing in here, and then what we need to do is we need to create an open AI API response. We're going to do a similar thing if we open up our embedding. We're going to do basically the same thing here, with just a little bit different syntax, just because we're not going to be importing that package in here. If you want to use it, feel free, but you'll see, we don't use it for streaming, so I just don't want to confuse anybody. So we're just going to do a basic batch request here, and what we want to do is we want to do the V1 but slash embeddings, okay, and that's going to be the end point.
We need- and we need to pass it- a couple things right. So it's going to be a post request when you set some headers, so the content type will be application Json, and then we need to pass in our API key here, which, again, if you set that in your environment variable file, thenvlocalfile, you should have it accessible, like so, except there's no underscore there. We're just going to let the type group know: it's there, all right, that looks good, and then we just need to pass it a body. So, if you remember, we pass it two parameters. We pass it the model- let's grab that model- and then we pass it in input, which is what we want to embed.
So we're going to stringify that body, we're going to pass that model in, right there it's the text embedding 802 model and, as our input, the thing we want to embed is: we want to embed that query, so we're going to go ahead and embed that query and close that up, and then we need to get the Json off of that response, perfect. And then we need to get the embedding off of that Json, which is going to be jsondata and it's going to come back as an array. So you want to get the first item in that array and then you want to get the embedding off of the object in that array. Okay, so now we have access to our embedding. So the next thing we're going to do right here is we're going to access superbase it'll. What I'm actually going to do is I'm going to go into the root of my directory here, my project here. We're going to create a new directory called tills and then we're going to create a new file called indexts, and what I'm going to do in here is I'm going to import that create client from Super Bass.
We're going to create, we're going to need to export this because we want to use it in the API route, which is going to be super base. We'll call it- we're actually going to call it Super Bass admin, and the reason why I'm calling it Super Bass admin is just to remind you that you're using your administrative service roll key in here, which you want to keep secret. So, just like we did in embeddings, where we passed, I believe, we created the client, yeah, so just, we're basically doing the same thing here.
I'm just naming it admin, which you could do in here as well, but just to remind you, just to keep this nice and secret, don't, don't expose that, cool. So now we can import that here, and what we're going to do is we're going to grab the data off of this Super Bass call we'll. We'll maybe refer to it as chunks, just because what we're doing is we're fetching these chunks in here and we'll just be consistent with how we name those. We want to grab the error off of there too, just to handle that. And you want to do: await Super Bass admin, base admin. So we're importing, we're importing this single base admin that we just created and we're going to make a RPC call off of this. So this is going to take two arguments. The first is it's going to be the name of that SQL function that we created.
So if you remember back when we not this one, but the one before this where we created our database function, so if you want to just make sure that exists and that you actually created it, we can do is you can click over here on this database and then click into functions, see your database functions that you made. So you should have created this one, the paulgram search function, that had those three parameters, and so what we need to do is we need to let it know, we want to call this function and we're going to give a second argument which is an object of these three parameters. So the query embedding is just going to be that embedding that we just created, VR query. So that's our first argument. And then similarity threshold: you can kind of play around with. I'll just do like 0.5 for now, but basically that higher this is is basically the more stricter being on how similar the passage needs be to the query and vice versa, and then we have the match count parameter. So we'll just we'll, we'll fetch five of those. That sounds good. And what we want to do now is we just want to make sure we check for an error. So if there is an error, maybe we'll just return that and let's, let's log it just in case there is an error, so we can debug it easier. Otherwise, we want to stringify these chunks, because that's what we want to send as the response, and then we'll just have a 200 status, which is a successful response. Okay, so that's our API Handler for fetching all of these, these chunks that we need. So we're going to perform a similarity search from our query on the five most similar passages. So now what we need to do is we need to create one more file in our API folder here and we're going to call it answer dot TS. Okay, and this one is going to be set up similarly, right? So we're going to need this Edge function configuration and then we want kind of the same Handler syntax. So I'm just going to go ahead and copy that and then, of course, don't forget to export default that. And what this is going to do is it's going to handle fetching the API chat request from openai and it's going to handle streaming it back to the client. So we're going to do another trade catch block here, identical setup as we just did, and then in this one, we need to fetch the prompt, because this is what we're going to pass in to this request and we're going to do a weight correctjson and prompt again, this is going to be a string, so we're gonna- we're gonna construct the prompt on the front end and then we'll pass the prompt in. And we're going to need to do here is we're going to pause, we're going to go back into our utility file here that we created.
So, under this super base admin, what we need to do is we need to create a little function that's going to handle the request to open Ai and then handle parsing and streaming those tokens that we get. So we're going to create a new function and this is going to be open AI stream. It's going to be an async function that takes in one prompt or one parameter which is prompt, like so, and we need to do a couple things here, really two things. So the first thing we're going to do is we're going to create a new API request to open AI and in this case we want to hit the chat completions endpoint. So we're gonna do V1 slash, chat slash completions and this is going to be a post request. I've got to set the headers again: Json pass in our API key and then for the body in this one it's going to be a couple things. We need the model, so we want to do GPT 3.5. So this is that model that's used in like chat, GPT and stuff. So we're gonna use that model. I'm just going to take in a messages parameter, and messages is an array of objects and each one of these objects has two properties. It has a rule, which can be system user or assistant, and then it's going to take in a Content which is just the text that is going to come in.
So, like what you normally want to do is you want to start it off with a message that is a system message and this is effectively like the system prompt. So this is kind of like your base prompt. So what we're going to do here is we're going to- we're gonna come up with some instructions that kind of tell the chat model what we want. So let's just go with like you are- hey, hope you are a full assistant at answers- query these about Paul Rams essays, and let's change this to backticks here. We'll make this Temple strings. We can handle this apostrophe. Let's say like: respond in I don't know three to five sentences. So this is just a really basic starter prompt. Feel free to make this more complex if you want, but we'll just roll with that for now, and then we're going to pass it a second message, which this time is going to be a rule of user. So this is us, and then in this case, the content is going to be the prompt. So this is what we're going to pass into it. Okay, and then we need just a couple more things here. We want to set a Max tokens limit, so somewhere like the one to 200 range is good. So I'm just gonna do like 150, because we don't want our responses to be too long or too short.
So that sounds good. We need to set a temperature. I'm going to set the temperature to zero, so temperature basically determines how deterministic your responses are. So, for example, with zero here, if I pass in the same prompt every time to this, it's pretty much going to give me the same response every time, whereas if my temperature was higher, like say, we went as high as like 0.9, then the responses I get would be different every time. So that's kind of how that works. And then we need to set one more parameter here, which is we need to set stream to true, because we're going to be wanting to stream this. Okay, let's just check to make sure that we actually get a good response here. So let's say, if response dot status does not equal 200, so if it's not a success, we'll throw an error. Otherwise we want to handle the Stream. So, to handle the stream, the first thing we're going to do is we're going to create an encode and a decode object here using these Constructors, and then we want to decode. These are just going to help us handle our streaming of the tokens.
And then what we want to do is we want to create a new stream. So we're just going to create a variable called string stream- sorry, not string stream- and this is going to be a new readable stream. Okay, and this is what we're going to pass into this to start. We're gonna pass a start and you need to pass in this controller here, and we're going to create an on-parse function here which is going to take an event, which is either a parsed event or a reconnect interval, and then in this on parse function, we want to check if the event type is an event. So we'll do if event type equal event and if it's an event, we want to get the data off of it, create a new variable called Data. We'll get the data off of that.
And the first check we're going to do here is we're going to check if data equals, done in Brackets like this. The reason we're checking for this is the way the open AI API works with streaming is, once all the tokens are done, streaming, it's going to send this and then you're going to know that you can terminate your stream. Okay, so if the stream is finished, we want to close it and then we're going to return. Otherwise your stream would just never close. If that's not the case, then we want to try and see if there's a chunk on here. So we're going to get the Json off of that data and we want to get the text off of that Json. So, and that's going to be in the format jsonchoices, it's going to be an array, so get the first item in there and then you want to do Delta dot content, so that's going to get the text off of that. And then we need to create a new variable called q, and so this is just queuing up each token, or just this is just how the stream's handling sending it. So we're going to queue up the decoder. There we go. So we're going to queue up that text, stream it in and then want to tell the controller to unqueue. That Q is a weird word to spell. There we go, and then we'll just catch the error there.
Otherwise, we want to have the controller handle the error controller. Having some tough time spawn here. Pass in the error controller, no controller. Okay, once we've done that, check outside the on parse function. You create a parser. So we're going to use this create parser and then we're going to pass in our purse function of this and so this create parser. This is just. All. These things are coming from the Event Source parser package that we installed. So that's what. That's what that package is doing. It's handling all this stuff for us. It's really nice. Okay, we're going to Loop over those chunks, chunk of chunk of res dot body. I'm gonna cheat a little bit and just type this as an end. Actually, I think I used response here, didn't I? Yeah, response dot body and now we want to feed this to the parser. So we're going to do parserfeed decoder. So you want to decode it and decode each chunk, and then this handles our Stream. So this just in summary, at a basic level. Just this handles each token as it comes in and creates a streamer so that we can consume on the client, and then we just want to return our stream, okay, so that's gonna. That's gonna do all the magic that we need to extra response from the API and then handle the streaming so that we can get some good ux going on. So we're gonna go back into our answer TS file in the API and now we can utilize the stream that we just created. So we're just going to create a new variable here called stream- seems fitting stream- and that's going to await open AI stream pass Again, The Prompt as that parameter. If you remember, we were passing in a prompts as a string.
Okay, and now you want to return it so that our client can consume the stream. So we're going to return the streaming. Awesome, so now that we've got that ready to go, we can go actually build the user interface for this. So let's go do that. So what we're going to do we'll- we'll close all these files for now and I'm going to go over to a localhost 3000 Tab and we're going to run npm, run Dev, to start up our development server here. And you're going to see an error here. So, if you remember, earlier on in the video, we deleted this homemodulecss file. So we need to go into our indextsx file and you'll see we're importing that and doesn't exist. Let's get rid of that. We'll save that and styles is not defined. So what we're going to do is everything in this main tag. I'm going to lower this a little bit.
We're going to delete. So we'll just get rid of everything in there and sweet, so that's working. Let's give it a title. Really quick. I'm just going to last program, GPT. Let's give it a description: AI, q, a on PG's essays. Okay, so you can see. We're titles up there now, okay. So what we need to do now is we're actually getting rid of this interform thing. I want to create the UI for this. So we're gonna need a couple of State variables. So we're going to need a way to keep track of the state of our query. We need to keep track of the answer, right. So this is the response that we're going to get back from the API. We need to keep track of the chunks, right. So these are the chunks we'll get back from our similarity search and let's handle a loading variable, just for a little bit of ux, which should automatically import it.
You've, just as a small note, maybe Auto Imports. You don't have set up, but you can just import it like. So okay, everything looks good here. We're actually going to type this: we're going to bring in our good friend PG chunk and make sure we just let it know that these chunks are supposed to be an array of PG chunks, and then we're going to create a function that's going to handle our similarity search. So let's create a function, do handle search, and this is going to be async because it's going to enter API and we need to do a couple things.
So let's get the response off of this. You know what I'm actually gonna do. I'm just going to call this handle answer. We're going to do everything in one function. I think that's just going to be a little bit easier to follow for you. So we're going to go ahead and do that. So we're going to do search response. So the first thing we're going to do here is we're going to fetch the most similar passages to a our query, so we're going to create a fetch here. So wait, fetch request and the way you hit your API endpoints when you have a next JS project like this, because you just want to do slash API, so we're in this directory right here and then slash the file name without the extension, so we're going to hit the search File and then we need to pass in our method and our post, so it is a post request.
I don't need to pass in anything else there on the headers, but we do need to pass in our body and our body in this case is just going to be that query. So, just to confirm, we are expecting this to come in as query, as a string, which is exactly what we're doing here.
Now. Let's just do a quick check to make sure that this response came in. Okay, so let's do. If search response. It's not equal, okay, let's throw an error here about that. No, we're actually we're gonna do a return so that this function just stops. It's a little cleaner. Otherwise, we want to get the results and the results are going to come in as an array of PG chunks. So, again, if we go back into our search API file here, we are sure enough we're sending those chunks on the response. Okay, so we're grabbing those results, which are an array of PG chunks, and what we want to do now is we want to set our state to the results right. So now we have access to those chunks once they've loaded in, and the way we can confirm that this is working is we can actually just log of those results. First of all, we're going to need a way to call this function and a way to import or to input our query. So let's go, let's go write some TSX code here. So what we're going to do is just create narrative, and I know the actual site looks a little bit fancier than this. I'm going to just keep it super Bare Bones, just because the goal of this video is to show you how this works, not to make it look pretty.
If you are interested in kind of some more advanced front-end stuff, maybe let me know in the comments, because I'm more than happy to do like a follow-up video where I teach how to do all that stuff too. If you want a little bit more of a front end heavy video. I'm happy to do a part two. Just let me know. But for now I'm going to keep the spare bones. Okay, it's gonna be a type text. The value is going to be our query. Again, we set a state variable, the query here, and then on the on change, we need to set the query to the value, and that should be good. It's not showing up here. It's probably a style things let's do. Yeah, see, we're getting some weird Styles, so I'm gonna do.
I'm just gonna have Tailwind Auto Auto come up with some Styles. Maybe we'll just do like a border. Yeah, let's just throw a border on that, maybe. And what we could do is we could do like a border. Let me just make sure really quick that this is all coming in correctly. Okay, everything's set up good. So we'll just make our text black, just so it's a little bit easier to see when we type in here or text me back. I actually think that's well. What we really want is we want like a border, black. What are great works too, okay. So we just have an input here and then we need a button.
I guess this: this Auto use co-pilot classes here and when this is clicked it's going to run our handle answer function. Okay, so that's all that's going to do and we'll just do like submit and then we'll close that one, okay. So again, I know this is like really bad or whatever, or it doesn't look very pretty, but just, we're just trying to get this to work and what we're actually gonna do we're gonna do a flex, Flex college. So this is a column, okay, looks nice.
Maybe we even set like a width of I don't know what 200 pixels looks like. It's like let's do 350.. Perfect, so that's gonna work. And what we want to do is we want to go over to our console, our developer console, and we just want to check and make sure that we're getting the search results when this runs. So let's go ahead and type in: how do I start a startup? And before we run this, let me just explain what's happening. So we've inputted that into our input.
We're accessing that with query here which is going to be passed in to our our search response here. All right, this handle answer is getting called. When we click the button. Then we're hitting a search endpoint which is performing similarity search on our database over here. So what we should get is we should get an embedding of this text. If you take that embedding, and it should perform similarity search on the embeddings of all of these passages and then return the five most similar, assuming it is above this similarity threshold, okay, so let's, let's see if everything we've built is working. What do you know? We get five most similar responses. So, as you can see, the similarity on this first one is 0.868 and again, that's from zero to one. So this is pretty similar. So that's a good result, right? If we go to the next one, you're going to see it's a little bit less. And then if we go to like the last one, so these are all above 0.85, which is pretty good, and we're getting all that data that we set in our database which we can now use to inject into the prompt.
Right, because that's that's kind of the whole point of this is, we want to get these results because we want to inject that into the problem so we can get a good answer. Okay, so now we're going to create the call to our answer endpoint, right? So what we need to do here is, instead of search response, we'll call it answer response. Right, and same syntax: slash API answer, since that's the file, we want to hit post request, set the header and then this body is not going to be query.
If you remember, we called this prompt. So we're going to pass in prompt and you're going to notice- we'll probably figure this out by now- prompt doesn't exist because we haven't created it. So what we're going to do between these two steps is we're going to create our prompt and, if you remember, we installed that indent package. So what we're going to do is we're going to prepend indent, we're going to import that in the template string after this, and all indent does is it lets you use template strings here, and if I create like new lines like this, like this, it's just going to clean up some of the white space on these lines. So if you would pass in this string without indent here, you know, if I just did test, test, it's going to, instead of passing the text you know, like, like this, instead of you know receiving it something like that, it would have all this white space from here. So it just it just cleans that up a little bit, which is a good thing to do when you're you're creating a prompt. So we're going to create our prompt and we're going to say: use the following passages to answer the query. And we're going to inject our query in here, right, so that's going to show our query right there. And then what we want to do is we want to take these results that we got in here and we're going to map over them. So let's go results and again, template strings let you, which are just what these back ticks are. They let you inject variables by using this dollar sign bracket syntax into a string. So that's that's what we're doing here. Hopefully you've gotten onto that by now. That's what I've been using throughout this whole video.
But what we want to do is we want to map over that and then, for each one of these chunks we want to, we want to return the content and then we want to join it by a new line. So what this is going to do is, for each one of these chunks that we fetched- all five of them- it's going to grab the content in here and we're basically just going to create like a list of them with the new line.
So I will log that prompt to show you what happens here. I'm actually- I'm just going to comment that out so we don't make that call, clear the log, and if I do it again, you'll see. This is our prompt right. So use the following passage to answer the query: there's our query and then we added all of our passages- excuse me- into here. Okay, so that's. That's basically the context that we're going to be using to handle this response. Okay, so now what we're going to do is we're going to fetch the response here and now all we have to do here is we have to handle the streaming. So let's, let's do a check- similar we did above- to make sure answers coming in, and then we have to do a couple of things to handle the string. So what we're going to do is we're going to create a reader. It's going to be eight. I got ETA dot get reader. Sorry, I forgot to access the data off of the response buddies. We got to get the answer response dot body.
Okay, and this is complaining to us. Data's possibly null. So what we can just do is do a quick check. If not, data return, that'll go away, sweet, and then we need to create a decoder for this is the same thing we just did in our utility function for the open AI stream: just a new text decoder, and we're going to create the variable here called done, which is going to be a Boolean that's going to handle.
If we go back into our utils, it's just going to check for this done, right. So we're we're just going to see if the stream closed, okay, and so while not done so, while we still have tokens coming into us from the, from the stream, what we want to do is we want to get the value and then we want to get done, and we'll just refer to that as, like, done reading. That's going to be a weight reader, and then we want to read from this room here. Okay, we want to set our done variable to done reading, and the reason we're casting this is done reading is because we've already declared a variable called none. That's why we're doing that. We want to get the chunk value right. So this is the text of the chunk and again, the chunk is just referencing the, the streamed in text. And then what we want to do is: you want to set the answer right. So this is our state variable answer to the previous value. So this: this is what the syntax says: it takes the previous value of our state and then it takes the previous value and it adds the new value to it. So that's what that does and so that's going to run through our whole stream and then, when our stream is complete, it is going to close it and then, instead of true or instead of returning false here, it's going to return true at the end. Then this variable is going to become true and then it's going to kick us out of this while loop. So it's going to wrap up so we don't get like an infinite Loop here, and then that's going to handle their answer. So what we can do down here to show you that, let me, I'm going to create a div here. I'm going to give it a margin top of four just for a little bit of spacing there. And what we're going to do is we're going to check if, if it's loading, we're going to want to have a, let's put a loading div in there just to show that it's loading. Otherwise, we want to put in our answer and we just want to close that div. Okay, and then we also, we just need to handle the, the loading stuff in here, all right, so we want to set loading to True at the start of this and then at the end of this. We want to sell loading the false, and if we do this, we should now have a fully working little proof of concept here where, if I do, how do I start a startup?
It should fetch the most similar chunks from our database. You should inject those- the content, the text, concept of those chunks- into our prompt. It should feed that prompt into our answer request to get our answer and then it should give that back to us in the format of a stream that we're going to decode in here and display to the user as it comes in. So let's, maybe let's try a different one. Let's do: how do I raise money? Hit submit, it's going to be loading and, as you can see, I'm getting an error. So we probably just did something wrong here. So let's, let's go check and see what we did wrong. So we're getting a 500 error. So let's do a couple things. Let's make sure our prompt is coming in.
Okay, we'll do some debugging and see how this loading didn't resolve. What I'm also going to do is just do some little ux stuff here. So in each one of these we want to set loading with false. If it threw an error, so like, if I copy and paste this refresh, run it again. It's going to fail and it'll get rid of that loading. So that's just a little quick ux thing we'll do as we debug this. I accidentally just exposed my API key, so I'm just going to record this. Last minute that's what I went through and tried to figure out what the error was, and I found it super quick and it was a tiny little thing. So what we need to do is we need to go into our utility file here where we do the open AI stream, and see where I have the model specified. This model is actually called gbt 3.5 turbo, so you have to add this Dash turbo on here, and now we have the model correct. So basically, what was happening is we were trying to request a response from a model that doesn't exist.
So now this should work. So if I do what is a hacker, we should be getting a response here. It's loading, cool. So we got our response and, sure enough, this is pretty good result. So you'll notice that this didn't Stream. So, as the very kind of final step of this, what I'm going to do is I'm going to show you how to build a nice little component in here that can handle streaming with a nice little bait in animation. So we're going to create a new folder called components and in here we're going to create a new folder called answer.
I'm going to create an answer component. I'm just gonna have two files: it's going to answertsx and is going to have answer dot module, modulecss. So in our answer component we are going to create an interface here just to type in the Pro to the props we're going to be passing into this component. So it's going to receive one prop, which is text, which is a string, and we're going to export this component. We're going to answer. This is going to be a functional component in react with props, like so we're going to de-structure text off the props there and we're going to return a few things. So the first thing we're going to do in here is we're going to keep track of the words that are coming in from this text in a state variable and then we want to do a use effect like so, and we want text to be a dependency here.
So every time the text changes we are going to set words to textsplit at each word. Okay, so we're just going to get an array of all the words in here because we're going to do an animation off of that, okay, so we're going to throw a div in here, and then we're going to map over these words, right? So we're mapping over this array, which is a word of each text that we passed into this component, and we're going to return a span with key index. We're going to pass in a- sorry, let me put this on a new line- a class name. I'm just going to equal the Styles dot fade in, and we're going to create that in our our module file. Here. Let's create our fade in class. It's gotta have an animation of fade in over 0.5 seconds. Ease in, out forwards. So this is just a little animation we're creating. We want the opacity to start in at zero, AKA we don't want it to show. You have to do these little keyframes for the fade in starting. Opacity zero, going, oh, passing one. That's all you need there. What you do need to do, though, is you need to import styles from that module, from, and that's a lowercase- a oops, Styles Dot fade in there we go, okay, and we're also going to do a little custom inline styling here, which we want to throw an animation delay, and the delay is just going to be equal to the index times, 0.1 seconds. So you can. You can play around with this number right here and this number right here to get kind of different fade in speeds. This is the one I thought was best, which is the only reason we're doing that and we want to put in the word, in here with the space, because we need to separate those words. So now we need to go back into our indextsx file and instead of rendering in the answer just the raw answer, here we want to be the answer component and we have to pass in answer to the as the the text prop right Square. We need this as text. Save that and now you can see we're getting this beautiful animation that's streaming this in. So that's, that's a brief summary. I say it brief. This is probably like a 90 minute video. I'm probably like two hours at this point. But let me know what you think. You should now have all the tools you need to create this for all sorts of different data sets. You know you can scrape text from PDFs. You can scrape text from websites. You know you want to use like Wikipedia or something, anything you want.
You know, if you want to use like podcast, audio transcription, YouTube videos. The possibilities are endless, but this shows you how to do all of the steps from embedding your text, from to coming up with the data set, to doing the streaming, to building just a basic UI. Here again, you can, as a challenge, make this a lot prettier. If you do want me to follow up and do a user interface video, do things like you know. Make the loaders prettier, make this prettier. You know I have the little settings screen and whatnot in the production version of this. If there's anything I didn't cover in here you would like me to cover, just let me know and if enough people want it, I will a thousand percent do it. So hopefully you found this informative. Again, this was my first video I've put out I. My personal preference is I like sort of the live coding style as opposed to a more edited version.
But I know these can get a little bit long-winded. So you know, if you prefer me to really cut down or skip over things and not live code it, let me know, and maybe I can do like two versions of each every time I do one of these videos, because I do plan on trying to do at least like two of these per week. So again, just give me some constructive feedback and I'm I'm happy to make you all happy, but I hope you found this video helpful. If you appreciate this tutorial, please drop me a like. I would really appreciate that. Again, it helps, it's feedback for me and if you have any questions, the best way to reach out to me is in the comments or on Twitter. My handle is at McKay Wrigley and happy building. Hope you all can use this to build some cool stuff. If you build anything cool with it, let me know. I'd love to see it. Take care, guys.