If you follow technology and programming news, then you're already aware of this new tool called github copilot. If you have no clue what github copilot is, don't worry. Here in a moment i'm going to go through exactly what it is and what it's supposed to do before we jump in, just to set the theme. This is absolutely not a tool that i'm ever going to use and it's not a tool that i can in good faith recommend that anybody use. I have several reasons why that is, and i'll be going over all of them in this video. But first let's talk about what is github copilot? The way github copilot is described on their home page is: it's your ai pair programmer.
It says: with github copilot, get suggestions for whole lines or entire functions right inside your editor. About a week ago, when i first saw this, somebody posted my discord server and my first reaction when i saw keyholed copilot was: that's kind of interesting. But this was, of course, before i had any idea how any of it worked. So they have a little gif animation down here which shows how it works. You type some comments, you type the function signature and then just magically a bunch of code appears. You can then go on to either accept the suggestion that it gives you or, i guess, you can toggle between some other suggestions. So you might be wondering: where does this code actually come from? Well, the answer is quite simple: it comes from github. When they made this machine learning model, they trained it on millions of open source repositories from github, and this, in my mind, is where the first problem exists. Github co-pilot is most assuredly going to disrespect open source licenses, and when those licenses are violated, we don't know to whom that liability is going to fall. It potentially could be you, the user of github co-pilot. There's a couple entries in the frequently asked questions that address this.
The first is: does github co-pilot recite code from the training set? And in here they do say that about 0.1 percent of the time it will recite something directly from the training set. That's verbatim, i think. What's being ignored here, though, is the other 99.9 of the time. It's saying that we may take a bunch of copyright pieces of material, put them together and give you that as a new thing, but the problem is that doesn't produce a new uncopyrighted piece of material. That's just a derivative work that contains several pieces of copyrighted material. The other entry in the faq that addresses this is who owns the code.
Github co-pilot helps me write, and they say in here that you own it entirely. This again seems like an obvious problem. If github copilot is making derivative works from copyrighted material and given to you, then you do not own them. There's been a couple of people who have compared this to open source code laundering, and i thought that was an interesting analogy for it. The second reason why you shouldn't use this is because it's going to be another crutch that you'll depend on, and you might actually have to pay for it.
This tool is not going to make people better programmers, although it doesn't actually claim to do that. This is really no different from copy and pasting from stack overflow. This is just the hyper integrated version of them, and, to be sure, stack overflow is a crutch too. Anytime you can just search for a giant block of code and just copy and paste it over like it's no big deal. That is a crutch that you're depending on at least. With stack overflow, though, you have to actually put the effort in to find the thing that you want and the code you find. Stack overflow is almost never going to be the exact thing you need, so it usually will kind of guide you to the actual solution that you want sometimes. As for the cost of github co-pilot, i had assumed at first that it was free, but after reading the frequently asked questions i'm not so sure anymore. They address this in a couple places in the frequently asked questions.
The first is where they say that the technical preview is free for some people and then will there be a paid version. It does say that their plan is to build a commercial version of copilot in the future. We want to use the preview to learn how people use copilot and what it takes to operate its scale. Of course you can make your own opinion here, but it's a hundred percent clear that they're gonna have a paid version. It is not clear if they're gonna have a free version. If i were making a prediction, i'm gonna say that best case scenario, there's going to be a free version with limitations. And i gotta tell you there's something a little unsettling to know that github is trying to build an entirely new revenue stream around the hard work of tens of millions of developers who have contributed to open source. I don't see that as okay. The third reason not to use get a co-pilot- and this depends a lot on the nature of your project is that your private code is sent to github for analysis and storage.
I had assumed two things were true from the outside here. Number one was that your private code that is sent for analysis is not actually stored, and number two, the private code that is sent is not actually used to generate any future suggestions for other people. However, after reading their frequently asked questions, i found that only one of those is true, and this is addressed in the question titled: is the transmitted data secure?
And what they say is: all data is transmitted and stored securely. So they're saying they are going to store it, but then they go on to say that when humans read it, it is specifically with the aim of improving the model or detecting abuse. So that means that your potentially private and sensitive code is both stored and or read by another person and again, depending on your project, this could range anywhere from a non-issue, like in the case where you're working on open source, or a deal breaker, when you work on something super sensitive. What if you're working on client code for which you've signed an nda and github co-pilot is just ingesting that into their system? The other question in here was: will my private code be shared with other users? And they say no, and i i did believe that that was the case, and the final reason i think why not to use this is it has the potential to, and probably will, produce bad and or unsafe code, and this isn't speculation. They've addressed this exact thing in their frequently asked questions. In the question: does github co-pilot write perfect code?
The answer is no. Github copy tries to understand your intent and generate the best it can, but the code suggests may not always work or even makes sense. It goes on to say that github copilot should be carefully tested, reviewed and vetted like any other code. So the idea that this tool is going to be able to just write the code you need is probably not going to happen. If it writes anything useful at all, you're still going to have to check it for problems. However, there are going to be people that use this tool who are just going to just take whatever github copied outputs and just use it without testing it. These are the same people, of course, who take code from stack overflow and just copy and paste into their project without testing it. So i suppose, in the end, that might not be anything new, except github copilot says it might be producing things that don't even make sense, at least stuff you find on stack overflow, somebody to compose it and then it is kind of curated by way of other users upvoting it and doing things like that. There's no doubt in my mind that the code that this is going to produce is going to be less quality than you'll find on stack overflow. So that's really all my thoughts on the matter.
I'm definitely not in love with this at all and i'm not really excited for this tool in the least. What i am excited about, however, is what you all think about github co-pilot and the opinions that you have. So hand down to the comment section and let me know what you think about all this. Thanks a lot for watching.