Description
Cody Ho is the winner of 2023’s DEFCON, the world's largest and most famous hacker conference. This year, Cody took home two of the three top spots in competitions revolving around manipulating popular AI models like ChatGPT and Google Bard. Cody is a current Stanford student, developer, and hacker.
Chapters
How did you get into hacking? (0:00)
White Hat vs Black Hat (1:40)
Walking out without knowing that you had won (5:33)
How to hack an LLM (6:59)
Adversarial attacks (10:55)
Guest[s]
Cody Ho
Roles:
Hacking Champion
Organization:
DefCon 2023
Host[s]
Maxwell Matson
Roles:
Head of Growth
Organization:
PlayerZero
Related content
Transcript
Cody Ho 0:00 Someone had a brilliant idea of giving computers control of the entire world and the entire infrastructure and then doing the absolute bare minimum to secure any of them. So DEFCON is a cross between an academic conference, a meet up of hackers and a really big party. There's been some systems that I've attacked, where it really, really is just like you asked for it, you just ask him nicely, and it gives you everything like the future is when these bots can connect to the internet. And you can API's, for example, chat cheap T recently was chat UPC plugins, which is a great feature. Max Matson 0:42 Hey, everybody, welcome to a slightly different future of product. Today, my guest is Cody, hoh, winner of two of the top three spots of this year's DEF CON, which had an unsurprisingly heavy focus on AI technology. Cody, thanks so much for joining me today. Horace, great to be here. Awesome. So before we get into your win, can you take just a minute or so to kind of tell me about what got you into hacking in the first place? Well, Cody Ho 1:06 absolutely. So I actually wasn't big into hacking before I got into Stanford. But once I was here, I was aggressively recruited by our coach, Manny, Alex Keller. And he's just really, I feel like a really good job of moving me and just really emphasizing the importance of security, that someone had a brilliant idea of giving computers control of the entire world and the entire world's infrastructure, and then doing the absolute bare minimum to secure any of them. And we're starting to learn that that's not that great of an idea. And that this can be a really big need in the future for people to protect ourselves, and to ensure that we reap the benefits of technology and computing. And not have to worry about all that being destroyed overnight, because we forgot there's some virus, some holes in our defenses. So I'm just really passionate about trying to, like, keep people safe, and ensure that like, again, we can just use computers well Max Matson 2:07 for our own benefit. Right. Awesome, man. So just to be clear for the audience, you would consider yourself a white hat as opposed to a black hat hacker. Cody Ho 2:15 99% White Hat versus black hat. Max Matson 2:18 Gotcha. Gotcha. Gotta keep that fun. 1%. Yeah, Cody Ho 2:22 that's fun. But we know what, it's just the right way of doing things like hacking for fun. Profit is great. Hacking to be an asshole and to break things left and right is not. That's the model. Max Matson 2:36 Yeah, yeah. No. 100%. So, that being said, Was this your first DEF CON, by the way? Second? Second. Okay. Gotcha. Now, real quick. For those who aren't familiar. Could you explain DEF CON, just kind of what it is kind of what it means to the culture? Cody Ho 2:53 Oh, absolutely. So DEF CON is a cross between an academic conference, a meetup of hackers and a really big party. So it's the largest cybersecurity conference in the entire world. There's like 30,000 people, I think, this year, something similar last year. And there's just tons of really awesome things to do all which would involve hacking, there's a code talks, there's a whole bunch of parties, a bunch of meetups, there's competitions, which is what we're here to talk about today. There's various just general challenges. And overall, by far the best part DEF CON is just meeting new people. It's you'll find the most interesting people in the entire world, hands down at DEF CON. And really everyone you talk to is going to be interesting. And that's going to make connections with fellow hackers in that way, just by far the most valuable part experience to me. Max Matson 3:44 I'm sure. That's awesome. I also, I would imagine walking around DEFCON one must be pretty tight on their own personal security. Yes. Well, I Cody Ho 3:57 have a funny story. Well, I don't have a story about this, Mike. Yeah. He was at DEF CON, I think it was like 10 years ago to the Wi Fi. And he finds a file on his desktop. When he doubled the file, it's a text file. It just says like, Hey, you really should not connect to the Wi Fi access on your computer right there. Now, I happen to use cubes, which is a highly secure HoH s it's based off the same technology that powers AWS. It's by far the most secure operating system in the world. So I don't worry too much about my own personal like cybersecurity. In general, it's improved from past in the past years. I mean, you're going to hacker meet up like the hacker meet up, right? Yeah. Max Matson 4:46 Yeah, exactly. It's kind of the tricks of the trade. That's what you're there to see. Right. Cody Ho 4:52 Exactly. That's exactly right. Max Matson 4:55 Cool. Well, um, real quick. I'm going to read an excerpt from The New York Times The article that talked about you. Cool. So launching into that. So seven judges graded the submissions this year. The top scores were Cody three, array four and Cody, two, two of those handles came from yourself. Cody Ho, student at Stanford University studying computer science with a focus on AI. He entered the contest five times during which he got the chatbot to tell him about a fake place named after a real historical figure, and describe the online tax filing requirement codified in the 28th constitutional amendment, which doesn't exist. I also heard that you walked out without knowing that you had won. Is that true? Cody Ho 5:37 That's true. Like that part of the article, the whole the entire article is accurate perspective. But especially with that part, that's completely accurate. I got to know about Paul's noon, Sunday. And at that point, I was sitting in my room, I was supposed to take a nap like 500 miles away is my room in San Jose. 500 miles from death. Oh, wow. I can even say, hey, come back to aiv one, I'm like, well, that's gonna be a minor issue. I didn't know if that met one. First, I just thought I was top three or top 10 or what. But that's all true, though. that that happens. I've been told that my my GPS coming in the mail. So far has not has not arrived. But I'll wait weeks before getting in there asked about it. But I will make sure it arrives though. I'm there, I'm quite interested in ensuring that have access to my ASICs? 1000s, Max Matson 6:33 I would imagine that is a what $4,000 graphics card Cody Ho 6:38 MSRP is 4600. On eBay, it's like about 3600. But point being it's a washed up traffic changed where I definitely don't make sure that ends up like I depend on access to it. Max Matson 6:50 Oh, of course, no and well deserved. May I also point out that you won not just the top spot, but two out of the three top spots. Cody Ho 6:59 That was actually the way a strategy worked was that when you're in competition, you're only allowed to interact with every LLM once. So what I would do is I bring a piece of paper, you're not let's bring a laptop, you have to use the laptop. If you you can't type in your own laptop on the side. So I have a piece of paper next to me, who I wrote down a codename for all the challenges and write down a prompt or a few words my prompts, and then had basically a man like a handwritten spreadsheet containing what each crop was successful in each model. So I did the competition, it was actually four times not five. That's the only way I could find that's my mistake. I told her five. So that was on me. And essentially, yeah, I essentially just did a whole bunch of trial runs. Where I would go in, I would try my prompt. I would write down for work tonight. And then I'd move on. And it turns out that my trial run was enough to get third place. And even the final run I did I actually funny. Funny story about this, I can't tell times this. But that final run, I was actually quite mad. Because I was planning one more run, like my fifth run. Were all out and I wouldn't actually write anything down. But you'd have the timing worked out, I wasn't actually able to do a final run. Because the closed before I was able to. And then. So my so that 23 The run ended up winning first place, I actually didn't go out. I had like a whole spreadsheet of prompts I didn't use because I was just trying to wow. So I can pitch and moan complaints. I also at the same time. i What could I have gotten if I had failed to do that? Max Matson 8:42 So okay, so you talked about some of the prompts. And and this is a really interesting thing that I want to make sure I get kind of your opinion on. So you are a hacker I have worked in security right? On a zero to 10 scale. How easy was it to hack an LLM relative to other systems that you've that you've hacked? Cody Ho 9:04 It's hard to answer that because there's been some systems that I've attacked, where it really truly is just like you ask for it just just asking nicely. It gives you everything like there isn't a whole bunch of apps at Stanford startups like legitimate apps, without people's data, where they're just running in debug mode. And you just ask the database for anything and just give you anything back. It was a really sad frame is inexcusable. So compared to that, it's not trivial. I'd say like, I mean, just a three perhaps but at the end of the day, these watch which models are not hard to hack, though, and pack and watch which models this that is accessible to the average person is natural language. It's like no no hack a database. Even if there's no security requires you to understand how a database works, how schemas are work, how networking works. I haven't that web apps work, which I, which I understand, obviously, but like an average person have to read about all this stuff, even if it's trivial, and it just gives you all the data, you have to figure out how you can make the request and do all this stuff. Whereas with large language models, there, again, it's natural language. I can, you know, you've I'm sure everyone has hauled away at some point in history, at some point, people in the issue have talked to someone with an agenda trying to get something from them, everyone knows these skills. So the barrier to entry is just significantly lower than any other area. Max Matson 10:39 I see. So that's really the concern is, is that it's not so much that it's easier, it's just that it's accessible. Cody Ho 10:46 Somewhat. There's a Science in Computer Science that with many eyes, all bugs are shallow. At the same time, though, the majority of users are not adversarial. And if it is adversarial, the threat model changes. Like if your goal, for example, is to get the model to say something racist, that's your goal, you're actually you're you're an adversarial user, you're doing a profit injection, your goal is to get them off to say something racist, you'll probably be successful. At the same time, though, you can argue that that doesn't really matter. Because if your goal is to see something racist, you can do that. With 4chan, it's not very difficult, you don't need a GPT for that. Whereas if you're a regular user, and you're not happy, you're not adversarial, you shouldn't see something racist unless you explicitly ask for it. Because that's an obvious harm. So the threat model is a bit different. In this case, there's, I think, one thing that Digiarty, which does what the challenge was called, The a very good job of what they separated the scenarios into prompt injection, adversarial attacks, and nonprofit injection. And the overwhelming majority of tasks are not pump injections. So the majority, regular case, average user, confused, doesn't know how algorithms work. But average user, which I like, because that's what should be most worried about. Because while MLMs are just text, and they can't talk to the internet, and they don't, and they can't call API's, and they can't do this other stuff. The harm that can cause is a willingness to text. And as long as we can solve the average case of the confused user, I think that's like, we'll be in a pretty good place. Because, again, if your goal is to be adversarial see something racist, just go on 4chan? Who cares if if chatty t does the same thing? Max Matson 12:46 Right, right. But but when it comes to, you know, like core security vulnerabilities, things like, you know, my name is credit card number, what is my name or something like that? Is that something that the average person necessarily needs to be concerned about? Cody Ho 13:02 No. There's definitely some private information embedded in these models, you have to remember, if this private information is embedded in these models, that means at some point is public on the internet and was scraped, which means it's not that private. So I worry much more about whoever pureed the day is that just opening up control app credit card number, and then finding that queering logic language model, what is very vague word, which is the future is when these models can connect to the internet. And you can use API's, for example, chat, cheap T recently watch chats, UPC plugins, which is a great feature. It lets you connected to all these different apps and do all this cool stuff. And it makes it more useful as an assistant. At the same time, if it can catch your bank account as you that pretty scary, in which case the domain changes, and you have to be really damn sure this model isn't going to send $10,000 to workshop or something like that. So when it comes to Lm using tools, that's why I call it because that's like holiday my research. That's a much bigger worry, than LMC is giving you advice and giving text, because then the adversarial user isn't just saying something racist. It's stealing money from other people. It's no causing legitimate harm upon the world. Max Matson 14:32 I gotcha. Well, Cody, I out of respect for your time, I'm gonna let you go. But I just want to say thank you so much for your perspectives. Congratulations again, and I am rooting for you to get that graphics card in the mail soon, Cody Ho 14:44 of course. Thank you so much, Max. I appreciate it. It's great being on here, man. Max Matson 14:47 Oh, thank you. I appreciate your time.
Show transcript