1 00:00:00,420 --> 00:00:05,910 [Music] 2 00:00:09,920 --> 00:00:16,320 Hello everyone. Uh welcome back to talks 3 00:00:12,400 --> 00:00:18,960 for this afternoon. Uh I have the 4 00:00:16,320 --> 00:00:21,600 absolute pleasure of introducing Caleb 5 00:00:18,960 --> 00:00:24,240 Brown who will be talking to us all 6 00:00:21,600 --> 00:00:25,359 about how archives can be very 7 00:00:24,240 --> 00:00:26,880 vulnerable. 8 00:00:25,359 --> 00:00:29,880 Thanks. 9 00:00:26,880 --> 00:00:29,880 All right. 10 00:00:30,400 --> 00:00:35,600 Uh, I'm glad you could all join me after 11 00:00:32,559 --> 00:00:38,719 lunch. I hope you can not fall asleep. 12 00:00:35,600 --> 00:00:41,280 Um, so as you've already heard, I'm 13 00:00:38,719 --> 00:00:43,600 Caleb. Um, I work at Google as a 14 00:00:41,280 --> 00:00:45,920 software engineer. Um, and I work on 15 00:00:43,600 --> 00:00:48,160 open source security. Uh, I also do a 16 00:00:45,920 --> 00:00:49,520 bit of work on deps.dev. Uh, and some of 17 00:00:48,160 --> 00:00:52,239 my teammates will actually be talking 18 00:00:49,520 --> 00:00:54,000 next. Um, I'm also a maintainer on 19 00:00:52,239 --> 00:00:56,640 something called the OpenSSF malicious 20 00:00:54,000 --> 00:00:59,039 packages project. Um, which means I get 21 00:00:56,640 --> 00:01:02,559 to see lots of reports about malware all 22 00:00:59,039 --> 00:01:05,600 the time. And so my background in Python 23 00:01:02,559 --> 00:01:07,920 and my work in security got me wondering 24 00:01:05,600 --> 00:01:10,159 a while back. 25 00:01:07,920 --> 00:01:12,799 How secure are the technologies our 26 00:01:10,159 --> 00:01:15,040 ecosystems are built on? I'm talking 27 00:01:12,799 --> 00:01:18,640 about the fundamental technologies, the 28 00:01:15,040 --> 00:01:20,799 nuts and bolts. For me, my interest has 29 00:01:18,640 --> 00:01:23,759 been focused on packaging. So source 30 00:01:20,799 --> 00:01:25,680 discs, wheels, that sort of thing. 31 00:01:23,759 --> 00:01:28,000 So, having spent a fair amount of time 32 00:01:25,680 --> 00:01:30,080 poking around Python packaging, I was 33 00:01:28,000 --> 00:01:34,240 intrigued when I read a report earlier 34 00:01:30,080 --> 00:01:37,280 this year that used tar files and sim 35 00:01:34,240 --> 00:01:39,040 links to compromise a system. After 36 00:01:37,280 --> 00:01:41,360 reading that report, I immediately 37 00:01:39,040 --> 00:01:43,520 wondered if Python's tar files were 38 00:01:41,360 --> 00:01:45,920 vulnerable to the same sim link based 39 00:01:43,520 --> 00:01:49,119 attacks in tar files. And long story 40 00:01:45,920 --> 00:01:51,920 short, I found um three vulnerabilities 41 00:01:49,119 --> 00:01:54,479 um in the the tar file library in 42 00:01:51,920 --> 00:01:56,079 Python. Um, and these were fixed uh 43 00:01:54,479 --> 00:01:58,560 earlier this year in in this version of 44 00:01:56,079 --> 00:02:01,759 Python along with two other tar file 45 00:01:58,560 --> 00:02:04,880 CVES as well. Um, but today I really 46 00:02:01,759 --> 00:02:07,119 want to focus on this one here. 47 00:02:04,880 --> 00:02:09,280 Um, particularly because this 48 00:02:07,119 --> 00:02:12,319 vulnerability is the most serious. This 49 00:02:09,280 --> 00:02:14,720 vulnerability allows a malicious tar 50 00:02:12,319 --> 00:02:16,080 file to write anywhere on the file 51 00:02:14,720 --> 00:02:18,800 system. Well, at least anywhere with 52 00:02:16,080 --> 00:02:20,239 privileges to do that. And while I 53 00:02:18,800 --> 00:02:22,879 explain how this particular 54 00:02:20,239 --> 00:02:24,879 vulnerability works, I I would like to 55 00:02:22,879 --> 00:02:26,640 discuss, I guess, some general security 56 00:02:24,879 --> 00:02:28,400 concepts and ideas I was reminded of 57 00:02:26,640 --> 00:02:31,200 while I was discovering these this this 58 00:02:28,400 --> 00:02:32,800 vulnerability and the others. 59 00:02:31,200 --> 00:02:34,959 So to get started, we need to go over 60 00:02:32,800 --> 00:02:36,640 some background just so we're all on the 61 00:02:34,959 --> 00:02:38,640 same page. 62 00:02:36,640 --> 00:02:39,920 Um, I'm sure most of you are familiar 63 00:02:38,640 --> 00:02:42,480 with TAR files, but I'll cover it 64 00:02:39,920 --> 00:02:46,720 briefly. Um, TAR stands for tape 65 00:02:42,480 --> 00:02:49,920 archive. Um, it's really old. Um, and 66 00:02:46,720 --> 00:02:52,319 it's original intent was for backing up 67 00:02:49,920 --> 00:02:53,599 your system to a tape file. Um, and so 68 00:02:52,319 --> 00:02:56,400 it's kind of intended for this 69 00:02:53,599 --> 00:02:58,640 sequential operation. Uh, and it doesn't 70 00:02:56,400 --> 00:03:00,560 do compression by itself. It's just, uh, 71 00:02:58,640 --> 00:03:03,040 usually you add some compression on 72 00:03:00,560 --> 00:03:06,640 later. So you might see a tar file 73 00:03:03,040 --> 00:03:08,159 followed by agz orbz 2. Um, that's the 74 00:03:06,640 --> 00:03:10,239 compression added afterwards. And it can 75 00:03:08,159 --> 00:03:11,840 store stupid things as well as normal 76 00:03:10,239 --> 00:03:13,840 stuff like you expect like files and 77 00:03:11,840 --> 00:03:17,840 directories. But it also stores device 78 00:03:13,840 --> 00:03:19,599 nodes and fifo. What? Um, so that's tar 79 00:03:17,840 --> 00:03:21,680 files. Uh, the other thing we should all 80 00:03:19,599 --> 00:03:23,120 be on the same page about is sim links. 81 00:03:21,680 --> 00:03:25,200 Um, if you're not from a Unix 82 00:03:23,120 --> 00:03:28,560 background, um, these are kind of like 83 00:03:25,200 --> 00:03:31,680 shortcuts. Um, so a symbolic link is 84 00:03:28,560 --> 00:03:33,360 essentially a file. Uh, it stores a path 85 00:03:31,680 --> 00:03:36,080 to some other place on the file system 86 00:03:33,360 --> 00:03:37,760 and we call this the target. And for a 87 00:03:36,080 --> 00:03:39,440 lot of file system operations, it just 88 00:03:37,760 --> 00:03:41,680 behaves like the target. So if the 89 00:03:39,440 --> 00:03:43,280 target is a file, you can open it, you 90 00:03:41,680 --> 00:03:44,560 can read to it, you can write to it. If 91 00:03:43,280 --> 00:03:47,599 it's a directory, you can list the 92 00:03:44,560 --> 00:03:49,920 contents. So that's sim links. And the 93 00:03:47,599 --> 00:03:52,799 last thing we need to introduce are path 94 00:03:49,920 --> 00:03:54,000 traversal vulnerabilities. U so in a 95 00:03:52,799 --> 00:03:55,599 path traversal vulnerability, you've got 96 00:03:54,000 --> 00:03:57,519 a file system. There are stuff in your 97 00:03:55,599 --> 00:03:59,680 file system you really don't want people 98 00:03:57,519 --> 00:04:00,959 to touch. There's stuff there that 99 00:03:59,680 --> 00:04:03,280 that's where you expect things to 100 00:04:00,959 --> 00:04:04,640 happen. That's here. Um so when you are 101 00:04:03,280 --> 00:04:06,640 doing operations, you kind of want 102 00:04:04,640 --> 00:04:08,879 everything to stay in that circle. Um, 103 00:04:06,640 --> 00:04:11,120 but a path traversal vulnerability is 104 00:04:08,879 --> 00:04:13,280 when an attacker can manipulate the 105 00:04:11,120 --> 00:04:15,360 input, usually a file name, in such a 106 00:04:13,280 --> 00:04:17,199 way that it can get out of that circle 107 00:04:15,360 --> 00:04:19,359 and access something else in the file 108 00:04:17,199 --> 00:04:20,880 system it wasn't meant to do. Now, if 109 00:04:19,359 --> 00:04:24,000 you're from a web background, you might 110 00:04:20,880 --> 00:04:26,400 be familiar with this sort of um pattern 111 00:04:24,000 --> 00:04:28,320 where you have a query parameter and 112 00:04:26,400 --> 00:04:30,320 someone's just added a whole bunch of 113 00:04:28,320 --> 00:04:33,040 dots. This is like the most simplest 114 00:04:30,320 --> 00:04:35,280 thing. Um there are a lot more complex 115 00:04:33,040 --> 00:04:37,440 complex ways of injecting the path 116 00:04:35,280 --> 00:04:39,840 traversal. Um but then you can see that 117 00:04:37,440 --> 00:04:43,440 it just gets passed straight into open 118 00:04:39,840 --> 00:04:46,080 and if this Django snippet was run it 119 00:04:43,440 --> 00:04:49,280 would dump your entire password file 120 00:04:46,080 --> 00:04:51,280 into the web response assuming you had 121 00:04:49,280 --> 00:04:53,919 permission to read it. Uh archive 122 00:04:51,280 --> 00:04:56,000 vulnerabilities are kind of similar. Um 123 00:04:53,919 --> 00:04:59,040 there's just a file name and a bunch of 124 00:04:56,000 --> 00:05:01,360 content and it's user supplied. Um, and 125 00:04:59,040 --> 00:05:04,800 so when you if you had a tar file with a 126 00:05:01,360 --> 00:05:06,960 a file name like that, it would extract 127 00:05:04,800 --> 00:05:09,199 and it would output some authorized keys 128 00:05:06,960 --> 00:05:11,280 into your SSH directory and give the 129 00:05:09,199 --> 00:05:13,280 attacker access to your host um 130 00:05:11,280 --> 00:05:15,520 potentially the root account. Now, 131 00:05:13,280 --> 00:05:18,160 archive vulnerabilities uh that are path 132 00:05:15,520 --> 00:05:20,320 traversal vulnerabilities um has a term 133 00:05:18,160 --> 00:05:24,240 that's commonly used called zip slip. Uh 134 00:05:20,320 --> 00:05:26,720 this was coined in 2018 by Sneak. Um, 135 00:05:24,240 --> 00:05:29,199 it's sometimes called tar tar slip for 136 00:05:26,720 --> 00:05:32,880 for tar files, but they're basically the 137 00:05:29,199 --> 00:05:35,280 same type of vulnerability. Uh, but for 138 00:05:32,880 --> 00:05:36,400 Python and the tar file library, uh, its 139 00:05:35,280 --> 00:05:38,720 history of path traversal 140 00:05:36,400 --> 00:05:42,800 vulnerabilities goes back a lot further. 141 00:05:38,720 --> 00:05:45,600 Uh, so in 2007, a CVE, a vulnerability 142 00:05:42,800 --> 00:05:47,520 report, um, was created about path 143 00:05:45,600 --> 00:05:50,320 traversal in the Python tar file 144 00:05:47,520 --> 00:05:52,560 library. Um, 145 00:05:50,320 --> 00:05:57,199 so not much happened for a while. Um and 146 00:05:52,560 --> 00:05:59,280 then in 2023 uh with PEP703 it 147 00:05:57,199 --> 00:06:01,520 introduced the concept of filters to 148 00:05:59,280 --> 00:06:04,400 Tarfile so that you could make the 149 00:06:01,520 --> 00:06:07,440 extraction uh safer. 150 00:06:04,400 --> 00:06:10,160 Uh and it it provides two filters that 151 00:06:07,440 --> 00:06:13,120 we care about. Uh the first is the tar 152 00:06:10,160 --> 00:06:15,280 filter. Uh and its job is basically to 153 00:06:13,120 --> 00:06:17,520 behave like the command line the tool 154 00:06:15,280 --> 00:06:19,680 the tar tool. Um and so the main thing 155 00:06:17,520 --> 00:06:24,319 it does is uh stop you using absolute 156 00:06:19,680 --> 00:06:26,639 paths and it tries to avoid u path 157 00:06:24,319 --> 00:06:28,479 traversal for writing. So if you give it 158 00:06:26,639 --> 00:06:31,680 a destination all the rights should stay 159 00:06:28,479 --> 00:06:33,840 in that location. Uh the other filter is 160 00:06:31,680 --> 00:06:35,759 the data filter and it is the like it 161 00:06:33,840 --> 00:06:39,120 does everything the tar filter does but 162 00:06:35,759 --> 00:06:41,520 it's stricter. Um it also tries to avoid 163 00:06:39,120 --> 00:06:45,520 having links inside a destination that 164 00:06:41,520 --> 00:06:47,520 point somewhere else. Um and uh it does 165 00:06:45,520 --> 00:06:49,919 other things like only lets you extract 166 00:06:47,520 --> 00:06:52,400 like files, sim links but not device 167 00:06:49,919 --> 00:06:54,319 nodes. Uh and it does sensible things 168 00:06:52,400 --> 00:06:56,479 like setting permissions and stuff on 169 00:06:54,319 --> 00:06:58,800 directories. And this is going to be the 170 00:06:56,479 --> 00:07:01,680 default next month when um the next 171 00:06:58,800 --> 00:07:04,639 version of Python is released. Uh along 172 00:07:01,680 --> 00:07:06,319 with this PEP, they also added um some 173 00:07:04,639 --> 00:07:08,720 more changes to the API so you could use 174 00:07:06,319 --> 00:07:11,759 them. Um you can go into tar file and 175 00:07:08,720 --> 00:07:14,080 you can add um your filters. 176 00:07:11,759 --> 00:07:17,360 Um you can even add them if you're using 177 00:07:14,080 --> 00:07:20,160 it from the command line. Um so these 178 00:07:17,360 --> 00:07:21,759 changes got me thinking about um these 179 00:07:20,160 --> 00:07:24,160 filter changes got me thinking about 180 00:07:21,759 --> 00:07:26,960 security retrofits. 181 00:07:24,160 --> 00:07:30,479 Um and first big point here is that 182 00:07:26,960 --> 00:07:32,880 retrofitting security is hard. 183 00:07:30,479 --> 00:07:35,280 Um it's hard because usually you're 184 00:07:32,880 --> 00:07:36,960 doing it because you're forced to. It's 185 00:07:35,280 --> 00:07:39,520 in reaction to a vulnerability or a 186 00:07:36,960 --> 00:07:42,240 compromise. It's also hard because 187 00:07:39,520 --> 00:07:44,479 usually you're adding security to a 188 00:07:42,240 --> 00:07:45,840 place where there's existing code and 189 00:07:44,479 --> 00:07:48,479 there's existing technical debt and 190 00:07:45,840 --> 00:07:50,880 you've got to manage that. And after 191 00:07:48,479 --> 00:07:52,880 you've added your security fix or you've 192 00:07:50,880 --> 00:07:54,800 hardened the security, the complexity 193 00:07:52,880 --> 00:07:57,120 and the technical debt is usually a bit 194 00:07:54,800 --> 00:07:58,800 higher than it was before. 195 00:07:57,120 --> 00:08:00,319 And then after all that, you've got to 196 00:07:58,800 --> 00:08:01,680 deal with like maybe you've had to 197 00:08:00,319 --> 00:08:04,639 change something in a way where you need 198 00:08:01,680 --> 00:08:06,879 to migrate systems or um get users to 199 00:08:04,639 --> 00:08:10,400 change to a new process. 200 00:08:06,879 --> 00:08:13,360 Retrofitting security is hard 201 00:08:10,400 --> 00:08:14,639 and when you do it, please avoid 202 00:08:13,360 --> 00:08:18,400 shortcuts because you are going to do 203 00:08:14,639 --> 00:08:20,319 it. Um, but avoid shortcuts. Fast fixes 204 00:08:18,400 --> 00:08:22,319 have their place. If you have a critical 205 00:08:20,319 --> 00:08:25,039 vulnerability, you're leaking private 206 00:08:22,319 --> 00:08:27,280 user data. Um, do what you need to do to 207 00:08:25,039 --> 00:08:29,520 get that closed. 208 00:08:27,280 --> 00:08:32,240 But otherwise, it's really important to 209 00:08:29,520 --> 00:08:34,959 approach security hardening carefully. 210 00:08:32,240 --> 00:08:36,719 What are the tradeoffs when you're 211 00:08:34,959 --> 00:08:39,599 what's with your solution? What are you 212 00:08:36,719 --> 00:08:41,360 gaining? What are you losing? Um, you 213 00:08:39,599 --> 00:08:43,120 it's good to avoid unintentional 214 00:08:41,360 --> 00:08:45,920 consequences. 215 00:08:43,120 --> 00:08:48,480 It's very easy for an intended security 216 00:08:45,920 --> 00:08:51,120 fix actually make or introduce new 217 00:08:48,480 --> 00:08:52,880 vulnerabilities as well. 218 00:08:51,120 --> 00:08:54,800 And you also want to take your time and 219 00:08:52,880 --> 00:08:57,920 be careful because you want to limit 220 00:08:54,800 --> 00:09:00,480 complexity. Um, complexity is the enemy 221 00:08:57,920 --> 00:09:02,640 of security. And in the long term, if 222 00:09:00,480 --> 00:09:04,160 you take shortcuts, it only makes the 223 00:09:02,640 --> 00:09:06,160 challenges of retrofitting security 224 00:09:04,160 --> 00:09:09,760 harder. 225 00:09:06,160 --> 00:09:11,760 But retrofitting security well is worth 226 00:09:09,760 --> 00:09:15,279 the effort. 227 00:09:11,760 --> 00:09:17,680 If we look at the tar file filters, um 228 00:09:15,279 --> 00:09:20,080 the secure default is is almost here. It 229 00:09:17,680 --> 00:09:22,640 lands next month. Uh and this secure 230 00:09:20,080 --> 00:09:25,200 default for anybody who's using Python 231 00:09:22,640 --> 00:09:29,120 314 is the default. It means that 232 00:09:25,200 --> 00:09:30,640 they're all protected um by this change. 233 00:09:29,120 --> 00:09:33,279 Also, the change is well considered. 234 00:09:30,640 --> 00:09:34,800 it's only a small API change. Um, and 235 00:09:33,279 --> 00:09:37,440 it's optional. If you're using it, 236 00:09:34,800 --> 00:09:39,440 you're using it to opt out. And the last 237 00:09:37,440 --> 00:09:41,839 thing, and probably most importantly, is 238 00:09:39,440 --> 00:09:44,399 that it's a centralized implementation 239 00:09:41,839 --> 00:09:45,760 that is strong and well tested. Um, if 240 00:09:44,399 --> 00:09:47,680 we didn't have this, you'd be forcing 241 00:09:45,760 --> 00:09:51,360 developers to all go and do it their own 242 00:09:47,680 --> 00:09:52,800 way. Uh, which risks them doing it wrong 243 00:09:51,360 --> 00:09:56,000 and for everyone having different 244 00:09:52,800 --> 00:09:58,560 vulnerabilities all over the place. 245 00:09:56,000 --> 00:09:59,839 But I'm getting ahead of myself. Um, we 246 00:09:58,560 --> 00:10:01,120 still haven't got to our vulnerability 247 00:09:59,839 --> 00:10:03,040 yet. And I'm sure that's why you're 248 00:10:01,120 --> 00:10:06,000 here. So, let's go back to these 249 00:10:03,040 --> 00:10:08,959 filters. Um, 250 00:10:06,000 --> 00:10:12,800 the vulnerability that I'm interested in 251 00:10:08,959 --> 00:10:14,399 is focused on breaking this. We want to 252 00:10:12,800 --> 00:10:17,200 be able to write files anywhere, 253 00:10:14,399 --> 00:10:19,440 anywhere we want on the file system. 254 00:10:17,200 --> 00:10:22,000 So, the vulnerability I found does allow 255 00:10:19,440 --> 00:10:23,920 this. Let's jump into tar file 256 00:10:22,000 --> 00:10:25,200 implementation in Python. I promise you 257 00:10:23,920 --> 00:10:26,720 it's not going to be like super 258 00:10:25,200 --> 00:10:29,920 detailed. You'll be able to hopefully 259 00:10:26,720 --> 00:10:32,640 follow along. Um so first of all here is 260 00:10:29,920 --> 00:10:36,480 the extract all function in the tar file 261 00:10:32,640 --> 00:10:37,760 module. Um the and it's I've simplified 262 00:10:36,480 --> 00:10:40,160 it. I've got rid of code you don't need 263 00:10:37,760 --> 00:10:42,560 to worry about. Um it ensures the first 264 00:10:40,160 --> 00:10:44,720 thing it does is it ensures that the 265 00:10:42,560 --> 00:10:46,320 filter function is actually a function. 266 00:10:44,720 --> 00:10:48,320 So you can pass in a string or a 267 00:10:46,320 --> 00:10:50,480 callable or none and it will make that 268 00:10:48,320 --> 00:10:53,040 sure that's something you can call and 269 00:10:50,480 --> 00:10:55,839 then it goes through each member in the 270 00:10:53,040 --> 00:10:58,720 tar file one by one in the order that 271 00:10:55,839 --> 00:11:01,279 they appear in the file 272 00:10:58,720 --> 00:11:03,519 and then for each of the members it will 273 00:11:01,279 --> 00:11:05,519 call this function get extract tar info 274 00:11:03,519 --> 00:11:09,040 and that basically just applies the 275 00:11:05,519 --> 00:11:13,920 filter to the each member and then after 276 00:11:09,040 --> 00:11:16,240 that it extracts the single member. 277 00:11:13,920 --> 00:11:19,519 So what are the filters doing when it um 278 00:11:16,240 --> 00:11:21,839 calls that? Uh this is a simplified 279 00:11:19,519 --> 00:11:23,519 version of the filter. Um and this is 280 00:11:21,839 --> 00:11:26,160 specifically like the top part of the 281 00:11:23,519 --> 00:11:29,519 function. And this bit of code is only 282 00:11:26,160 --> 00:11:31,600 about making sure that the file that 283 00:11:29,519 --> 00:11:36,720 we're writing or the directory we're 284 00:11:31,600 --> 00:11:38,800 creating is inside the destination path. 285 00:11:36,720 --> 00:11:41,120 And you see there it's got this real 286 00:11:38,800 --> 00:11:43,519 path call which kind of gets rid of 287 00:11:41,120 --> 00:11:45,680 traversal and links and stuff. And 288 00:11:43,519 --> 00:11:47,600 there's another one. And there's this 289 00:11:45,680 --> 00:11:50,959 check to make sure that they both start 290 00:11:47,600 --> 00:11:52,399 with the destination prefix. Um, and you 291 00:11:50,959 --> 00:11:54,399 might be wondering, this code looks 292 00:11:52,399 --> 00:11:56,720 pretty normal. 293 00:11:54,399 --> 00:11:58,160 Can't see much wrong with it. And I when 294 00:11:56,720 --> 00:12:00,160 I was looking through this code, I 295 00:11:58,160 --> 00:12:02,720 passed over this many times and I 296 00:12:00,160 --> 00:12:06,720 thought it was perfectly secure as well. 297 00:12:02,720 --> 00:12:08,320 But I started wondering, 298 00:12:06,720 --> 00:12:11,200 Real Path is doing a lot of heavy 299 00:12:08,320 --> 00:12:13,120 lifting here. What What does Real Path 300 00:12:11,200 --> 00:12:16,320 actually do under the hood? 301 00:12:13,120 --> 00:12:19,279 Maybe maybe we can find a bug and bypass 302 00:12:16,320 --> 00:12:21,279 these checks. So let's jump into real 303 00:12:19,279 --> 00:12:24,399 path. We're going down the layers here 304 00:12:21,279 --> 00:12:26,880 um into more and more detail. Um so real 305 00:12:24,399 --> 00:12:29,360 path has a basic process um where 306 00:12:26,880 --> 00:12:31,839 firstly if the link is the path is 307 00:12:29,360 --> 00:12:33,279 relative it makes it absolute by 308 00:12:31,839 --> 00:12:36,399 sticking the current working directory 309 00:12:33,279 --> 00:12:38,160 on the front. Then it goes through each 310 00:12:36,399 --> 00:12:40,399 of the path element. It splits on the 311 00:12:38,160 --> 00:12:42,399 slash and then slowly builds up from 312 00:12:40,399 --> 00:12:44,160 left to right the path checking each one 313 00:12:42,399 --> 00:12:47,360 as it goes or processing each one as it 314 00:12:44,160 --> 00:12:50,720 goes. So A then slashb then slash C all 315 00:12:47,360 --> 00:12:54,240 the way. So as it does that it does this 316 00:12:50,720 --> 00:12:55,839 very basic uh set of things. Firstly if 317 00:12:54,240 --> 00:12:58,000 it's a dot or a dot dot it deals with 318 00:12:55,839 --> 00:12:59,839 that. If it's a dot it'll get rid of it. 319 00:12:58,000 --> 00:13:02,639 If it's a dot dot which is a parent 320 00:12:59,839 --> 00:13:05,440 reversal it gets rid of the dot dot and 321 00:13:02,639 --> 00:13:08,880 the one just to its left. Uh if it isn't 322 00:13:05,440 --> 00:13:11,040 one of those, it'll call LSTA. Uh lstat 323 00:13:08,880 --> 00:13:15,200 is an operating like a wrapper around a 324 00:13:11,040 --> 00:13:16,959 system call that returns data about the 325 00:13:15,200 --> 00:13:18,880 path. 326 00:13:16,959 --> 00:13:21,600 Um and it's specifically used to check 327 00:13:18,880 --> 00:13:23,519 if it's a sim link. And if it is a sim 328 00:13:21,600 --> 00:13:25,920 link, then it calls read link and 329 00:13:23,519 --> 00:13:27,760 replaces the element in the path with 330 00:13:25,920 --> 00:13:29,040 the contents of that link. And it does 331 00:13:27,760 --> 00:13:32,639 that over and over again until it's 332 00:13:29,040 --> 00:13:34,399 done. So, this all looks pretty normal, 333 00:13:32,639 --> 00:13:36,800 but then we start to see interesting 334 00:13:34,399 --> 00:13:39,360 things. This is not working. Oh, now you 335 00:13:36,800 --> 00:13:42,959 decide to um Sorry, it's giving me like 336 00:13:39,360 --> 00:13:44,800 USB warnings. Um, so we look at the 337 00:13:42,959 --> 00:13:46,720 error handling 338 00:13:44,800 --> 00:13:48,560 and the first thing I noticed was that 339 00:13:46,720 --> 00:13:51,279 the default for real path is to be 340 00:13:48,560 --> 00:13:54,399 strict equals false. So errors are 341 00:13:51,279 --> 00:13:56,160 ignored by default which means that if 342 00:13:54,399 --> 00:13:59,040 the lstat call that checks if it's a 343 00:13:56,160 --> 00:14:02,320 link fails um and it could fail for a 344 00:13:59,040 --> 00:14:03,760 bunch of reasons um then all of the rain 345 00:14:02,320 --> 00:14:05,600 remaining path elements that it hasn't 346 00:14:03,760 --> 00:14:08,800 processed they essentially get stuck 347 00:14:05,600 --> 00:14:10,720 onto the bits that it has processed. 348 00:14:08,800 --> 00:14:13,360 Let's exploit this. I think we can do 349 00:14:10,720 --> 00:14:14,800 this. Um all right let's introduce our 350 00:14:13,360 --> 00:14:16,480 our favorite thing of the day which is 351 00:14:14,800 --> 00:14:19,279 path max. We're now getting even lower 352 00:14:16,480 --> 00:14:21,920 into operating system. path max is a 353 00:14:19,279 --> 00:14:23,519 constant um and it's a constant that 354 00:14:21,920 --> 00:14:24,800 applies to operating system calls. So 355 00:14:23,519 --> 00:14:27,360 this is when you're calling into the 356 00:14:24,800 --> 00:14:29,839 kernel um there's this limit that says 357 00:14:27,360 --> 00:14:32,639 how many bytes of a path you're allowed 358 00:14:29,839 --> 00:14:34,399 to pass to it. This is important to 359 00:14:32,639 --> 00:14:36,000 remember. It's not a file system limit. 360 00:14:34,399 --> 00:14:38,880 You can make as deep a directory 361 00:14:36,000 --> 00:14:40,399 structure as you want. Um I think um as 362 00:14:38,880 --> 00:14:43,600 long as you don't run out of iodes or 363 00:14:40,399 --> 00:14:46,560 disk space um and so its main job is to 364 00:14:43,600 --> 00:14:49,199 stop you breaking the operating system. 365 00:14:46,560 --> 00:14:51,120 And here are the the amounts that each 366 00:14:49,199 --> 00:14:52,240 operating system has as its limits. 367 00:14:51,120 --> 00:14:53,680 We're not going to use that limit 368 00:14:52,240 --> 00:14:56,240 because that's kind of big to fit on a 369 00:14:53,680 --> 00:14:57,680 slide. Um, 370 00:14:56,240 --> 00:15:00,959 so we're going to just assume we have a 371 00:14:57,680 --> 00:15:02,639 path max of about 30 cuz uh that's what 372 00:15:00,959 --> 00:15:05,440 my drawings kind of worked out to be. 373 00:15:02,639 --> 00:15:08,160 Um, so assume we've got a uh a buffer 374 00:15:05,440 --> 00:15:09,760 here where we're building our path. 375 00:15:08,160 --> 00:15:12,240 We'll just stick the destination at the 376 00:15:09,760 --> 00:15:13,920 front. That's going to be there. Uh, and 377 00:15:12,240 --> 00:15:15,839 then let's do something funny. We're 378 00:15:13,920 --> 00:15:18,480 going to add a bunch of directories one 379 00:15:15,839 --> 00:15:21,920 by one. Um, and this is important later 380 00:15:18,480 --> 00:15:24,959 when we try and uh traverse out. And 381 00:15:21,920 --> 00:15:28,560 then at the end, let's add a sim link 382 00:15:24,959 --> 00:15:30,320 that exceeds the length of path max. Um, 383 00:15:28,560 --> 00:15:32,000 and and this link is just something that 384 00:15:30,320 --> 00:15:36,160 takes us all the way back to the top 385 00:15:32,000 --> 00:15:38,240 where destination is. So if we pass the 386 00:15:36,160 --> 00:15:41,040 green part and the red part into lstat, 387 00:15:38,240 --> 00:15:42,880 this is what happens. Um, the first one 388 00:15:41,040 --> 00:15:45,519 passes happily. returns the information 389 00:15:42,880 --> 00:15:48,160 about that directory, but the second one 390 00:15:45,519 --> 00:15:53,199 doesn't. It throws an error and by 391 00:15:48,160 --> 00:15:55,040 default, real path swallows that error. 392 00:15:53,199 --> 00:15:57,040 So, what happens? We're kind of stepping 393 00:15:55,040 --> 00:15:59,680 out now. We'll go back to have a look at 394 00:15:57,040 --> 00:16:01,519 real path and have a look what it does. 395 00:15:59,680 --> 00:16:03,199 So, if you pass that long link with a 396 00:16:01,519 --> 00:16:07,759 sim link at the end that exceeds path 397 00:16:03,199 --> 00:16:10,880 max into real path, what it returns is 398 00:16:07,759 --> 00:16:13,040 the unexpanded sim link. 399 00:16:10,880 --> 00:16:16,240 But what it should have returned is 400 00:16:13,040 --> 00:16:21,519 either an error or slashest which is 401 00:16:16,240 --> 00:16:25,279 kind of a problem. So this is a problem. 402 00:16:21,519 --> 00:16:30,440 Any path element that is overlapping or 403 00:16:25,279 --> 00:16:30,440 after pathmax never gets expanded. 404 00:16:31,040 --> 00:16:37,920 So this behavior of real path 405 00:16:34,560 --> 00:16:41,839 has me thinking about abstractions. 406 00:16:37,920 --> 00:16:44,639 Um, abstractions are really good. Uh, 407 00:16:41,839 --> 00:16:48,079 they're about uh in computing they're 408 00:16:44,639 --> 00:16:50,240 about providing access to something and 409 00:16:48,079 --> 00:16:52,000 a high about hiding details and 410 00:16:50,240 --> 00:16:54,959 complexity so you don't have to deal 411 00:16:52,000 --> 00:16:57,360 with all the minutiae. Um, and in the 412 00:16:54,959 --> 00:16:59,279 end, an abstraction allows you to think 413 00:16:57,360 --> 00:17:01,680 at a more high level about what you're 414 00:16:59,279 --> 00:17:04,319 doing. Abstraction is powerful. 415 00:17:01,680 --> 00:17:07,600 Computing is built on layers and layers 416 00:17:04,319 --> 00:17:09,839 of abstraction. It's awesome. 417 00:17:07,600 --> 00:17:12,319 Like who has written machine code or 418 00:17:09,839 --> 00:17:14,480 assembly recently? 419 00:17:12,319 --> 00:17:16,480 Is it two? What about anyone like 420 00:17:14,480 --> 00:17:19,120 writing a raw Ethernet frame to a 421 00:17:16,480 --> 00:17:20,959 network interface card? No. Does anyone 422 00:17:19,120 --> 00:17:23,280 know what those terms mean? Yeah, 423 00:17:20,959 --> 00:17:25,360 there's a few of you. Okay, good. Um, 424 00:17:23,280 --> 00:17:26,799 this is why abstractions are awesome. 425 00:17:25,360 --> 00:17:29,200 You don't have to think about all that 426 00:17:26,799 --> 00:17:31,200 stuff. Um, programming languages are an 427 00:17:29,200 --> 00:17:33,760 abstraction. Operating systems are an 428 00:17:31,200 --> 00:17:36,000 abstraction. Even the tar file library 429 00:17:33,760 --> 00:17:39,640 is an abstraction over a collection of 430 00:17:36,000 --> 00:17:39,640 bytes in a file. 431 00:17:39,760 --> 00:17:45,919 But abstractions do have a weakness. 432 00:17:42,960 --> 00:17:49,280 They can make security harder and they 433 00:17:45,919 --> 00:17:50,880 do this in a variety of ways. And these 434 00:17:49,280 --> 00:17:53,520 are the four that I came up with. They 435 00:17:50,880 --> 00:17:55,600 they make the problem simpler than it 436 00:17:53,520 --> 00:17:58,799 should be. Uh and this is often the case 437 00:17:55,600 --> 00:18:00,559 with IO. Um, they can bury critical 438 00:17:58,799 --> 00:18:02,880 security flaws that you should be made 439 00:18:00,559 --> 00:18:04,720 aware of but don't know. Uh, they can 440 00:18:02,880 --> 00:18:06,240 have default behaviors that aren't 441 00:18:04,720 --> 00:18:08,880 secure. 442 00:18:06,240 --> 00:18:11,440 And the last one is harder to explain. 443 00:18:08,880 --> 00:18:14,559 It's they are leaky. Uh, there are side 444 00:18:11,440 --> 00:18:16,799 channels. Um, spectre and meltdown or 445 00:18:14,559 --> 00:18:20,080 rowhammer are examples of this type of 446 00:18:16,799 --> 00:18:23,039 abstraction problem. Um, so if we look 447 00:18:20,080 --> 00:18:24,880 at real path, it has a couple of these. 448 00:18:23,039 --> 00:18:27,840 The first is obviously the insecure 449 00:18:24,880 --> 00:18:29,919 default. Strict equal equals false is an 450 00:18:27,840 --> 00:18:32,400 insecure default. 451 00:18:29,919 --> 00:18:34,720 It ignores any error that is thrown 452 00:18:32,400 --> 00:18:37,039 inside it. 453 00:18:34,720 --> 00:18:39,360 But we can't simply just change the 454 00:18:37,039 --> 00:18:42,160 default to true 455 00:18:39,360 --> 00:18:44,480 because of oversimplification. If we did 456 00:18:42,160 --> 00:18:46,400 that, it's all or nothing. You don't get 457 00:18:44,480 --> 00:18:48,480 the nuance behavior. It forces that onto 458 00:18:46,400 --> 00:18:50,240 the user of that function which is not 459 00:18:48,480 --> 00:18:54,480 great. 460 00:18:50,240 --> 00:18:57,200 The name too long error is not the same 461 00:18:54,480 --> 00:18:59,520 as it doesn't exists. 462 00:18:57,200 --> 00:19:01,200 So different errors need different 463 00:18:59,520 --> 00:19:03,120 responses. 464 00:19:01,200 --> 00:19:06,720 And in fact, one of the main fixes to 465 00:19:03,120 --> 00:19:08,799 this vulnerability was introducing a new 466 00:19:06,720 --> 00:19:11,440 uh behavior for strict in real path 467 00:19:08,799 --> 00:19:14,400 called allow missing. So all errors are 468 00:19:11,440 --> 00:19:17,120 thrown in this case except if the lstat 469 00:19:14,400 --> 00:19:20,000 call fails with a missing uh file not 470 00:19:17,120 --> 00:19:25,600 found. 471 00:19:20,000 --> 00:19:28,880 you're not seeing my RAM scripts. Um so 472 00:19:25,600 --> 00:19:31,520 abstraction presents a challenge on one 473 00:19:28,880 --> 00:19:34,320 hand we so it it hides a bunch of 474 00:19:31,520 --> 00:19:36,720 details which um is both its strength 475 00:19:34,320 --> 00:19:39,919 and weakness. It allows higher level 476 00:19:36,720 --> 00:19:43,039 thinking. It may lets us build bigger 477 00:19:39,919 --> 00:19:44,960 more capable systems but it also hides 478 00:19:43,039 --> 00:19:47,760 important security related details and 479 00:19:44,960 --> 00:19:50,160 issues. So, when building secure 480 00:19:47,760 --> 00:19:51,679 systems, it's important to consider our 481 00:19:50,160 --> 00:19:53,039 abstractions. 482 00:19:51,679 --> 00:19:56,320 All right, let's Oh, man, I'm running 483 00:19:53,039 --> 00:19:58,000 out of time. I thought I had more. Um, 484 00:19:56,320 --> 00:20:00,000 so we're back here with our leaky 485 00:19:58,000 --> 00:20:03,039 abstraction real path. We already saw 486 00:20:00,000 --> 00:20:05,200 this. Let's turn it into an exploit. 487 00:20:03,039 --> 00:20:07,280 Here we go. Uh, this is our tar file. 488 00:20:05,200 --> 00:20:09,360 The and I'll, uh, the first thing we do 489 00:20:07,280 --> 00:20:11,280 is we create that directory. The next 490 00:20:09,360 --> 00:20:14,559 thing we do is we have to create a sim 491 00:20:11,280 --> 00:20:16,559 link to the bottom of that directory. 492 00:20:14,559 --> 00:20:18,160 And then we create use that sim link we 493 00:20:16,559 --> 00:20:20,720 create to create the really long link. 494 00:20:18,160 --> 00:20:22,799 And I we need this actually because you 495 00:20:20,720 --> 00:20:24,720 know how that path is too long. We can't 496 00:20:22,799 --> 00:20:26,720 just create the sim link or the creation 497 00:20:24,720 --> 00:20:28,799 of the sim link will return the same 498 00:20:26,720 --> 00:20:30,559 error. So we use the sim link to kind of 499 00:20:28,799 --> 00:20:33,039 trick it into creating the thing that 500 00:20:30,559 --> 00:20:34,720 exceeds the length. Uh and then we have 501 00:20:33,039 --> 00:20:36,559 this escape sim link. And this is really 502 00:20:34,720 --> 00:20:38,880 just to make things nice. And then we 503 00:20:36,559 --> 00:20:40,320 have our whoop our payload at the end 504 00:20:38,880 --> 00:20:46,240 which is to overwrite the authorized 505 00:20:40,320 --> 00:20:48,559 keys. And what happens is the uh input. 506 00:20:46,240 --> 00:20:49,919 So this is the file at the bottom and 507 00:20:48,559 --> 00:20:53,039 this is kind of reversing those sim 508 00:20:49,919 --> 00:20:55,360 links. Uh the escape is actually uh this 509 00:20:53,039 --> 00:20:56,960 is what it means. So we have this really 510 00:20:55,360 --> 00:20:58,559 long path and that really long link 511 00:20:56,960 --> 00:21:00,080 remember is these um this directory 512 00:20:58,559 --> 00:21:02,960 traversal which takes us all the way 513 00:21:00,080 --> 00:21:05,520 down to the root. But because of the bug 514 00:21:02,960 --> 00:21:08,720 uh because of this pathmax bug what 515 00:21:05,520 --> 00:21:10,880 happens instead is that real path 516 00:21:08,720 --> 00:21:14,240 returns this 517 00:21:10,880 --> 00:21:16,400 um but in reality it's this 518 00:21:14,240 --> 00:21:20,159 and because real path returns this 519 00:21:16,400 --> 00:21:22,559 target path in our filter is set to the 520 00:21:20,159 --> 00:21:24,320 output of real path and this check in 521 00:21:22,559 --> 00:21:27,840 the comment path at the end of the 522 00:21:24,320 --> 00:21:30,880 filter passes and we now can point 523 00:21:27,840 --> 00:21:32,640 target path anywhere we want. We can 524 00:21:30,880 --> 00:21:34,559 write anywhere in the file system. We 525 00:21:32,640 --> 00:21:37,840 have permissions to our arbitrary right 526 00:21:34,559 --> 00:21:40,240 is achieved. Um 527 00:21:37,840 --> 00:21:44,679 so uh you might be thinking like why do 528 00:21:40,240 --> 00:21:44,679 we even have sim links in tar files? 529 00:21:44,960 --> 00:21:50,720 Well this got me also thinking about sim 530 00:21:47,200 --> 00:21:52,559 links and incentives. Um so you have I'm 531 00:21:50,720 --> 00:21:55,039 going to simplify the world down to just 532 00:21:52,559 --> 00:21:57,600 a library user and a library maintainer. 533 00:21:55,039 --> 00:22:01,600 And statistically the maintainer is just 534 00:21:57,600 --> 00:22:03,440 one person uh in their spare time. Um 535 00:22:01,600 --> 00:22:04,960 and the user is kind of interested in 536 00:22:03,440 --> 00:22:06,960 these things. They they want to ship 537 00:22:04,960 --> 00:22:08,320 features. They don't want to break their 538 00:22:06,960 --> 00:22:10,240 stuff. They want to keep it working. 539 00:22:08,320 --> 00:22:12,320 They want their boss to be happy. But a 540 00:22:10,240 --> 00:22:14,480 maintainer, our our statistical 541 00:22:12,320 --> 00:22:16,000 maintainer is more interested in this 542 00:22:14,480 --> 00:22:18,000 this these sorts of things. The 543 00:22:16,000 --> 00:22:19,600 maintainer has a day job. They have a 544 00:22:18,000 --> 00:22:21,200 personal life. They might have a family. 545 00:22:19,600 --> 00:22:23,120 They might have health problems. They 546 00:22:21,200 --> 00:22:25,679 might be caring for elderly people. Who 547 00:22:23,120 --> 00:22:27,679 knows? They just want to resolve their 548 00:22:25,679 --> 00:22:29,679 issues quickly and keep their users 549 00:22:27,679 --> 00:22:31,840 happy. 550 00:22:29,679 --> 00:22:34,400 And at the end of the day, the user only 551 00:22:31,840 --> 00:22:37,120 cares about a limited feature set, but 552 00:22:34,400 --> 00:22:39,200 the maintainer in wanting to keep their 553 00:22:37,120 --> 00:22:41,440 users happy is interested in building a 554 00:22:39,200 --> 00:22:43,440 general purpose library. So sim links 555 00:22:41,440 --> 00:22:46,000 are in tar because it's a general 556 00:22:43,440 --> 00:22:48,320 purpose library. 557 00:22:46,000 --> 00:22:51,039 So what can we take away with that from 558 00:22:48,320 --> 00:22:54,720 this? Uh you need to be wary of 559 00:22:51,039 --> 00:22:57,760 incentives. Um, for a user, be wary of 560 00:22:54,720 --> 00:23:01,280 large feature sets. 561 00:22:57,760 --> 00:23:02,880 Disable things you don't need. 562 00:23:01,280 --> 00:23:04,640 Um, be careful of things you're opted 563 00:23:02,880 --> 00:23:07,679 into without being able to control. For 564 00:23:04,640 --> 00:23:10,000 a maintainer, think about building for 565 00:23:07,679 --> 00:23:13,280 default secure and making stuff that 566 00:23:10,000 --> 00:23:15,919 isn't core optional. Uh, and and make it 567 00:23:13,280 --> 00:23:17,520 optional, but also make it optin. And I 568 00:23:15,919 --> 00:23:18,880 really like Django's approach to this in 569 00:23:17,520 --> 00:23:21,120 some ways where you can pick and choose 570 00:23:18,880 --> 00:23:23,520 the bits that you you want. So, beware 571 00:23:21,120 --> 00:23:24,960 of different incentives. 572 00:23:23,520 --> 00:23:27,120 All right, I've got a minute or two 573 00:23:24,960 --> 00:23:29,520 left. I think we've got time for a demo. 574 00:23:27,120 --> 00:23:31,280 Uh, hopefully this all works. I only I 575 00:23:29,520 --> 00:23:34,880 had multiple demos, but I'll do this one 576 00:23:31,280 --> 00:23:36,480 first. So, this is my We've got a file 577 00:23:34,880 --> 00:23:37,760 system here. We have two directories. 578 00:23:36,480 --> 00:23:40,000 This is where we're going to extract the 579 00:23:37,760 --> 00:23:41,120 file into destination. Do not touch is 580 00:23:40,000 --> 00:23:42,480 the thing we don't want to touch. We 581 00:23:41,120 --> 00:23:45,679 have our very important document in 582 00:23:42,480 --> 00:23:48,240 here. Uh, if I open it, it says on my 583 00:23:45,679 --> 00:23:51,200 screen, not yours. Uh, let me drag this 584 00:23:48,240 --> 00:23:53,600 over. Says hello world. Okay, this is 585 00:23:51,200 --> 00:23:56,720 great. Um, but in our demo directory 586 00:23:53,600 --> 00:23:58,640 here, uh, we have this tar file, 587 00:23:56,720 --> 00:24:01,200 pop.tar. I'm going to I'm going to list 588 00:23:58,640 --> 00:24:03,919 the contents of this tar file. 589 00:24:01,200 --> 00:24:06,159 Uh, it's very long. Um, and because this 590 00:24:03,919 --> 00:24:08,000 is the like a Macbased operating system, 591 00:24:06,159 --> 00:24:09,600 we need really long paths. So, there's a 592 00:24:08,000 --> 00:24:11,520 whole bunch of these directories and sim 593 00:24:09,600 --> 00:24:13,360 links and stuff going on. But, we can 594 00:24:11,520 --> 00:24:15,840 see at the bottom what we like some of 595 00:24:13,360 --> 00:24:17,760 the stuff we're going to do. Um, those 596 00:24:15,840 --> 00:24:19,360 flag links are one of them creates a 597 00:24:17,760 --> 00:24:21,679 hard link, which I won't get into, and 598 00:24:19,360 --> 00:24:24,159 the another changes its contents. Um, so 599 00:24:21,679 --> 00:24:26,640 what I'm going to do is I will try and 600 00:24:24,159 --> 00:24:28,400 find my copy and paste command on the 601 00:24:26,640 --> 00:24:30,000 other screen 602 00:24:28,400 --> 00:24:31,360 because I don't trust myself. Oh, just 603 00:24:30,000 --> 00:24:34,360 to prove that I'm using a vulnerable 604 00:24:31,360 --> 00:24:34,360 version. 605 00:24:35,120 --> 00:24:42,240 We got this. I had to use UV because 606 00:24:37,279 --> 00:24:43,919 like um you going to 607 00:24:42,240 --> 00:24:45,279 All right. This is a vulnerable version, 608 00:24:43,919 --> 00:24:46,880 so it's not one of the ones that's fixed 609 00:24:45,279 --> 00:24:48,720 fixed. I'm using UV because it lets me 610 00:24:46,880 --> 00:24:50,240 pick the Python I want to use. Um, all 611 00:24:48,720 --> 00:24:51,760 right. Now, we're going to Oh, wait. I'm 612 00:24:50,240 --> 00:24:54,080 not in destination file. We'll go into 613 00:24:51,760 --> 00:24:58,080 the destination. 614 00:24:54,080 --> 00:24:59,440 And now I run my pock. And the first 615 00:24:58,080 --> 00:25:01,440 thing you'll notice is not a lot 616 00:24:59,440 --> 00:25:03,279 happened there. But if you look in our 617 00:25:01,440 --> 00:25:05,520 do not touch directory, a couple of 618 00:25:03,279 --> 00:25:08,320 things happened. Uh, there's a new file 619 00:25:05,520 --> 00:25:10,240 there. Uh, and the important file has 620 00:25:08,320 --> 00:25:12,559 uh, uh, let's have a look at it. It's 621 00:25:10,240 --> 00:25:15,840 opening up my screen. Uh let's drag it 622 00:25:12,559 --> 00:25:18,000 across. Oh, this one now says that. Um 623 00:25:15,840 --> 00:25:21,360 and this one uh created a new file as 624 00:25:18,000 --> 00:25:24,880 well. So there we go. Uh it works. Uh 625 00:25:21,360 --> 00:25:26,880 which is good. Um 626 00:25:24,880 --> 00:25:29,679 you can't see the cooler demo where I I 627 00:25:26,880 --> 00:25:33,120 created a quick website using uh Gemini 628 00:25:29,679 --> 00:25:34,640 and uh you can upload tasks to it. It's 629 00:25:33,120 --> 00:25:36,640 fun. Uh all right, let's just jump back 630 00:25:34,640 --> 00:25:38,080 into the presentation. So wrapping up um 631 00:25:36,640 --> 00:25:40,799 quickly while I've still got a minute. 632 00:25:38,080 --> 00:25:42,720 Um, I hope you've enjoyed me talking 633 00:25:40,799 --> 00:25:44,640 about breaking things. Uh, finding 634 00:25:42,720 --> 00:25:47,760 vulnerabilities is a roller coaster ride 635 00:25:44,640 --> 00:25:50,640 and um, but I also hope you take away 636 00:25:47,760 --> 00:25:55,440 some new ideas about security and 637 00:25:50,640 --> 00:25:58,000 retrofits, abstraction and incentives. 638 00:25:55,440 --> 00:26:00,159 Uh, and uh, like this is important 639 00:25:58,000 --> 00:26:01,440 because attacks are growing particularly 640 00:26:00,159 --> 00:26:04,240 in open source. I don't know if you've 641 00:26:01,440 --> 00:26:06,400 heard about fishing and npm. Um and 642 00:26:04,240 --> 00:26:09,279 regulations are increasing as well and 643 00:26:06,400 --> 00:26:11,039 new laws like the Australian cyber res 644 00:26:09,279 --> 00:26:13,919 security act. These are mouthfuls and 645 00:26:11,039 --> 00:26:16,799 the EU cyber resiliency act. Um these 646 00:26:13,919 --> 00:26:19,039 are acts that change the landscape for 647 00:26:16,799 --> 00:26:21,279 software engineers in terms of what we 648 00:26:19,039 --> 00:26:23,120 are required to be doing. So tackling 649 00:26:21,279 --> 00:26:26,080 cyber security is more important now 650 00:26:23,120 --> 00:26:29,200 than ever. Uh and finally upgrade your 651 00:26:26,080 --> 00:26:31,520 Python. Uh you can check uh download 652 00:26:29,200 --> 00:26:33,200 stats and like 42% of pe people 653 00:26:31,520 --> 00:26:35,200 downloading no not people clients 654 00:26:33,200 --> 00:26:36,960 downloading from Pippy are like from 655 00:26:35,200 --> 00:26:38,559 vulnerable versions. Um and if you're 656 00:26:36,960 --> 00:26:40,559 interested in metrics like this stay for 657 00:26:38,559 --> 00:26:44,090 the next talk cuz um they'll talk more 658 00:26:40,559 --> 00:26:52,570 about that. Um yeah that's me. 659 00:26:44,090 --> 00:26:52,570 [Applause] 660 00:26:53,440 --> 00:26:59,279 Thank you, Caleb, for the wonderful 661 00:26:55,600 --> 00:26:59,840 talk. Uh, I do have a my speaker gift 662 00:26:59,279 --> 00:27:01,360 for you. 663 00:26:59,840 --> 00:27:03,840 Thank you very much. 664 00:27:01,360 --> 00:27:06,400 A year 2022 month. Uh, we might have 665 00:27:03,840 --> 00:27:06,720 time for one question if anyone has a 666 00:27:06,400 --> 00:27:08,400 question. 667 00:27:06,720 --> 00:27:10,240 If if you want if there's a question, 668 00:27:08,400 --> 00:27:12,000 I'll happily answer it, but I also have 669 00:27:10,240 --> 00:27:14,880 like more demo. So, if you want to see 670 00:27:12,000 --> 00:27:15,600 more demo, 671 00:27:14,880 --> 00:27:17,120 for another, 672 00:27:15,600 --> 00:27:21,400 but anyway, uh, questions probably 673 00:27:17,120 --> 00:27:21,400 better. Um, 674 00:27:34,400 --> 00:27:39,679 sorry. Um, have there been 675 00:27:36,480 --> 00:27:41,760 vulnerabilities found in Python's own 676 00:27:39,679 --> 00:27:45,360 um, you know, legacy type file formats 677 00:27:41,760 --> 00:27:48,240 like uh, uh, pickle or shelf? And uh now 678 00:27:45,360 --> 00:27:50,880 what would you suggest in in terms of 679 00:27:48,240 --> 00:27:53,039 going about opening a pickle or a shelf 680 00:27:50,880 --> 00:27:54,880 from 10 years ago or 20 years ago? 681 00:27:53,039 --> 00:27:56,159 What sort of file format? Sorry. I 682 00:27:54,880 --> 00:27:58,880 pickle or shelf. 683 00:27:56,159 --> 00:28:00,960 Pickle or shelf? I'm not familiar with 684 00:27:58,880 --> 00:28:04,080 shelf. Uh pickle does have its 685 00:28:00,960 --> 00:28:06,799 well-known problems particularly in um 686 00:28:04,080 --> 00:28:09,279 the AI space and sharing models and that 687 00:28:06,799 --> 00:28:11,039 sort of thing. Um so pickles are a well 688 00:28:09,279 --> 00:28:12,480 understood problem and I think there are 689 00:28:11,039 --> 00:28:15,520 mitigations to deal with that. I'm 690 00:28:12,480 --> 00:28:16,960 unfamiliar with the uh shelf example. Um 691 00:28:15,520 --> 00:28:18,159 yeah, but I'm sure there are 692 00:28:16,960 --> 00:28:20,640 vulnerabilities if they're old and 693 00:28:18,159 --> 00:28:22,240 unused. 694 00:28:20,640 --> 00:28:25,600 Uh that's all the time we have for 695 00:28:22,240 --> 00:28:30,100 questions. Uh please join me again 696 00:28:25,600 --> 00:28:33,350 thanking Caleb's talk. Thank 697 00:28:30,100 --> 00:28:33,350 [Applause]