1 00:00:06,320 --> 00:00:11,499 [Music] 2 00:00:15,679 --> 00:00:21,039 hello everyone welcome back um and 3 00:00:18,160 --> 00:00:22,560 welcome to our first talk of today's 4 00:00:21,039 --> 00:00:25,199 colonel miniconf 5 00:00:22,560 --> 00:00:26,640 uh andre almeida is a linux kernel 6 00:00:25,199 --> 00:00:28,320 hacker at calabra where he's 7 00:00:26,640 --> 00:00:30,160 investigating how to make game runs 8 00:00:28,320 --> 00:00:31,279 faster and better in a free and open 9 00:00:30,160 --> 00:00:33,360 stack 10 00:00:31,279 --> 00:00:36,559 and today he'll be talking to us about 11 00:00:33,360 --> 00:00:37,840 the current uh the futex2 syscall 12 00:00:36,559 --> 00:00:40,079 and the 13 00:00:37,840 --> 00:00:40,879 different approaches 14 00:00:40,079 --> 00:00:43,040 that 15 00:00:40,879 --> 00:00:45,440 he has been taking so far to overcome 16 00:00:43,040 --> 00:00:48,719 the limitations of the current futex 17 00:00:45,440 --> 00:00:50,079 syscall uh please welcome andre 18 00:00:48,719 --> 00:00:52,640 thank you very much good morning 19 00:00:50,079 --> 00:00:54,879 australia um so yeah i'm andrea ameda 20 00:00:52,640 --> 00:00:57,760 i'm internal developer collabora and i'm 21 00:00:54,879 --> 00:00:59,600 based in brazil there is a place very 22 00:00:57,760 --> 00:01:01,600 far away from australia 23 00:00:59,600 --> 00:01:04,159 but it's a pleasure to be back here 24 00:01:01,600 --> 00:01:05,680 and um yeah today i want to talk about 25 00:01:04,159 --> 00:01:06,640 this project that i have been working 26 00:01:05,680 --> 00:01:10,240 for 27 00:01:06,640 --> 00:01:12,080 quite some time now that is the futex2 28 00:01:10,240 --> 00:01:13,200 but for of course i will explain what is 29 00:01:12,080 --> 00:01:14,240 futex 30 00:01:13,200 --> 00:01:15,439 why 31 00:01:14,240 --> 00:01:17,600 is this 32 00:01:15,439 --> 00:01:19,840 why do we need a new physics 33 00:01:17,600 --> 00:01:21,119 uh what i have done so far and what is 34 00:01:19,840 --> 00:01:22,560 left 35 00:01:21,119 --> 00:01:27,759 so um 36 00:01:22,560 --> 00:01:29,200 futex is a cisco of on linux uh is used 37 00:01:27,759 --> 00:01:32,560 for creating 38 00:01:29,200 --> 00:01:35,200 uh sync primitives uh like mutexes so 39 00:01:32,560 --> 00:01:38,400 for instance bitrate mutex 40 00:01:35,200 --> 00:01:40,960 uses few texts behind to be implemented 41 00:01:38,400 --> 00:01:43,200 and it's something that we uh expose 42 00:01:40,960 --> 00:01:45,200 user space so you can create create 43 00:01:43,200 --> 00:01:48,000 these fast prim chips 44 00:01:45,200 --> 00:01:50,240 and uh and giving that is used by p 45 00:01:48,000 --> 00:01:52,880 threads you you can imagine that is 46 00:01:50,240 --> 00:01:54,640 something that is called very often if 47 00:01:52,880 --> 00:01:56,560 you are using any mood straight 48 00:01:54,640 --> 00:01:59,360 application right now for sure you are 49 00:01:56,560 --> 00:02:01,759 running a thousand few texts you can s 50 00:01:59,360 --> 00:02:03,200 trace your browser to to see the imaging 51 00:02:01,759 --> 00:02:04,320 happening 52 00:02:03,200 --> 00:02:06,960 and uh 53 00:02:04,320 --> 00:02:10,160 so and the the kernel interface is very 54 00:02:06,960 --> 00:02:11,680 very simple is basically a way to sleep 55 00:02:10,160 --> 00:02:14,000 and weak threats 56 00:02:11,680 --> 00:02:15,680 and all the logic needs to be done in 57 00:02:14,000 --> 00:02:18,080 user space 58 00:02:15,680 --> 00:02:21,520 and here is like a 59 00:02:18,080 --> 00:02:23,280 kind of a visual explanation of how the 60 00:02:21,520 --> 00:02:25,760 syscall works 61 00:02:23,280 --> 00:02:28,480 so imagine that this thread over here 62 00:02:25,760 --> 00:02:31,280 is like uh waiting 63 00:02:28,480 --> 00:02:33,599 it got the lock it got the mutex lock 64 00:02:31,280 --> 00:02:36,879 so it's here in user space doing a lot 65 00:02:33,599 --> 00:02:39,120 of work so here we have two threads and 66 00:02:36,879 --> 00:02:40,560 this is the times flows on that 67 00:02:39,120 --> 00:02:42,720 direction here 68 00:02:40,560 --> 00:02:45,440 and this thread is happy in this space 69 00:02:42,720 --> 00:02:47,360 with the lock taken doing a lot of work 70 00:02:45,440 --> 00:02:49,280 and then in the meanwhile this right 71 00:02:47,360 --> 00:02:51,840 here tries to get the lock 72 00:02:49,280 --> 00:02:53,920 but the lock is taken so it needs to 73 00:02:51,840 --> 00:02:56,720 slip 74 00:02:53,920 --> 00:02:59,840 so what we do what bitrate does here it 75 00:02:56,720 --> 00:03:01,280 will issue a few text weight call so you 76 00:02:59,840 --> 00:03:04,640 wait on 77 00:03:01,280 --> 00:03:06,000 user space address this is kind the id 78 00:03:04,640 --> 00:03:07,760 of 79 00:03:06,000 --> 00:03:10,959 the mutexes 80 00:03:07,760 --> 00:03:14,640 so you just wait on uh on this address 81 00:03:10,959 --> 00:03:16,239 here until some something wake you up 82 00:03:14,640 --> 00:03:18,400 and then you go to the kernel you do 83 00:03:16,239 --> 00:03:21,040 some initial um 84 00:03:18,400 --> 00:03:24,080 work and then the kernel call is caddo 85 00:03:21,040 --> 00:03:26,959 so uh the kernel the pro the processor 86 00:03:24,080 --> 00:03:28,799 can do something useful while you are 87 00:03:26,959 --> 00:03:29,599 waiting to get this 88 00:03:28,799 --> 00:03:30,560 uh 89 00:03:29,599 --> 00:03:32,159 lock 90 00:03:30,560 --> 00:03:33,760 free 91 00:03:32,159 --> 00:03:36,400 and at the meanwhile 92 00:03:33,760 --> 00:03:38,959 uh we can uh let's say here that this 93 00:03:36,400 --> 00:03:40,720 user space red finished the work 94 00:03:38,959 --> 00:03:43,200 and that after finishing the work you 95 00:03:40,720 --> 00:03:46,720 want to to give the lock to someone else 96 00:03:43,200 --> 00:03:48,879 and then the user space called uh 97 00:03:46,720 --> 00:03:51,040 do a few text call using the awake 98 00:03:48,879 --> 00:03:53,920 operation so we get back to the kernel 99 00:03:51,040 --> 00:03:56,239 we look up if is there anyone really 100 00:03:53,920 --> 00:03:58,799 waiting at this address and we find 101 00:03:56,239 --> 00:04:01,280 someone you we should wake up call 102 00:03:58,799 --> 00:04:03,519 and then this thread get back to the uh 103 00:04:01,280 --> 00:04:06,319 get back to user space and can finally 104 00:04:03,519 --> 00:04:07,599 do the work so this is a very basic 105 00:04:06,319 --> 00:04:11,200 overview 106 00:04:07,599 --> 00:04:12,319 of what is futex and 107 00:04:11,200 --> 00:04:14,480 this 108 00:04:12,319 --> 00:04:16,880 i said a lot about mutex but this can be 109 00:04:14,480 --> 00:04:19,519 used for barriers semaphores for a lot 110 00:04:16,880 --> 00:04:21,280 of sync primitives 111 00:04:19,519 --> 00:04:24,240 okay 112 00:04:21,280 --> 00:04:27,919 and uh but why do we need a new futex 113 00:04:24,240 --> 00:04:29,600 api so uh along the years developers 114 00:04:27,919 --> 00:04:34,400 have uh 115 00:04:29,600 --> 00:04:38,320 seen some limitations of how uh the 116 00:04:34,400 --> 00:04:40,800 of of the few takes uh cisco uh i will 117 00:04:38,320 --> 00:04:43,919 go on all those limitations in some 118 00:04:40,800 --> 00:04:47,600 units but basically the problem that we 119 00:04:43,919 --> 00:04:49,120 have been we have been facing is that we 120 00:04:47,600 --> 00:04:52,880 we wanted to add new features to the 121 00:04:49,120 --> 00:04:54,320 futex cisco but the futexis call is kind 122 00:04:52,880 --> 00:04:56,880 of 123 00:04:54,320 --> 00:04:59,680 multiplexed and strange 124 00:04:56,880 --> 00:05:02,080 interface i mean look at this uh we have 125 00:04:59,680 --> 00:05:04,320 all those arguments and 126 00:05:02,080 --> 00:05:06,720 they're going uh it needs to be very 127 00:05:04,320 --> 00:05:09,120 generic so documents have strange names 128 00:05:06,720 --> 00:05:11,039 like this is just fall this is a timeout 129 00:05:09,120 --> 00:05:14,960 but sometimes can be evolved two and we 130 00:05:11,039 --> 00:05:15,840 have voltry here so this uh was made to 131 00:05:14,960 --> 00:05:19,520 uh 132 00:05:15,840 --> 00:05:21,759 use a lot of different um operations 133 00:05:19,520 --> 00:05:23,680 you can check the main page for 134 00:05:21,759 --> 00:05:25,919 all the crazy things that it can be done 135 00:05:23,680 --> 00:05:28,880 there and every time 136 00:05:25,919 --> 00:05:30,160 people wanted to add the new uh 137 00:05:28,880 --> 00:05:32,880 operations 138 00:05:30,160 --> 00:05:34,960 the maintainers didn't really wanted to 139 00:05:32,880 --> 00:05:38,400 to get this in the kernel because 140 00:05:34,960 --> 00:05:41,360 it's kind of hard to maintain that 141 00:05:38,400 --> 00:05:42,639 yes it's it's very tricky to to maintain 142 00:05:41,360 --> 00:05:44,240 that code 143 00:05:42,639 --> 00:05:45,840 so uh 144 00:05:44,240 --> 00:05:49,199 and then 145 00:05:45,840 --> 00:05:51,120 i wanted to add a new us 146 00:05:49,199 --> 00:05:53,199 operation to the cisco that is the 147 00:05:51,120 --> 00:05:56,000 ability to wait on multiple full texts 148 00:05:53,199 --> 00:05:58,000 at the same time because on the old 149 00:05:56,000 --> 00:05:58,960 interface we could just want on a single 150 00:05:58,000 --> 00:06:00,720 one 151 00:05:58,960 --> 00:06:02,880 and then they told me hey if you want to 152 00:06:00,720 --> 00:06:04,880 do that you need to develop this 153 00:06:02,880 --> 00:06:07,440 interface for that 154 00:06:04,880 --> 00:06:08,880 and here we are here and now we have 155 00:06:07,440 --> 00:06:11,360 this this project that we call for 156 00:06:08,880 --> 00:06:13,360 texture that is to try to discover how 157 00:06:11,360 --> 00:06:14,800 these new interfaces look like 158 00:06:13,360 --> 00:06:16,720 and and 159 00:06:14,800 --> 00:06:18,800 to get all those new features that we 160 00:06:16,720 --> 00:06:21,199 want on this new interface 161 00:06:18,800 --> 00:06:23,759 so the first of all uh weight on 162 00:06:21,199 --> 00:06:26,319 multiple full taxes as i said before 163 00:06:23,759 --> 00:06:29,120 so uh you basically wait on multiple 164 00:06:26,319 --> 00:06:31,440 taxes and 165 00:06:29,120 --> 00:06:32,479 if some food tax triggers awake you just 166 00:06:31,440 --> 00:06:33,600 wake up 167 00:06:32,479 --> 00:06:35,280 and this 168 00:06:33,600 --> 00:06:37,919 operation can be found on other 169 00:06:35,280 --> 00:06:41,120 operation systems like windows 170 00:06:37,919 --> 00:06:43,919 so this is very similar to 171 00:06:41,120 --> 00:06:45,520 wait for much objects from windows api 172 00:06:43,919 --> 00:06:47,120 and uh 173 00:06:45,520 --> 00:06:50,080 all of these 174 00:06:47,120 --> 00:06:52,319 work has been based on 175 00:06:50,080 --> 00:06:54,720 to be used on stream proton so stream 176 00:06:52,319 --> 00:06:57,840 proton is a compatibility layer that 177 00:06:54,720 --> 00:06:59,520 allows you to run windows games on linux 178 00:06:57,840 --> 00:07:02,080 and 179 00:06:59,520 --> 00:07:03,120 linux didn't have a proper way to do 180 00:07:02,080 --> 00:07:05,039 that 181 00:07:03,120 --> 00:07:06,800 to wait for multiple things at the same 182 00:07:05,039 --> 00:07:09,599 time and 183 00:07:06,800 --> 00:07:11,360 i mean we had we have like event ft that 184 00:07:09,599 --> 00:07:15,199 can be used for that but 185 00:07:11,360 --> 00:07:17,120 uh on a very large number of waiters it 186 00:07:15,199 --> 00:07:19,440 doesn't scale very well 187 00:07:17,120 --> 00:07:20,720 and we find out that few takes sweets 188 00:07:19,440 --> 00:07:22,319 better for that 189 00:07:20,720 --> 00:07:25,280 so we 190 00:07:22,319 --> 00:07:27,919 at the bottom we we call this the this 191 00:07:25,280 --> 00:07:31,599 is called two way to move texas to map 192 00:07:27,919 --> 00:07:33,759 to this windows function and what we saw 193 00:07:31,599 --> 00:07:35,919 is that not always we you can like 194 00:07:33,759 --> 00:07:38,160 increase the frame heater for game but 195 00:07:35,919 --> 00:07:39,759 almost always you can decrease the cpu 196 00:07:38,160 --> 00:07:44,080 utilization 197 00:07:39,759 --> 00:07:44,080 or the kernel load that you need to 198 00:07:44,240 --> 00:07:49,039 to use to to wait for those objects 199 00:07:47,199 --> 00:07:51,280 and thus 200 00:07:49,039 --> 00:07:52,319 usually 201 00:07:51,280 --> 00:07:54,479 we can 202 00:07:52,319 --> 00:07:57,120 decrease the power usage and if it's 203 00:07:54,479 --> 00:07:59,280 like a portable device we can probably 204 00:07:57,120 --> 00:08:01,440 decrease the battery usage 205 00:07:59,280 --> 00:08:02,720 and use better the resource of your 206 00:08:01,440 --> 00:08:04,319 machine 207 00:08:02,720 --> 00:08:07,199 and this 208 00:08:04,319 --> 00:08:08,560 function was merged on the current that 209 00:08:07,199 --> 00:08:10,800 was released 210 00:08:08,560 --> 00:08:13,120 on these 211 00:08:10,800 --> 00:08:14,560 this weekend 5 16. 212 00:08:13,120 --> 00:08:16,400 and uh we 213 00:08:14,560 --> 00:08:18,879 uh basically merged a new syscall for 214 00:08:16,400 --> 00:08:20,000 that so all those new operations they 215 00:08:18,879 --> 00:08:22,960 will come 216 00:08:20,000 --> 00:08:25,520 via uh new cisco's 217 00:08:22,960 --> 00:08:29,280 and few tax return was just the first of 218 00:08:25,520 --> 00:08:32,159 it so you can happily use it right now 219 00:08:29,280 --> 00:08:34,399 and another interesting user case that 220 00:08:32,159 --> 00:08:37,599 people have been wanting to use 221 00:08:34,399 --> 00:08:39,919 on futex is variable sizes because 222 00:08:37,599 --> 00:08:43,039 the current interface can only 223 00:08:39,919 --> 00:08:43,760 use 32-bit values 224 00:08:43,039 --> 00:08:46,160 so 225 00:08:43,760 --> 00:08:48,800 you need to use a pointer 226 00:08:46,160 --> 00:08:50,080 to utter 2-bit 227 00:08:48,800 --> 00:08:51,040 integer 228 00:08:50,080 --> 00:08:52,720 but 229 00:08:51,040 --> 00:08:54,240 a lot of 230 00:08:52,720 --> 00:08:56,720 there are some very interesting user 231 00:08:54,240 --> 00:08:59,680 case that once you use variable size for 232 00:08:56,720 --> 00:09:02,320 taxes and people from the boost library 233 00:08:59,680 --> 00:09:04,640 from the c plus plus liberty boost they 234 00:09:02,320 --> 00:09:07,279 wanted to choose support variable size 235 00:09:04,640 --> 00:09:09,920 for texas so they can create user space 236 00:09:07,279 --> 00:09:11,760 atomic operations on top of that and 237 00:09:09,920 --> 00:09:13,360 also if you have uh 238 00:09:11,760 --> 00:09:15,920 six four bits 239 00:09:13,360 --> 00:09:16,959 for taxes you can like wait on a pointer 240 00:09:15,920 --> 00:09:20,560 value 241 00:09:16,959 --> 00:09:23,360 so these will come with new the new 242 00:09:20,560 --> 00:09:27,360 new syscalls and you'll be able to wait 243 00:09:23,360 --> 00:09:29,279 to operate on those four different sizes 244 00:09:27,360 --> 00:09:30,959 and uh also 245 00:09:29,279 --> 00:09:33,440 a big problem of 246 00:09:30,959 --> 00:09:34,720 futex right now for really big 247 00:09:33,440 --> 00:09:37,519 uh cloud 248 00:09:34,720 --> 00:09:39,600 enterprise machines is the lack of lumen 249 00:09:37,519 --> 00:09:42,320 awareness so 250 00:09:39,600 --> 00:09:45,040 on full text when you get into 251 00:09:42,320 --> 00:09:46,640 when you want to wait on some address 252 00:09:45,040 --> 00:09:48,080 you need to 253 00:09:46,640 --> 00:09:50,320 get on 254 00:09:48,080 --> 00:09:52,800 on a global hash table and you'll be 255 00:09:50,320 --> 00:09:54,480 assigned to a bucket 256 00:09:52,800 --> 00:09:57,440 so for instance here we have a bunch of 257 00:09:54,480 --> 00:10:00,160 threads waiting for different addresses 258 00:09:57,440 --> 00:10:02,160 and each one we get in a different you 259 00:10:00,160 --> 00:10:04,000 can have threads in the different 260 00:10:02,160 --> 00:10:07,519 different buckets 261 00:10:04,000 --> 00:10:09,040 uh but the problem is that uh well in us 262 00:10:07,519 --> 00:10:11,519 if you have a 263 00:10:09,040 --> 00:10:14,640 non-numa machine you have no problem you 264 00:10:11,519 --> 00:10:17,040 have just your hash table there and 265 00:10:14,640 --> 00:10:19,519 everything's fine but you have another 266 00:10:17,040 --> 00:10:21,760 machine this global hash table will be 267 00:10:19,519 --> 00:10:24,160 assigned to a single node 268 00:10:21,760 --> 00:10:28,000 and basically that means that every 269 00:10:24,160 --> 00:10:30,320 other node will have a penalty to 270 00:10:28,000 --> 00:10:32,160 access this hash table 271 00:10:30,320 --> 00:10:33,680 so that means that every fuel text 272 00:10:32,160 --> 00:10:35,920 operation that happens 273 00:10:33,680 --> 00:10:37,040 outside of the cpu will take a little 274 00:10:35,920 --> 00:10:38,880 longer 275 00:10:37,040 --> 00:10:41,600 and uh and yeah 276 00:10:38,880 --> 00:10:42,560 this can have a very strong performance 277 00:10:41,600 --> 00:10:44,720 impact 278 00:10:42,560 --> 00:10:47,120 on some really big machines 279 00:10:44,720 --> 00:10:49,120 so this is like the 280 00:10:47,120 --> 00:10:51,440 a big main point of a few texts and this 281 00:10:49,120 --> 00:10:53,440 is something that we would like to solve 282 00:10:51,440 --> 00:10:54,720 on this new interface 283 00:10:53,440 --> 00:10:57,839 so 284 00:10:54,720 --> 00:11:01,360 basically what is left now is to figure 285 00:10:57,839 --> 00:11:03,440 out the numeral syntax so we can 286 00:11:01,360 --> 00:11:06,000 this i think this is the only piece 287 00:11:03,440 --> 00:11:09,279 remaining to get uh 288 00:11:06,000 --> 00:11:11,040 the new syscalls uh 289 00:11:09,279 --> 00:11:15,120 sent to the melodies you should figure 290 00:11:11,040 --> 00:11:18,560 out how should be these numa syntax so 291 00:11:15,120 --> 00:11:20,959 what we have right now is that you send 292 00:11:18,560 --> 00:11:22,640 you to the operation a flag called futex 293 00:11:20,959 --> 00:11:23,680 number flag 294 00:11:22,640 --> 00:11:26,399 and then 295 00:11:23,680 --> 00:11:29,760 instead of using uh 296 00:11:26,399 --> 00:11:32,560 just a plain user space address you will 297 00:11:29,760 --> 00:11:34,079 point to a stretch where you have here 298 00:11:32,560 --> 00:11:35,360 on the first place 299 00:11:34,079 --> 00:11:37,760 the interior 300 00:11:35,360 --> 00:11:40,480 like the the view text address and then 301 00:11:37,760 --> 00:11:42,720 i think that it can be used to 302 00:11:40,480 --> 00:11:43,839 specify which node you will want to 303 00:11:42,720 --> 00:11:46,640 operate 304 00:11:43,839 --> 00:11:49,200 so the value is just the respective 305 00:11:46,640 --> 00:11:50,880 value of the field text and the hint can 306 00:11:49,200 --> 00:11:54,079 be like 307 00:11:50,880 --> 00:11:56,480 zero to maximum nodes to just so you can 308 00:11:54,079 --> 00:11:57,279 specify which noma node you can operate 309 00:11:56,480 --> 00:11:59,760 on 310 00:11:57,279 --> 00:12:02,079 or you can use just -1 to say that you 311 00:11:59,760 --> 00:12:05,360 want to operate on the current node 312 00:12:02,079 --> 00:12:06,480 so but we're not sure yet if this 313 00:12:05,360 --> 00:12:08,320 makes sense 314 00:12:06,480 --> 00:12:11,279 for 315 00:12:08,320 --> 00:12:14,079 the luma applications the new avengers 316 00:12:11,279 --> 00:12:17,360 so this is like a research step that i 317 00:12:14,079 --> 00:12:18,800 i'm doing right now to and talk to 318 00:12:17,360 --> 00:12:21,200 two people that 319 00:12:18,800 --> 00:12:24,800 needs these user case to to figure out 320 00:12:21,200 --> 00:12:26,800 if this is really a match for what they 321 00:12:24,800 --> 00:12:30,000 are looking for 322 00:12:26,800 --> 00:12:32,560 uh but yeah i think that's it this is 323 00:12:30,000 --> 00:12:37,160 uh the states of futex 324 00:12:32,560 --> 00:12:37,160 so now i can have questions 325 00:12:52,240 --> 00:12:54,320 um 326 00:12:53,040 --> 00:12:56,560 please put your questions if you have 327 00:12:54,320 --> 00:12:57,920 any in the 328 00:12:56,560 --> 00:13:01,320 venulis 329 00:12:57,920 --> 00:13:01,320 questions tab 330 00:13:57,440 --> 00:14:02,320 okay so we have our first question 331 00:14:00,399 --> 00:14:04,079 that is one of the cool things that you 332 00:14:02,320 --> 00:14:05,440 can do with windows wait for much 333 00:14:04,079 --> 00:14:08,320 objects 334 00:14:05,440 --> 00:14:11,600 is wait for a mutex at the same time as 335 00:14:08,320 --> 00:14:14,240 an io result for example is there any 336 00:14:11,600 --> 00:14:15,040 plan to get fuel text notification 337 00:14:14,240 --> 00:14:16,720 through 338 00:14:15,040 --> 00:14:20,560 people 339 00:14:16,720 --> 00:14:20,560 nope there's no such plane right now 340 00:14:20,880 --> 00:14:24,240 because 341 00:14:21,820 --> 00:14:27,839 [Music] 342 00:14:24,240 --> 00:14:29,199 few texts and file descriptors 343 00:14:27,839 --> 00:14:31,040 did work 344 00:14:29,199 --> 00:14:33,199 well in the past 345 00:14:31,040 --> 00:14:35,040 um 346 00:14:33,199 --> 00:14:37,279 well it's kind of 347 00:14:35,040 --> 00:14:41,040 tricky to explain but um 348 00:14:37,279 --> 00:14:44,320 well maybe i can revis revisit this talk 349 00:14:41,040 --> 00:14:45,400 but uh i'm not sure it's like would you 350 00:14:44,320 --> 00:14:46,639 be able to 351 00:14:45,400 --> 00:14:48,480 [Music] 352 00:14:46,639 --> 00:14:50,800 um 353 00:14:48,480 --> 00:14:51,920 wait on the few texts using the file 354 00:14:50,800 --> 00:14:54,880 descriptor 355 00:14:51,920 --> 00:14:56,800 or it's like waiting on 356 00:14:54,880 --> 00:14:58,959 when did you get the 357 00:14:56,800 --> 00:15:01,040 the end of the i o 358 00:14:58,959 --> 00:15:04,079 operation 359 00:15:01,040 --> 00:15:06,800 i'm not sure how would be this um 360 00:15:04,079 --> 00:15:06,800 the syntax 361 00:15:07,360 --> 00:15:10,639 and 362 00:15:08,399 --> 00:15:13,199 so will few texts you support real time 363 00:15:10,639 --> 00:15:15,120 priority awareness 364 00:15:13,199 --> 00:15:17,760 this is something that we we had 365 00:15:15,120 --> 00:15:20,720 sketches as well but because uh 366 00:15:17,760 --> 00:15:23,199 basically the the biggest pain point of 367 00:15:20,720 --> 00:15:25,279 a few takes in real time 368 00:15:23,199 --> 00:15:27,839 is that uh 369 00:15:25,279 --> 00:15:30,399 on the that hash table 370 00:15:27,839 --> 00:15:33,040 you can have a lot of conflicts 371 00:15:30,399 --> 00:15:35,839 and uh 372 00:15:33,040 --> 00:15:38,399 you are you can be sure from the user 373 00:15:35,839 --> 00:15:42,560 space side how many conflicts will have 374 00:15:38,399 --> 00:15:44,480 and each conflict increases your time to 375 00:15:42,560 --> 00:15:46,720 complete the operation 376 00:15:44,480 --> 00:15:47,839 a few things have been proposed proposed 377 00:15:46,720 --> 00:15:50,079 in the past 378 00:15:47,839 --> 00:15:50,079 like 379 00:15:50,800 --> 00:15:56,720 each process would have its own hash 380 00:15:53,040 --> 00:16:00,800 table so in this way you can like re 381 00:15:56,720 --> 00:16:02,639 reduce a lot of your unhash collision 382 00:16:00,800 --> 00:16:06,959 but from 383 00:16:02,639 --> 00:16:06,959 my plumber session this wasn't 384 00:16:07,360 --> 00:16:11,920 a issue or i mean this hasn't been 385 00:16:09,920 --> 00:16:15,440 discussed in the game but yeah it's a 386 00:16:11,920 --> 00:16:15,440 very nice question about 387 00:16:15,519 --> 00:16:18,639 how we can 388 00:16:16,959 --> 00:16:20,320 get this 389 00:16:18,639 --> 00:16:23,320 uh real-time 390 00:16:20,320 --> 00:16:23,320 awareness 391 00:16:24,480 --> 00:16:28,560 but yeah 392 00:16:25,519 --> 00:16:30,320 we don't have any planes right now but 393 00:16:28,560 --> 00:16:32,000 of course this should be 394 00:16:30,320 --> 00:16:34,880 uh disgusted again 395 00:16:32,000 --> 00:16:37,040 um and another question other than 396 00:16:34,880 --> 00:16:40,000 proton do you have other expected use 397 00:16:37,040 --> 00:16:42,240 cases for written only taxes well i know 398 00:16:40,000 --> 00:16:43,680 that some 399 00:16:42,240 --> 00:16:46,560 game engines 400 00:16:43,680 --> 00:16:47,759 wants to use these as well 401 00:16:46,560 --> 00:16:49,519 and the 402 00:16:47,759 --> 00:16:51,759 the boost library 403 00:16:49,519 --> 00:16:53,920 people also told me that they have 404 00:16:51,759 --> 00:16:56,639 some kind of mass locking api that could 405 00:16:53,920 --> 00:16:59,759 benefit from this as well 406 00:16:56,639 --> 00:17:01,759 uh but yeah basically this is a very 407 00:16:59,759 --> 00:17:05,280 common 408 00:17:01,759 --> 00:17:08,720 load pattern for games to 409 00:17:05,280 --> 00:17:10,640 do a bunch of work and wait for 410 00:17:08,720 --> 00:17:12,799 wait for everything to get complete to 411 00:17:10,640 --> 00:17:14,959 go for the next frame stuff like that 412 00:17:12,799 --> 00:17:17,120 so basically it's a round game 413 00:17:14,959 --> 00:17:19,360 are there any performance issues with 414 00:17:17,120 --> 00:17:22,000 the new syscall 415 00:17:19,360 --> 00:17:23,679 no no not really 416 00:17:22,000 --> 00:17:25,760 the new syscall 417 00:17:23,679 --> 00:17:28,400 basically is 418 00:17:25,760 --> 00:17:31,039 kind of a front end for 419 00:17:28,400 --> 00:17:34,720 the current full text code so yeah the 420 00:17:31,039 --> 00:17:36,080 performance is very similar the 421 00:17:34,720 --> 00:17:36,960 performance of this operation is very 422 00:17:36,080 --> 00:17:40,880 similar 423 00:17:36,960 --> 00:17:43,440 of the other few text operations 424 00:17:40,880 --> 00:17:45,360 can we avoid spurious wake ups with few 425 00:17:43,440 --> 00:17:47,120 texts too 426 00:17:45,360 --> 00:17:49,360 um 427 00:17:47,120 --> 00:17:49,360 well 428 00:17:49,520 --> 00:17:52,880 i don't think so as well because 429 00:17:51,679 --> 00:17:56,080 uh 430 00:17:52,880 --> 00:17:58,799 to avoid that you would need to 431 00:17:56,080 --> 00:18:00,559 really have the kernel involved for when 432 00:17:58,799 --> 00:18:01,840 the 433 00:18:00,559 --> 00:18:03,840 uncontained 434 00:18:01,840 --> 00:18:07,280 case you know if you 435 00:18:03,840 --> 00:18:09,360 if the the lock is free we will need to 436 00:18:07,280 --> 00:18:12,000 talk to the kernel anyway 437 00:18:09,360 --> 00:18:14,480 uh but yeah it's this is 438 00:18:12,000 --> 00:18:16,880 uh we can't avoid that with texture it's 439 00:18:14,480 --> 00:18:18,799 not something that we resolve it if we 440 00:18:16,880 --> 00:18:21,760 for design 441 00:18:18,799 --> 00:18:24,000 and yeah importantly 442 00:18:21,760 --> 00:18:26,000 you will need to to wake up the thread 443 00:18:24,000 --> 00:18:28,480 and the thread you need to to fight for 444 00:18:26,000 --> 00:18:32,080 the mutex and it can lose it can get 445 00:18:28,480 --> 00:18:34,799 back to to the kernel and this is really 446 00:18:32,080 --> 00:18:35,919 kind to the design of of how few texts 447 00:18:34,799 --> 00:18:38,640 work 448 00:18:35,919 --> 00:18:40,559 so yeah we can't afford we can't avoid 449 00:18:38,640 --> 00:18:42,400 that 450 00:18:40,559 --> 00:18:43,440 but as far as i know it's 451 00:18:42,400 --> 00:18:46,080 um 452 00:18:43,440 --> 00:18:48,080 spruce wake ups are not like 453 00:18:46,080 --> 00:18:51,120 a very big concern 454 00:18:48,080 --> 00:18:51,120 on the performance side 455 00:18:52,400 --> 00:18:55,400 um 456 00:18:56,880 --> 00:19:00,000 disk text yes 457 00:19:01,280 --> 00:19:04,740 [Music] 458 00:19:04,799 --> 00:19:11,760 if alison is asking if futex will 459 00:19:08,240 --> 00:19:12,720 support pi priority inheritance 460 00:19:11,760 --> 00:19:15,520 we 461 00:19:12,720 --> 00:19:17,520 using the recent added kernel flex 462 00:19:15,520 --> 00:19:19,440 so uh 463 00:19:17,520 --> 00:19:22,080 so far no so far 464 00:19:19,440 --> 00:19:25,520 no one from no one proposed that i use a 465 00:19:22,080 --> 00:19:28,320 case that would mix it 466 00:19:25,520 --> 00:19:30,160 this pneuma winners and this 467 00:19:28,320 --> 00:19:32,000 variable size it 468 00:19:30,160 --> 00:19:33,360 for texas with 469 00:19:32,000 --> 00:19:37,080 pi 470 00:19:33,360 --> 00:19:37,080 so i don't think so 471 00:19:40,799 --> 00:19:46,080 uh yeah well 472 00:19:43,679 --> 00:19:48,160 it seems that people have a lot of 473 00:19:46,080 --> 00:19:50,480 good ideas about what we can do with 474 00:19:48,160 --> 00:19:53,840 footage and basically 475 00:19:50,480 --> 00:19:56,160 this talk is of course is like a 476 00:19:53,840 --> 00:19:58,320 status of how 477 00:19:56,160 --> 00:20:00,799 this project is going on but it's also 478 00:19:58,320 --> 00:20:03,600 for really to listen to 479 00:20:00,799 --> 00:20:06,640 those ideas that uh people 480 00:20:03,600 --> 00:20:08,320 want to have on the api 481 00:20:06,640 --> 00:20:10,240 so yeah thank you very much 482 00:20:08,320 --> 00:20:12,400 i will take all of this xena into 483 00:20:10,240 --> 00:20:15,039 account 484 00:20:12,400 --> 00:20:16,880 uh yeah let's see maybe next year 485 00:20:15,039 --> 00:20:21,720 i will come back with 486 00:20:16,880 --> 00:20:21,720 these all those cool operation netflix 487 00:20:28,320 --> 00:20:32,559 andrew i don't think we have 488 00:20:30,480 --> 00:20:33,440 more questions 489 00:20:32,559 --> 00:20:35,919 i 490 00:20:33,440 --> 00:20:38,559 don't see any more questions so i think 491 00:20:35,919 --> 00:20:38,559 that might be it 492 00:20:40,240 --> 00:20:44,080 um thank you very much andre thank you 493 00:20:42,320 --> 00:20:46,240 for a great talk 494 00:20:44,080 --> 00:20:49,120 thank you um 495 00:20:46,240 --> 00:20:51,600 we will now be going to a break uh the 496 00:20:49,120 --> 00:20:54,240 next talk will be by paul mckenny uh 497 00:20:51,600 --> 00:20:57,440 about uh torturing rcu 498 00:20:54,240 --> 00:20:59,840 and that will be at 11 30 which is in 499 00:20:57,440 --> 00:21:03,080 uh 17 minutes 500 00:20:59,840 --> 00:21:03,080 see you then