1 00:00:06,320 --> 00:00:11,499 [Music] 2 00:00:19,520 --> 00:00:25,039 welcome back to lca day two 3 00:00:22,400 --> 00:00:26,960 um we have angus with us 4 00:00:25,039 --> 00:00:29,439 angus is a computer science student at 5 00:00:26,960 --> 00:00:31,199 the australian national university 6 00:00:29,439 --> 00:00:33,280 and today he's going to be talking to 7 00:00:31,199 --> 00:00:35,600 you about the memory management toolkit 8 00:00:33,280 --> 00:00:37,760 the mmtk 9 00:00:35,600 --> 00:00:39,520 it's an open source project 10 00:00:37,760 --> 00:00:41,360 and a runtime agnostic garbage 11 00:00:39,520 --> 00:00:43,200 collection framework developed by 12 00:00:41,360 --> 00:00:45,680 researchers at anu 13 00:00:43,200 --> 00:00:48,800 this is angus first time attending and 14 00:00:45,680 --> 00:00:51,199 talking at lca so go easy on him 15 00:00:48,800 --> 00:00:53,039 no my angus 16 00:00:51,199 --> 00:00:55,360 thank you very much 17 00:00:53,039 --> 00:00:58,879 so good day everyone lovely to be lca 18 00:00:55,360 --> 00:01:00,879 this year and i hope you all enjoy uh 19 00:00:58,879 --> 00:01:03,280 before we start i would like to 20 00:01:00,879 --> 00:01:05,040 acknowledge the monaro people who are 21 00:01:03,280 --> 00:01:08,080 the traditional custodians of the land 22 00:01:05,040 --> 00:01:09,439 from which i'm speaking um i'd like to 23 00:01:08,080 --> 00:01:11,760 acknowledge and pay my respects to the 24 00:01:09,439 --> 00:01:13,520 elders past present and future and i'd 25 00:01:11,760 --> 00:01:15,200 like to extend those respects to any 26 00:01:13,520 --> 00:01:18,159 aboriginal or torres strait islander 27 00:01:15,200 --> 00:01:21,600 people present with us today 28 00:01:18,159 --> 00:01:24,720 so the story behind this talk begins at 29 00:01:21,600 --> 00:01:26,560 anu where i'm a student 30 00:01:24,720 --> 00:01:28,000 this talk was about a research project 31 00:01:26,560 --> 00:01:30,479 that i did with professor steve 32 00:01:28,000 --> 00:01:32,799 blackburn over the summer of 2020 and 33 00:01:30,479 --> 00:01:35,680 2021 34 00:01:32,799 --> 00:01:38,079 the project was about integrating 35 00:01:35,680 --> 00:01:40,320 the ruby programming language with mmtk 36 00:01:38,079 --> 00:01:42,799 which is this garbage collection 37 00:01:40,320 --> 00:01:45,680 framework that's been developed by 38 00:01:42,799 --> 00:01:47,680 researchers and students at anu it's 39 00:01:45,680 --> 00:01:48,720 been around for about 15 years or so at 40 00:01:47,680 --> 00:01:49,600 this point 41 00:01:48,720 --> 00:01:51,439 and 42 00:01:49,600 --> 00:01:53,040 uh the reason why i'm giving this talk 43 00:01:51,439 --> 00:01:56,799 is because i learned a lot along the way 44 00:01:53,040 --> 00:01:58,320 about garbage collection how ruby works 45 00:01:56,799 --> 00:01:59,759 mtk all these things and i thought it'd 46 00:01:58,320 --> 00:02:00,960 be really interesting to share with you 47 00:01:59,759 --> 00:02:03,119 some of the things i learned along the 48 00:02:00,960 --> 00:02:03,119 way 49 00:02:03,200 --> 00:02:06,479 so the plan is we're going to start with 50 00:02:05,360 --> 00:02:08,800 a look at 51 00:02:06,479 --> 00:02:10,560 ruby's existing garbage collector 52 00:02:08,800 --> 00:02:12,400 and how that works we're going to look 53 00:02:10,560 --> 00:02:13,920 at a bit of the memory model some of the 54 00:02:12,400 --> 00:02:16,160 problems that exist with the garbage 55 00:02:13,920 --> 00:02:17,760 collector at the moment and what the 56 00:02:16,160 --> 00:02:19,280 ruby team has done over time to try and 57 00:02:17,760 --> 00:02:21,840 fix those things 58 00:02:19,280 --> 00:02:23,599 we're then going to have a look at mtk 59 00:02:21,840 --> 00:02:25,200 and see how it can try and solve some of 60 00:02:23,599 --> 00:02:26,879 the ongoing issues that have been 61 00:02:25,200 --> 00:02:29,520 happening and we're going to have a look 62 00:02:26,879 --> 00:02:31,599 at how mntk works and how you can 63 00:02:29,520 --> 00:02:34,560 connect it together to other languages 64 00:02:31,599 --> 00:02:37,760 and we're going to continue on to how 65 00:02:34,560 --> 00:02:40,720 i went about integrating mtk and ruby in 66 00:02:37,760 --> 00:02:40,720 my summer project 67 00:02:40,959 --> 00:02:44,800 so 68 00:02:41,920 --> 00:02:48,239 let's start with ruby's memory model so 69 00:02:44,800 --> 00:02:50,239 ruby uses fixed sized objects of 40 70 00:02:48,239 --> 00:02:52,640 bytes each so as you can see on the 71 00:02:50,239 --> 00:02:55,680 bottom left hand side of the slide there 72 00:02:52,640 --> 00:02:58,080 we have a single object 40 bytes has a 73 00:02:55,680 --> 00:03:00,720 16 byte header that has some flags that 74 00:02:58,080 --> 00:03:02,560 are used for keeping track of various 75 00:03:00,720 --> 00:03:04,400 type and various state about the object 76 00:03:02,560 --> 00:03:06,319 and the class pointer to whatever the 77 00:03:04,400 --> 00:03:09,120 parent class of that object is 78 00:03:06,319 --> 00:03:11,920 and the sorry the class of that object 79 00:03:09,120 --> 00:03:14,239 is and a body which can store up to 24 80 00:03:11,920 --> 00:03:15,760 bytes of information 81 00:03:14,239 --> 00:03:17,519 now 82 00:03:15,760 --> 00:03:20,800 obviously not all objects are going to 83 00:03:17,519 --> 00:03:22,879 fit in that 24 bytes of space and we can 84 00:03:20,800 --> 00:03:25,040 see in this case our string has managed 85 00:03:22,879 --> 00:03:27,519 to fit but on the right hand side here 86 00:03:25,040 --> 00:03:30,159 we have an array that's too big to fit 87 00:03:27,519 --> 00:03:32,959 in that slot so what ruby does in this 88 00:03:30,159 --> 00:03:34,000 case is it allocates extra space using 89 00:03:32,959 --> 00:03:36,959 malloc 90 00:03:34,000 --> 00:03:38,879 off to the side to store that data 91 00:03:36,959 --> 00:03:42,080 the main objects themselves which ruby 92 00:03:38,879 --> 00:03:44,720 calls slots are stored in heat pages 93 00:03:42,080 --> 00:03:46,560 which contain 408 slots 94 00:03:44,720 --> 00:03:48,480 now note that the this definition of 95 00:03:46,560 --> 00:03:50,000 page is different to 96 00:03:48,480 --> 00:03:52,319 the use of the word page and operating 97 00:03:50,000 --> 00:03:55,840 systems and that sort of thing so i just 98 00:03:52,319 --> 00:03:57,680 don't alleviate any confusion there 99 00:03:55,840 --> 00:04:00,080 so now that we understand how ruby 100 00:03:57,680 --> 00:04:00,959 stores its objects how does ruby 101 00:04:00,080 --> 00:04:02,720 actually 102 00:04:00,959 --> 00:04:04,080 do garbage collection and automatic 103 00:04:02,720 --> 00:04:06,159 memory management 104 00:04:04,080 --> 00:04:08,720 ruby uses what's known as a mark sweep 105 00:04:06,159 --> 00:04:11,280 garbage collection which has two phases 106 00:04:08,720 --> 00:04:13,519 a mark phase and a sweep face 107 00:04:11,280 --> 00:04:15,840 in the first half the mark phase we 108 00:04:13,519 --> 00:04:18,079 determine what objects are reachable 109 00:04:15,840 --> 00:04:20,959 from the roots of the program where the 110 00:04:18,079 --> 00:04:22,560 roots are like objects on the stack 111 00:04:20,959 --> 00:04:24,479 local variables global variables 112 00:04:22,560 --> 00:04:28,720 anything that the program can access 113 00:04:24,479 --> 00:04:31,360 access at a particular point in time 114 00:04:28,720 --> 00:04:33,600 from these roots we trace down into the 115 00:04:31,360 --> 00:04:35,759 heap identifying what objects are 116 00:04:33,600 --> 00:04:38,000 reachable from those roots 117 00:04:35,759 --> 00:04:39,759 we mark those objects using a marked bit 118 00:04:38,000 --> 00:04:42,240 as we can see down the bottom here to 119 00:04:39,759 --> 00:04:44,560 identify that they are reachable we then 120 00:04:42,240 --> 00:04:46,639 recursively search those objects and 121 00:04:44,560 --> 00:04:49,840 find objects they point to and mark 122 00:04:46,639 --> 00:04:52,240 those and so on until we've searched the 123 00:04:49,840 --> 00:04:55,360 entire heap 124 00:04:52,240 --> 00:04:57,280 this process is called marking 125 00:04:55,360 --> 00:04:58,880 now this process has identified all 126 00:04:57,280 --> 00:05:01,120 objects that are reachable from the 127 00:04:58,880 --> 00:05:03,120 roots which means anything the program 128 00:05:01,120 --> 00:05:05,440 still could be using so we know these 129 00:05:03,120 --> 00:05:07,680 are not safe to garbage collect 130 00:05:05,440 --> 00:05:10,000 but any objects that aren't in use like 131 00:05:07,680 --> 00:05:11,759 those two grey ones that aren't marked 132 00:05:10,000 --> 00:05:13,680 it is safe to collect them because 133 00:05:11,759 --> 00:05:16,000 they're not reachable from the program 134 00:05:13,680 --> 00:05:19,600 therefore the program cannot access them 135 00:05:16,000 --> 00:05:21,600 in any way so it's safe to reuse them 136 00:05:19,600 --> 00:05:23,199 so after the mark phase is done we go 137 00:05:21,600 --> 00:05:25,440 through the sweep phase of actually 138 00:05:23,199 --> 00:05:27,120 collecting the objects 139 00:05:25,440 --> 00:05:29,199 we start by looking at the mark bit of 140 00:05:27,120 --> 00:05:31,199 each object and identify 141 00:05:29,199 --> 00:05:32,720 um whether or not it should be collected 142 00:05:31,199 --> 00:05:34,000 here we have an object that's marked so 143 00:05:32,720 --> 00:05:36,720 we should keep it 144 00:05:34,000 --> 00:05:39,280 here we have a free slot so we obviously 145 00:05:36,720 --> 00:05:40,800 don't need to do anything 146 00:05:39,280 --> 00:05:42,400 we've got two more objects here which 147 00:05:40,800 --> 00:05:44,639 are marked with the one in the mark bit 148 00:05:42,400 --> 00:05:46,560 so we want to keep those 149 00:05:44,639 --> 00:05:48,720 and we just unset the mark bit as we go 150 00:05:46,560 --> 00:05:50,720 along to item so that when we do another 151 00:05:48,720 --> 00:05:52,639 garbage collection cycle later they're 152 00:05:50,720 --> 00:05:54,639 all ready to go again 153 00:05:52,639 --> 00:05:56,720 now as we can see here we have an object 154 00:05:54,639 --> 00:05:59,440 that has 155 00:05:56,720 --> 00:06:01,199 no mark bits set but it does exist which 156 00:05:59,440 --> 00:06:04,880 means it's not reachable therefore it's 157 00:06:01,199 --> 00:06:06,479 garbage so it's safe to collect it 158 00:06:04,880 --> 00:06:08,400 finally we can continue through the heap 159 00:06:06,479 --> 00:06:10,560 doing a similar strategy 160 00:06:08,400 --> 00:06:12,560 keeping any objects that are reachable 161 00:06:10,560 --> 00:06:16,080 and unsetting the mark bit and freeing 162 00:06:12,560 --> 00:06:16,080 any objects that aren't reachable 163 00:06:16,240 --> 00:06:22,319 so this mark suite garbage collector has 164 00:06:19,039 --> 00:06:22,319 a couple of main problems 165 00:06:22,479 --> 00:06:27,120 the first one is that it's what's known 166 00:06:24,720 --> 00:06:28,560 as a stop the world garbage collector 167 00:06:27,120 --> 00:06:30,240 this means that 168 00:06:28,560 --> 00:06:32,400 you can't have the gc running at the 169 00:06:30,240 --> 00:06:34,560 same time as the ruby code when your 170 00:06:32,400 --> 00:06:37,199 ruby code runs out of memory it needs to 171 00:06:34,560 --> 00:06:39,759 stop the gc that will then run and type 172 00:06:37,199 --> 00:06:41,600 in entirety and then the ruby code will 173 00:06:39,759 --> 00:06:43,680 continue 174 00:06:41,600 --> 00:06:45,680 now this caused to be uh ruby to be 175 00:06:43,680 --> 00:06:49,000 known for one particular thing earlier 176 00:06:45,680 --> 00:06:49,000 in its lifetime 177 00:06:52,080 --> 00:06:57,759 and that was long pause times 178 00:06:55,759 --> 00:07:00,479 particularly these would be have a huge 179 00:06:57,759 --> 00:07:03,039 impact on latency of the program 180 00:07:00,479 --> 00:07:04,800 in fact this issue is so bad that github 181 00:07:03,039 --> 00:07:07,520 which is one of the major ruby users 182 00:07:04,800 --> 00:07:09,599 today and historically uh they would 183 00:07:07,520 --> 00:07:10,960 disable ruby's garbage collector 184 00:07:09,599 --> 00:07:12,880 whenever they were processing a web 185 00:07:10,960 --> 00:07:14,960 request and then they'd re-enable it 186 00:07:12,880 --> 00:07:16,479 later once it was done to avoid a gc 187 00:07:14,960 --> 00:07:17,759 cycle from happening in the middle of a 188 00:07:16,479 --> 00:07:18,639 request 189 00:07:17,759 --> 00:07:21,120 because they were finding it was 190 00:07:18,639 --> 00:07:23,199 impacting their 99 and 95th percentile 191 00:07:21,120 --> 00:07:25,280 latency too much 192 00:07:23,199 --> 00:07:26,880 they did remove this in 2018 though due 193 00:07:25,280 --> 00:07:28,319 to some improvements that have been made 194 00:07:26,880 --> 00:07:29,680 over the time to ruby's garbage 195 00:07:28,319 --> 00:07:32,000 collector 196 00:07:29,680 --> 00:07:34,240 the first of which was the ruby team 197 00:07:32,000 --> 00:07:36,560 implemented lazy sweeping 198 00:07:34,240 --> 00:07:38,080 so remember how we had two phases mark 199 00:07:36,560 --> 00:07:40,160 and sweep 200 00:07:38,080 --> 00:07:42,639 the sweep face we don't have to do it 201 00:07:40,160 --> 00:07:45,120 all in one go we only have to go as far 202 00:07:42,639 --> 00:07:46,560 as to reclaim enough memory to actually 203 00:07:45,120 --> 00:07:47,520 continue with the execution of the 204 00:07:46,560 --> 00:07:50,080 program 205 00:07:47,520 --> 00:07:51,840 so as we can see here you execute do the 206 00:07:50,080 --> 00:07:54,080 full mark face and then only sweep a 207 00:07:51,840 --> 00:07:55,840 little bit we can continue executing and 208 00:07:54,080 --> 00:07:56,639 only sweep little bits whenever we need 209 00:07:55,840 --> 00:07:58,400 to 210 00:07:56,639 --> 00:08:00,080 this doesn't actually reduce the amount 211 00:07:58,400 --> 00:08:02,319 of time taken by the garbage collector 212 00:08:00,080 --> 00:08:04,080 in total but it does amortize it over 213 00:08:02,319 --> 00:08:07,759 the lifetime of the program which 214 00:08:04,080 --> 00:08:09,360 reduces the average pause time 215 00:08:07,759 --> 00:08:11,680 the second thing that was introduced was 216 00:08:09,360 --> 00:08:13,520 generational garbage collection now 217 00:08:11,680 --> 00:08:16,160 generational garbage collection is 218 00:08:13,520 --> 00:08:17,280 possibly it relies on one of the most 219 00:08:16,160 --> 00:08:19,280 important 220 00:08:17,280 --> 00:08:21,360 uh observations in the garbage 221 00:08:19,280 --> 00:08:23,280 collection literature which is 222 00:08:21,360 --> 00:08:25,680 that when you allocate a whole bunch of 223 00:08:23,280 --> 00:08:26,639 objects the majority of them die quite 224 00:08:25,680 --> 00:08:29,759 young 225 00:08:26,639 --> 00:08:31,360 they don't survive many gc cycles 226 00:08:29,759 --> 00:08:33,120 so what ruby does 227 00:08:31,360 --> 00:08:35,200 is 228 00:08:33,120 --> 00:08:37,440 sorry and the reason why this is very 229 00:08:35,200 --> 00:08:39,360 useful is because 230 00:08:37,440 --> 00:08:42,240 these small amount of objects that 231 00:08:39,360 --> 00:08:44,800 survive past an initial cycle they have 232 00:08:42,240 --> 00:08:46,399 to be retraced every single time you go 233 00:08:44,800 --> 00:08:47,920 through the mark phase which is a bit of 234 00:08:46,399 --> 00:08:49,440 a waste of time 235 00:08:47,920 --> 00:08:50,959 so what you can do 236 00:08:49,440 --> 00:08:52,480 is separate the objects into two 237 00:08:50,959 --> 00:08:54,080 generations 238 00:08:52,480 --> 00:08:55,360 an old generation which consists of 239 00:08:54,080 --> 00:08:56,959 objects that have survived some 240 00:08:55,360 --> 00:08:59,279 collection cycles 241 00:08:56,959 --> 00:09:01,279 in ruby's case that's three cycles the 242 00:08:59,279 --> 00:09:02,880 three cycles that are needed and you 243 00:09:01,279 --> 00:09:05,600 have a young generation which contains 244 00:09:02,880 --> 00:09:07,279 new newly allocated objects 245 00:09:05,600 --> 00:09:08,880 and then during a minor garbage 246 00:09:07,279 --> 00:09:10,720 collection cycle you only check the 247 00:09:08,880 --> 00:09:12,800 newly allocated objects because most of 248 00:09:10,720 --> 00:09:14,399 those will die pretty quickly 249 00:09:12,800 --> 00:09:16,000 and then only every now and then when 250 00:09:14,399 --> 00:09:18,240 you're totally out of space you do a 251 00:09:16,000 --> 00:09:20,480 major garbage collection cycle and do 252 00:09:18,240 --> 00:09:22,320 the old things as well 253 00:09:20,480 --> 00:09:25,360 this both improves the throughput and 254 00:09:22,320 --> 00:09:27,440 reduces the pause times of the gc 255 00:09:25,360 --> 00:09:30,399 and the third improvement that's been 256 00:09:27,440 --> 00:09:32,080 made of the years is incremental marking 257 00:09:30,399 --> 00:09:34,800 so you know how with the lazy sweeping 258 00:09:32,080 --> 00:09:37,360 we're able to break up the gc ports time 259 00:09:34,800 --> 00:09:39,360 into multiple smaller steps 260 00:09:37,360 --> 00:09:42,320 by implementing a tri-color marking 261 00:09:39,360 --> 00:09:44,160 algorithm ruby was able to break up 262 00:09:42,320 --> 00:09:46,320 their garbage collection 263 00:09:44,160 --> 00:09:48,560 time into smaller sorry the mark phase 264 00:09:46,320 --> 00:09:50,320 into smaller parts as well and this 265 00:09:48,560 --> 00:09:52,240 overall had another great effect on 266 00:09:50,320 --> 00:09:54,560 reducing the average pause times 267 00:09:52,240 --> 00:09:56,880 although it doesn't actually improve the 268 00:09:54,560 --> 00:09:59,760 throughput 269 00:09:56,880 --> 00:10:01,920 the other major problem that uh this 270 00:09:59,760 --> 00:10:04,320 collector has is that it's a non-moving 271 00:10:01,920 --> 00:10:06,959 collector when you allocate an object in 272 00:10:04,320 --> 00:10:09,279 a particular place it's never going to 273 00:10:06,959 --> 00:10:10,800 move again it'll just stay there until 274 00:10:09,279 --> 00:10:13,600 it's eventually freed by the garbage 275 00:10:10,800 --> 00:10:15,839 collector when it becomes unreachable 276 00:10:13,600 --> 00:10:17,760 now this causes fragmentation because 277 00:10:15,839 --> 00:10:19,600 once we collect all the stuff in between 278 00:10:17,760 --> 00:10:22,320 we have all these random old objects 279 00:10:19,600 --> 00:10:24,000 dotted around that aren't actually 280 00:10:22,320 --> 00:10:25,200 using as much space as they have 281 00:10:24,000 --> 00:10:26,959 allocated 282 00:10:25,200 --> 00:10:28,640 so this overall inflates the memory 283 00:10:26,959 --> 00:10:30,800 usage of the program 284 00:10:28,640 --> 00:10:32,480 now aaron patterson you can see on the 285 00:10:30,800 --> 00:10:34,800 bottom right hand corner there he's been 286 00:10:32,480 --> 00:10:37,200 doing some really great work on ruby's 287 00:10:34,800 --> 00:10:39,440 gc and implanting a mark compact 288 00:10:37,200 --> 00:10:41,120 algorithm which essentially compacts 289 00:10:39,440 --> 00:10:42,079 down all the objects into fewer heat 290 00:10:41,120 --> 00:10:43,760 pages 291 00:10:42,079 --> 00:10:47,040 as you can see on his slides there from 292 00:10:43,760 --> 00:10:49,839 his excellent ruby conf 2020 talk 293 00:10:47,040 --> 00:10:52,800 there is a substantial saving in memory 294 00:10:49,839 --> 00:10:55,279 usage thanks to this algorithm however 295 00:10:52,800 --> 00:10:57,040 work on this is still ongoing although 296 00:10:55,279 --> 00:10:58,640 it does support automatic collection 297 00:10:57,040 --> 00:11:01,839 it's turned off by default due to 298 00:10:58,640 --> 00:11:01,839 performance concerns 299 00:11:02,160 --> 00:11:06,000 now 300 00:11:04,320 --> 00:11:08,560 obviously this has been going on for a 301 00:11:06,000 --> 00:11:10,079 while but what have you heard from a gc 302 00:11:08,560 --> 00:11:12,000 expert that these techniques for 303 00:11:10,079 --> 00:11:14,079 instance generational garbage collection 304 00:11:12,000 --> 00:11:17,120 which was first proposed in this paper 305 00:11:14,079 --> 00:11:18,200 by lebanon and hewitt was first proposed 306 00:11:17,120 --> 00:11:20,000 back in 307 00:11:18,200 --> 00:11:21,839 1983. 308 00:11:20,000 --> 00:11:24,720 the mark impact algorithm dates back 309 00:11:21,839 --> 00:11:26,480 even further in 1974. 310 00:11:24,720 --> 00:11:27,360 you're probably wondering why is it 311 00:11:26,480 --> 00:11:28,720 taken 312 00:11:27,360 --> 00:11:30,240 ruby so long to implement these 313 00:11:28,720 --> 00:11:32,000 techniques that have been known for a 314 00:11:30,240 --> 00:11:34,959 long time 315 00:11:32,000 --> 00:11:36,800 the answer to this is in many ways 316 00:11:34,959 --> 00:11:38,399 simple and obvious 317 00:11:36,800 --> 00:11:41,200 garbage collection 318 00:11:38,399 --> 00:11:42,560 is hard building gcs is hard 319 00:11:41,200 --> 00:11:44,399 you've got to have specific domain 320 00:11:42,560 --> 00:11:46,399 knowledge about how garbage collection 321 00:11:44,399 --> 00:11:48,480 collectors work it's easy to learn the 322 00:11:46,399 --> 00:11:50,240 basics about basic algorithms like mark 323 00:11:48,480 --> 00:11:52,320 suite but for more complex collectors 324 00:11:50,240 --> 00:11:54,639 that can require quite a lot more 325 00:11:52,320 --> 00:11:56,320 reading and understanding 326 00:11:54,639 --> 00:11:58,959 you've also got to make sure you're 327 00:11:56,320 --> 00:12:01,120 focusing on correctness if a gc has a 328 00:11:58,959 --> 00:12:02,240 bug and collects an object that's still 329 00:12:01,120 --> 00:12:04,240 reachable 330 00:12:02,240 --> 00:12:06,959 that can have huge ramifications on the 331 00:12:04,240 --> 00:12:08,480 correctness of the program later on some 332 00:12:06,959 --> 00:12:11,120 programmer could try and access that 333 00:12:08,480 --> 00:12:12,800 object and suddenly everything segments 334 00:12:11,120 --> 00:12:14,639 and grinds to a halt because the garbage 335 00:12:12,800 --> 00:12:16,639 collector has a bug in it this would be 336 00:12:14,639 --> 00:12:19,200 a huge problem as you can imagine 337 00:12:16,639 --> 00:12:21,360 and finally performance is requires 338 00:12:19,200 --> 00:12:23,600 specific domain knowledge to improve 339 00:12:21,360 --> 00:12:25,120 with gcs 340 00:12:23,600 --> 00:12:26,959 but what working on this project has 341 00:12:25,120 --> 00:12:28,880 really taught me uh 342 00:12:26,959 --> 00:12:30,880 over the summer was that 343 00:12:28,880 --> 00:12:33,040 building it's not just that building gc 344 00:12:30,880 --> 00:12:35,440 is its heart it's the fact that building 345 00:12:33,040 --> 00:12:39,519 gcs into production grade language run 346 00:12:35,440 --> 00:12:41,600 times is just so much harder again 347 00:12:39,519 --> 00:12:44,240 let's take a look for instance at some 348 00:12:41,600 --> 00:12:48,000 of the major vms and see how big their 349 00:12:44,240 --> 00:12:51,760 jc runtimes are open jdk 100 000 lines 350 00:12:48,000 --> 00:12:53,440 dot net 63 000 v8 53 000 ruby is a 351 00:12:51,760 --> 00:12:55,920 little bit of a baby in comparison only 352 00:12:53,440 --> 00:12:57,440 about 11 000 lines but there's still a 353 00:12:55,920 --> 00:12:58,480 huge amount of complexity in these 354 00:12:57,440 --> 00:13:00,079 systems 355 00:12:58,480 --> 00:13:03,200 and this isn't representative of the 356 00:13:00,079 --> 00:13:05,279 total uh amount of code involved because 357 00:13:03,200 --> 00:13:07,680 gc often seeps into all the other 358 00:13:05,279 --> 00:13:09,120 different parts of these systems 359 00:13:07,680 --> 00:13:10,160 there's so many things you need to worry 360 00:13:09,120 --> 00:13:12,320 about 361 00:13:10,160 --> 00:13:14,000 from implementation concerns like 362 00:13:12,320 --> 00:13:15,519 concurrency you can imagine if you had a 363 00:13:14,000 --> 00:13:17,600 garbage collector that moved around 364 00:13:15,519 --> 00:13:19,440 objects over time if you're doing that 365 00:13:17,600 --> 00:13:20,959 while a programmer is using the object 366 00:13:19,440 --> 00:13:22,720 that could cause big issues if that 367 00:13:20,959 --> 00:13:23,839 object moved while it was still being 368 00:13:22,720 --> 00:13:25,600 used 369 00:13:23,839 --> 00:13:28,320 you've also got to worry about 370 00:13:25,600 --> 00:13:30,079 integrating the gc into the just in time 371 00:13:28,320 --> 00:13:31,920 compiler if the 372 00:13:30,079 --> 00:13:33,360 language runtime has one of those you've 373 00:13:31,920 --> 00:13:35,760 got to modify the bytecode that's 374 00:13:33,360 --> 00:13:38,480 generated by that compiler to hook into 375 00:13:35,760 --> 00:13:40,560 the garbage collector 376 00:13:38,480 --> 00:13:42,240 and over time as language runtimes 377 00:13:40,560 --> 00:13:44,399 become older 378 00:13:42,240 --> 00:13:46,480 software engineering issues start to be 379 00:13:44,399 --> 00:13:47,839 a problem as well you get abstraction 380 00:13:46,480 --> 00:13:52,320 leakage 381 00:13:47,839 --> 00:13:54,160 where parts of the runtime assume that 382 00:13:52,320 --> 00:13:55,920 there's a fixed memory model using a 383 00:13:54,160 --> 00:13:57,360 fixed garbage collector 384 00:13:55,920 --> 00:13:59,680 they assume how it will work for 385 00:13:57,360 --> 00:14:02,639 instance in ruby it's assumed that the 386 00:13:59,680 --> 00:14:04,560 garbage collector works on 40 byte slots 387 00:14:02,639 --> 00:14:06,639 nowadays they've discovered that 40 byte 388 00:14:04,560 --> 00:14:08,800 slots are bad for locality 389 00:14:06,639 --> 00:14:10,800 cpu cache locality because occasionally 390 00:14:08,800 --> 00:14:13,199 you'll have an array 391 00:14:10,800 --> 00:14:16,000 allocated totally separately and this 392 00:14:13,199 --> 00:14:18,800 pointer update often causes a cache miss 393 00:14:16,000 --> 00:14:20,720 so they want to improve locality by 394 00:14:18,800 --> 00:14:22,320 allocating them all in one giant object 395 00:14:20,720 --> 00:14:24,480 but due to all the assumptions in the 396 00:14:22,320 --> 00:14:26,560 runtime about how the garbage collector 397 00:14:24,480 --> 00:14:28,839 works this is really difficult because 398 00:14:26,560 --> 00:14:31,199 so many things have undocumented 399 00:14:28,839 --> 00:14:33,279 assumptions you also have 400 00:14:31,199 --> 00:14:36,000 programmers using the gc for things that 401 00:14:33,279 --> 00:14:38,800 aren't necessarily gc related tasks 402 00:14:36,000 --> 00:14:40,800 for instance the gc is often used to 403 00:14:38,800 --> 00:14:44,000 close and flash files after they're no 404 00:14:40,800 --> 00:14:45,360 longer accessible and that sort of thing 405 00:14:44,000 --> 00:14:46,880 and finally 406 00:14:45,360 --> 00:14:49,120 one of the problem is backwards 407 00:14:46,880 --> 00:14:51,600 compatibility in ruby's case c 408 00:14:49,120 --> 00:14:53,839 extensions are a really large concern 409 00:14:51,600 --> 00:14:56,639 so ruby gems which elect libraries from 410 00:14:53,839 --> 00:14:58,959 ruby they can implement their own c code 411 00:14:56,639 --> 00:15:01,040 which hooks into ruby's api to perform 412 00:14:58,959 --> 00:15:04,399 tasks faster 413 00:15:01,040 --> 00:15:06,720 however this c api is fixed it was built 414 00:15:04,399 --> 00:15:07,760 a long time ago in ruby's history 415 00:15:06,720 --> 00:15:11,440 and it 416 00:15:07,760 --> 00:15:13,040 programmers all assume how the gc works 417 00:15:11,440 --> 00:15:14,480 so 418 00:15:13,040 --> 00:15:15,680 when we want to build new garbage 419 00:15:14,480 --> 00:15:17,279 collectors it's really difficult 420 00:15:15,680 --> 00:15:18,480 maintaining compatibility with these 421 00:15:17,279 --> 00:15:20,160 gems and we can't just drop 422 00:15:18,480 --> 00:15:22,399 compatibility or expect them all to be 423 00:15:20,160 --> 00:15:24,800 updated because then so many libraries 424 00:15:22,399 --> 00:15:24,800 would break 425 00:15:24,880 --> 00:15:28,639 and all of these software engineering 426 00:15:26,560 --> 00:15:29,920 problems cause a huge number of social 427 00:15:28,639 --> 00:15:31,040 factors that are really difficult to 428 00:15:29,920 --> 00:15:33,360 deal with 429 00:15:31,040 --> 00:15:36,079 first of all it's scary making such huge 430 00:15:33,360 --> 00:15:37,519 changes to the language runtime if you 431 00:15:36,079 --> 00:15:38,959 saw a patch being made to your 432 00:15:37,519 --> 00:15:41,600 programming language of a hundred 433 00:15:38,959 --> 00:15:42,959 thousand lines 434 00:15:41,600 --> 00:15:44,959 you would be a little bit hesitant to 435 00:15:42,959 --> 00:15:47,360 emerge that without lots and lots of 436 00:15:44,959 --> 00:15:49,759 work being spent to review that and not 437 00:15:47,360 --> 00:15:51,920 everyone has the time and effort to uh 438 00:15:49,759 --> 00:15:53,279 review those changes and because of this 439 00:15:51,920 --> 00:15:55,680 there's a huge amount of inertia behind 440 00:15:53,279 --> 00:15:59,040 these language runtimes because 441 00:15:55,680 --> 00:15:59,040 it's very difficult to change them 442 00:15:59,440 --> 00:16:03,519 so 443 00:16:01,120 --> 00:16:05,360 we find that complexity is impleading 444 00:16:03,519 --> 00:16:08,240 impeding innovation on garbage 445 00:16:05,360 --> 00:16:10,160 collectors the literature has lots of 446 00:16:08,240 --> 00:16:12,399 new innovations and techniques that are 447 00:16:10,160 --> 00:16:13,920 out there yet not many of them are 448 00:16:12,399 --> 00:16:16,320 actually implemented into the production 449 00:16:13,920 --> 00:16:18,800 grade systems that we use every day 450 00:16:16,320 --> 00:16:20,560 so what can we do to solve this problem 451 00:16:18,800 --> 00:16:21,839 well this is the aim of the memory 452 00:16:20,560 --> 00:16:24,560 management toolkit which is being 453 00:16:21,839 --> 00:16:26,160 developed by researchers at anu 454 00:16:24,560 --> 00:16:28,000 it's an open source 455 00:16:26,160 --> 00:16:30,560 runtime agnostic garbage collection 456 00:16:28,000 --> 00:16:32,720 framework it provides developers with a 457 00:16:30,560 --> 00:16:34,880 large library of high performance gc 458 00:16:32,720 --> 00:16:36,959 algorithms ranging from the good old 459 00:16:34,880 --> 00:16:39,759 tried and tested ones like you know semi 460 00:16:36,959 --> 00:16:41,839 space mark suite mic compact all the way 461 00:16:39,759 --> 00:16:44,880 to higher end cutting edge high 462 00:16:41,839 --> 00:16:46,720 performance collectors like emix 463 00:16:44,880 --> 00:16:49,759 it was originally developed in the early 464 00:16:46,720 --> 00:16:51,120 2000s for jikes rvm which was ibm's 465 00:16:49,759 --> 00:16:53,680 research 466 00:16:51,120 --> 00:16:55,120 research virtual machine for java 467 00:16:53,680 --> 00:16:57,519 by professor steve blackburn my 468 00:16:55,120 --> 00:16:59,199 supervisor along with perry chang and 469 00:16:57,519 --> 00:17:00,959 catherine mckinley 470 00:16:59,199 --> 00:17:03,440 in recent years it has been rewritten in 471 00:17:00,959 --> 00:17:05,919 rust with the idea that we can enable 472 00:17:03,440 --> 00:17:08,079 interoperability with not just jax rvm 473 00:17:05,919 --> 00:17:11,520 but also other language runtimes and 474 00:17:08,079 --> 00:17:15,360 that's the idea of my project today 475 00:17:11,520 --> 00:17:18,559 these gc algorithms are exposed behind a 476 00:17:15,360 --> 00:17:20,480 single a bi-directional api 477 00:17:18,559 --> 00:17:22,079 now the advantage of this for language 478 00:17:20,480 --> 00:17:24,480 implementers is you only have to 479 00:17:22,079 --> 00:17:27,039 implement this api once and then in 480 00:17:24,480 --> 00:17:28,640 future you can benefit in future from 481 00:17:27,039 --> 00:17:31,679 any and all improvements that occur to 482 00:17:28,640 --> 00:17:33,280 these garbage collection algorithms 483 00:17:31,679 --> 00:17:35,200 so all these software engineering 484 00:17:33,280 --> 00:17:36,799 challenges that we have that i talked 485 00:17:35,200 --> 00:17:38,960 about beforehand 486 00:17:36,799 --> 00:17:40,720 yes we can't make them disappear but if 487 00:17:38,960 --> 00:17:42,000 we go through the hard work of 488 00:17:40,720 --> 00:17:44,080 implementing 489 00:17:42,000 --> 00:17:45,760 uh interface for a garbage collector 490 00:17:44,080 --> 00:17:48,960 once we won't have to deal with these 491 00:17:45,760 --> 00:17:48,960 problems again in the future 492 00:17:49,200 --> 00:17:54,480 so this is really powerful and enables 493 00:17:51,520 --> 00:17:56,799 long-term innovation on gcs 494 00:17:54,480 --> 00:17:58,559 mtk is also really helpful for garbage 495 00:17:56,799 --> 00:18:00,640 collection researchers and just anyone 496 00:17:58,559 --> 00:18:02,640 in general who wants to build a gc it 497 00:18:00,640 --> 00:18:04,880 provides a composable framework for 498 00:18:02,640 --> 00:18:06,320 building ugcs so components of the 499 00:18:04,880 --> 00:18:09,520 software that are reused really 500 00:18:06,320 --> 00:18:12,000 frequently like allocation or 501 00:18:09,520 --> 00:18:13,280 bumper allocation or copy spaces or 502 00:18:12,000 --> 00:18:14,960 these sorts of things that are common to 503 00:18:13,280 --> 00:18:16,160 many garbage collection algorithms 504 00:18:14,960 --> 00:18:17,760 they're already there and you can just 505 00:18:16,160 --> 00:18:19,760 compose them together to create brand 506 00:18:17,760 --> 00:18:21,600 new collectors 507 00:18:19,760 --> 00:18:24,000 it abstracts away the complexity because 508 00:18:21,600 --> 00:18:26,080 it's behind this api which means that gc 509 00:18:24,000 --> 00:18:28,160 researchers can focus on the thing they 510 00:18:26,080 --> 00:18:29,760 know best gcs and they don't have to 511 00:18:28,160 --> 00:18:31,200 worry about all those software 512 00:18:29,760 --> 00:18:32,880 engineering details that i talked about 513 00:18:31,200 --> 00:18:35,360 beforehand 514 00:18:32,880 --> 00:18:37,280 and possibly most importantly it enables 515 00:18:35,360 --> 00:18:39,200 apples to apples comparisons between 516 00:18:37,280 --> 00:18:40,880 different garbage collection algorithms 517 00:18:39,200 --> 00:18:43,840 running on the same or even different 518 00:18:40,880 --> 00:18:43,840 language runtimes 519 00:18:44,160 --> 00:18:47,360 so 520 00:18:45,520 --> 00:18:48,559 how do we actually go about implementing 521 00:18:47,360 --> 00:18:50,799 a language 522 00:18:48,559 --> 00:18:53,120 mtk into a language 523 00:18:50,799 --> 00:18:55,840 well we create what's known as a binding 524 00:18:53,120 --> 00:18:57,840 which consists of two different parts 525 00:18:55,840 --> 00:18:59,919 the first half of the binding 526 00:18:57,840 --> 00:19:01,919 is conceptually closer to 527 00:18:59,919 --> 00:19:04,559 the language runtime in this case it's 528 00:19:01,919 --> 00:19:06,320 max's ruby interpreter mri which is the 529 00:19:04,559 --> 00:19:08,559 most commonly used interpreter for the 530 00:19:06,320 --> 00:19:11,440 ruby programming language 531 00:19:08,559 --> 00:19:13,600 and on the right we have uh mri specific 532 00:19:11,440 --> 00:19:16,400 mtk code which is conceptually closer 533 00:19:13,600 --> 00:19:18,160 but separate from the mtk code base 534 00:19:16,400 --> 00:19:20,240 the idea is that the language runtime in 535 00:19:18,160 --> 00:19:22,720 mtk should remain 536 00:19:20,240 --> 00:19:24,799 uh completely agnostic of the garbage 537 00:19:22,720 --> 00:19:26,400 collection algorithm being used and the 538 00:19:24,799 --> 00:19:28,080 language that those garbage collection 539 00:19:26,400 --> 00:19:29,679 algorithms are running on 540 00:19:28,080 --> 00:19:33,039 and then the binding contains all the 541 00:19:29,679 --> 00:19:35,280 details that hook these things together 542 00:19:33,039 --> 00:19:37,840 so whilst it may be possible to perform 543 00:19:35,280 --> 00:19:40,880 a generic algo generic allocation call 544 00:19:37,840 --> 00:19:43,120 from mri straight through to mntk these 545 00:19:40,880 --> 00:19:46,000 bindings allow us to do smarter things 546 00:19:43,120 --> 00:19:48,320 like we can optimize the fast paths by 547 00:19:46,000 --> 00:19:50,400 only 548 00:19:48,320 --> 00:19:51,679 calling into the binding itself and then 549 00:19:50,400 --> 00:19:54,000 running code that's specifically 550 00:19:51,679 --> 00:19:55,200 tailored for the combination of ruby and 551 00:19:54,000 --> 00:19:57,600 mtk 552 00:19:55,200 --> 00:19:59,360 and then in exceptional cases where it's 553 00:19:57,600 --> 00:20:01,440 difficult to optimize the code we can 554 00:19:59,360 --> 00:20:03,039 call out to mntk proper which has all of 555 00:20:01,440 --> 00:20:04,320 the relevant logic 556 00:20:03,039 --> 00:20:06,400 and certainly this works back in the 557 00:20:04,320 --> 00:20:08,799 other direction we may scan an object 558 00:20:06,400 --> 00:20:10,799 and the binding can then know how to 559 00:20:08,799 --> 00:20:12,880 read all of ruby's objects and identify 560 00:20:10,799 --> 00:20:15,840 what pointers they point to or what 561 00:20:12,880 --> 00:20:15,840 objects they point to 562 00:20:16,080 --> 00:20:19,840 and then over time the idea is as we 563 00:20:18,640 --> 00:20:21,440 build up bindings for all these 564 00:20:19,840 --> 00:20:23,039 programming languages all the 565 00:20:21,440 --> 00:20:24,400 programming languages will be able to 566 00:20:23,039 --> 00:20:25,679 benefit from 567 00:20:24,400 --> 00:20:28,159 mtk 568 00:20:25,679 --> 00:20:30,400 and mtk and gc researchers will be able 569 00:20:28,159 --> 00:20:31,760 to benefit from having production grade 570 00:20:30,400 --> 00:20:34,640 implementation 571 00:20:31,760 --> 00:20:36,480 integrations with language runtimes 572 00:20:34,640 --> 00:20:40,240 we currently have officially supported 573 00:20:36,480 --> 00:20:43,679 runtimes for openjdk uh 574 00:20:40,240 --> 00:20:46,640 jax rvm and v8 and there's currently 575 00:20:43,679 --> 00:20:49,520 runtime integrations in progress for 576 00:20:46,640 --> 00:20:53,679 ruby juilliard ghc with the glasgow 577 00:20:49,520 --> 00:20:55,280 haskell compiler and the net framework 578 00:20:53,679 --> 00:20:58,480 so how do we actually go about 579 00:20:55,280 --> 00:21:00,799 integrating mtk into ruby specifically 580 00:20:58,480 --> 00:21:02,640 well this was my project that i was 581 00:21:00,799 --> 00:21:04,640 working on over the summer of 2020 and 582 00:21:02,640 --> 00:21:06,960 2020 583 00:21:04,640 --> 00:21:07,919 now it may seem a little bit confusing 584 00:21:06,960 --> 00:21:09,600 but 585 00:21:07,919 --> 00:21:11,360 the first thing we do 586 00:21:09,600 --> 00:21:12,640 is we get rid of the garbage collection 587 00:21:11,360 --> 00:21:14,640 altogether 588 00:21:12,640 --> 00:21:17,120 we build what's known as a 589 00:21:14,640 --> 00:21:19,440 as no gc a garbage collector that 590 00:21:17,120 --> 00:21:21,360 doesn't collect any garbage instead we 591 00:21:19,440 --> 00:21:23,919 just make all these allocation calls and 592 00:21:21,360 --> 00:21:26,240 over time they build up and the memory 593 00:21:23,919 --> 00:21:27,200 is never freed 594 00:21:26,240 --> 00:21:28,880 now 595 00:21:27,200 --> 00:21:30,400 that may seem a little bit confusing why 596 00:21:28,880 --> 00:21:32,159 are we just chucking out all the things 597 00:21:30,400 --> 00:21:33,600 that exist and 598 00:21:32,159 --> 00:21:34,880 making something that's objectively 599 00:21:33,600 --> 00:21:36,799 worse 600 00:21:34,880 --> 00:21:39,200 well as i said earlier there's a huge 601 00:21:36,799 --> 00:21:40,720 amount of complexity in integrating with 602 00:21:39,200 --> 00:21:42,840 language runtimes all those software 603 00:21:40,720 --> 00:21:45,840 engineering issues that i talked about 604 00:21:42,840 --> 00:21:47,679 earlier and as you can see rndk has to 605 00:21:45,840 --> 00:21:50,080 have a large api in order to work with a 606 00:21:47,679 --> 00:21:53,039 variety of different garbage collectors 607 00:21:50,080 --> 00:21:55,039 but by starting simple and starting with 608 00:21:53,039 --> 00:21:57,440 no gc which only needs to allocate 609 00:21:55,039 --> 00:21:59,520 things and initialize 610 00:21:57,440 --> 00:22:00,880 mtk's information 611 00:21:59,520 --> 00:22:02,480 it makes the tasks a lot simpler to 612 00:22:00,880 --> 00:22:05,520 start off with and we can slowly build 613 00:22:02,480 --> 00:22:07,200 confidence in working with ruby's code 614 00:22:05,520 --> 00:22:09,520 over time we can implement larger 615 00:22:07,200 --> 00:22:13,840 subsets of the mtki 616 00:22:09,520 --> 00:22:16,400 mmtk api implement and as larger subsets 617 00:22:13,840 --> 00:22:18,000 of the api are implemented more and more 618 00:22:16,400 --> 00:22:18,880 garbage collection algorithms can be 619 00:22:18,000 --> 00:22:21,440 used 620 00:22:18,880 --> 00:22:24,240 and eventually once all of the 621 00:22:21,440 --> 00:22:25,679 api is implemented all of the gcs within 622 00:22:24,240 --> 00:22:29,280 mmtk 623 00:22:25,679 --> 00:22:29,280 should be accessible to the language 624 00:22:30,799 --> 00:22:34,320 so what are the steps involved with no 625 00:22:33,039 --> 00:22:36,080 gc 626 00:22:34,320 --> 00:22:38,480 well the first step is we need to 627 00:22:36,080 --> 00:22:40,960 disable the existing garbage collector 628 00:22:38,480 --> 00:22:44,159 in ruby thankfully this is relatively 629 00:22:40,960 --> 00:22:46,000 easy because ruby provides a method 630 00:22:44,159 --> 00:22:48,480 directly to programmers even that they 631 00:22:46,000 --> 00:22:50,559 can disable and enable the gc so we just 632 00:22:48,480 --> 00:22:52,880 call this and then stop the gc from 633 00:22:50,559 --> 00:22:54,400 being enabled ever again 634 00:22:52,880 --> 00:22:56,159 obviously if we had 635 00:22:54,400 --> 00:22:59,120 ruby's gc running at the same time as 636 00:22:56,159 --> 00:23:01,039 mtk's gc that would cause a whole lot of 637 00:22:59,120 --> 00:23:02,880 correctness issues 638 00:23:01,039 --> 00:23:05,440 the second step is we need to initialize 639 00:23:02,880 --> 00:23:07,200 some mtk metadata once again relatively 640 00:23:05,440 --> 00:23:10,240 easy we need to figure out 641 00:23:07,200 --> 00:23:12,559 early on in the vm startup where we can 642 00:23:10,240 --> 00:23:14,640 hook in and form this connection between 643 00:23:12,559 --> 00:23:16,720 ruby and mmtk 644 00:23:14,640 --> 00:23:18,640 and the third step which is the most 645 00:23:16,720 --> 00:23:21,760 time consuming of all of this is finding 646 00:23:18,640 --> 00:23:24,320 every single allocation call within the 647 00:23:21,760 --> 00:23:26,400 ruby runtime and 648 00:23:24,320 --> 00:23:28,640 hooking those out and replacing them 649 00:23:26,400 --> 00:23:30,640 with mtk calls 650 00:23:28,640 --> 00:23:32,320 this constituted several weeks worth of 651 00:23:30,640 --> 00:23:33,840 work alone 652 00:23:32,320 --> 00:23:36,159 i needed to build an understanding of 653 00:23:33,840 --> 00:23:38,320 ruby's vm which is difficult when you're 654 00:23:36,159 --> 00:23:40,000 coming in as a newcomer and in general 655 00:23:38,320 --> 00:23:41,840 it's difficult because there's often 656 00:23:40,000 --> 00:23:42,799 like some abstraction breakages like for 657 00:23:41,840 --> 00:23:45,360 instance 658 00:23:42,799 --> 00:23:47,440 uh because ruby uses 659 00:23:45,360 --> 00:23:48,960 separate areas to allocate the objects 660 00:23:47,440 --> 00:23:50,640 themselves which are allocated in the 661 00:23:48,960 --> 00:23:51,840 heap and 662 00:23:50,640 --> 00:23:53,919 the 663 00:23:51,840 --> 00:23:56,320 larger sections of memory that are used 664 00:23:53,919 --> 00:23:58,320 to store the overflow space 665 00:23:56,320 --> 00:24:00,880 there were some issues here while the 666 00:23:58,320 --> 00:24:03,039 ruby heap was really easy to fix up 667 00:24:00,880 --> 00:24:05,200 because all those objects allocated 668 00:24:03,039 --> 00:24:07,520 through something known as uh through a 669 00:24:05,200 --> 00:24:09,840 function known as new object of uh 670 00:24:07,520 --> 00:24:11,520 replacing the elements the malect 671 00:24:09,840 --> 00:24:13,600 separately was a little bit harder 672 00:24:11,520 --> 00:24:16,080 because ruby uses malloc throughout 673 00:24:13,600 --> 00:24:19,520 codebase not just for allocating these 674 00:24:16,080 --> 00:24:22,640 site object data but also uh 675 00:24:19,520 --> 00:24:25,919 vm internal information like thread data 676 00:24:22,640 --> 00:24:26,960 the stack and that sort of thing 677 00:24:25,919 --> 00:24:28,720 so 678 00:24:26,960 --> 00:24:30,720 now that we've done that 679 00:24:28,720 --> 00:24:32,480 we've got it working right well no of 680 00:24:30,720 --> 00:24:33,760 course not we we're programmers we know 681 00:24:32,480 --> 00:24:35,679 when you compile something for the first 682 00:24:33,760 --> 00:24:37,520 time it's never going to work 683 00:24:35,679 --> 00:24:39,919 uh but this is a really interesting 684 00:24:37,520 --> 00:24:41,919 failure that i wanted to talk about 685 00:24:39,919 --> 00:24:43,760 so what i found was that it was this 686 00:24:41,919 --> 00:24:45,679 relatively simple test case it was doing 687 00:24:43,760 --> 00:24:46,640 some stuff printing out some data so 688 00:24:45,679 --> 00:24:49,120 then 689 00:24:46,640 --> 00:24:50,960 uh that part data can be piped to a file 690 00:24:49,120 --> 00:24:53,120 and it could be compared against a 691 00:24:50,960 --> 00:24:56,480 master version that file and check that 692 00:24:53,120 --> 00:24:56,480 ruby is operating correctly 693 00:24:56,640 --> 00:24:59,679 and this test was failing because no 694 00:24:58,559 --> 00:25:01,840 data was coming out and i was very 695 00:24:59,679 --> 00:25:04,159 confused about this and so i wrote a 696 00:25:01,840 --> 00:25:06,159 simplified version of the test to see 697 00:25:04,159 --> 00:25:08,480 uh what was the issue so just create you 698 00:25:06,159 --> 00:25:09,279 know hello world program 699 00:25:08,480 --> 00:25:11,440 uh 700 00:25:09,279 --> 00:25:14,000 then run that program cool seems to be 701 00:25:11,440 --> 00:25:16,559 outputting stuff just fine 702 00:25:14,000 --> 00:25:19,200 but then when we pipe the app put to a 703 00:25:16,559 --> 00:25:20,960 file redirect it to a file 704 00:25:19,200 --> 00:25:23,600 it doesn't work there's no data sent to 705 00:25:20,960 --> 00:25:24,799 that file it's empty so what's happening 706 00:25:23,600 --> 00:25:27,200 here 707 00:25:24,799 --> 00:25:29,679 well after spending quite a bit of time 708 00:25:27,200 --> 00:25:31,760 uh looking into this i discovered it was 709 00:25:29,679 --> 00:25:34,320 something to do with the way that how 710 00:25:31,760 --> 00:25:36,640 ruby flashes uh 711 00:25:34,320 --> 00:25:38,880 file objects so ruby detects when you're 712 00:25:36,640 --> 00:25:40,400 redirecting to a file and behaves 713 00:25:38,880 --> 00:25:42,559 differently when you're going to 714 00:25:40,400 --> 00:25:44,799 directly to the terminal 715 00:25:42,559 --> 00:25:46,799 it more aggressively flushes the output 716 00:25:44,799 --> 00:25:47,600 so that you see it faster whereas when 717 00:25:46,799 --> 00:25:49,360 it's 718 00:25:47,600 --> 00:25:52,000 flushing to a file it's less aggressive 719 00:25:49,360 --> 00:25:53,840 and only does it in chunks 720 00:25:52,000 --> 00:25:55,440 now this in and of itself i didn't think 721 00:25:53,840 --> 00:25:56,720 would be an issue because that has 722 00:25:55,440 --> 00:25:58,559 nothing to do with the garbage 723 00:25:56,720 --> 00:26:00,480 collection algorithm it should have 724 00:25:58,559 --> 00:26:02,240 happened anyway and in larger tests i 725 00:26:00,480 --> 00:26:05,120 found that 726 00:26:02,240 --> 00:26:06,880 it did sometimes work and sometimes not 727 00:26:05,120 --> 00:26:08,840 and it wasn't until i eventually tracked 728 00:26:06,880 --> 00:26:12,000 down this code that i figured out the 729 00:26:08,840 --> 00:26:14,559 answer so as so this is part of the vm 730 00:26:12,000 --> 00:26:16,480 shutdown code so when the ruby program 731 00:26:14,559 --> 00:26:18,799 is finished running and it's about to 732 00:26:16,480 --> 00:26:21,520 close and exit the program 733 00:26:18,799 --> 00:26:23,360 as you can see here it forms a list of 734 00:26:21,520 --> 00:26:24,640 this these things called a finalized 735 00:26:23,360 --> 00:26:26,960 list 736 00:26:24,640 --> 00:26:29,200 now this has to do with finalizers which 737 00:26:26,960 --> 00:26:31,279 isn't actually a gc topic 738 00:26:29,200 --> 00:26:32,559 but is often grouped together in the gc 739 00:26:31,279 --> 00:26:35,120 implementation 740 00:26:32,559 --> 00:26:38,080 finalizes are a piece of code that's run 741 00:26:35,120 --> 00:26:40,240 when a is something that the programmer 742 00:26:38,080 --> 00:26:42,480 can register to run a piece of code when 743 00:26:40,240 --> 00:26:44,159 the object is no longer reachable in 744 00:26:42,480 --> 00:26:45,600 other words it's eligible for garbage 745 00:26:44,159 --> 00:26:47,840 collection but before it actually 746 00:26:45,600 --> 00:26:50,320 collects this is commonly used for this 747 00:26:47,840 --> 00:26:52,799 specific case of flashing files closing 748 00:26:50,320 --> 00:26:55,600 file descriptors that sort of thing 749 00:26:52,799 --> 00:26:58,240 um and as we can see at vm's shutdown it 750 00:26:55,600 --> 00:27:01,039 loops through all of these file objects 751 00:26:58,240 --> 00:27:02,640 and runs that finalizer now 752 00:27:01,039 --> 00:27:04,559 this wasn't something we were expecting 753 00:27:02,640 --> 00:27:06,000 to see because this isn't necessarily 754 00:27:04,559 --> 00:27:07,360 something that's usually grouped into 755 00:27:06,000 --> 00:27:09,440 gcs 756 00:27:07,360 --> 00:27:11,679 and 757 00:27:09,440 --> 00:27:14,080 uh 758 00:27:11,679 --> 00:27:15,360 oh yeah 759 00:27:14,080 --> 00:27:16,240 sorry i've just forgotten what i was 760 00:27:15,360 --> 00:27:17,200 saying 761 00:27:16,240 --> 00:27:19,200 uh 762 00:27:17,200 --> 00:27:21,360 and so these finalizers weren't being 763 00:27:19,200 --> 00:27:23,039 run because they relied on the gc and i 764 00:27:21,360 --> 00:27:24,640 just replaced the gc 765 00:27:23,039 --> 00:27:26,240 so this is 766 00:27:24,640 --> 00:27:28,480 relatively simple to fix but an 767 00:27:26,240 --> 00:27:30,399 illustrative example of some things that 768 00:27:28,480 --> 00:27:34,000 can happen just by removing the gc that 769 00:27:30,399 --> 00:27:35,840 you don't really expect to occur 770 00:27:34,000 --> 00:27:37,520 now another thing we considered was a 771 00:27:35,840 --> 00:27:39,679 potential optimization that is thread 772 00:27:37,520 --> 00:27:42,720 local allocation buffers so you can 773 00:27:39,679 --> 00:27:45,039 imagine in a vm that supports uh 774 00:27:42,720 --> 00:27:46,640 multiple threads like v8 you may have 775 00:27:45,039 --> 00:27:48,480 allocation calls happening at once 776 00:27:46,640 --> 00:27:50,720 because these often require locks to be 777 00:27:48,480 --> 00:27:53,840 acquired and that sort of thing 778 00:27:50,720 --> 00:27:56,080 there can be some contention that occurs 779 00:27:53,840 --> 00:27:58,960 and this can slow down the program 780 00:27:56,080 --> 00:28:00,799 so a common practice uh when integrating 781 00:27:58,960 --> 00:28:03,039 with mntk is to use what's known as 782 00:28:00,799 --> 00:28:03,840 thread local allocation buffers 783 00:28:03,039 --> 00:28:06,399 where 784 00:28:03,840 --> 00:28:08,799 instead of always calling out to ntk we 785 00:28:06,399 --> 00:28:11,039 acquire just a buffer 786 00:28:08,799 --> 00:28:13,039 for that thread specifically and we 787 00:28:11,039 --> 00:28:15,520 allocate into this buffer whenever we 788 00:28:13,039 --> 00:28:17,440 need more space and then when the buffer 789 00:28:15,520 --> 00:28:19,520 is completely full we can call out to 790 00:28:17,440 --> 00:28:20,640 mmtk via the slow path 791 00:28:19,520 --> 00:28:22,240 uh this 792 00:28:20,640 --> 00:28:23,840 dramatically reduces contention and 793 00:28:22,240 --> 00:28:25,039 improves performance in multithreaded 794 00:28:23,840 --> 00:28:27,360 vms 795 00:28:25,039 --> 00:28:29,679 however we decided this wasn't a concern 796 00:28:27,360 --> 00:28:31,520 in ruby because ruby has what's known as 797 00:28:29,679 --> 00:28:33,279 a global vm lock 798 00:28:31,520 --> 00:28:35,360 if you've used python you've probably 799 00:28:33,279 --> 00:28:36,159 heard of the gill the global interpreter 800 00:28:35,360 --> 00:28:38,960 lock 801 00:28:36,159 --> 00:28:41,120 and ruby's gvl is similar to this it 802 00:28:38,960 --> 00:28:42,720 means that only one thread can execute 803 00:28:41,120 --> 00:28:45,120 at any given time 804 00:28:42,720 --> 00:28:46,960 and because of this the allocation calls 805 00:28:45,120 --> 00:28:49,279 should never collide and hence there'll 806 00:28:46,960 --> 00:28:51,440 be no contention 807 00:28:49,279 --> 00:28:55,120 so we chose not to implement this we may 808 00:28:51,440 --> 00:28:57,279 investigate it further down the line 809 00:28:55,120 --> 00:28:59,440 so that's all that well that's finished 810 00:28:57,279 --> 00:29:00,960 let's get a little bit of a demo 811 00:28:59,440 --> 00:29:02,720 uh 812 00:29:00,960 --> 00:29:04,480 right here i have a terminal on a web 813 00:29:02,720 --> 00:29:06,960 browser what we're going to do for the 814 00:29:04,480 --> 00:29:08,880 demo is we're going to create and run a 815 00:29:06,960 --> 00:29:10,159 ruby on rails application if you've 816 00:29:08,880 --> 00:29:11,679 heard of ruby it's probably because 817 00:29:10,159 --> 00:29:13,360 you've heard of rails and want to use 818 00:29:11,679 --> 00:29:16,960 ruby on rails 819 00:29:13,360 --> 00:29:16,960 so as we can see 820 00:29:20,880 --> 00:29:23,520 we have 821 00:29:21,919 --> 00:29:25,840 rails installed which in and of itself 822 00:29:23,520 --> 00:29:29,279 is a challenge because rails relies on 823 00:29:25,840 --> 00:29:30,320 very many gems that use c extensions so 824 00:29:29,279 --> 00:29:32,799 it's great that we've managed to 825 00:29:30,320 --> 00:29:34,240 maintain compatibility with c extensions 826 00:29:32,799 --> 00:29:36,480 which is really important for adoption 827 00:29:34,240 --> 00:29:38,559 of this project in the future 828 00:29:36,480 --> 00:29:42,799 so now we can run 829 00:29:38,559 --> 00:29:44,960 rails new and create a basic application 830 00:29:42,799 --> 00:29:46,480 it's doing its thing checking all the 831 00:29:44,960 --> 00:29:48,000 ruby gems are present which are like 832 00:29:46,480 --> 00:29:50,240 their libraries 833 00:29:48,000 --> 00:29:54,279 it's generating all the pages and basic 834 00:29:50,240 --> 00:29:54,279 framework of rails application 835 00:29:54,480 --> 00:29:59,600 and we cd into the demo directory 836 00:29:57,840 --> 00:30:01,760 we can see we have our basic 837 00:29:59,600 --> 00:30:02,960 rails thing as we expect 838 00:30:01,760 --> 00:30:04,960 and if we 839 00:30:02,960 --> 00:30:06,960 run rail server 840 00:30:04,960 --> 00:30:08,880 we can start up fully functional web 841 00:30:06,960 --> 00:30:11,600 server running on 842 00:30:08,880 --> 00:30:11,600 mtk 843 00:30:13,360 --> 00:30:17,679 whoops my apologies 844 00:30:15,520 --> 00:30:19,200 and ta-da we have rails running 845 00:30:17,679 --> 00:30:22,960 and as we can see down here in the 846 00:30:19,200 --> 00:30:25,440 terminal this is running on mtk ruby 847 00:30:22,960 --> 00:30:27,039 now there is a downside to this 848 00:30:25,440 --> 00:30:29,120 implementation which is inherent to the 849 00:30:27,039 --> 00:30:31,279 fact it's a no gc 850 00:30:29,120 --> 00:30:33,120 if we have a look at h top 851 00:30:31,279 --> 00:30:36,000 and run a script in the background to 852 00:30:33,120 --> 00:30:37,919 spam our server with lots of queries we 853 00:30:36,000 --> 00:30:39,840 can see that the memory usage of this 854 00:30:37,919 --> 00:30:41,360 server is monotonically increasing over 855 00:30:39,840 --> 00:30:44,000 the time 856 00:30:41,360 --> 00:30:45,679 and that's obviously not good 857 00:30:44,000 --> 00:30:47,840 so what can we do 858 00:30:45,679 --> 00:30:49,600 about this well we need to implement new 859 00:30:47,840 --> 00:30:51,840 allocators which is a story for another 860 00:30:49,600 --> 00:30:51,840 day 861 00:30:51,919 --> 00:30:54,960 so 862 00:30:53,279 --> 00:30:56,799 my apologies for the audio issues there 863 00:30:54,960 --> 00:30:57,840 by the way 864 00:30:56,799 --> 00:30:59,519 you're probably wondering do you have 865 00:30:57,840 --> 00:31:01,919 any performance results 866 00:30:59,519 --> 00:31:04,799 uh no sorry 867 00:31:01,919 --> 00:31:07,600 because mta implementation so far is 868 00:31:04,799 --> 00:31:09,760 just using a no gc garbage collection 869 00:31:07,600 --> 00:31:12,159 doesn't collect any garbage uh this 870 00:31:09,760 --> 00:31:14,960 would be a totally unfair comparison so 871 00:31:12,159 --> 00:31:16,000 no point showing any performance results 872 00:31:14,960 --> 00:31:18,320 the second thing you're probably 873 00:31:16,000 --> 00:31:20,159 wondering is what did we actually 874 00:31:18,320 --> 00:31:22,320 achieve by doing this 875 00:31:20,159 --> 00:31:24,559 at the end of the day my project 876 00:31:22,320 --> 00:31:26,080 gutted ruby's gc replaced it with 877 00:31:24,559 --> 00:31:27,600 something that's 878 00:31:26,080 --> 00:31:29,519 fairly functional 879 00:31:27,600 --> 00:31:31,120 well functional but 880 00:31:29,519 --> 00:31:33,039 doesn't collect any memory so it will 881 00:31:31,120 --> 00:31:34,320 crash on any system running an actual 882 00:31:33,039 --> 00:31:36,559 program 883 00:31:34,320 --> 00:31:36,559 so 884 00:31:36,640 --> 00:31:40,159 this may seem like a waste of time but 885 00:31:38,480 --> 00:31:41,279 we have achieved quite a lot in this 886 00:31:40,159 --> 00:31:43,760 project 887 00:31:41,279 --> 00:31:46,399 the first thing is that we've begun to 888 00:31:43,760 --> 00:31:48,799 build a gc interface for ruby 889 00:31:46,399 --> 00:31:50,480 because ruby's gc is so tightly 890 00:31:48,799 --> 00:31:52,480 integrated at the moment the code 891 00:31:50,480 --> 00:31:55,120 changes that are made to ruby source 892 00:31:52,480 --> 00:31:56,640 code will enable not just mtk to make 893 00:31:55,120 --> 00:31:59,600 use of that but also anyone else who 894 00:31:56,640 --> 00:32:01,360 wants to try and build a gc for ruby 895 00:31:59,600 --> 00:32:02,799 it also lays the groundwork for higher 896 00:32:01,360 --> 00:32:04,399 performance collectors which i'll be 897 00:32:02,799 --> 00:32:06,720 discussing in a moment 898 00:32:04,399 --> 00:32:08,320 and finally for the mtk project 899 00:32:06,720 --> 00:32:10,799 we built a porting guide so if anyone 900 00:32:08,320 --> 00:32:13,120 else wants to build their own 901 00:32:10,799 --> 00:32:16,320 integration with mmtk they have this 902 00:32:13,120 --> 00:32:18,799 handy guide that you can follow 903 00:32:16,320 --> 00:32:21,039 so what's next uh one of the team 904 00:32:18,799 --> 00:32:23,120 members kunshan wang incredibly capable 905 00:32:21,039 --> 00:32:23,919 will be following on with this project 906 00:32:23,120 --> 00:32:27,200 and 907 00:32:23,919 --> 00:32:30,240 continuing his plan is to implement 908 00:32:27,200 --> 00:32:32,000 the api necessary for mtk's mark sweep 909 00:32:30,240 --> 00:32:33,840 collected a run the reason why we've 910 00:32:32,000 --> 00:32:36,399 chosen to target this collector is 911 00:32:33,840 --> 00:32:37,519 because well it's similar to what ruby 912 00:32:36,399 --> 00:32:39,360 already has 913 00:32:37,519 --> 00:32:40,480 so there should be less compatibility 914 00:32:39,360 --> 00:32:42,960 concerns 915 00:32:40,480 --> 00:32:44,559 and also because it doesn't move objects 916 00:32:42,960 --> 00:32:45,679 which ruby doesn't really do much at the 917 00:32:44,559 --> 00:32:48,159 moment so 918 00:32:45,679 --> 00:32:49,519 to improve the ease of implementation 919 00:32:48,159 --> 00:32:51,600 and then after that he's going to move 920 00:32:49,519 --> 00:32:53,200 on to imx which is a high performance 921 00:32:51,600 --> 00:32:54,880 garbage collector that was actually 922 00:32:53,200 --> 00:32:57,600 originally developed in mmtk and 923 00:32:54,880 --> 00:32:59,279 discovered uh and 924 00:32:57,600 --> 00:33:00,799 kunchan is going to be continuing on 925 00:32:59,279 --> 00:33:03,360 with this project which is really 926 00:33:00,799 --> 00:33:05,519 fantastic 927 00:33:03,360 --> 00:33:08,159 so before i finish up i would like to 928 00:33:05,519 --> 00:33:09,039 say a huge thank you to the entire m tk 929 00:33:08,159 --> 00:33:11,279 team 930 00:33:09,039 --> 00:33:13,360 steve for his guidance along with the 931 00:33:11,279 --> 00:33:15,120 rest of the team for helping me when i 932 00:33:13,360 --> 00:33:16,880 was having technical issues and teaching 933 00:33:15,120 --> 00:33:18,559 me a lot about gcs and that sort of 934 00:33:16,880 --> 00:33:20,159 thing i'd also like to thank chris 935 00:33:18,559 --> 00:33:22,240 seaton from shopify who was really 936 00:33:20,159 --> 00:33:23,679 helpful in guiding me through the ruby 937 00:33:22,240 --> 00:33:26,159 code base and 938 00:33:23,679 --> 00:33:27,760 a lot of tiny issues that 939 00:33:26,159 --> 00:33:28,960 can be really difficult to figure out if 940 00:33:27,760 --> 00:33:31,600 you're not already familiar with the 941 00:33:28,960 --> 00:33:33,120 code base and also i'd like to thank uh 942 00:33:31,600 --> 00:33:34,799 ruby's main gc developers for their 943 00:33:33,120 --> 00:33:37,360 incredibly helpful technical writing 944 00:33:34,799 --> 00:33:39,760 which was very helpful for understanding 945 00:33:37,360 --> 00:33:42,240 ruby's gc 946 00:33:39,760 --> 00:33:44,880 so that's all i have for today 947 00:33:42,240 --> 00:33:47,039 i have put up this link which has a copy 948 00:33:44,880 --> 00:33:48,399 of the slides and some links 949 00:33:47,039 --> 00:33:49,840 unfortunately i had some technical 950 00:33:48,399 --> 00:33:51,760 difficulties uploading it just before 951 00:33:49,840 --> 00:33:53,840 the presentation so there's a minimal 952 00:33:51,760 --> 00:33:55,200 subset there and i'll be uploading more 953 00:33:53,840 --> 00:33:57,360 over the course of the day but if you're 954 00:33:55,200 --> 00:33:59,679 watching the recording it should be fine 955 00:33:57,360 --> 00:34:01,840 if you're interested in learning more 956 00:33:59,679 --> 00:34:03,919 about mmtk i've put links to the website 957 00:34:01,840 --> 00:34:05,600 and our github page there there's also 958 00:34:03,919 --> 00:34:06,799 our zulip channel so if you want to come 959 00:34:05,600 --> 00:34:08,879 and have a chat 960 00:34:06,799 --> 00:34:10,320 we'd be happy to help out 961 00:34:08,879 --> 00:34:12,399 please do reach out if you're interested 962 00:34:10,320 --> 00:34:13,919 in learning or contributing 963 00:34:12,399 --> 00:34:16,399 and finally if you want to get in touch 964 00:34:13,919 --> 00:34:17,679 with me my info is on the left 965 00:34:16,399 --> 00:34:20,560 thank you very much everyone and i hope 966 00:34:17,679 --> 00:34:22,720 you've learned a lot about ruby and 967 00:34:20,560 --> 00:34:25,200 garbage collectors oh wow 10 minutes 968 00:34:22,720 --> 00:34:27,679 under as well 969 00:34:25,200 --> 00:34:29,599 so good we have a couple of questions so 970 00:34:27,679 --> 00:34:31,919 so we can use that 10 minutes up to to 971 00:34:29,599 --> 00:34:35,200 get some more quality information out of 972 00:34:31,919 --> 00:34:38,000 you angus yeah sure 973 00:34:35,200 --> 00:34:40,720 so first question uh is there any 974 00:34:38,000 --> 00:34:43,520 critical incompatibility with using 975 00:34:40,720 --> 00:34:45,760 something like mmtk with ractors which 976 00:34:43,520 --> 00:34:48,720 make ruby sort of actually 977 00:34:45,760 --> 00:34:51,440 multi-threaded 978 00:34:48,720 --> 00:34:54,320 i'm not particularly sure on this one so 979 00:34:51,440 --> 00:34:56,960 i my code is based on ruby 2.7 which i 980 00:34:54,320 --> 00:34:58,560 believe was before rectors were 981 00:34:56,960 --> 00:35:00,240 introduced i think 982 00:34:58,560 --> 00:35:02,560 so i didn't really have to deal with 983 00:35:00,240 --> 00:35:04,320 those too much but we'll see 984 00:35:02,560 --> 00:35:06,079 as the project progresses whether or not 985 00:35:04,320 --> 00:35:08,640 that'll have any major incompatibilities 986 00:35:06,079 --> 00:35:10,960 but ruby is designed to support 987 00:35:08,640 --> 00:35:12,320 multi-threaded vms 988 00:35:10,960 --> 00:35:14,240 so 989 00:35:12,320 --> 00:35:15,920 mtk shouldn't have any compatibility 990 00:35:14,240 --> 00:35:18,240 issues it's just a matter of whether 991 00:35:15,920 --> 00:35:21,040 ruby expects gc to work correctly in 992 00:35:18,240 --> 00:35:23,200 this case that we'll need to worry about 993 00:35:21,040 --> 00:35:25,760 yep someone just commented yes they were 994 00:35:23,200 --> 00:35:27,839 added in 3.0 so yeah cool that makes 995 00:35:25,760 --> 00:35:31,200 sense yeah i started this project about 996 00:35:27,839 --> 00:35:32,720 a week before 3.0 released so uh 997 00:35:31,200 --> 00:35:34,560 unfortunately that's not included in the 998 00:35:32,720 --> 00:35:37,119 current code but it has been rebased to 999 00:35:34,560 --> 00:35:39,920 3.1 now i believe 1000 00:35:37,119 --> 00:35:41,920 okay next question um what was the 1001 00:35:39,920 --> 00:35:45,920 biggest gotcha you discovered while 1002 00:35:41,920 --> 00:35:48,240 learning how ruby's gc works 1003 00:35:45,920 --> 00:35:48,240 oh 1004 00:35:49,040 --> 00:35:51,920 the 1005 00:35:50,480 --> 00:35:53,760 all of the innovations that happened 1006 00:35:51,920 --> 00:35:55,280 over the day uh were a little bit 1007 00:35:53,760 --> 00:35:57,359 difficult to discover because they 1008 00:35:55,280 --> 00:35:59,520 aren't documented that well but 1009 00:35:57,359 --> 00:36:01,440 once i found the original documentation 1010 00:35:59,520 --> 00:36:03,280 for like the generational collector that 1011 00:36:01,440 --> 00:36:05,200 they've implemented that was really 1012 00:36:03,280 --> 00:36:07,359 helpful because it turns out some of 1013 00:36:05,200 --> 00:36:10,800 those features that they implemented for 1014 00:36:07,359 --> 00:36:12,640 these previous uh improvements have been 1015 00:36:10,800 --> 00:36:13,359 will actually be useful in the future 1016 00:36:12,640 --> 00:36:15,200 for 1017 00:36:13,359 --> 00:36:16,640 integrating mtk stuff like right 1018 00:36:15,200 --> 00:36:18,320 barriers and that sort of thing that are 1019 00:36:16,640 --> 00:36:19,760 commonly used in 1020 00:36:18,320 --> 00:36:21,359 garbage collectors 1021 00:36:19,760 --> 00:36:23,119 uh but probably the biggest problem i 1022 00:36:21,359 --> 00:36:25,040 had along the way was that finalization 1023 00:36:23,119 --> 00:36:26,560 issue that i was talking about 1024 00:36:25,040 --> 00:36:28,320 whenever you're 1025 00:36:26,560 --> 00:36:29,839 developing something and you can't stick 1026 00:36:28,320 --> 00:36:31,119 it through dev tools because that 1027 00:36:29,839 --> 00:36:32,480 constitutes 1028 00:36:31,119 --> 00:36:34,320 piping it to somewhere else and hence 1029 00:36:32,480 --> 00:36:36,400 the vm behaves totally differently to 1030 00:36:34,320 --> 00:36:38,560 the case that you're trying to debug 1031 00:36:36,400 --> 00:36:41,800 that understandably makes debugging 1032 00:36:38,560 --> 00:36:41,800 pretty difficult 1033 00:36:42,560 --> 00:36:46,160 cool and 1034 00:36:44,480 --> 00:36:50,000 one more 1035 00:36:46,160 --> 00:36:52,160 this will kind of rush in at the end 1036 00:36:50,000 --> 00:36:54,480 is there much interest in taking the gc 1037 00:36:52,160 --> 00:36:55,920 abstractions you've done to the upstream 1038 00:36:54,480 --> 00:36:58,000 project 1039 00:36:55,920 --> 00:37:00,880 uh yes that is something that we have 1040 00:36:58,000 --> 00:37:02,720 considered uh this project we did do a 1041 00:37:00,880 --> 00:37:05,359 little bit of collaboration with shopify 1042 00:37:02,720 --> 00:37:06,640 and uh once we've got a bit more stable 1043 00:37:05,359 --> 00:37:09,200 the idea is hopefully we'll be able to 1044 00:37:06,640 --> 00:37:10,640 put it in production at shopify to see 1045 00:37:09,200 --> 00:37:12,079 how well it runs and then once we've got 1046 00:37:10,640 --> 00:37:14,480 a bit of proof of seeing how well it 1047 00:37:12,079 --> 00:37:16,720 works then we've got an actual garbage 1048 00:37:14,480 --> 00:37:18,640 collector implemented like mark sweep or 1049 00:37:16,720 --> 00:37:21,200 mx because currently it doesn't actually 1050 00:37:18,640 --> 00:37:23,440 collect any garbage uh once we've got 1051 00:37:21,200 --> 00:37:25,440 that we might submit a patch upstream 1052 00:37:23,440 --> 00:37:28,320 into ruby and then other people can 1053 00:37:25,440 --> 00:37:31,119 benefit from our changes 1054 00:37:28,320 --> 00:37:32,800 cool um i i also have a question of my 1055 00:37:31,119 --> 00:37:34,480 own just because yeah sure we've still 1056 00:37:32,800 --> 00:37:38,640 got a little bit of time 1057 00:37:34,480 --> 00:37:40,320 um so i'm i'm not i'm not ruby i mean 1058 00:37:38,640 --> 00:37:43,200 last time i used ruby was rails and it 1059 00:37:40,320 --> 00:37:45,040 was pre version 2 at uni years ago 1060 00:37:43,200 --> 00:37:47,040 um 1061 00:37:45,040 --> 00:37:48,800 i do work a lot with openshift 1062 00:37:47,040 --> 00:37:51,599 communities so 1063 00:37:48,800 --> 00:37:53,760 ruby is not particularly common in 1064 00:37:51,599 --> 00:37:56,800 kubernetes in terms of workload 1065 00:37:53,760 --> 00:37:59,680 environments how do you see 1066 00:37:56,800 --> 00:38:01,760 kind of the the containerized workload 1067 00:37:59,680 --> 00:38:05,200 interacting with some of the the garbage 1068 00:38:01,760 --> 00:38:06,000 collection stuff in mmtk 1069 00:38:05,200 --> 00:38:08,480 um 1070 00:38:06,000 --> 00:38:09,359 hopefully that's an abstraction because 1071 00:38:08,480 --> 00:38:10,800 these are different levels of 1072 00:38:09,359 --> 00:38:12,400 abstraction we probably won't have to 1073 00:38:10,800 --> 00:38:14,000 worry too much 1074 00:38:12,400 --> 00:38:16,640 because this is all within 1075 00:38:14,000 --> 00:38:18,640 the language runtime itself 1076 00:38:16,640 --> 00:38:22,079 any container that can run 1077 00:38:18,640 --> 00:38:23,280 uh ruby should be able to run mmtk 1078 00:38:22,079 --> 00:38:24,560 um 1079 00:38:23,280 --> 00:38:27,200 but it would be interesting to 1080 00:38:24,560 --> 00:38:29,119 investigate if because mtk 1081 00:38:27,200 --> 00:38:31,119 is common code if there's any way that 1082 00:38:29,119 --> 00:38:32,560 mtk could be built to have multiple 1083 00:38:31,119 --> 00:38:34,480 instances that 1084 00:38:32,560 --> 00:38:35,920 use the same threads to collect code 1085 00:38:34,480 --> 00:38:36,800 across multiple different languages i 1086 00:38:35,920 --> 00:38:38,480 don't know if that's something that 1087 00:38:36,800 --> 00:38:40,079 could be explored or not 1088 00:38:38,480 --> 00:38:42,480 yeah i mean i just when you mentioned 1089 00:38:40,079 --> 00:38:44,640 open jdk i know there's been a massive 1090 00:38:42,480 --> 00:38:46,880 exodus from from open gdk to using 1091 00:38:44,640 --> 00:38:49,680 quarkus 1092 00:38:46,880 --> 00:38:51,200 for for containerization workloads it 1093 00:38:49,680 --> 00:38:53,680 works much better and i just wondered if 1094 00:38:51,200 --> 00:38:55,200 there was a parallel but yeah maybe it's 1095 00:38:53,680 --> 00:38:57,200 an interesting 1096 00:38:55,200 --> 00:38:59,839 interesting project well thank you very 1097 00:38:57,200 --> 00:39:01,920 much and i'm glad that your first lca 1098 00:38:59,839 --> 00:39:04,560 experience has been mostly smooth 1099 00:39:01,920 --> 00:39:07,119 running which is brilliant 1100 00:39:04,560 --> 00:39:09,200 we do have lunch now and so we're taking 1101 00:39:07,119 --> 00:39:11,359 a bit of an extended break please go and 1102 00:39:09,200 --> 00:39:13,359 hydrate yourselves and get some kai 1103 00:39:11,359 --> 00:39:16,160 that's food 1104 00:39:13,359 --> 00:39:20,079 and we will see you back and 1105 00:39:16,160 --> 00:39:22,079 here at i've got to find my schedule at 1106 00:39:20,079 --> 00:39:24,000 1 30 uh 1107 00:39:22,079 --> 00:39:27,599 australian eastern standard i think i 1108 00:39:24,000 --> 00:39:29,760 got that right um and we'll be talking 1109 00:39:27,599 --> 00:39:31,680 with fraser tweedle about 1110 00:39:29,760 --> 00:39:33,359 change owns and systemd containers and 1111 00:39:31,680 --> 00:39:35,119 username spaces 1112 00:39:33,359 --> 00:39:38,839 thank you very much 1113 00:39:35,119 --> 00:39:38,839 thanks for coming along everyone