1 00:00:06,320 --> 00:00:11,499 [Music] 2 00:00:17,359 --> 00:00:21,600 good afternoon australians time from 3 00:00:19,439 --> 00:00:23,199 sydney welcome back from lunch i hope 4 00:00:21,600 --> 00:00:24,880 everybody's had a good lunch a good 5 00:00:23,199 --> 00:00:26,720 break a bit of a walk 6 00:00:24,880 --> 00:00:27,840 chase the dog whatever you needed to do 7 00:00:26,720 --> 00:00:29,679 i had cheese and crackers for the 8 00:00:27,840 --> 00:00:31,679 wallace and grommet friends out there 9 00:00:29,679 --> 00:00:34,079 and we welcome back this afternoon for 10 00:00:31,679 --> 00:00:36,000 our first session after lunch uh with 11 00:00:34,079 --> 00:00:38,399 david and he's joining us all the way 12 00:00:36,000 --> 00:00:41,200 from the sunny u.s in the center 13 00:00:38,399 --> 00:00:43,200 san francisco bay area 14 00:00:41,200 --> 00:00:44,480 so i'll introduce david now before i 15 00:00:43,200 --> 00:00:46,640 hand him over 16 00:00:44,480 --> 00:00:48,480 so ordinarily we think about things like 17 00:00:46,640 --> 00:00:51,199 converting numbers between the way we 18 00:00:48,480 --> 00:00:53,199 see them and the way computers use them 19 00:00:51,199 --> 00:00:55,760 as they solve a problem 20 00:00:53,199 --> 00:00:58,079 and for many purposes they are 21 00:00:55,760 --> 00:00:59,760 as data volumes grow and people's 22 00:00:58,079 --> 00:01:02,000 patience doesn't 23 00:00:59,760 --> 00:01:04,239 we see sometimes find that there is 24 00:01:02,000 --> 00:01:06,640 still room for improvement 25 00:01:04,239 --> 00:01:09,360 and proving them is fairly 26 00:01:06,640 --> 00:01:11,600 novice low-level coders can 27 00:01:09,360 --> 00:01:13,600 when our time together is done 28 00:01:11,600 --> 00:01:15,759 david will have taught us how that we 29 00:01:13,600 --> 00:01:17,040 can use some of the recent advantages in 30 00:01:15,759 --> 00:01:18,560 a few areas 31 00:01:17,040 --> 00:01:21,200 a few of these things that he's brought 32 00:01:18,560 --> 00:01:22,560 together a postgres sql 33 00:01:21,200 --> 00:01:24,799 some of the things to consider when 34 00:01:22,560 --> 00:01:27,200 attempting such improvements and looking 35 00:01:24,799 --> 00:01:29,119 at the world from a new perspective 36 00:01:27,200 --> 00:01:31,680 thanks david and please do ask questions 37 00:01:29,119 --> 00:01:33,600 in the chat during the session and david 38 00:01:31,680 --> 00:01:34,799 will be hanging around in the chat room 39 00:01:33,600 --> 00:01:36,159 afterwards if you've got another 40 00:01:34,799 --> 00:01:38,320 question you didn't quite get finished 41 00:01:36,159 --> 00:01:41,280 before our timer runs out so over to you 42 00:01:38,320 --> 00:01:43,759 david thank you thanks so much 43 00:01:41,280 --> 00:01:46,320 um greetings from yesterday and from the 44 00:01:43,759 --> 00:01:49,040 unseated territory of the olani people 45 00:01:46,320 --> 00:01:50,320 or castro valley as i usually refer to 46 00:01:49,040 --> 00:01:52,320 it 47 00:01:50,320 --> 00:01:54,399 thanks to all of you for coming i 48 00:01:52,320 --> 00:01:56,640 hope to earn your attention and your 49 00:01:54,399 --> 00:01:57,920 wakefulness in the coveted post lunch 50 00:01:56,640 --> 00:02:01,119 spot 51 00:01:57,920 --> 00:02:04,000 i'd like to shout out ivan uh my 52 00:02:01,119 --> 00:02:06,560 employer who pays me to work on open 53 00:02:04,000 --> 00:02:09,360 source software and perhaps yours if if 54 00:02:06,560 --> 00:02:11,520 you're interested we can talk later 55 00:02:09,360 --> 00:02:13,360 about that if you are 56 00:02:11,520 --> 00:02:16,400 you can leave your impressions now or 57 00:02:13,360 --> 00:02:20,239 later in the venulis channel called uh 58 00:02:16,400 --> 00:02:23,599 post talk chat womanjeka theater hope i 59 00:02:20,239 --> 00:02:24,640 got that at least not too mangled 60 00:02:23,599 --> 00:02:28,000 um 61 00:02:24,640 --> 00:02:30,319 so let's get started 62 00:02:28,000 --> 00:02:32,239 apparently i'm across from my retro 63 00:02:30,319 --> 00:02:33,680 computing talk 64 00:02:32,239 --> 00:02:37,040 i hope 65 00:02:33,680 --> 00:02:39,840 somebody recognizes this this this is 66 00:02:37,040 --> 00:02:41,840 now what we call retro computing 67 00:02:39,840 --> 00:02:44,720 or as my 68 00:02:41,840 --> 00:02:47,280 13 year old self used to call it 69 00:02:44,720 --> 00:02:47,280 computing 70 00:02:47,680 --> 00:02:54,800 we had resources available 71 00:02:50,720 --> 00:02:57,680 they included a concise instruction set 72 00:02:54,800 --> 00:02:58,879 we'll see what i mean by concise 73 00:02:57,680 --> 00:03:01,760 presently 74 00:02:58,879 --> 00:03:04,480 there were some registers i don't recall 75 00:03:01,760 --> 00:03:06,560 exactly how many offhand 76 00:03:04,480 --> 00:03:09,120 it's been a while 77 00:03:06,560 --> 00:03:14,239 but not terribly many 78 00:03:09,120 --> 00:03:17,840 there was a whole 64 kb of memory 79 00:03:14,239 --> 00:03:19,360 and if everything went badly wrong there 80 00:03:17,840 --> 00:03:22,640 was 81 00:03:19,360 --> 00:03:22,640 the floppy disk 82 00:03:23,040 --> 00:03:28,879 so concise 83 00:03:25,200 --> 00:03:30,959 well the 6502 was originally built to be 84 00:03:28,879 --> 00:03:33,760 a 85 00:03:30,959 --> 00:03:36,720 machine controller i i believe 86 00:03:33,760 --> 00:03:38,239 and it turns out that concise did not 87 00:03:36,720 --> 00:03:41,680 include 88 00:03:38,239 --> 00:03:44,159 this instruction 89 00:03:41,680 --> 00:03:46,720 i think it's called mull on modern 90 00:03:44,159 --> 00:03:49,440 machines 91 00:03:46,720 --> 00:03:51,920 uh well 92 00:03:49,440 --> 00:03:55,040 in order to simulate this instruction 93 00:03:51,920 --> 00:03:58,159 people tried the usual approaches first 94 00:03:55,040 --> 00:04:02,159 iterating and adding 95 00:03:58,159 --> 00:04:04,879 but this turned out to be far too slow 96 00:04:02,159 --> 00:04:08,159 so eventually they went with 97 00:04:04,879 --> 00:04:10,159 pre-computing or currying if you want to 98 00:04:08,159 --> 00:04:13,439 use some 99 00:04:10,159 --> 00:04:15,519 i guess slightly gatekeeping terminology 100 00:04:13,439 --> 00:04:16,320 but i guess that's what people do so 101 00:04:15,519 --> 00:04:19,040 they 102 00:04:16,320 --> 00:04:21,120 used an algebraic identity 103 00:04:19,040 --> 00:04:23,600 this is not a class in algebra please 104 00:04:21,120 --> 00:04:25,840 don't stare at this slide and try to try 105 00:04:23,600 --> 00:04:29,199 to read it right now i'll put it up 106 00:04:25,840 --> 00:04:32,240 later anyhow the upshot is that we can 107 00:04:29,199 --> 00:04:35,199 pre-compute some lookup tables about two 108 00:04:32,240 --> 00:04:37,199 kilobytes worth in order to do those 109 00:04:35,199 --> 00:04:38,720 uh multiplies 110 00:04:37,199 --> 00:04:42,080 um 111 00:04:38,720 --> 00:04:44,400 we are using these to avoid 112 00:04:42,080 --> 00:04:46,320 executing some instructions because the 113 00:04:44,400 --> 00:04:48,400 most efficient instruction is the one 114 00:04:46,320 --> 00:04:50,639 you don't execute 115 00:04:48,400 --> 00:04:52,479 and in this case we're doing some 116 00:04:50,639 --> 00:04:54,479 instructions that aren't actually 117 00:04:52,479 --> 00:04:57,199 contained in the 118 00:04:54,479 --> 00:05:00,240 isa 119 00:04:57,199 --> 00:05:04,720 okay so this was kind of a background 120 00:05:00,240 --> 00:05:06,720 thing that i i dimly remembered 121 00:05:04,720 --> 00:05:07,680 years past 122 00:05:06,720 --> 00:05:10,000 um 123 00:05:07,680 --> 00:05:13,520 right about this time there was this 124 00:05:10,000 --> 00:05:16,080 band called akadaka with a new singer 125 00:05:13,520 --> 00:05:18,160 and it becomes famous in the northern 126 00:05:16,080 --> 00:05:20,080 hemisphere 127 00:05:18,160 --> 00:05:22,560 um 128 00:05:20,080 --> 00:05:22,560 also 129 00:05:23,039 --> 00:05:26,720 a 130 00:05:25,039 --> 00:05:28,880 database system 131 00:05:26,720 --> 00:05:31,199 done by some of the people 132 00:05:28,880 --> 00:05:33,039 who had done ingress 133 00:05:31,199 --> 00:05:35,039 was released 134 00:05:33,039 --> 00:05:36,000 it's called postgres 135 00:05:35,039 --> 00:05:39,919 then 136 00:05:36,000 --> 00:05:44,240 a decade later it gets its sql layer and 137 00:05:39,919 --> 00:05:44,240 with it the worst name in the business 138 00:05:46,960 --> 00:05:50,560 data volumes 139 00:05:48,960 --> 00:05:53,039 keep growing 140 00:05:50,560 --> 00:05:55,360 over this time 141 00:05:53,039 --> 00:05:55,360 and 142 00:05:55,440 --> 00:05:59,120 people 143 00:05:56,560 --> 00:06:01,120 get impatient or at any rate start to 144 00:05:59,120 --> 00:06:02,720 express their impatience a little bit 145 00:06:01,120 --> 00:06:06,639 more 146 00:06:02,720 --> 00:06:09,680 and they say things like 147 00:06:06,639 --> 00:06:12,160 beta export is 148 00:06:09,680 --> 00:06:12,160 slow 149 00:06:12,800 --> 00:06:18,000 well it's slow is one of those very 150 00:06:14,560 --> 00:06:21,280 interesting uh challenges in computing 151 00:06:18,000 --> 00:06:23,360 you just never know where to lead 152 00:06:21,280 --> 00:06:26,160 it turned out that 153 00:06:23,360 --> 00:06:27,840 sending integers to standard out 154 00:06:26,160 --> 00:06:31,680 was taking 155 00:06:27,840 --> 00:06:35,120 significant time in these exports 156 00:06:31,680 --> 00:06:36,080 the code looked kind of familiar 157 00:06:35,120 --> 00:06:38,080 um 158 00:06:36,080 --> 00:06:39,280 it had a loop 159 00:06:38,080 --> 00:06:42,240 and 160 00:06:39,280 --> 00:06:45,440 what turns out to be 161 00:06:42,240 --> 00:06:47,199 kind of an expensive operation 162 00:06:45,440 --> 00:06:50,960 at this level 163 00:06:47,199 --> 00:06:52,240 um which was uh which was that little 164 00:06:50,960 --> 00:06:55,199 divide 165 00:06:52,240 --> 00:06:56,400 that you see right after the word value 166 00:06:55,199 --> 00:06:58,240 there 167 00:06:56,400 --> 00:06:59,599 um 168 00:06:58,240 --> 00:07:00,479 and so 169 00:06:59,599 --> 00:07:02,639 i 170 00:07:00,479 --> 00:07:05,120 i got to work 171 00:07:02,639 --> 00:07:07,599 um this kind of looked like something 172 00:07:05,120 --> 00:07:09,599 that i'd seen before and i'd hoped that 173 00:07:07,599 --> 00:07:12,479 i could sort of use 174 00:07:09,599 --> 00:07:14,560 sort of mixture as before which is a 175 00:07:12,479 --> 00:07:17,280 lookup table 176 00:07:14,560 --> 00:07:21,280 um 177 00:07:17,280 --> 00:07:24,080 i applied this to figuring out um how 178 00:07:21,280 --> 00:07:25,759 many decimal digits were in 179 00:07:24,080 --> 00:07:27,840 the number 180 00:07:25,759 --> 00:07:30,400 so that we could avoid some of those 181 00:07:27,840 --> 00:07:33,360 expensive operations have some digit 182 00:07:30,400 --> 00:07:33,360 pairs to hand 183 00:07:34,400 --> 00:07:39,680 the situation 184 00:07:35,919 --> 00:07:39,680 is a little bit different to be fair 185 00:07:41,840 --> 00:07:46,560 the apple 2e had just one core that we 186 00:07:44,879 --> 00:07:48,720 really cared about 187 00:07:46,560 --> 00:07:52,160 i mean there were other processors off 188 00:07:48,720 --> 00:07:53,840 to the side that you could use and and 189 00:07:52,160 --> 00:07:55,360 people did 190 00:07:53,840 --> 00:07:57,199 i just uh 191 00:07:55,360 --> 00:08:00,800 i just wasn't quite that 192 00:07:57,199 --> 00:08:00,800 that advance at that time 193 00:08:01,599 --> 00:08:05,840 and 194 00:08:03,280 --> 00:08:05,840 given the 195 00:08:05,919 --> 00:08:10,960 dearth of cores 196 00:08:08,560 --> 00:08:14,879 it didn't have a branch predictor that 197 00:08:10,960 --> 00:08:14,879 we needed to worry too much about 198 00:08:16,160 --> 00:08:20,960 and so we didn't have to 199 00:08:18,080 --> 00:08:22,639 we didn't have to tiptoe around it 200 00:08:20,960 --> 00:08:25,680 but 201 00:08:22,639 --> 00:08:27,360 we do now and this is one of the things 202 00:08:25,680 --> 00:08:30,639 that 203 00:08:27,360 --> 00:08:30,639 we think about doing 204 00:08:31,440 --> 00:08:34,800 is avoiding that branch predictor or at 205 00:08:33,680 --> 00:08:36,159 any rate 206 00:08:34,800 --> 00:08:37,440 assuaging it 207 00:08:36,159 --> 00:08:40,800 so 208 00:08:37,440 --> 00:08:41,919 with those lookup tables we can 209 00:08:40,800 --> 00:08:45,039 do 210 00:08:41,919 --> 00:08:45,039 branchless computing 211 00:08:45,600 --> 00:08:49,519 or at any rate 212 00:08:47,040 --> 00:08:51,519 limit the kind and limit the no amount 213 00:08:49,519 --> 00:08:53,360 of branching 214 00:08:51,519 --> 00:08:55,760 that we do 215 00:08:53,360 --> 00:08:58,000 so 216 00:08:55,760 --> 00:08:58,000 um 217 00:08:58,800 --> 00:09:03,040 i think there's more that we can do 218 00:09:03,200 --> 00:09:09,040 we should be able to um 219 00:09:06,080 --> 00:09:10,640 to make it faster and for this i would 220 00:09:09,040 --> 00:09:12,160 like to thank 221 00:09:10,640 --> 00:09:15,040 professor 222 00:09:12,160 --> 00:09:18,080 daniel lumiere who's out of 223 00:09:15,040 --> 00:09:18,080 eastern canada 224 00:09:18,320 --> 00:09:22,399 his 225 00:09:19,760 --> 00:09:25,120 his blog is always fascinating on this 226 00:09:22,399 --> 00:09:25,120 kind of score 227 00:09:26,399 --> 00:09:31,279 and 228 00:09:27,600 --> 00:09:31,279 it's uh it it's 229 00:09:32,640 --> 00:09:36,800 some of the future work that i'm her 230 00:09:34,880 --> 00:09:39,519 work that i hope to get 231 00:09:36,800 --> 00:09:41,600 into the postgresql code 232 00:09:39,519 --> 00:09:44,160 is based on things that he's 233 00:09:41,600 --> 00:09:45,519 collaborated with other people 234 00:09:44,160 --> 00:09:46,720 to write 235 00:09:45,519 --> 00:09:49,360 so 236 00:09:46,720 --> 00:09:52,080 we're going to 237 00:09:49,360 --> 00:09:54,160 do more caching or currying or whatever 238 00:09:52,080 --> 00:09:55,760 you want to call the the 239 00:09:54,160 --> 00:09:58,880 process of 240 00:09:55,760 --> 00:10:00,640 computing first holding on to some of 241 00:09:58,880 --> 00:10:02,720 that computation 242 00:10:00,640 --> 00:10:05,839 and then using it later 243 00:10:02,720 --> 00:10:05,839 many many times 244 00:10:06,240 --> 00:10:11,360 so 245 00:10:08,240 --> 00:10:11,360 one of the things that 246 00:10:13,200 --> 00:10:17,360 we can make faster 247 00:10:15,279 --> 00:10:18,640 is 248 00:10:17,360 --> 00:10:21,440 when we have 249 00:10:18,640 --> 00:10:22,880 a binary 250 00:10:21,440 --> 00:10:25,839 integer let's 251 00:10:22,880 --> 00:10:29,680 for simplicity let's say it's 252 00:10:25,839 --> 00:10:29,680 it's an integer and it's unsigned 253 00:10:30,560 --> 00:10:36,000 so 254 00:10:33,200 --> 00:10:38,240 for a 32-bit integer 255 00:10:36,000 --> 00:10:38,240 um 256 00:10:40,160 --> 00:10:42,399 the 257 00:10:43,360 --> 00:10:49,440 binary representation will have its 258 00:10:47,440 --> 00:10:50,480 most significant bit 259 00:10:49,440 --> 00:10:53,279 in 260 00:10:50,480 --> 00:10:54,959 one of 32 places 261 00:10:53,279 --> 00:10:57,519 or one of 262 00:10:54,959 --> 00:11:00,000 64 places depending on the size of the 263 00:10:57,519 --> 00:11:00,000 integer 264 00:11:00,720 --> 00:11:07,839 each of those places 265 00:11:03,600 --> 00:11:09,519 represents a range of numbers that has 266 00:11:07,839 --> 00:11:13,200 at most 267 00:11:09,519 --> 00:11:16,399 two different numbers of decimal digits 268 00:11:13,200 --> 00:11:18,480 in its representation 269 00:11:16,399 --> 00:11:22,079 so for example 270 00:11:18,480 --> 00:11:23,600 a number with its left most one position 271 00:11:22,079 --> 00:11:26,800 four 272 00:11:23,600 --> 00:11:29,120 represents digits from eight which is 273 00:11:26,800 --> 00:11:29,920 one decimal digit 274 00:11:29,120 --> 00:11:33,519 to 275 00:11:29,920 --> 00:11:33,519 15 which is two 276 00:11:38,240 --> 00:11:40,880 so 277 00:11:39,200 --> 00:11:44,079 we can use 278 00:11:40,880 --> 00:11:47,200 a count leading zeros instruction 279 00:11:44,079 --> 00:11:49,279 to find out where that significant one 280 00:11:47,200 --> 00:11:50,800 is 281 00:11:49,279 --> 00:11:51,680 then we use 282 00:11:50,800 --> 00:11:54,240 uh 283 00:11:51,680 --> 00:11:56,959 some cleverness which we need to 284 00:11:54,240 --> 00:12:00,560 document carefully in the code so people 285 00:11:56,959 --> 00:12:00,560 understand what we're doing 286 00:12:01,200 --> 00:12:05,839 we add a magic number from this 287 00:12:04,079 --> 00:12:10,079 short 288 00:12:05,839 --> 00:12:15,279 table that we've we've constructed 289 00:12:10,079 --> 00:12:16,399 that magic number is guaranteed to be 290 00:12:15,279 --> 00:12:18,880 um 291 00:12:16,399 --> 00:12:21,040 larger when added to the number we're 292 00:12:18,880 --> 00:12:22,079 looking at than the original integer 293 00:12:21,040 --> 00:12:26,240 size 294 00:12:22,079 --> 00:12:29,040 so we have this say a 32-bit integer 295 00:12:26,240 --> 00:12:30,079 and then we add this magic number to it 296 00:12:29,040 --> 00:12:33,200 and 297 00:12:30,079 --> 00:12:37,279 what hap what happens is that 298 00:12:33,200 --> 00:12:38,399 um it would it overflows into the next 299 00:12:37,279 --> 00:12:41,760 larger size 300 00:12:38,399 --> 00:12:45,120 so it's no longer inside the lower 301 00:12:41,760 --> 00:12:47,279 32 bits let's say 302 00:12:45,120 --> 00:12:50,079 and when we look at the ones beyond the 303 00:12:47,279 --> 00:12:52,720 width of the original number we get 304 00:12:50,079 --> 00:12:53,680 the number of decimal digits 305 00:12:52,720 --> 00:12:56,000 in the 306 00:12:53,680 --> 00:12:58,880 original number 307 00:12:56,000 --> 00:13:01,440 um so we can do this in 308 00:12:58,880 --> 00:13:02,639 a lot fewer instructions 309 00:13:01,440 --> 00:13:06,320 i think 310 00:13:02,639 --> 00:13:08,560 we can get it down to well i know for 32 311 00:13:06,320 --> 00:13:10,000 bits we can get it down to 312 00:13:08,560 --> 00:13:12,240 one 313 00:13:10,000 --> 00:13:14,560 clz 314 00:13:12,240 --> 00:13:16,880 one table lookup 315 00:13:14,560 --> 00:13:18,839 and i hope to keep that 316 00:13:16,880 --> 00:13:21,120 from 317 00:13:18,839 --> 00:13:23,839 overrunning um 318 00:13:21,120 --> 00:13:26,240 our cache lines or or at any rate from 319 00:13:23,839 --> 00:13:27,680 overrunning them too much 320 00:13:26,240 --> 00:13:28,800 and one add 321 00:13:27,680 --> 00:13:31,200 and one 322 00:13:28,800 --> 00:13:31,200 shift 323 00:13:31,279 --> 00:13:35,440 um 324 00:13:32,320 --> 00:13:36,399 and what we're left with 325 00:13:35,440 --> 00:13:37,360 is 326 00:13:36,399 --> 00:13:39,360 this 327 00:13:37,360 --> 00:13:40,560 little guy 328 00:13:39,360 --> 00:13:42,720 which 329 00:13:40,560 --> 00:13:45,199 um that's the 330 00:13:42,720 --> 00:13:48,160 that's what the new code looks like and 331 00:13:45,199 --> 00:13:49,519 uh yes it's a little bit confusing and 332 00:13:48,160 --> 00:13:50,880 this is where 333 00:13:49,519 --> 00:13:53,360 uh 334 00:13:50,880 --> 00:13:55,600 we're a facility and write writing 335 00:13:53,360 --> 00:13:58,560 english and then getting good feedback 336 00:13:55,600 --> 00:14:01,600 on how that's coming across 337 00:13:58,560 --> 00:14:02,399 will really help to to do this this is 338 00:14:01,600 --> 00:14:04,399 uh 339 00:14:02,399 --> 00:14:06,720 this is not the kind of thing that you 340 00:14:04,399 --> 00:14:10,079 want to have sitting there on its own 341 00:14:06,720 --> 00:14:12,399 the way i have it here on the screen 342 00:14:10,079 --> 00:14:12,399 um 343 00:14:12,639 --> 00:14:17,199 so 344 00:14:14,079 --> 00:14:19,199 can we get it faster 345 00:14:17,199 --> 00:14:22,320 i don't know for sure 346 00:14:19,199 --> 00:14:25,040 but i suspect that it's possible 347 00:14:22,320 --> 00:14:27,360 and then there's other things that are 348 00:14:25,040 --> 00:14:30,480 uh a part of 349 00:14:27,360 --> 00:14:30,480 printing decimal 350 00:14:30,880 --> 00:14:35,360 uh printing decimal representations to 351 00:14:33,440 --> 00:14:37,199 standard output that i haven't covered 352 00:14:35,360 --> 00:14:38,560 here 353 00:14:37,199 --> 00:14:39,839 and i would 354 00:14:38,560 --> 00:14:41,760 love to 355 00:14:39,839 --> 00:14:44,800 work with some other people to figure 356 00:14:41,760 --> 00:14:47,519 out what uh where where best to to shave 357 00:14:44,800 --> 00:14:51,519 that back and to make the 358 00:14:47,519 --> 00:14:54,800 the user experience um 359 00:14:51,519 --> 00:14:56,000 helpful for for uh for all our users out 360 00:14:54,800 --> 00:14:58,720 in the world 361 00:14:56,000 --> 00:14:59,839 i would love your input 362 00:14:58,720 --> 00:15:02,560 uh 363 00:14:59,839 --> 00:15:05,760 with um 364 00:15:02,560 --> 00:15:08,480 as i was doing this i was thinking that 365 00:15:05,760 --> 00:15:10,560 you know postgres isn't all that special 366 00:15:08,480 --> 00:15:12,639 i mean i've been working on it for some 367 00:15:10,560 --> 00:15:14,959 time but 368 00:15:12,639 --> 00:15:17,360 maybe this shouldn't be bespoke code for 369 00:15:14,959 --> 00:15:20,000 that and i would love to figure out ways 370 00:15:17,360 --> 00:15:23,199 to get this into 371 00:15:20,000 --> 00:15:24,639 glib c or other 372 00:15:23,199 --> 00:15:26,880 lib cs 373 00:15:24,639 --> 00:15:28,720 that are upstream 374 00:15:26,880 --> 00:15:31,120 of where i am 375 00:15:28,720 --> 00:15:35,120 and 376 00:15:31,120 --> 00:15:37,440 oh my goodness i had time this for uh 377 00:15:35,120 --> 00:15:40,079 more interaction and questions and stuff 378 00:15:37,440 --> 00:15:44,800 so i guess i'll 379 00:15:40,079 --> 00:15:44,800 i guess i'll i guess we'll do those now 380 00:15:50,480 --> 00:15:54,079 hey thanks for the great presentation i 381 00:15:52,480 --> 00:15:56,240 haven't had any questions come through 382 00:15:54,079 --> 00:15:57,759 in the chat line so if you are thinking 383 00:15:56,240 --> 00:16:00,399 of a question please bring them through 384 00:15:57,759 --> 00:16:01,839 um i was reminiscing the apple 2e at the 385 00:16:00,399 --> 00:16:03,920 front and 386 00:16:01,839 --> 00:16:06,720 i will confess that i actually covered 387 00:16:03,920 --> 00:16:08,240 it in apple ii gs when they first came 388 00:16:06,720 --> 00:16:09,360 out but we couldn't afford one of those 389 00:16:08,240 --> 00:16:12,240 when i was in 390 00:16:09,360 --> 00:16:15,440 senior school so i ended up on a x86 391 00:16:12,240 --> 00:16:17,440 secondhand xx86 compatible hyundai 392 00:16:15,440 --> 00:16:19,360 with a lovely orange screen and then we 393 00:16:17,440 --> 00:16:22,230 got the paper white but yeah 394 00:16:19,360 --> 00:16:23,680 some good nostalgia for me and 395 00:16:22,230 --> 00:16:25,680 [Music] 396 00:16:23,680 --> 00:16:27,199 i lost my date to introduction to 397 00:16:25,680 --> 00:16:28,639 database books at some time i long 398 00:16:27,199 --> 00:16:31,680 across the line but yeah a little bit of 399 00:16:28,639 --> 00:16:34,160 sql still in my hair there somewhere 400 00:16:31,680 --> 00:16:36,880 yeah well it turns out that 401 00:16:34,160 --> 00:16:38,720 all these sql things are i mean they 402 00:16:36,880 --> 00:16:39,519 they kind of hide the 403 00:16:38,720 --> 00:16:42,000 the 404 00:16:39,519 --> 00:16:43,920 underlying sort of 405 00:16:42,000 --> 00:16:46,880 regular computer science that you think 406 00:16:43,920 --> 00:16:47,759 of the data structures the algorithms 407 00:16:46,880 --> 00:16:50,800 the 408 00:16:47,759 --> 00:16:50,800 you know parsing 409 00:16:51,120 --> 00:16:56,160 execution engines compilers 410 00:16:54,480 --> 00:16:58,160 they they kind of 411 00:16:56,160 --> 00:16:59,680 they kind of 412 00:16:58,160 --> 00:17:02,240 help you pretend that those things 413 00:16:59,680 --> 00:17:04,079 aren't still there 414 00:17:02,240 --> 00:17:07,120 but they're there and and if you want to 415 00:17:04,079 --> 00:17:08,319 dig there they're pretty easy to find 416 00:17:07,120 --> 00:17:10,000 wonderful we've actually had two 417 00:17:08,319 --> 00:17:12,720 questions come through now so the first 418 00:17:10,000 --> 00:17:15,520 one is maybe the problem with 419 00:17:12,720 --> 00:17:17,679 sd out or standard out would be 420 00:17:15,520 --> 00:17:19,600 maybe there's a problem with std out 421 00:17:17,679 --> 00:17:20,559 would a socket be better 422 00:17:19,600 --> 00:17:22,319 uh 423 00:17:20,559 --> 00:17:24,400 so 424 00:17:22,319 --> 00:17:25,520 probably 425 00:17:24,400 --> 00:17:28,400 um 426 00:17:25,520 --> 00:17:31,280 that that but um 427 00:17:28,400 --> 00:17:35,679 since i found an improvement in 428 00:17:31,280 --> 00:17:35,679 actually changing the code that that 429 00:17:38,080 --> 00:17:42,320 that does the computation 430 00:17:40,880 --> 00:17:44,880 that that actually 431 00:17:42,320 --> 00:17:48,799 does the instructions i suspect that 432 00:17:44,880 --> 00:17:51,120 it's not all standard out and and so 433 00:17:48,799 --> 00:17:53,679 um sockets might not 434 00:17:51,120 --> 00:17:54,880 uh fix this 435 00:17:53,679 --> 00:17:58,160 because we're 436 00:17:54,880 --> 00:18:00,400 we're generally doing an awful lot of 437 00:17:58,160 --> 00:18:04,720 these things they're they're they tend 438 00:18:00,400 --> 00:18:04,720 to be sort of a dump of a database or 439 00:18:05,840 --> 00:18:08,000 the 440 00:18:09,120 --> 00:18:12,320 setting up a new replica somewhere and 441 00:18:11,520 --> 00:18:15,440 so 442 00:18:12,320 --> 00:18:16,240 huge volumes of data are going through 443 00:18:15,440 --> 00:18:18,240 um 444 00:18:16,240 --> 00:18:20,720 so i don't know that it would 445 00:18:18,240 --> 00:18:23,919 i i thought that sockets would help you 446 00:18:20,720 --> 00:18:26,559 more with a sort of latency issue 447 00:18:23,919 --> 00:18:29,120 um maybe i'm maybe i'm mistaken about 448 00:18:26,559 --> 00:18:30,640 that this is more of a throughput issue 449 00:18:29,120 --> 00:18:33,280 um 450 00:18:30,640 --> 00:18:34,000 let's see benchmarks um i 451 00:18:33,280 --> 00:18:36,240 do 452 00:18:34,000 --> 00:18:38,000 i vaguely recall i got about five 453 00:18:36,240 --> 00:18:39,280 percent out of this which was kind of 454 00:18:38,000 --> 00:18:40,240 amazing 455 00:18:39,280 --> 00:18:43,120 um 456 00:18:40,240 --> 00:18:44,880 the let's see there 457 00:18:43,120 --> 00:18:48,240 might have trouble getting it upstreamed 458 00:18:44,880 --> 00:18:51,440 in g lipsy well uh i have not tried it 459 00:18:48,240 --> 00:18:51,440 yet so i don't know 460 00:18:52,480 --> 00:18:57,919 so i'll i guess i'll find out um but i 461 00:18:55,919 --> 00:19:00,799 i'd uh i'd like to get something that's 462 00:18:57,919 --> 00:19:02,799 a little bit um 463 00:19:00,799 --> 00:19:06,240 more sort of fully 464 00:19:02,799 --> 00:19:06,240 fully baked before i do that 465 00:19:06,640 --> 00:19:12,400 because i yeah i don't know that this is 466 00:19:09,520 --> 00:19:15,120 specialty code it's just more efficient 467 00:19:12,400 --> 00:19:18,400 and i also want to make sure that it is 468 00:19:15,120 --> 00:19:21,120 more efficient on things that are not 469 00:19:18,400 --> 00:19:22,320 intel 64-bit 470 00:19:21,120 --> 00:19:24,400 because of course that would be 471 00:19:22,320 --> 00:19:26,799 important in in the upstream 472 00:19:24,400 --> 00:19:30,480 of c's 473 00:19:26,799 --> 00:19:31,760 a table of magic numbers and more about 474 00:19:30,480 --> 00:19:36,160 where their 475 00:19:31,760 --> 00:19:36,160 where their values derive um 476 00:19:36,240 --> 00:19:39,840 let me see if i can 477 00:19:40,799 --> 00:19:46,799 hold that up here they're basically the 478 00:19:43,760 --> 00:19:50,080 differences between 2 to the 479 00:19:46,799 --> 00:19:51,440 n and 10 to the m 480 00:19:50,080 --> 00:19:54,640 um 481 00:19:51,440 --> 00:19:57,440 and they they're the the property is 482 00:19:54,640 --> 00:19:58,799 that for the largest 483 00:19:57,440 --> 00:20:00,880 um 484 00:19:58,799 --> 00:20:02,080 uh power of 10 485 00:20:00,880 --> 00:20:03,039 they will 486 00:20:02,080 --> 00:20:05,200 um 487 00:20:03,039 --> 00:20:06,640 they will add up to 488 00:20:05,200 --> 00:20:09,760 um 489 00:20:06,640 --> 00:20:12,640 one number and for uh the nine the all 490 00:20:09,760 --> 00:20:16,480 nines number below it they will add up 491 00:20:12,640 --> 00:20:18,720 to they they will they will make the the 492 00:20:16,480 --> 00:20:21,360 higher 493 00:20:18,720 --> 00:20:22,559 part of the the sum 494 00:20:21,360 --> 00:20:24,640 um 495 00:20:22,559 --> 00:20:25,760 a different number so that's the that's 496 00:20:24,640 --> 00:20:28,080 where those 497 00:20:25,760 --> 00:20:29,280 that that's where the magic is 498 00:20:28,080 --> 00:20:32,320 um 499 00:20:29,280 --> 00:20:34,400 i don't i'm not pulling it up right here 500 00:20:32,320 --> 00:20:35,919 but i can send it 501 00:20:34,400 --> 00:20:37,840 in 502 00:20:35,919 --> 00:20:39,679 the the 503 00:20:37,840 --> 00:20:42,559 pr that i had 504 00:20:39,679 --> 00:20:44,880 um originally sent for postgres 14 which 505 00:20:42,559 --> 00:20:47,200 didn't make it because i still needed to 506 00:20:44,880 --> 00:20:50,400 do some work 507 00:20:47,200 --> 00:20:53,760 uh was there something in my current 508 00:20:50,400 --> 00:20:58,240 domain that prompted this research 509 00:20:53,760 --> 00:21:02,400 um well i i wanted to to see uh to to 510 00:20:58,240 --> 00:21:05,200 improve my uh my c skills and i from the 511 00:21:02,400 --> 00:21:05,200 bottom would help 512 00:21:06,159 --> 00:21:09,679 i i realized that 513 00:21:11,919 --> 00:21:16,880 that 514 00:21:13,039 --> 00:21:19,360 making exports and maybe someday imports 515 00:21:16,880 --> 00:21:21,440 a little faster would have the 516 00:21:19,360 --> 00:21:22,799 largest impact on the largest number of 517 00:21:21,440 --> 00:21:24,080 people 518 00:21:22,799 --> 00:21:25,919 the uh 519 00:21:24,080 --> 00:21:29,039 size of the array this is one of those 520 00:21:25,919 --> 00:21:33,280 things that i think may be improvable 521 00:21:29,039 --> 00:21:36,559 um because right now i have uh 32 522 00:21:33,280 --> 00:21:38,400 entries in there for 32-bit 523 00:21:36,559 --> 00:21:41,039 um 524 00:21:38,400 --> 00:21:43,440 numbers for calculating those those uh 525 00:21:41,039 --> 00:21:46,080 those law of integer log tens i guess 526 00:21:43,440 --> 00:21:46,080 you could call them 527 00:21:46,799 --> 00:21:51,760 and so i think it i should be able to 528 00:21:50,480 --> 00:21:55,120 get 529 00:21:51,760 --> 00:21:55,120 whatever it is uh 530 00:21:55,919 --> 00:22:01,360 whatever the size is in bits divided by 531 00:21:59,200 --> 00:22:04,400 four i think because that's 532 00:22:01,360 --> 00:22:05,840 that's the that's the largest number of 533 00:22:04,400 --> 00:22:09,120 uh 534 00:22:05,840 --> 00:22:10,400 base 10 digits that are sorry each 535 00:22:09,120 --> 00:22:12,559 um 536 00:22:10,400 --> 00:22:13,520 each set of four bits can have at most 537 00:22:12,559 --> 00:22:16,159 one 538 00:22:13,520 --> 00:22:18,240 transition or what well is guaranteed to 539 00:22:16,159 --> 00:22:20,480 have one transition between one number 540 00:22:18,240 --> 00:22:21,440 of decimal digits and another 541 00:22:20,480 --> 00:22:24,840 so 542 00:22:21,440 --> 00:22:27,840 um so i think it should be possible to 543 00:22:24,840 --> 00:22:28,960 get something that uses 544 00:22:27,840 --> 00:22:31,200 um 545 00:22:28,960 --> 00:22:34,640 n by four 546 00:22:31,200 --> 00:22:37,039 uh bits or somewhere pretty close to it 547 00:22:34,640 --> 00:22:39,840 sorry n by four entries or something 548 00:22:37,039 --> 00:22:42,240 pretty close to it for uh an end it 549 00:22:39,840 --> 00:22:43,440 integer 550 00:22:42,240 --> 00:22:45,440 um 551 00:22:43,440 --> 00:22:46,720 thanks david i'm happy to read the 552 00:22:45,440 --> 00:22:48,240 questions out if you like i'll give you 553 00:22:46,720 --> 00:22:50,000 a few more seconds to think about them 554 00:22:48,240 --> 00:22:51,760 you will use the politicians trick read 555 00:22:50,000 --> 00:22:54,080 the question and then repeat it back see 556 00:22:51,760 --> 00:22:56,559 more thinking time 557 00:22:54,080 --> 00:22:59,440 apologies if i pronounce anything wrong 558 00:22:56,559 --> 00:23:02,000 how big is the array instructions can 559 00:22:59,440 --> 00:23:04,640 provide div and mod in one 560 00:23:02,000 --> 00:23:06,000 saving multiply by 10 561 00:23:04,640 --> 00:23:09,360 uh 562 00:23:06,000 --> 00:23:10,880 yeah i sorry i thought i was um i 563 00:23:09,360 --> 00:23:11,679 thought i was 564 00:23:10,880 --> 00:23:14,880 you know 565 00:23:11,679 --> 00:23:15,840 speaking to that so basically the 566 00:23:14,880 --> 00:23:18,080 the 567 00:23:15,840 --> 00:23:19,679 the array 568 00:23:18,080 --> 00:23:20,720 is right now 569 00:23:19,679 --> 00:23:23,440 the 570 00:23:20,720 --> 00:23:25,840 uh the um 571 00:23:23,440 --> 00:23:27,919 a number of elements which is the number 572 00:23:25,840 --> 00:23:32,480 of bits in the 573 00:23:27,919 --> 00:23:36,159 uh number so for 32-bit numbers it has 574 00:23:32,480 --> 00:23:38,480 32 entries each of which is an n64 575 00:23:36,159 --> 00:23:41,600 and this is all this is all done with 576 00:23:38,480 --> 00:23:44,480 basically plus count leading zeros and 577 00:23:41,600 --> 00:23:45,840 uh shift 578 00:23:44,480 --> 00:23:47,840 so 579 00:23:45,840 --> 00:23:50,880 maybe there's something more efficient 580 00:23:47,840 --> 00:23:52,559 that takes fewer clocks that's certainly 581 00:23:50,880 --> 00:23:54,559 possible 582 00:23:52,559 --> 00:23:56,640 excellent thank you the next the next 583 00:23:54,559 --> 00:23:58,480 question we've got is there a link to 584 00:23:56,640 --> 00:24:00,559 the limia post 585 00:23:58,480 --> 00:24:01,440 that motivated the improvement 586 00:24:00,559 --> 00:24:04,030 uh 587 00:24:01,440 --> 00:24:05,760 let me see if i can 588 00:24:04,030 --> 00:24:09,799 [Music] 589 00:24:05,760 --> 00:24:09,799 quickly find that 590 00:24:22,080 --> 00:24:26,000 here we go 591 00:24:23,200 --> 00:24:28,240 [Music] 592 00:24:26,000 --> 00:24:28,240 um 593 00:24:31,440 --> 00:24:34,960 so that was 594 00:24:38,400 --> 00:24:44,720 sorry um 595 00:24:40,480 --> 00:24:44,720 let's see if i dump this in 596 00:24:44,840 --> 00:24:50,960 the chat here with that help 597 00:24:49,279 --> 00:24:53,200 i think there may be 598 00:24:50,960 --> 00:24:55,520 there may be a follow yeah there there 599 00:24:53,200 --> 00:24:56,960 may be a follow-up one 600 00:24:55,520 --> 00:24:57,760 um 601 00:24:56,960 --> 00:25:00,320 that 602 00:24:57,760 --> 00:25:02,480 is more recent no i think this is yeah i 603 00:25:00,320 --> 00:25:06,480 think this is the most recent one 604 00:25:02,480 --> 00:25:07,679 um so yeah so so this uh 605 00:25:06,480 --> 00:25:08,720 so uh 606 00:25:07,679 --> 00:25:10,960 um 607 00:25:08,720 --> 00:25:13,440 rare in my experience of academics 608 00:25:10,960 --> 00:25:16,000 anyway uh 609 00:25:13,440 --> 00:25:19,120 professor lemieux 610 00:25:16,000 --> 00:25:19,120 tends to get 611 00:25:20,320 --> 00:25:26,559 really into things as they actually 612 00:25:22,960 --> 00:25:26,559 execute on actual machines 613 00:25:26,960 --> 00:25:30,159 and then test them and then you know 614 00:25:29,039 --> 00:25:31,919 publish 615 00:25:30,159 --> 00:25:33,360 things that even somebody like me can 616 00:25:31,919 --> 00:25:35,120 follow 617 00:25:33,360 --> 00:25:38,320 good 618 00:25:35,120 --> 00:25:38,320 there's hope for me yet then 619 00:25:38,640 --> 00:25:41,279 excellent thank you that's all the 620 00:25:40,080 --> 00:25:43,679 questions we've got that has come 621 00:25:41,279 --> 00:25:45,760 through so far so are you um happy to 622 00:25:43,679 --> 00:25:47,760 jump across to the chat channel post the 623 00:25:45,760 --> 00:25:49,440 the meeting room of course 624 00:25:47,760 --> 00:25:51,919 excellent so if people can 625 00:25:49,440 --> 00:25:53,279 jump over to the chatting channel post 626 00:25:51,919 --> 00:25:54,799 this conference 627 00:25:53,279 --> 00:25:56,880 we're scheduled to finish in about 628 00:25:54,799 --> 00:25:58,880 another 20 minutes maybe people can have 629 00:25:56,880 --> 00:26:00,799 a long afternoon tea or maybe there's 630 00:25:58,880 --> 00:26:02,559 that tricky question you can't quite 631 00:26:00,799 --> 00:26:04,720 word at the moment and you need a few 632 00:26:02,559 --> 00:26:06,559 more to think about it so thank you so 633 00:26:04,720 --> 00:26:08,400 much for joining us from the other side 634 00:26:06,559 --> 00:26:09,520 of the planet um your evening our 635 00:26:08,400 --> 00:26:10,559 afternoon 636 00:26:09,520 --> 00:26:12,240 and 637 00:26:10,559 --> 00:26:14,000 for those of you who are watching the 638 00:26:12,240 --> 00:26:15,520 recording we didn't have any surprise 639 00:26:14,000 --> 00:26:17,520 visitors from david family that were 640 00:26:15,520 --> 00:26:19,360 threatening to run through and surprise 641 00:26:17,520 --> 00:26:21,200 him in the middle of the broadcast so 642 00:26:19,360 --> 00:26:22,640 please do thank your family for taking 643 00:26:21,200 --> 00:26:24,320 time out from your family time this 644 00:26:22,640 --> 00:26:25,919 evening 645 00:26:24,320 --> 00:26:30,360 all right have a good evening 646 00:26:25,919 --> 00:26:30,360 see everybody back after afternoon team