1 00:00:04,960 --> 00:00:19,999 [Music] 2 00:00:20,600 --> 00:00:26,039 after brief a hiccup we are back with 3 00:00:22,800 --> 00:00:28,679 Tom who is talking about whyu uu IDs are 4 00:00:26,039 --> 00:00:31,960 secretly incredibly fascinating please 5 00:00:28,679 --> 00:00:31,960 make Tom welcome 6 00:00:36,120 --> 00:00:41,160 hello I'm weirdly nervous this morning I 7 00:00:38,680 --> 00:00:44,120 um most of the talks that I give have a 8 00:00:41,160 --> 00:00:46,239 kind of point this one was purely sort 9 00:00:44,120 --> 00:00:48,520 of special interest information dumping 10 00:00:46,239 --> 00:00:50,480 so um I ended up just talking about this 11 00:00:48,520 --> 00:00:52,120 randomly to random people forever and so 12 00:00:50,480 --> 00:00:55,399 now you guys are forced to sit here and 13 00:00:52,120 --> 00:00:55,399 listen to it 14 00:00:55,760 --> 00:01:01,680 um in terms of real world information if 15 00:00:58,559 --> 00:01:02,879 you've already read RFC 9562 then you 16 00:01:01,680 --> 00:01:04,320 probably already know everything that 17 00:01:02,879 --> 00:01:06,200 I'm going to say in this talk and you 18 00:01:04,320 --> 00:01:09,000 just kind of here for opinions and and 19 00:01:06,200 --> 00:01:11,000 Vibes I guess 20 00:01:09,000 --> 00:01:14,600 um 21 00:01:11,000 --> 00:01:16,640 I'm oh yeah because I Yammer on about 22 00:01:14,600 --> 00:01:18,240 this way too much I've got the last 23 00:01:16,640 --> 00:01:19,720 slide of my talk first just in case I go 24 00:01:18,240 --> 00:01:22,720 over 25 00:01:19,720 --> 00:01:22,720 time 26 00:01:23,320 --> 00:01:27,600 um 27 00:01:25,079 --> 00:01:29,479 so most of this is just weird interest 28 00:01:27,600 --> 00:01:31,119 stuff but like if there's useful things 29 00:01:29,479 --> 00:01:32,280 that I want you to take out of it if 30 00:01:31,119 --> 00:01:34,560 there's useful things that I want you to 31 00:01:32,280 --> 00:01:36,640 take from this talk they are that it's 32 00:01:34,560 --> 00:01:38,600 really worth knowing what the 33 00:01:36,640 --> 00:01:39,720 differences between these U IDs are 34 00:01:38,600 --> 00:01:45,320 because they can actually be quite 35 00:01:39,720 --> 00:01:48,040 useful for certain use cases um and uid 36 00:01:45,320 --> 00:01:50,119 V7 which is a new standard that was 37 00:01:48,040 --> 00:01:51,560 finalized in April of this year but has 38 00:01:50,119 --> 00:01:53,880 been being worked on for the last couple 39 00:01:51,560 --> 00:01:55,159 years might be really useful for certain 40 00:01:53,880 --> 00:01:58,759 use cases of yours and we'll go into 41 00:01:55,159 --> 00:02:00,600 more details on that later um if you 42 00:01:58,759 --> 00:02:03,759 need to invent your own kind of 43 00:02:00,600 --> 00:02:05,960 universal ID system there's a now a spec 44 00:02:03,759 --> 00:02:07,560 for you to conform to while you do that 45 00:02:05,960 --> 00:02:10,680 which could be very useful for you so 46 00:02:07,560 --> 00:02:13,040 make sure you know about that 47 00:02:10,680 --> 00:02:15,000 but I'm talking about a thing that may 48 00:02:13,040 --> 00:02:17,120 or may not be useful to you depending on 49 00:02:15,000 --> 00:02:18,680 your use case so probably the most 50 00:02:17,120 --> 00:02:20,040 important lesson that I realized is 51 00:02:18,680 --> 00:02:22,720 probably applicable to almost everyone 52 00:02:20,040 --> 00:02:27,519 in this room if you work 53 00:02:22,720 --> 00:02:29,720 in what we do uh whether or not your 54 00:02:27,519 --> 00:02:31,360 uids are a good primary key for you or 55 00:02:29,720 --> 00:02:33,360 not or whether or not they're useful for 56 00:02:31,360 --> 00:02:35,800 your use case I've now worked in a 57 00:02:33,360 --> 00:02:38,519 couple places where we've had real 58 00:02:35,800 --> 00:02:40,519 problems having with the default primary 59 00:02:38,519 --> 00:02:43,120 key value of databases particularly 60 00:02:40,519 --> 00:02:45,400 postgress so maybe you don't the the the 61 00:02:43,120 --> 00:02:48,120 one lesson that I think is probably 62 00:02:45,400 --> 00:02:51,360 applicable to everyone is maybe you 63 00:02:48,120 --> 00:02:55,040 don't need 128bit universally unique 64 00:02:51,360 --> 00:02:57,159 identifiers uh but I think that nowadays 65 00:02:55,040 --> 00:02:59,319 you will regret using a 32-bit 66 00:02:57,159 --> 00:03:02,840 identifier at some point and by the time 67 00:02:59,319 --> 00:03:08,319 you regret it it's a really painful job 68 00:03:02,840 --> 00:03:08,319 to fix so I cool you can all go 69 00:03:09,959 --> 00:03:14,720 now okay I don't know why I was wasn't 70 00:03:13,760 --> 00:03:17,640 able to squeeze this all into a 71 00:03:14,720 --> 00:03:19,159 lightning talk um so why am I talking 72 00:03:17,640 --> 00:03:21,280 about this because I thought this was 73 00:03:19,159 --> 00:03:22,840 really interesting in Django land first 74 00:03:21,280 --> 00:03:25,159 of all if you how many people here work 75 00:03:22,840 --> 00:03:27,400 in Django do web development do database 76 00:03:25,159 --> 00:03:29,040 design have to choose your primary key 77 00:03:27,400 --> 00:03:31,760 yeah okay so you guys are vaguely in the 78 00:03:29,040 --> 00:03:33,760 right place right um you get really used 79 00:03:31,760 --> 00:03:35,920 to Django's defaults of integer primary 80 00:03:33,760 --> 00:03:37,799 keys if you don't decide on one it'll 81 00:03:35,920 --> 00:03:40,159 make one for you and by and large it 82 00:03:37,799 --> 00:03:42,920 works perfectly fine Forever Until your 83 00:03:40,159 --> 00:03:44,760 table gets too big I've now worked at 84 00:03:42,920 --> 00:03:47,879 three different companies where each one 85 00:03:44,760 --> 00:03:50,760 has had huge projects that have each one 86 00:03:47,879 --> 00:03:55,120 taken like over a year of developer time 87 00:03:50,760 --> 00:03:58,480 because of something going fun with with 88 00:03:55,120 --> 00:04:00,879 your primary key Choice um I worked for 89 00:03:58,480 --> 00:04:02,920 a company several years ago who who were 90 00:04:00,879 --> 00:04:05,280 merging multiple production environments 91 00:04:02,920 --> 00:04:07,959 into one production environment which 92 00:04:05,280 --> 00:04:09,879 was multi-tenanted and naturally they 93 00:04:07,959 --> 00:04:12,360 had to remap a whole bunch of primary 94 00:04:09,879 --> 00:04:14,519 Keys through a huge amount of data sets 95 00:04:12,360 --> 00:04:17,440 and it was gigantic and talking about 96 00:04:14,519 --> 00:04:19,079 that would be a whole talk on its own um 97 00:04:17,440 --> 00:04:21,720 I've worked for a company that chose to 98 00:04:19,079 --> 00:04:23,040 use uid fors and text columns for every 99 00:04:21,720 --> 00:04:25,479 primary key in every table of their 100 00:04:23,040 --> 00:04:27,080 database which had some real I see some 101 00:04:25,479 --> 00:04:29,960 of you shaking your heads and you're 102 00:04:27,080 --> 00:04:31,960 right but also it had some real real 103 00:04:29,960 --> 00:04:34,639 advantages that I will also go 104 00:04:31,960 --> 00:04:36,880 into and then most recently uh at Kraken 105 00:04:34,639 --> 00:04:39,600 Technologies we've had an amazing 106 00:04:36,880 --> 00:04:42,639 project recently where we've had to with 107 00:04:39,600 --> 00:04:44,320 some urgency quite quickly change a 108 00:04:42,639 --> 00:04:46,160 bunch of primary keys from 32- bit 109 00:04:44,320 --> 00:04:47,919 integers to 64-bit integers because we 110 00:04:46,160 --> 00:04:50,080 were running out of space and by the 111 00:04:47,919 --> 00:04:51,960 time you're running out of space your 112 00:04:50,080 --> 00:04:53,160 tables are already really big and if 113 00:04:51,960 --> 00:04:53,960 your tables are really big it's probably 114 00:04:53,160 --> 00:04:55,639 because you've been reasonably 115 00:04:53,960 --> 00:04:56,800 successful so you don't want downtime 116 00:04:55,639 --> 00:04:58,800 and so trying to do this sort of thing 117 00:04:56,800 --> 00:05:00,240 without downtime would also be a whole 118 00:04:58,800 --> 00:05:01,680 big talk of its own 119 00:05:00,240 --> 00:05:03,440 uh Tim Bell will be here later in the 120 00:05:01,680 --> 00:05:06,280 conference and I'm sure he'll love to 121 00:05:03,440 --> 00:05:09,440 revisit many months of his life so be 122 00:05:06,280 --> 00:05:09,440 sure to ask him about 123 00:05:10,759 --> 00:05:14,560 it for a long time all I really knew 124 00:05:13,120 --> 00:05:15,720 about uids was that they looked like 125 00:05:14,560 --> 00:05:17,919 this you know you've got you've got 126 00:05:15,720 --> 00:05:20,160 those hyphens in slightly different 127 00:05:17,919 --> 00:05:21,759 places you've got a whole string of 128 00:05:20,160 --> 00:05:23,800 hexadecimal numbers and there's a bunch 129 00:05:21,759 --> 00:05:26,160 of different versions of them and in 130 00:05:23,800 --> 00:05:27,600 practice you only ever really see uuid 131 00:05:26,160 --> 00:05:30,680 V4 which is the version which is 132 00:05:27,600 --> 00:05:33,199 completely random who who here like I 133 00:05:30,680 --> 00:05:34,520 don't go into much about where they came 134 00:05:33,199 --> 00:05:36,280 from I just go into sort of the weird 135 00:05:34,520 --> 00:05:39,960 stuff in this talk but all of you have 136 00:05:36,280 --> 00:05:41,639 seen this sort of thing before yeah um 137 00:05:39,960 --> 00:05:42,960 you I knew about uid v1s I knew that 138 00:05:41,639 --> 00:05:44,880 they were constructed from several 139 00:05:42,960 --> 00:05:46,720 ingredients and I knew that people like 140 00:05:44,880 --> 00:05:48,280 just don't use them for anything and 141 00:05:46,720 --> 00:05:50,400 then if you go into the python standard 142 00:05:48,280 --> 00:05:53,520 Library you'll see U uid versions two 143 00:05:50,400 --> 00:05:56,080 three and five didn't know what they 144 00:05:53,520 --> 00:05:57,919 were and nobody uses them and so that 145 00:05:56,080 --> 00:06:00,880 was what really got me 146 00:05:57,919 --> 00:06:02,720 interested we've had this standard uh 147 00:06:00,880 --> 00:06:05,039 uids have been around and not a standard 148 00:06:02,720 --> 00:06:08,520 since like 1995 and they've been a 149 00:06:05,039 --> 00:06:10,120 standard since 2002 or 2004 and nobody 150 00:06:08,520 --> 00:06:12,440 uses them and everybody invents their 151 00:06:10,120 --> 00:06:16,680 own Twitter invented lexu ID and 152 00:06:12,440 --> 00:06:21,080 Snowflake and Google invented push no uh 153 00:06:16,680 --> 00:06:22,199 yeah push ID for Firebase and uh doc ID 154 00:06:21,080 --> 00:06:24,199 which is something they use in Google 155 00:06:22,199 --> 00:06:27,039 Drive and mongodb invented their own 156 00:06:24,199 --> 00:06:28,960 ones and Instagram uh which is a big 157 00:06:27,039 --> 00:06:32,759 python Django project invented their own 158 00:06:28,960 --> 00:06:34,240 one called stting ID um everyone has to 159 00:06:32,759 --> 00:06:35,800 invent their own because for some reason 160 00:06:34,240 --> 00:06:37,080 the standard that we had was never fit 161 00:06:35,800 --> 00:06:38,919 for 162 00:06:37,080 --> 00:06:42,720 purpose 163 00:06:38,919 --> 00:06:45,160 so I am going to give you a crash course 164 00:06:42,720 --> 00:06:46,919 in what all these things were why 165 00:06:45,160 --> 00:06:49,080 they're interesting why they weren't fit 166 00:06:46,919 --> 00:06:51,240 for purpose and how that's being fixed 167 00:06:49,080 --> 00:06:53,720 nowadays 168 00:06:51,240 --> 00:06:54,800 so this is the executive summary of all 169 00:06:53,720 --> 00:06:58,120 the versions that are currently in 170 00:06:54,800 --> 00:06:59,599 Python the V1 was built out of the 171 00:06:58,120 --> 00:07:01,639 timestamp the MAC address and the 172 00:06:59,599 --> 00:07:05,800 sequence and so each of those chunks of 173 00:07:01,639 --> 00:07:07,160 the uid become those things um the V4 is 174 00:07:05,800 --> 00:07:09,639 the completely random one that doesn't 175 00:07:07,160 --> 00:07:11,919 use any other kind of logic it's just 176 00:07:09,639 --> 00:07:13,680 121 bits of completely random 177 00:07:11,919 --> 00:07:15,400 information and then the little version 178 00:07:13,680 --> 00:07:18,360 specifier which is like whatever the 179 00:07:15,400 --> 00:07:20,919 remaining bits were seven bits right um 180 00:07:18,360 --> 00:07:22,520 V3 and V5 are the ones that you probably 181 00:07:20,919 --> 00:07:24,479 heard the least about and they have a 182 00:07:22,520 --> 00:07:26,960 very interesting Niche use case because 183 00:07:24,479 --> 00:07:29,240 you can turn any other identifier you've 184 00:07:26,960 --> 00:07:31,360 got into a uu ID I'll show you an 185 00:07:29,240 --> 00:07:33,599 example of that and V2 I'm not going to 186 00:07:31,360 --> 00:07:36,879 talk about because it's almost the same 187 00:07:33,599 --> 00:07:38,800 as V1 but it was specific for a certain 188 00:07:36,879 --> 00:07:40,800 application back in the 90s for a 189 00:07:38,800 --> 00:07:42,160 certain operating system so none of it 190 00:07:40,800 --> 00:07:44,840 really exists anymore so it's something 191 00:07:42,160 --> 00:07:47,520 that you can actually just kind of 192 00:07:44,840 --> 00:07:50,960 ignore so we'll talk about V1 they 193 00:07:47,520 --> 00:07:53,960 decided that a universal ID something 194 00:07:50,960 --> 00:07:57,120 that well hang on the point of a 195 00:07:53,960 --> 00:07:58,960 universal ID is that you can have a lot 196 00:07:57,120 --> 00:08:01,520 of things generating these IDs and 197 00:07:58,960 --> 00:08:03,919 they'll never Collide 198 00:08:01,520 --> 00:08:06,199 okay one easy way of doing that is 199 00:08:03,919 --> 00:08:07,639 making sure that you incorporate a few 200 00:08:06,199 --> 00:08:11,000 ingredients that the other machines 201 00:08:07,639 --> 00:08:13,639 generating U IDs won't have so the time 202 00:08:11,000 --> 00:08:16,400 that it was generated a node ID like a 203 00:08:13,639 --> 00:08:19,280 machine ID and so each machine 204 00:08:16,400 --> 00:08:20,759 generating U IDs has its own node ID and 205 00:08:19,280 --> 00:08:22,039 then a few other bits and pieces like 206 00:08:20,759 --> 00:08:24,360 sometimes a bit of random data or a 207 00:08:22,039 --> 00:08:26,840 clock 208 00:08:24,360 --> 00:08:28,840 sequence when they were designing uid 209 00:08:26,840 --> 00:08:30,400 ones they made a couple interesting 210 00:08:28,840 --> 00:08:33,320 decisions that kind came back to haunt 211 00:08:30,400 --> 00:08:34,399 them uh the time stamp is this readable 212 00:08:33,320 --> 00:08:37,800 for 213 00:08:34,399 --> 00:08:39,800 people cool okay um the time stamp is a 214 00:08:37,800 --> 00:08:41,360 60- bit value for you id1 this is 215 00:08:39,800 --> 00:08:42,880 represented by the coordinated use time 216 00:08:41,360 --> 00:08:46,519 blah blah blah blah blah blah blah 217 00:08:42,880 --> 00:08:48,480 blah 100 nond intervals since the 218 00:08:46,519 --> 00:08:51,320 Gregorian reform to the Christian 219 00:08:48,480 --> 00:08:56,399 calendar in 220 00:08:51,320 --> 00:08:59,160 1582 okay sure why not um and then the 221 00:08:56,399 --> 00:09:00,959 node address was the MAC address of your 222 00:08:59,160 --> 00:09:02,399 computer computer so just find any 223 00:09:00,959 --> 00:09:06,760 network card on your computer grab its 224 00:09:02,399 --> 00:09:09,839 Mac address and throw it into the ID 225 00:09:06,760 --> 00:09:11,880 so this example this U ID right here is 226 00:09:09,839 --> 00:09:13,360 the one that's like in the spec and if 227 00:09:11,880 --> 00:09:15,839 you can decode it and get that 228 00:09:13,360 --> 00:09:16,920 information back out the spec suggests 229 00:09:15,839 --> 00:09:19,760 that you don't do this that you don't 230 00:09:16,920 --> 00:09:21,760 rely on any information in the uuid but 231 00:09:19,760 --> 00:09:23,079 it's interesting that you can decode it 232 00:09:21,760 --> 00:09:25,600 and so you can find out that the time 233 00:09:23,079 --> 00:09:27,440 that this U ID was generated was the 3rd 234 00:09:25,600 --> 00:09:29,600 of February 235 00:09:27,440 --> 00:09:32,120 1997 um the clock sequence which is 236 00:09:29,600 --> 00:09:34,760 actually just kind of a random number is 237 00:09:32,120 --> 00:09:36,440 1085 and the node ID is that Network 238 00:09:34,760 --> 00:09:37,839 address and of course because it's a 239 00:09:36,440 --> 00:09:40,000 network address you can also tell that 240 00:09:37,839 --> 00:09:43,680 that's from Intel it's from an Intel 241 00:09:40,000 --> 00:09:46,600 Corporation uh machine and that block of 242 00:09:43,680 --> 00:09:48,880 Mac addresses was allocated on the 1 of 243 00:09:46,600 --> 00:09:52,200 January 1980 like around the first 244 00:09:48,880 --> 00:09:54,320 allocations of ethernet addresses um the 245 00:09:52,200 --> 00:09:56,560 reason you can know that by the way is 246 00:09:54,320 --> 00:09:58,160 because maybe up until recently or maybe 247 00:09:56,560 --> 00:10:00,200 it is still the case that all Mac 248 00:09:58,160 --> 00:10:04,200 addresses the first 249 00:10:00,200 --> 00:10:06,839 half of it are allocated to a network 250 00:10:04,200 --> 00:10:08,959 card manufacturer and they get to decide 251 00:10:06,839 --> 00:10:10,600 on the sequence number from 252 00:10:08,959 --> 00:10:12,360 there 253 00:10:10,600 --> 00:10:16,200 um 254 00:10:12,360 --> 00:10:19,800 so the other important thing to note is 255 00:10:16,200 --> 00:10:22,120 the time in ID 1es is backwards so it's 256 00:10:19,800 --> 00:10:23,920 not like year first then month then day 257 00:10:22,120 --> 00:10:26,040 or it's not it's not most significant 258 00:10:23,920 --> 00:10:30,320 bit first it's least significant bit 259 00:10:26,040 --> 00:10:33,360 first so um they don't 260 00:10:30,320 --> 00:10:34,800 enter in order so uid 1's could their 261 00:10:33,360 --> 00:10:37,760 their sort of overall numerical value 262 00:10:34,800 --> 00:10:39,519 could be all over the show as the least 263 00:10:37,760 --> 00:10:41,880 significant digits 264 00:10:39,519 --> 00:10:44,920 increment that will be 265 00:10:41,880 --> 00:10:46,959 important um so that's id1 I've skipped 266 00:10:44,920 --> 00:10:49,600 over id4 because it's just pure random 267 00:10:46,959 --> 00:10:51,959 data I told you that id5 is quite an 268 00:10:49,600 --> 00:10:53,399 interesting one sometimes you have a use 269 00:10:51,959 --> 00:10:57,000 case for this and I did have a use case 270 00:10:53,399 --> 00:10:59,920 for this and it was great um you take 271 00:10:57,000 --> 00:11:04,360 any other identifier you've got and you 272 00:10:59,920 --> 00:11:06,720 turn it into a uuid and so you do that 273 00:11:04,360 --> 00:11:08,880 by the the way the algorithm works is it 274 00:11:06,720 --> 00:11:12,160 takes a namespace which is just another 275 00:11:08,880 --> 00:11:15,680 uuid and it takes the canonical 276 00:11:12,160 --> 00:11:18,000 representation of your value of of your 277 00:11:15,680 --> 00:11:19,639 name and it does something like an hmac 278 00:11:18,000 --> 00:11:22,720 you know it does like it concatenates 279 00:11:19,639 --> 00:11:25,880 the two and then it does a sha one hash 280 00:11:22,720 --> 00:11:28,399 or an md5 hash and it takes those bits 281 00:11:25,880 --> 00:11:30,519 and turns that into a uid uh the 282 00:11:28,399 --> 00:11:36,240 difference between V3 and 283 00:11:30,519 --> 00:11:38,399 V5 is that V5 used md5 and V5 used sha 284 00:11:36,240 --> 00:11:40,880 one and so they don't recommend you use 285 00:11:38,399 --> 00:11:43,959 uh V3 anymore at all but it's still 286 00:11:40,880 --> 00:11:46,880 around in the standard and 287 00:11:43,959 --> 00:11:49,399 so and in the standard there's a couple 288 00:11:46,880 --> 00:11:51,120 predefined name spaces so if you for 289 00:11:49,399 --> 00:11:53,519 some reason wanted to 290 00:11:51,120 --> 00:11:56,639 turn a DNS 291 00:11:53,519 --> 00:12:00,560 name uh into 292 00:11:56,639 --> 00:12:03,800 a uuid there's a DNS namespace and 293 00:12:00,560 --> 00:12:08,680 there's a URL namespace and there's a 294 00:12:03,800 --> 00:12:11,279 x509 namespace and a few other things um 295 00:12:08,680 --> 00:12:12,639 and so this can actually be genuinely 296 00:12:11,279 --> 00:12:15,000 useful and I did this trick a couple 297 00:12:12,639 --> 00:12:16,959 years ago because I had I had a primary 298 00:12:15,000 --> 00:12:21,360 key that I needed to turn into a uu ID 299 00:12:16,959 --> 00:12:24,360 for a certain table and the in the 300 00:12:21,360 --> 00:12:27,040 integer ID corresponded to a sequence of 301 00:12:24,360 --> 00:12:30,720 financial transactions 302 00:12:27,040 --> 00:12:33,240 and I you you invent any uu ID and it 303 00:12:30,720 --> 00:12:35,680 becomes your name space for whatever the 304 00:12:33,240 --> 00:12:39,199 identifier is that you have and from 305 00:12:35,680 --> 00:12:43,360 then on you can basically just map 306 00:12:39,199 --> 00:12:46,880 anything onto any name in there to a uu 307 00:12:43,360 --> 00:12:49,120 ID without it colliding so they're 308 00:12:46,880 --> 00:12:50,399 deterministic this way so unlike the 309 00:12:49,120 --> 00:12:52,040 other ones where you expect to get a 310 00:12:50,399 --> 00:12:55,040 different unique uu ID every time these 311 00:12:52,040 --> 00:12:57,199 are designed to get you back to the same 312 00:12:55,040 --> 00:12:58,360 one every time so for transaction one as 313 00:12:57,199 --> 00:13:00,279 you can see I've got the same value you 314 00:12:58,360 --> 00:13:01,760 can see my mouse yeah okay transaction 315 00:13:00,279 --> 00:13:03,160 one you can see the same value came out 316 00:13:01,760 --> 00:13:05,079 when you feed at the same input 317 00:13:03,160 --> 00:13:08,839 transaction two is a completely 318 00:13:05,079 --> 00:13:10,240 different ID so you can't so basically a 319 00:13:08,839 --> 00:13:11,199 one-way hash right basically like a 320 00:13:10,240 --> 00:13:14,880 password 321 00:13:11,199 --> 00:13:14,880 hash hang let me 322 00:13:17,040 --> 00:13:21,760 just and id4 pure random 323 00:13:24,839 --> 00:13:30,240 data and so nobody uses these except id4 324 00:13:28,399 --> 00:13:33,279 is still fairly commonly used when you 325 00:13:30,240 --> 00:13:33,279 just want purely random 326 00:13:35,480 --> 00:13:41,360 stuff a lot of the questions a lot of 327 00:13:38,240 --> 00:13:43,639 the time people wonder if uh collisions 328 00:13:41,360 --> 00:13:45,760 were why you wouldn't want to use these 329 00:13:43,639 --> 00:13:48,440 if they were too likely to collide if 330 00:13:45,760 --> 00:13:52,360 you go to the Wikipedia page you'll see 331 00:13:48,440 --> 00:13:53,959 various absurd examples of um what is it 332 00:13:52,360 --> 00:13:56,440 like with uuid4 which is completely 333 00:13:53,959 --> 00:13:59,160 random if you generated a billion of 334 00:13:56,440 --> 00:14:02,759 them a minute or no a billion a second 335 00:13:59,160 --> 00:14:05,079 for 90 years you'd have a coin flip 336 00:14:02,759 --> 00:14:08,279 chance of maybe having generated two of 337 00:14:05,079 --> 00:14:10,880 the same one um so the the numbers that 338 00:14:08,279 --> 00:14:12,360 we're talking about for random 339 00:14:10,880 --> 00:14:14,600 collisions in a way that would affect 340 00:14:12,360 --> 00:14:16,880 your app are negligible like humans 341 00:14:14,600 --> 00:14:18,199 really don't have a great way of 342 00:14:16,880 --> 00:14:24,160 understanding in their head just how 343 00:14:18,199 --> 00:14:26,240 gigantic a 128bit number is um but in 344 00:14:24,160 --> 00:14:27,759 the history of people using uids there 345 00:14:26,240 --> 00:14:32,120 have been collisions but they've always 346 00:14:27,759 --> 00:14:35,279 been sort of um implementation bugs so a 347 00:14:32,120 --> 00:14:37,600 reasonably prominent example was I 348 00:14:35,279 --> 00:14:40,880 think it might not have been VMware but 349 00:14:37,600 --> 00:14:45,120 it was like a it was a VM company 350 00:14:40,880 --> 00:14:48,480 product that was incorrectly seeding the 351 00:14:45,120 --> 00:14:50,480 random number entropy pool on booting 352 00:14:48,480 --> 00:14:53,279 VMS and so the VMS had a higher chance 353 00:14:50,480 --> 00:14:55,120 of generating the same U IDs uh so there 354 00:14:53,279 --> 00:14:56,720 was there there's been that sort of 355 00:14:55,120 --> 00:15:01,040 thing in the past but that's not why 356 00:14:56,720 --> 00:15:01,040 these became unused 357 00:15:01,800 --> 00:15:05,320 it was because by and 358 00:15:07,320 --> 00:15:14,560 large everybody needed one feature of 359 00:15:11,360 --> 00:15:17,160 primary of identifiers that uids didn't 360 00:15:14,560 --> 00:15:18,519 provide which was they needed numbers to 361 00:15:17,160 --> 00:15:22,759 go 362 00:15:18,519 --> 00:15:25,959 up um especially so a perfect example of 363 00:15:22,759 --> 00:15:27,600 this is is a relational database like 364 00:15:25,959 --> 00:15:28,880 postgress I I use postgress for my 365 00:15:27,600 --> 00:15:31,759 examples because I just use postgress 366 00:15:28,880 --> 00:15:31,759 for everything 367 00:15:31,839 --> 00:15:37,759 um most relational databases use uh B 368 00:15:36,040 --> 00:15:40,959 trees for their indexes and berries have 369 00:15:37,759 --> 00:15:42,800 been heavily optimized for insertions 370 00:15:40,959 --> 00:15:47,480 kind of happening at the end of the 371 00:15:42,800 --> 00:15:50,240 stack and so in real world terms you 372 00:15:47,480 --> 00:15:52,600 want locality of data that was sort of 373 00:15:50,240 --> 00:15:57,120 generated around the same time and so 374 00:15:52,600 --> 00:15:59,720 you would like your IDs for similar data 375 00:15:57,120 --> 00:16:03,839 to be close to each other on dis or 376 00:15:59,720 --> 00:16:05,240 logically uh uid the uuid standard all 377 00:16:03,839 --> 00:16:08,000 of those numbers come out pretty much 378 00:16:05,240 --> 00:16:09,319 random including the uid V1 which had 379 00:16:08,000 --> 00:16:11,360 time but like I told you the time was 380 00:16:09,319 --> 00:16:14,360 backwards so the low significant B bits 381 00:16:11,360 --> 00:16:15,920 are wrapping around all the time inserts 382 00:16:14,360 --> 00:16:16,959 will happen all through the table 383 00:16:15,920 --> 00:16:18,600 inserts will happen all through the 384 00:16:16,959 --> 00:16:21,199 index and that's fairly 385 00:16:18,600 --> 00:16:26,000 inefficient um some other complaints 386 00:16:21,199 --> 00:16:29,720 cuid V1 um the Gregorian Epoch thing 387 00:16:26,000 --> 00:16:31,680 just was annoying to implement uh 388 00:16:29,720 --> 00:16:33,440 if you did want to sort them by order 389 00:16:31,680 --> 00:16:35,800 you kind of had to parse the uid which 390 00:16:33,440 --> 00:16:37,360 is not a great idea and using the MAC 391 00:16:35,800 --> 00:16:39,160 address was considered a bad idea people 392 00:16:37,360 --> 00:16:41,839 either wanted kind of arbitrary node IDs 393 00:16:39,160 --> 00:16:45,199 that they assigned or they really didn't 394 00:16:41,839 --> 00:16:48,519 want the potential security or identity 395 00:16:45,199 --> 00:16:51,759 problems of using um your L your literal 396 00:16:48,519 --> 00:16:51,759 Network address in your primary 397 00:16:53,959 --> 00:16:59,759 keys and so yeah if we went back to that 398 00:16:57,199 --> 00:17:01,319 list 399 00:16:59,759 --> 00:17:03,839 uh all those Twitter ones and I think 400 00:17:01,319 --> 00:17:06,679 the mongodb ones all of them are like 401 00:17:03,839 --> 00:17:08,720 64-bit integers that basically have some 402 00:17:06,679 --> 00:17:10,559 variation of like the time and then 403 00:17:08,720 --> 00:17:14,000 random data but the time is in correct 404 00:17:10,559 --> 00:17:15,799 order so that basically IDs will go up 405 00:17:14,000 --> 00:17:18,079 over time 406 00:17:15,799 --> 00:17:19,839 um snowflake has a little bit of extra 407 00:17:18,079 --> 00:17:21,720 data in the middle it's used by Mastadon 408 00:17:19,839 --> 00:17:22,919 as well for those who use that uh it's 409 00:17:21,720 --> 00:17:26,720 like the 410 00:17:22,919 --> 00:17:28,919 time a type identifier and then random 411 00:17:26,720 --> 00:17:31,799 data um but the thing that ended up 412 00:17:28,919 --> 00:17:33,559 being really popular was time so that so 413 00:17:31,799 --> 00:17:36,640 that number basically goes up as you 414 00:17:33,559 --> 00:17:38,559 generate them and random data so that 415 00:17:36,640 --> 00:17:41,840 brings us finally 416 00:17:38,559 --> 00:17:45,120 to the new versions the new versions 417 00:17:41,840 --> 00:17:48,160 will probably be in Python by 418 00:17:45,120 --> 00:17:50,440 three4 um there's PLL requests for them 419 00:17:48,160 --> 00:17:52,400 and I see discussions on like the python 420 00:17:50,440 --> 00:17:54,880 discussion lists um but there's plenty 421 00:17:52,400 --> 00:17:55,919 of libraries that use them anyway um oh 422 00:17:54,880 --> 00:17:57,120 yeah I was going to demonstrate some of 423 00:17:55,919 --> 00:18:00,840 these I even have like a terminal right 424 00:17:57,120 --> 00:18:03,000 here look there you go uid ones um 425 00:18:00,840 --> 00:18:04,799 actually yeah let's do this now so I 426 00:18:03,000 --> 00:18:07,600 just generated 20 uid 427 00:18:04,799 --> 00:18:10,280 ones and what you can see is that 428 00:18:07,600 --> 00:18:12,080 they're actually not that unique right 429 00:18:10,280 --> 00:18:14,640 like you can see that the time sequence 430 00:18:12,080 --> 00:18:16,520 is going up on the left and as soon as 431 00:18:14,640 --> 00:18:18,919 that actually yeah perfect timing let's 432 00:18:16,520 --> 00:18:21,280 just hang on I always go over time and 433 00:18:18,919 --> 00:18:22,480 now I'm suggesting we wait I always go 434 00:18:21,280 --> 00:18:24,000 over time when I'm talking about this 435 00:18:22,480 --> 00:18:25,640 and now if I just sit here for a little 436 00:18:24,000 --> 00:18:27,720 bit 437 00:18:25,640 --> 00:18:28,840 longer oh it's going to take too long 438 00:18:27,720 --> 00:18:31,120 but you can see what's going to happen 439 00:18:28,840 --> 00:18:32,360 the D the D is going to wrap around to e 440 00:18:31,120 --> 00:18:33,640 and then it's going to wrap around to F 441 00:18:32,360 --> 00:18:36,000 and then it's going to wrap back around 442 00:18:33,640 --> 00:18:37,360 to zero and so that means that the 443 00:18:36,000 --> 00:18:38,640 insertions of these IDs as they were 444 00:18:37,360 --> 00:18:41,520 being generated all over the time all 445 00:18:38,640 --> 00:18:44,799 over the place um they're not going to 446 00:18:41,520 --> 00:18:47,280 be at the end of the index now if we use 447 00:18:44,799 --> 00:18:49,559 du id4 they're completely 448 00:18:47,280 --> 00:18:51,280 random right so they're also always 449 00:18:49,559 --> 00:18:52,000 going to be inserted in random places in 450 00:18:51,280 --> 00:18:56,000 your 451 00:18:52,000 --> 00:18:57,400 index um except for as you can see the 452 00:18:56,000 --> 00:18:59,240 little four down here which is the 453 00:18:57,400 --> 00:19:01,720 version specifier 454 00:18:59,240 --> 00:19:04,200 and the other non-random bit is this 455 00:19:01,720 --> 00:19:06,000 column here you'll notice is never lower 456 00:19:04,200 --> 00:19:09,840 than 457 00:19:06,000 --> 00:19:12,240 eight in heximal value because the first 458 00:19:09,840 --> 00:19:15,799 two bits of 459 00:19:12,240 --> 00:19:16,760 that number are also um part of the 460 00:19:15,799 --> 00:19:18,760 version 461 00:19:16,760 --> 00:19:22,159 specifier so they're never quite as 462 00:19:18,760 --> 00:19:26,640 random as you totally thought anyway 463 00:19:22,159 --> 00:19:29,720 let's go on to the new ones uu ID V6 is 464 00:19:26,640 --> 00:19:32,640 exactly the same as uu ID V1 465 00:19:29,720 --> 00:19:34,720 except put the time in the right order 466 00:19:32,640 --> 00:19:37,640 so it's still the Gregorian 100 Nan 467 00:19:34,720 --> 00:19:41,760 thingy myob um but you put the time in 468 00:19:37,640 --> 00:19:44,280 the correct order and the spec the the 469 00:19:41,760 --> 00:19:46,080 uh spec says don't use the network ID 470 00:19:44,280 --> 00:19:48,640 anymore just pick a random number and 471 00:19:46,080 --> 00:19:51,080 stick to it so can I generate some of 472 00:19:48,640 --> 00:19:54,600 these for you yeah if we 473 00:19:51,080 --> 00:19:54,600 go what do they look 474 00:19:55,240 --> 00:20:00,440 like they are going up over time oh and 475 00:19:59,080 --> 00:20:02,280 this implementation as you can see it's 476 00:20:00,440 --> 00:20:04,320 not sticking to a static node ID this 477 00:20:02,280 --> 00:20:05,760 last chunk here is the node ID uh and 478 00:20:04,320 --> 00:20:09,880 it's not it's so they're just generating 479 00:20:05,760 --> 00:20:12,000 a random one each time um but the point 480 00:20:09,880 --> 00:20:14,840 of the uid V6 they don't recommend you 481 00:20:12,000 --> 00:20:17,400 use it they just invented it because 482 00:20:14,840 --> 00:20:19,320 it's literally just uuid V1 with the 483 00:20:17,400 --> 00:20:20,919 bits rearranged so that you can 484 00:20:19,320 --> 00:20:23,240 implement it really easily if all you've 485 00:20:20,919 --> 00:20:24,440 got is uid v1s you can just take that 486 00:20:23,240 --> 00:20:27,440 move this bit to here and and then 487 00:20:24,440 --> 00:20:30,559 you're done uuid V7 is the one that they 488 00:20:27,440 --> 00:20:33,559 recommend you use for every nowadays um 489 00:20:30,559 --> 00:20:36,720 and it's really dumb simple it is just 490 00:20:33,559 --> 00:20:38,440 the Unix timestamp you know the number 491 00:20:36,720 --> 00:20:40,320 of milliseconds since the January 1st 492 00:20:38,440 --> 00:20:42,039 1970 which pretty much every modern 493 00:20:40,320 --> 00:20:44,960 computer just uses you all are familiar 494 00:20:42,039 --> 00:20:49,039 with it Unix time stamp in milliseconds 495 00:20:44,960 --> 00:20:50,799 um for the first 48 bits and then a big 496 00:20:49,039 --> 00:20:52,159 chunk of random data like it's still a 497 00:20:50,799 --> 00:20:55,480 big chunk because you got 48 bits then 498 00:20:52,159 --> 00:20:57,200 you got like what 67 bits of random data 499 00:20:55,480 --> 00:21:00,480 so that's enough to make 500 00:20:57,200 --> 00:21:01,840 it really ESS um and then there's a 501 00:21:00,480 --> 00:21:03,640 couple optional rules in there for 502 00:21:01,840 --> 00:21:05,520 submillisecond precision so like this 503 00:21:03,640 --> 00:21:06,640 little this little chunk right here 504 00:21:05,520 --> 00:21:09,520 instead of just being a part of the 505 00:21:06,640 --> 00:21:11,240 random data can be if your system clock 506 00:21:09,520 --> 00:21:15,120 goes down to nanc Precision you can kind 507 00:21:11,240 --> 00:21:16,840 of use that for some more time Precision 508 00:21:15,120 --> 00:21:18,279 um I'll generate a couple of those and 509 00:21:16,840 --> 00:21:21,919 you'll see more numbers on screen won't 510 00:21:18,279 --> 00:21:21,919 that be exciting 511 00:21:23,480 --> 00:21:28,760 um and they are very random except for 512 00:21:26,559 --> 00:21:30,080 that front part and they go up over time 513 00:21:28,760 --> 00:21:32,679 and they're guaranteed to go up over 514 00:21:30,080 --> 00:21:34,520 time that's the other rule of uv7 is if 515 00:21:32,679 --> 00:21:36,039 there's it's written into the rules that 516 00:21:34,520 --> 00:21:38,039 implementations need to account for like 517 00:21:36,039 --> 00:21:42,000 clock Jitter and stuff and need to 518 00:21:38,039 --> 00:21:43,960 either increment a counter but you can 519 00:21:42,000 --> 00:21:45,880 you can place design guarantees that it 520 00:21:43,960 --> 00:21:49,640 will go up over 521 00:21:45,880 --> 00:21:54,200 time and finally they created a uid V8 522 00:21:49,640 --> 00:21:55,960 and uid V8 is screw it build your own um 523 00:21:54,200 --> 00:21:57,559 you can decide what any of these bits 524 00:21:55,960 --> 00:21:58,880 are you can decide what chunks they are 525 00:21:57,559 --> 00:22:00,360 they just have to have the version bit 526 00:21:58,880 --> 00:22:03,799 in the right spot and the variant bit in 527 00:22:00,360 --> 00:22:05,240 the right spot um and this sounds like 528 00:22:03,799 --> 00:22:08,720 it might be a bit of a cop out but it's 529 00:22:05,240 --> 00:22:11,200 actually incredibly useful 530 00:22:08,720 --> 00:22:12,640 because 128 bits is probably you can 531 00:22:11,200 --> 00:22:14,760 construct a really useful identifier out 532 00:22:12,640 --> 00:22:16,840 of that you might need to put some kind 533 00:22:14,760 --> 00:22:21,440 of implementation specific information 534 00:22:16,840 --> 00:22:22,919 in it um but having a uuid spec for it 535 00:22:21,440 --> 00:22:25,919 means that you can use things like 536 00:22:22,919 --> 00:22:29,799 postgress is native binary uid column 537 00:22:25,919 --> 00:22:31,880 and indexes for it and so this is 538 00:22:29,799 --> 00:22:34,000 actually this is this is might actually 539 00:22:31,880 --> 00:22:35,679 be a really powerful useful new tool for 540 00:22:34,000 --> 00:22:37,760 people who do have to invent their own 541 00:22:35,679 --> 00:22:40,440 identifiers for something because you 542 00:22:37,760 --> 00:22:42,480 don't have to stash them in big strings 543 00:22:40,440 --> 00:22:45,000 in your postgress 544 00:22:42,480 --> 00:22:48,320 database so it's kind of the general 545 00:22:45,000 --> 00:22:50,080 purpose uh do whatever you want and they 546 00:22:48,320 --> 00:22:51,640 still recommend that you don't depend on 547 00:22:50,080 --> 00:22:53,440 what's what information is in there like 548 00:22:51,640 --> 00:22:57,240 you shouldn't really be decoding these 549 00:22:53,440 --> 00:23:00,600 things anyway as as a best practice but 550 00:22:57,240 --> 00:23:03,320 by having that you use Python's uid type 551 00:23:00,600 --> 00:23:05,840 you can um yeah database native formats 552 00:23:03,320 --> 00:23:09,240 that support it 553 00:23:05,840 --> 00:23:10,400 uh yeah those are the new versions so 554 00:23:09,240 --> 00:23:12,159 for the last bit of my talk I'm going to 555 00:23:10,400 --> 00:23:13,760 talk about like why you may or may not 556 00:23:12,159 --> 00:23:17,279 want to use these in your database 557 00:23:13,760 --> 00:23:20,559 primary key you don't you don't need 558 00:23:17,279 --> 00:23:23,000 uuids until your project or your data 559 00:23:20,559 --> 00:23:25,760 set or your system is so big or 560 00:23:23,000 --> 00:23:27,480 distributed that you can't afford the 561 00:23:25,760 --> 00:23:29,240 time or network cost of asking 562 00:23:27,480 --> 00:23:32,880 permission to use a an identifier as a 563 00:23:29,240 --> 00:23:37,080 primary key right um postgress by design 564 00:23:32,880 --> 00:23:38,880 your primary key is always going into uh 565 00:23:37,080 --> 00:23:41,480 index with with like a uniqueness 566 00:23:38,880 --> 00:23:43,159 constraint on it right so no matter what 567 00:23:41,480 --> 00:23:45,480 even if you're using uids in your 568 00:23:43,159 --> 00:23:47,000 postest database you're still kind of 569 00:23:45,480 --> 00:23:48,559 asking permission if you can use that 570 00:23:47,000 --> 00:23:51,880 identifier because you're checking to 571 00:23:48,559 --> 00:23:53,559 see if it can go into your your your 572 00:23:51,880 --> 00:23:56,440 index and if it can't then there's a 573 00:23:53,559 --> 00:23:58,240 collision so uids aren't designed for 574 00:23:56,440 --> 00:24:01,760 this use case but they're good for it 575 00:23:58,240 --> 00:24:03,279 any anyway um there's some real 576 00:24:01,760 --> 00:24:05,520 advantages 577 00:24:03,279 --> 00:24:07,039 uh migrating your primary Keys when you 578 00:24:05,520 --> 00:24:08,720 realize they're not fit for purpose for 579 00:24:07,039 --> 00:24:10,600 is a really expensive project the thing 580 00:24:08,720 --> 00:24:14,559 that I was just telling you before about 581 00:24:10,600 --> 00:24:15,799 moving from 32-bit to 6 64-bit integers 582 00:24:14,559 --> 00:24:16,960 by the time you've got billions of rows 583 00:24:15,799 --> 00:24:19,240 in your database and you don't want your 584 00:24:16,960 --> 00:24:22,400 database to have downtime you're in real 585 00:24:19,240 --> 00:24:23,640 trouble there um yeah by the time you 586 00:24:22,400 --> 00:24:27,399 realize you have to do it your database 587 00:24:23,640 --> 00:24:29,520 is always really really big um this one 588 00:24:27,399 --> 00:24:31,279 fascinated me this was the big lesson 589 00:24:29,520 --> 00:24:33,080 that I got for working for a company 590 00:24:31,279 --> 00:24:34,720 that had just off the bat because it 591 00:24:33,080 --> 00:24:36,080 wasn't ajango project probably they just 592 00:24:34,720 --> 00:24:37,679 were like no we're going to use uids as 593 00:24:36,080 --> 00:24:40,120 our primary keys for every 594 00:24:37,679 --> 00:24:42,640 table it was a company that also dropped 595 00:24:40,120 --> 00:24:44,240 to Raw SQL all the time um quite 596 00:24:42,640 --> 00:24:45,720 usefully like it was an SQL Alchemy 597 00:24:44,240 --> 00:24:48,480 project and so that meant that you're 598 00:24:45,720 --> 00:24:50,320 it's quite easy for you to go use the OM 599 00:24:48,480 --> 00:24:51,919 when it's convenient but also when you 600 00:24:50,320 --> 00:24:54,360 have a query that of some complexity 601 00:24:51,919 --> 00:24:55,640 that you want to construct you feel the 602 00:24:54,360 --> 00:24:57,880 convention was that you feel quite 603 00:24:55,640 --> 00:25:00,159 casual about dropping down to Raw SQL to 604 00:24:57,880 --> 00:25:02,279 write it but when you're writing raw SQL 605 00:25:00,159 --> 00:25:04,000 with a modern database schema there 606 00:25:02,279 --> 00:25:05,760 there's a lot of risk of like doing the 607 00:25:04,000 --> 00:25:08,120 wrong join on the wrong identifier you 608 00:25:05,760 --> 00:25:09,880 might have several models that are like 609 00:25:08,120 --> 00:25:11,200 closely related closely stacked and you 610 00:25:09,880 --> 00:25:14,640 might join the wrong one on the wrong 611 00:25:11,200 --> 00:25:16,799 one and you know if every table has a 612 00:25:14,640 --> 00:25:18,360 row in it of ID 100 you won't notice 613 00:25:16,799 --> 00:25:22,640 that you've made this 614 00:25:18,360 --> 00:25:25,039 mistake a real advantage of your primary 615 00:25:22,640 --> 00:25:26,679 Keys being globally unique at least in 616 00:25:25,039 --> 00:25:29,080 terms of your entire application and not 617 00:25:26,679 --> 00:25:31,240 just your individual tables 618 00:25:29,080 --> 00:25:34,320 is that entire class of bugs is just 619 00:25:31,240 --> 00:25:37,600 eliminated because a developer will 620 00:25:34,320 --> 00:25:39,000 write a join and it won't get any rows 621 00:25:37,600 --> 00:25:40,200 and they'll go oh because I'm joining on 622 00:25:39,000 --> 00:25:42,799 the wrong foreign 623 00:25:40,200 --> 00:25:44,799 key and so I've got no idea how much 624 00:25:42,799 --> 00:25:47,520 time that saved us because it's one of 625 00:25:44,799 --> 00:25:49,240 those things like the bug can't happen 626 00:25:47,520 --> 00:25:51,919 so it didn't happen so I don't know how 627 00:25:49,240 --> 00:25:54,120 much time we saved but I've had that bug 628 00:25:51,919 --> 00:25:55,159 in the past and so you know working for 629 00:25:54,120 --> 00:25:57,440 a company for a couple years where that 630 00:25:55,159 --> 00:25:59,039 sort of thing just didn't happen um you 631 00:25:57,440 --> 00:26:00,960 don't really know how much it bought us 632 00:25:59,039 --> 00:26:04,480 but it was not 633 00:26:00,960 --> 00:26:07,440 nothing um if you are using uids in your 634 00:26:04,480 --> 00:26:09,480 database don't store them as strings uh 635 00:26:07,440 --> 00:26:12,240 if you store them as strings instead of 636 00:26:09,480 --> 00:26:15,039 being like what is it eight no 16 bytes 637 00:26:12,240 --> 00:26:17,559 of binary data it's about 56 bytes 638 00:26:15,039 --> 00:26:19,240 including the hyphen um a uuid with all 639 00:26:17,559 --> 00:26:20,880 capital letters is a different uuid than 640 00:26:19,240 --> 00:26:24,159 a uid with all lowercase letters and 641 00:26:20,880 --> 00:26:26,399 that's not great um and you're looking 642 00:26:24,159 --> 00:26:28,679 up collation rules every time you do any 643 00:26:26,399 --> 00:26:31,200 kind of comparison or database index or 644 00:26:28,679 --> 00:26:33,039 B tree insertion or anything which means 645 00:26:31,200 --> 00:26:35,360 effectively that when you're inserting a 646 00:26:33,039 --> 00:26:37,720 uid into your table it has to check if 647 00:26:35,360 --> 00:26:41,799 it's like a Mexican uid or a Polish uuid 648 00:26:37,720 --> 00:26:43,520 or a UT utf8 uid because you order the 649 00:26:41,799 --> 00:26:44,480 letters in a different way when actually 650 00:26:43,520 --> 00:26:47,240 you just want to be treating them as 651 00:26:44,480 --> 00:26:48,960 binary data so other piece of solid 652 00:26:47,240 --> 00:26:53,440 advice if you're using postgress and you 653 00:26:48,960 --> 00:26:56,080 want to use uids use the uid column way 654 00:26:53,440 --> 00:26:59,720 faster way more 655 00:26:56,080 --> 00:27:01,159 efficient um 656 00:26:59,720 --> 00:27:03,200 and yeah I've now built a couple 657 00:27:01,159 --> 00:27:06,760 databases just for side projects with 658 00:27:03,200 --> 00:27:07,760 using id7 uh as a primary key yeah it 659 00:27:06,760 --> 00:27:09,399 works and you don't have to think about 660 00:27:07,760 --> 00:27:10,399 it it's just like sure okay but of 661 00:27:09,399 --> 00:27:12,080 course this is something with like a 662 00:27:10,399 --> 00:27:13,480 couple million rows so it's not like you 663 00:27:12,080 --> 00:27:16,240 know my little side project that's 664 00:27:13,480 --> 00:27:19,320 collecting the temperature sensor in in 665 00:27:16,240 --> 00:27:21,039 my in my bathroom is um going to hit two 666 00:27:19,320 --> 00:27:23,880 billion rows anytime 667 00:27:21,039 --> 00:27:28,600 soon and then V4 is still useful for 668 00:27:23,880 --> 00:27:30,399 completely unguessable stuff V5 669 00:27:28,600 --> 00:27:33,159 keep an eye on V5 sometimes you might go 670 00:27:30,399 --> 00:27:35,919 oh actually like I have a use for 671 00:27:33,159 --> 00:27:38,320 this um where are 672 00:27:35,919 --> 00:27:40,720 we yeah and if you need to invent your 673 00:27:38,320 --> 00:27:43,159 own seriously consider using uid V8 674 00:27:40,720 --> 00:27:44,760 because it lets you use postgress as 675 00:27:43,159 --> 00:27:47,279 column it lets you use all these this 676 00:27:44,760 --> 00:27:49,000 tooling that you've already got and then 677 00:27:47,279 --> 00:27:50,440 finally yeah like the the only bit of 678 00:27:49,000 --> 00:27:52,200 absolute solid advice that I think I 679 00:27:50,440 --> 00:27:54,559 have for you 680 00:27:52,200 --> 00:27:56,399 is if you work for a company you might 681 00:27:54,559 --> 00:27:58,960 end up with a couple billion rows sooner 682 00:27:56,399 --> 00:28:00,960 than you think and so uu IDs might not 683 00:27:58,960 --> 00:28:02,880 be the right answer but 32-bit integers 684 00:28:00,960 --> 00:28:05,279 like you're not you're saving Yourself 685 00:28:02,880 --> 00:28:07,279 four bytes a row right or eight bytes a 686 00:28:05,279 --> 00:28:10,799 row if it's indexed as 687 00:28:07,279 --> 00:28:12,559 well for small tables that's not very 688 00:28:10,799 --> 00:28:14,200 much and for big tables You'll wish you 689 00:28:12,559 --> 00:28:16,279 had changed it so the one piece of 690 00:28:14,200 --> 00:28:17,559 advice I have for you is move away from 691 00:28:16,279 --> 00:28:19,640 move away from the small numbers for 692 00:28:17,559 --> 00:28:23,720 your primary keys and that's my entire 693 00:28:19,640 --> 00:28:26,920 slide deck I am done um we have like a 694 00:28:23,720 --> 00:28:35,709 minute for questions 695 00:28:26,920 --> 00:28:35,709 [Applause] 696 00:28:37,000 --> 00:28:41,960 thank you Tom we are unfortunately out 697 00:28:39,679 --> 00:28:44,000 of time for questions cool we have a 698 00:28:41,960 --> 00:28:47,440 gift for you as token of 699 00:28:44,000 --> 00:28:49,880 appreciation mug that is 700 00:28:47,440 --> 00:28:51,080 gorgeous thank you very much that is the 701 00:28:49,880 --> 00:28:53,960 only mugging 702 00:28:51,080 --> 00:28:55,480 allowed I'll be I'll be hanging out at 703 00:28:53,960 --> 00:28:56,679 the Kraken desk some of the time this 704 00:28:55,480 --> 00:28:59,080 weekend and otherwise just come and grab 705 00:28:56,679 --> 00:29:00,279 me at any moment if you want to chat 706 00:28:59,080 --> 00:29:03,279 thank you so much for your time thank 707 00:29:00,279 --> 00:29:03,279 you