1 00:00:00,480 --> 00:00:03,480 foreign 2 00:00:08,420 --> 00:00:13,080 have a wonderful outside and a sunshine 3 00:00:11,280 --> 00:00:16,320 and a lunch 4 00:00:13,080 --> 00:00:18,480 okay well you're back here to see 5 00:00:16,320 --> 00:00:20,420 another wonderful talk we are talking 6 00:00:18,480 --> 00:00:23,400 about what are we talking about 7 00:00:20,420 --> 00:00:25,980 multi-strategies for multi-tenancy 8 00:00:23,400 --> 00:00:26,880 strategies with Django and postgres you 9 00:00:25,980 --> 00:00:29,779 seem to know what you're talking about 10 00:00:26,880 --> 00:00:29,779 I'll let you take it away 11 00:00:32,279 --> 00:00:35,960 good afternoon Jenga nuts 12 00:00:36,360 --> 00:00:40,920 so the focus of my talk today is going 13 00:00:39,719 --> 00:00:42,300 to be 14 00:00:40,920 --> 00:00:44,820 we have decided to use Django and 15 00:00:42,300 --> 00:00:47,879 postgres on our website 16 00:00:44,820 --> 00:00:50,460 what can we do to install to ensure that 17 00:00:47,879 --> 00:00:52,620 data doesn't leak from one user to 18 00:00:50,460 --> 00:00:55,680 another user and in particular what can 19 00:00:52,620 --> 00:00:57,660 we do about developer mistakes so that 20 00:00:55,680 --> 00:01:00,000 developers don't make it possible for 21 00:00:57,660 --> 00:01:03,000 users to leak data between the between 22 00:01:00,000 --> 00:01:05,760 systems now I'm going to be talking a 23 00:01:03,000 --> 00:01:08,580 lot about tenants and tenancies in this 24 00:01:05,760 --> 00:01:11,040 in this talk if you're not aware of a 25 00:01:08,580 --> 00:01:13,619 tenancy or a tenant the idea is it's an 26 00:01:11,040 --> 00:01:15,600 isolated set of data that is using 27 00:01:13,619 --> 00:01:17,220 shared infrastructure so a really good 28 00:01:15,600 --> 00:01:18,479 example of that would be Gmail where you 29 00:01:17,220 --> 00:01:19,860 might have millions of users that are 30 00:01:18,479 --> 00:01:21,960 all using Google's computers they're all 31 00:01:19,860 --> 00:01:24,780 stored in the same servers 32 00:01:21,960 --> 00:01:26,759 but I as a Gmail user should not be able 33 00:01:24,780 --> 00:01:28,500 to see the data from another Gmail user 34 00:01:26,759 --> 00:01:30,240 in the system I have my own mailbox and 35 00:01:28,500 --> 00:01:31,680 those things should be kept completely 36 00:01:30,240 --> 00:01:34,020 separate 37 00:01:31,680 --> 00:01:35,400 in most situations a tenant will 38 00:01:34,020 --> 00:01:36,420 correspond directly with the user but 39 00:01:35,400 --> 00:01:37,500 that's not always going to be the case 40 00:01:36,420 --> 00:01:39,900 if you think about something like 41 00:01:37,500 --> 00:01:41,340 Salesforce a company will have an 42 00:01:39,900 --> 00:01:42,960 account on Salesforce and they might 43 00:01:41,340 --> 00:01:44,939 have a hundred users that are attached 44 00:01:42,960 --> 00:01:46,979 to that company within that company 45 00:01:44,939 --> 00:01:48,780 those hundred users can all see data 46 00:01:46,979 --> 00:01:50,700 relating to those customers but they 47 00:01:48,780 --> 00:01:52,200 shouldn't be able to see data from a 48 00:01:50,700 --> 00:01:55,100 different company that is also using the 49 00:01:52,200 --> 00:01:55,100 Salesforce platform 50 00:01:56,399 --> 00:02:00,659 so I'm going to talk about three major 51 00:01:58,740 --> 00:02:02,579 strategies today 52 00:02:00,659 --> 00:02:03,780 I'm going to spend most of the time on 53 00:02:02,579 --> 00:02:05,759 the last one because I think that's 54 00:02:03,780 --> 00:02:07,920 probably the most interesting 55 00:02:05,759 --> 00:02:10,259 and my assumptions here are that you are 56 00:02:07,920 --> 00:02:11,879 python developers who've used Django and 57 00:02:10,259 --> 00:02:14,040 in particular use the Django orm before 58 00:02:11,879 --> 00:02:17,459 if you've never used the Django orm but 59 00:02:14,040 --> 00:02:19,319 you have used postgres or SQL databases 60 00:02:17,459 --> 00:02:22,080 of some sort you'll probably get about 61 00:02:19,319 --> 00:02:23,940 70 but there will be some that may go 62 00:02:22,080 --> 00:02:26,340 over your head 63 00:02:23,940 --> 00:02:27,180 there's also many setup variations on 64 00:02:26,340 --> 00:02:29,940 these 65 00:02:27,180 --> 00:02:31,800 um like little tweaks you can make them 66 00:02:29,940 --> 00:02:33,480 just could be covering the broad the 67 00:02:31,800 --> 00:02:35,099 broad outlines here 68 00:02:33,480 --> 00:02:38,160 hopefully by the end you'll have picked 69 00:02:35,099 --> 00:02:40,080 up one or two new strategies 70 00:02:38,160 --> 00:02:41,879 so the first major strategy is that 71 00:02:40,080 --> 00:02:43,560 every tenant in your system gets their 72 00:02:41,879 --> 00:02:45,599 own app server and gets their own 73 00:02:43,560 --> 00:02:48,420 database on completely separate domains 74 00:02:45,599 --> 00:02:51,239 in a sense every single tenant has a 75 00:02:48,420 --> 00:02:54,060 complete new copy of your system 76 00:02:51,239 --> 00:02:56,459 this is great for code Simplicity it's 77 00:02:54,060 --> 00:02:58,560 very difficult for devs to accidentally 78 00:02:56,459 --> 00:03:00,599 make a mistake and expose data from a 79 00:02:58,560 --> 00:03:02,819 different tenant and in fact it's quite 80 00:03:00,599 --> 00:03:05,580 difficult for them to do that 81 00:03:02,819 --> 00:03:07,260 it's great for data isolation and not 82 00:03:05,580 --> 00:03:08,760 just data isolation but also performance 83 00:03:07,260 --> 00:03:10,860 isolation if you have one particular 84 00:03:08,760 --> 00:03:12,239 tenant who is hammering the database a 85 00:03:10,860 --> 00:03:13,260 lot that's not going to affect any of 86 00:03:12,239 --> 00:03:14,700 the other tenants because they're all 87 00:03:13,260 --> 00:03:16,200 using completely separate database 88 00:03:14,700 --> 00:03:17,400 service 89 00:03:16,200 --> 00:03:19,319 and 90 00:03:17,400 --> 00:03:21,300 depending on your cost model it also 91 00:03:19,319 --> 00:03:23,580 makes it possible to pass through the 92 00:03:21,300 --> 00:03:25,560 costs of Hosting directly to a tenant if 93 00:03:23,580 --> 00:03:27,300 you have some big corporates and you say 94 00:03:25,560 --> 00:03:28,620 well we're going to give this system to 95 00:03:27,300 --> 00:03:30,000 you but you're also going to pay for the 96 00:03:28,620 --> 00:03:33,239 hosting costs associated with your 97 00:03:30,000 --> 00:03:35,099 tenancy then it's possible to do that 98 00:03:33,239 --> 00:03:36,420 where it becomes painful is when you 99 00:03:35,099 --> 00:03:38,220 have lots of different tenants in the 100 00:03:36,420 --> 00:03:39,599 system every time you bring a new tenant 101 00:03:38,220 --> 00:03:40,620 on means you have to stand up new app 102 00:03:39,599 --> 00:03:43,200 servers you have to stand up new 103 00:03:40,620 --> 00:03:44,819 databases you can automate that process 104 00:03:43,200 --> 00:03:46,080 but every time you add moving parts to a 105 00:03:44,819 --> 00:03:47,940 system there's something that's going to 106 00:03:46,080 --> 00:03:50,040 go wrong and let's say for example you 107 00:03:47,940 --> 00:03:51,599 create a new AWS server well what 108 00:03:50,040 --> 00:03:53,760 happens if something goes wrong in that 109 00:03:51,599 --> 00:03:55,319 and it actually takes 40 minutes rather 110 00:03:53,760 --> 00:03:56,819 than the expected three minutes then you 111 00:03:55,319 --> 00:03:59,159 have a support call there's lots of 112 00:03:56,819 --> 00:04:01,200 things that can go wrong 113 00:03:59,159 --> 00:04:03,720 it's also not possible to easily run 114 00:04:01,200 --> 00:04:06,000 cross-tenancy queries if as a software 115 00:04:03,720 --> 00:04:08,099 developer I want to know how many users 116 00:04:06,000 --> 00:04:09,959 are across all of the tenants in the 117 00:04:08,099 --> 00:04:11,340 system I have to query every single one 118 00:04:09,959 --> 00:04:13,019 of those databases one at a time bring 119 00:04:11,340 --> 00:04:15,120 it back to Python and then aggregate the 120 00:04:13,019 --> 00:04:16,979 data together 121 00:04:15,120 --> 00:04:19,500 from an Administration perspective it's 122 00:04:16,979 --> 00:04:21,660 also a bit of a pain when I want to take 123 00:04:19,500 --> 00:04:24,020 backups I have to back up every single 124 00:04:21,660 --> 00:04:26,699 database one at a time 125 00:04:24,020 --> 00:04:28,500 and when I want to run when I want to 126 00:04:26,699 --> 00:04:30,479 push out new migrations I can't just 127 00:04:28,500 --> 00:04:32,759 push it out to one server I have to push 128 00:04:30,479 --> 00:04:33,660 out my new Django migrations to each 129 00:04:32,759 --> 00:04:35,820 server 130 00:04:33,660 --> 00:04:37,979 one at a time or in parallel but you 131 00:04:35,820 --> 00:04:40,860 still have to do it many times 132 00:04:37,979 --> 00:04:42,180 so usually this is a bad idea why do I 133 00:04:40,860 --> 00:04:44,100 mention it all 134 00:04:42,180 --> 00:04:46,320 in some circumstances it can actually be 135 00:04:44,100 --> 00:04:48,000 a good idea for example if you have a 136 00:04:46,320 --> 00:04:50,100 very small number of tenants I had a 137 00:04:48,000 --> 00:04:52,620 client once who had a custom written CRM 138 00:04:50,100 --> 00:04:54,320 that was written just for them there was 139 00:04:52,620 --> 00:04:57,120 no Assumption of any tenancy information 140 00:04:54,320 --> 00:04:59,100 whatsoever it was it was one company 141 00:04:57,120 --> 00:05:00,360 using that one system they want they 142 00:04:59,100 --> 00:05:02,160 decided they wanted to license it to 143 00:05:00,360 --> 00:05:03,960 another company well we could have spent 144 00:05:02,160 --> 00:05:05,580 months going back through the system and 145 00:05:03,960 --> 00:05:07,740 saying okay every single thing now needs 146 00:05:05,580 --> 00:05:09,180 to have an account ID attached to it or 147 00:05:07,740 --> 00:05:10,320 what we end up doing we spend a couple 148 00:05:09,180 --> 00:05:12,120 of days 149 00:05:10,320 --> 00:05:13,500 standing up a new database server a new 150 00:05:12,120 --> 00:05:14,699 application server and we were done with 151 00:05:13,500 --> 00:05:16,680 it 152 00:05:14,699 --> 00:05:18,180 it's also sometimes the only option if 153 00:05:16,680 --> 00:05:19,440 you have corporates with an I.T 154 00:05:18,180 --> 00:05:22,400 Department that has particularly strict 155 00:05:19,440 --> 00:05:22,400 security policies 156 00:05:23,280 --> 00:05:28,020 the next option is where we have a 157 00:05:25,919 --> 00:05:30,660 single Django app server backed by 158 00:05:28,020 --> 00:05:33,240 multiple databases the idea here is that 159 00:05:30,660 --> 00:05:35,699 we have custom middleware that looks at 160 00:05:33,240 --> 00:05:37,740 the incoming requests decides which 161 00:05:35,699 --> 00:05:40,380 tendency is this particular request 162 00:05:37,740 --> 00:05:43,199 relating to and then in your custom 163 00:05:40,380 --> 00:05:44,400 Django database router you remember you 164 00:05:43,199 --> 00:05:46,440 go back and you look at the middleware 165 00:05:44,400 --> 00:05:47,880 and say well what tenancy are we 166 00:05:46,440 --> 00:05:50,220 currently dealing with okay we'll route 167 00:05:47,880 --> 00:05:52,080 that now through to the correct database 168 00:05:50,220 --> 00:05:53,820 the nice thing about this is that hooks 169 00:05:52,080 --> 00:05:54,840 to do it are already there in Django you 170 00:05:53,820 --> 00:05:56,160 have to write a bit of code but you're 171 00:05:54,840 --> 00:05:58,440 not messing around with any deep 172 00:05:56,160 --> 00:06:00,479 internals or anything 173 00:05:58,440 --> 00:06:02,400 this inherits almost all of the 174 00:06:00,479 --> 00:06:04,139 advantages and the disadvantages of 175 00:06:02,400 --> 00:06:07,500 having multiple app servers but you do 176 00:06:04,139 --> 00:06:09,840 save on app server hosting costs 177 00:06:07,500 --> 00:06:12,320 and while the code is generally simple 178 00:06:09,840 --> 00:06:14,699 there are a few gotchas so for example 179 00:06:12,320 --> 00:06:16,620 Django transactions by default only 180 00:06:14,699 --> 00:06:17,759 apply to the default database which may 181 00:06:16,620 --> 00:06:20,220 not be the database they're actually 182 00:06:17,759 --> 00:06:21,840 working with and you think everything's 183 00:06:20,220 --> 00:06:23,520 working fine until you discover that 184 00:06:21,840 --> 00:06:24,660 hang on the data is inconsistent here 185 00:06:23,520 --> 00:06:26,220 because transactions weren't being 186 00:06:24,660 --> 00:06:27,660 applied to the correct database at the 187 00:06:26,220 --> 00:06:29,940 time 188 00:06:27,660 --> 00:06:31,440 the biggest problem in it it has though 189 00:06:29,940 --> 00:06:33,180 is what are you going to do with the 190 00:06:31,440 --> 00:06:35,580 situation where you push out a new 191 00:06:33,180 --> 00:06:37,259 update which includes migrations 192 00:06:35,580 --> 00:06:39,120 and you start migrating each of those 193 00:06:37,259 --> 00:06:40,860 databases one at a time and you've got 194 00:06:39,120 --> 00:06:43,199 three thousand databases to do and on 195 00:06:40,860 --> 00:06:44,220 database 1 500 something fails and you 196 00:06:43,199 --> 00:06:46,919 realize there's a problem with the 197 00:06:44,220 --> 00:06:48,840 migrations there's no rollback here now 198 00:06:46,919 --> 00:06:51,180 you have a single set of app server code 199 00:06:48,840 --> 00:06:53,520 that is somehow having to deal with 1500 200 00:06:51,180 --> 00:06:56,340 databases on the old migration schema 201 00:06:53,520 --> 00:06:57,840 and 1500 databases in the new schema 202 00:06:56,340 --> 00:07:00,080 that's going to be a bit of a pain to 203 00:06:57,840 --> 00:07:00,080 deal with 204 00:07:00,780 --> 00:07:05,400 a nicer variation of that is to use a 205 00:07:03,360 --> 00:07:07,740 schema per tenant now if you're not 206 00:07:05,400 --> 00:07:09,720 familiar with postgres schemas think of 207 00:07:07,740 --> 00:07:12,539 them as postgres equivalent of 208 00:07:09,720 --> 00:07:14,520 namespaces within a postgres database 209 00:07:12,539 --> 00:07:16,919 you have many little smaller databases 210 00:07:14,520 --> 00:07:18,539 that we call schemas instead those 211 00:07:16,919 --> 00:07:20,639 schemas are all completely independent 212 00:07:18,539 --> 00:07:22,500 of each other and in fact the table 213 00:07:20,639 --> 00:07:24,300 definitions inside each schema could be 214 00:07:22,500 --> 00:07:26,099 different now we probably don't want to 215 00:07:24,300 --> 00:07:27,539 do that it's theoretically possible 216 00:07:26,099 --> 00:07:29,340 instead we want to have the same 217 00:07:27,539 --> 00:07:32,400 database definitions across all of the 218 00:07:29,340 --> 00:07:34,199 schemas but instead inside each schema 219 00:07:32,400 --> 00:07:36,300 is just the data for one particular 220 00:07:34,199 --> 00:07:37,860 tenant 221 00:07:36,300 --> 00:07:39,419 and again the idea here is that 222 00:07:37,860 --> 00:07:41,759 middleware will look at the incoming 223 00:07:39,419 --> 00:07:43,380 request and on the current Django 224 00:07:41,759 --> 00:07:46,340 database connection it will set the 225 00:07:43,380 --> 00:07:46,340 current schema to use 226 00:07:47,280 --> 00:07:51,300 again the majority of your code can 227 00:07:49,440 --> 00:07:52,860 simply pretend that tenancies do not 228 00:07:51,300 --> 00:07:55,580 exist the middleware has already taken 229 00:07:52,860 --> 00:07:55,580 care of it for you 230 00:07:55,620 --> 00:08:01,919 um and unlike the last setup because all 231 00:07:59,039 --> 00:08:03,180 of your databases sorry because all of 232 00:08:01,919 --> 00:08:05,160 your schemers sit within the one 233 00:08:03,180 --> 00:08:09,680 database transactions work as you would 234 00:08:05,160 --> 00:08:09,680 expect transactions cover across schemas 235 00:08:10,380 --> 00:08:14,400 it's 236 00:08:11,880 --> 00:08:16,919 a bit better for very large numbers of 237 00:08:14,400 --> 00:08:17,639 schemas you can dump the database as a 238 00:08:16,919 --> 00:08:19,979 whole 239 00:08:17,639 --> 00:08:22,080 sort of the big problem here is that 240 00:08:19,979 --> 00:08:23,699 when you use some standard postgres 241 00:08:22,080 --> 00:08:25,560 tools PG dump you'll quickly discover 242 00:08:23,699 --> 00:08:27,120 that postgres has some internal limits 243 00:08:25,560 --> 00:08:29,520 on the number of table locks that can be 244 00:08:27,120 --> 00:08:31,680 taken at once now that can be increased 245 00:08:29,520 --> 00:08:32,940 by changing the postgres settings but 246 00:08:31,680 --> 00:08:34,680 you're forever going to be chasing your 247 00:08:32,940 --> 00:08:36,360 tail if you have to increase this and 248 00:08:34,680 --> 00:08:38,039 increase this and increase this every 249 00:08:36,360 --> 00:08:39,659 time you do that you're creating fixed 250 00:08:38,039 --> 00:08:42,800 you're setting aside fixed memory 251 00:08:39,659 --> 00:08:42,800 buffers to deal with that 252 00:08:43,440 --> 00:08:46,500 and cross-tenancy queries are now 253 00:08:45,240 --> 00:08:48,540 possible 254 00:08:46,500 --> 00:08:49,680 because this because your data all sits 255 00:08:48,540 --> 00:08:52,140 in the one database you can actually 256 00:08:49,680 --> 00:08:53,700 query multiple schemas at once 257 00:08:52,140 --> 00:08:55,380 it's a bit ugly because what you 258 00:08:53,700 --> 00:08:57,360 effectively do is Select from this 259 00:08:55,380 --> 00:08:58,800 tenancy Union this tenancy Union this 260 00:08:57,360 --> 00:09:01,080 one Union this one Union this one you 261 00:08:58,800 --> 00:09:02,580 end if you've got 10 000 tenants that's 262 00:09:01,080 --> 00:09:04,260 a really really big query you're going 263 00:09:02,580 --> 00:09:06,600 to be running 264 00:09:04,260 --> 00:09:09,420 and the administration deployment 265 00:09:06,600 --> 00:09:10,860 situation is a little better however 266 00:09:09,420 --> 00:09:12,899 again as I mentioned there are 267 00:09:10,860 --> 00:09:15,000 transaction limits when you try and run 268 00:09:12,899 --> 00:09:16,740 your migrations you think okay I'll put 269 00:09:15,000 --> 00:09:18,060 everything in one big transaction and 270 00:09:16,740 --> 00:09:20,220 I'll run the migrations across all my 271 00:09:18,060 --> 00:09:21,720 schemas until you get halfway through 272 00:09:20,220 --> 00:09:23,820 and then postgres says I've run out of 273 00:09:21,720 --> 00:09:27,019 transaction locks so I've run out of 274 00:09:23,820 --> 00:09:27,019 table locks for your transaction 275 00:09:27,120 --> 00:09:30,120 if you try and do this without 276 00:09:28,260 --> 00:09:31,440 transaction at all then we're back to 277 00:09:30,120 --> 00:09:33,060 the previous problem where you could get 278 00:09:31,440 --> 00:09:34,560 halfway through half your schemes have 279 00:09:33,060 --> 00:09:37,160 been updated half your schemers haven't 280 00:09:34,560 --> 00:09:37,160 been updated 281 00:09:38,459 --> 00:09:43,019 if you've not used it before connection 282 00:09:40,200 --> 00:09:47,160 pooling is a way of sharing database 283 00:09:43,019 --> 00:09:49,440 connections between multiple requests so 284 00:09:47,160 --> 00:09:51,540 normally in the simplest configuration 285 00:09:49,440 --> 00:09:52,680 you have a request come in Django 286 00:09:51,540 --> 00:09:54,360 creates a connection to the database 287 00:09:52,680 --> 00:09:56,100 when the request is finished processing 288 00:09:54,360 --> 00:09:57,420 it tears down that connection that 289 00:09:56,100 --> 00:09:59,580 creating connections and tearing down 290 00:09:57,420 --> 00:10:01,740 connections that has overhead that can 291 00:09:59,580 --> 00:10:02,880 add up over times so the idea behind 292 00:10:01,740 --> 00:10:04,860 connection pulling is you generally have 293 00:10:02,880 --> 00:10:07,680 a proxy sitting between your application 294 00:10:04,860 --> 00:10:09,180 and the actual postgres database and it 295 00:10:07,680 --> 00:10:11,820 will hold those connections open forever 296 00:10:09,180 --> 00:10:13,200 so that the postgres overhead of 297 00:10:11,820 --> 00:10:14,760 starting and stopping connections 298 00:10:13,200 --> 00:10:16,680 doesn't exist 299 00:10:14,760 --> 00:10:18,240 depending on your connection pooling 300 00:10:16,680 --> 00:10:19,800 setup it's also possible to share 301 00:10:18,240 --> 00:10:21,180 connections between requests 302 00:10:19,800 --> 00:10:23,100 concurrently 303 00:10:21,180 --> 00:10:25,820 it's not a common setup but it is a 304 00:10:23,100 --> 00:10:25,820 possible setup 305 00:10:26,399 --> 00:10:30,360 because we are setting the current 306 00:10:28,500 --> 00:10:32,160 schema on the connection our connections 307 00:10:30,360 --> 00:10:34,019 are now stateful if you try an 308 00:10:32,160 --> 00:10:36,000 interspersed connection interspersed 309 00:10:34,019 --> 00:10:37,680 requests to the database from multiple 310 00:10:36,000 --> 00:10:38,940 requests simultaneously they're going to 311 00:10:37,680 --> 00:10:41,959 clobber each other and you'll end up 312 00:10:38,940 --> 00:10:41,959 using the wrong schema 313 00:10:42,660 --> 00:10:46,860 there is a third-party package called 314 00:10:44,279 --> 00:10:48,779 Django tenants which basically does all 315 00:10:46,860 --> 00:10:50,880 of this for you all of the middleware 316 00:10:48,779 --> 00:10:52,560 for routing towards schemas it looks at 317 00:10:50,880 --> 00:10:54,180 the subdomain and says well based on the 318 00:10:52,560 --> 00:10:55,740 subdomain this is the schema you want to 319 00:10:54,180 --> 00:10:57,720 use the fact that there's third-party 320 00:10:55,740 --> 00:11:00,000 code to do this is quite nice it also 321 00:10:57,720 --> 00:11:01,500 has the ability to run your migrations 322 00:11:00,000 --> 00:11:03,540 in parallel so rather than each 323 00:11:01,500 --> 00:11:07,880 upgrading each schema one at a time 324 00:11:03,540 --> 00:11:07,880 it'll paralyze them and do 30 at a time 325 00:11:08,100 --> 00:11:12,200 probably the most interesting setup 326 00:11:10,320 --> 00:11:14,820 though and the one that you've probably 327 00:11:12,200 --> 00:11:17,100 used before and most developers would 328 00:11:14,820 --> 00:11:18,420 use intuitively is where you have a 329 00:11:17,100 --> 00:11:20,100 single app server you have a single 330 00:11:18,420 --> 00:11:22,440 database you have a single set of tables 331 00:11:20,100 --> 00:11:24,720 and you share those tables between the 332 00:11:22,440 --> 00:11:26,579 different tenants within each table you 333 00:11:24,720 --> 00:11:28,800 record who does this particular piece of 334 00:11:26,579 --> 00:11:30,839 data belong to 335 00:11:28,800 --> 00:11:32,160 now in order to keep the data separate 336 00:11:30,839 --> 00:11:34,620 between the tenants there's a bunch of 337 00:11:32,160 --> 00:11:35,700 strategies that we can follow so these 338 00:11:34,620 --> 00:11:37,200 are the three things that we're going to 339 00:11:35,700 --> 00:11:38,820 be looking at today and the nice thing 340 00:11:37,200 --> 00:11:40,079 about these three is they're somewhat 341 00:11:38,820 --> 00:11:42,300 independent of each other and you can 342 00:11:40,079 --> 00:11:44,279 adopt them incrementally so if you had 343 00:11:42,300 --> 00:11:46,019 an existing system you can actually add 344 00:11:44,279 --> 00:11:48,500 these onto an existing system and get a 345 00:11:46,019 --> 00:11:48,500 bit more security 346 00:11:48,720 --> 00:11:52,320 there are some common characteristics 347 00:11:49,980 --> 00:11:54,060 that they all have firstly the code is 348 00:11:52,320 --> 00:11:55,740 not as simple as the previous methods 349 00:11:54,060 --> 00:11:56,760 you always have to bear in the back of 350 00:11:55,740 --> 00:11:58,800 your mind that there are multiple 351 00:11:56,760 --> 00:12:00,779 tenants in the database and all these 352 00:11:58,800 --> 00:12:02,459 things can help avoid make mistakes it's 353 00:12:00,779 --> 00:12:04,200 still possible to make mistakes and 354 00:12:02,459 --> 00:12:07,200 accidentally expose data 355 00:12:04,200 --> 00:12:09,560 the isolation is okay but it's not great 356 00:12:07,200 --> 00:12:12,839 again coding mistakes can have a problem 357 00:12:09,560 --> 00:12:14,519 however if you have thousands of tenants 358 00:12:12,839 --> 00:12:16,980 then it's basically the only option 359 00:12:14,519 --> 00:12:19,440 you're going to have 360 00:12:16,980 --> 00:12:20,940 cross-tenancy queries are possible in 361 00:12:19,440 --> 00:12:23,279 fact the problem is accidentally doing 362 00:12:20,940 --> 00:12:25,019 cross-tenancy queries 363 00:12:23,279 --> 00:12:27,839 and the administration and deployment 364 00:12:25,019 --> 00:12:29,339 situation is fine everything sits within 365 00:12:27,839 --> 00:12:32,339 the one database your the standard 366 00:12:29,339 --> 00:12:35,060 Django workflows that you would follow 367 00:12:32,339 --> 00:12:35,060 um apply here 368 00:12:35,100 --> 00:12:40,560 so here's an example database diagram 369 00:12:37,680 --> 00:12:42,540 for a task tracking system 370 00:12:40,560 --> 00:12:44,459 and down the bottom is some Associated 371 00:12:42,540 --> 00:12:46,260 model code the idea here is that we're 372 00:12:44,459 --> 00:12:48,540 going to have custom model managers and 373 00:12:46,260 --> 00:12:50,339 query sets which have a standardized 374 00:12:48,540 --> 00:12:52,560 interface on every single model in the 375 00:12:50,339 --> 00:12:54,660 system that says limit the data that I'm 376 00:12:52,560 --> 00:12:56,040 selecting out of this table to just one 377 00:12:54,660 --> 00:12:58,079 particular tenant 378 00:12:56,040 --> 00:12:59,940 now one of the nice consequences of 379 00:12:58,079 --> 00:13:01,320 standardizing on one thing across all of 380 00:12:59,940 --> 00:13:03,300 the tables in your system is that if 381 00:13:01,320 --> 00:13:04,560 you're using class-based views The 382 00:13:03,300 --> 00:13:06,240 class-based View doesn't need to know 383 00:13:04,560 --> 00:13:08,459 what model it's dealing with it can 384 00:13:06,240 --> 00:13:10,560 always call filter tenant and the code 385 00:13:08,459 --> 00:13:12,779 will just work so you can have your base 386 00:13:10,560 --> 00:13:15,120 classes which apply this tenancy 387 00:13:12,779 --> 00:13:18,860 filtering automatically and it for most 388 00:13:15,120 --> 00:13:18,860 of the system it just comes out for free 389 00:13:20,220 --> 00:13:25,019 if a table doesn't join directly to your 390 00:13:22,500 --> 00:13:26,820 table to to the account table which is 391 00:13:25,019 --> 00:13:28,079 the tenant table in this particular 392 00:13:26,820 --> 00:13:30,779 example 393 00:13:28,079 --> 00:13:32,519 then we just use standard orm joins in 394 00:13:30,779 --> 00:13:34,920 order to get back to that table 395 00:13:32,519 --> 00:13:37,860 now as a practical matter 396 00:13:34,920 --> 00:13:39,420 the middleware has a copy of the request 397 00:13:37,860 --> 00:13:40,680 and from that request it's able to 398 00:13:39,420 --> 00:13:42,480 figure out which tenant we're supposed 399 00:13:40,680 --> 00:13:43,920 to be dealing with or maybe it looks at 400 00:13:42,480 --> 00:13:45,180 the user in the request and from that 401 00:13:43,920 --> 00:13:47,040 figures out which tenant we're supposed 402 00:13:45,180 --> 00:13:48,959 to be dealing with and since the view 403 00:13:47,040 --> 00:13:51,000 has access to the request it now has 404 00:13:48,959 --> 00:13:52,920 access to the tenant ID 405 00:13:51,000 --> 00:13:54,300 but not everything else in the system is 406 00:13:52,920 --> 00:13:56,579 going to have the tenant ID and where 407 00:13:54,300 --> 00:13:58,620 now we could pass that tenant ID down 408 00:13:56,579 --> 00:14:00,000 the system down the call chains that's 409 00:13:58,620 --> 00:14:01,500 going to get really painful really 410 00:14:00,000 --> 00:14:03,360 quickly 411 00:14:01,500 --> 00:14:04,920 um just as a pragmatic thing you 412 00:14:03,360 --> 00:14:06,839 probably want to store the current 413 00:14:04,920 --> 00:14:08,639 tenant ID in the middleware so that you 414 00:14:06,839 --> 00:14:10,019 don't have to pass it everywhere through 415 00:14:08,639 --> 00:14:11,940 the system 416 00:14:10,019 --> 00:14:14,339 and keep track of that in thread local 417 00:14:11,940 --> 00:14:16,980 storage now if you've not used it before 418 00:14:14,339 --> 00:14:19,200 thread local storage is 419 00:14:16,980 --> 00:14:20,940 essentially global data that only 420 00:14:19,200 --> 00:14:22,079 applies to the current thread so if 421 00:14:20,940 --> 00:14:24,540 you're running on a multi-threaded 422 00:14:22,079 --> 00:14:27,240 server you won't clobber data that 423 00:14:24,540 --> 00:14:29,459 relates to it to the wrong thread the 424 00:14:27,240 --> 00:14:31,800 reason we want to do this is that if 425 00:14:29,459 --> 00:14:33,839 you're not aware Django only creates one 426 00:14:31,800 --> 00:14:35,880 copy of each middleware object on 427 00:14:33,839 --> 00:14:38,700 Startup so you can't store the data in 428 00:14:35,880 --> 00:14:40,980 the middleware object itself or it will 429 00:14:38,700 --> 00:14:42,660 get clobbered by whatever the whatever 430 00:14:40,980 --> 00:14:45,980 the next request that happens to be 431 00:14:42,660 --> 00:14:45,980 simultaneously processed is 432 00:14:47,100 --> 00:14:51,660 so there are some caveats about this 433 00:14:49,740 --> 00:14:54,180 whoops wrong direction 434 00:14:51,660 --> 00:14:56,399 number one if you write custom SQL 435 00:14:54,180 --> 00:14:57,660 you're on your own there's no help you 436 00:14:56,399 --> 00:14:59,699 need to make sure that you don't make 437 00:14:57,660 --> 00:15:02,880 mistakes with any custom or irm code 438 00:14:59,699 --> 00:15:05,459 sorry any custom SQL code 439 00:15:02,880 --> 00:15:06,959 having the joins automatically applied 440 00:15:05,459 --> 00:15:08,760 to every single table can actually 441 00:15:06,959 --> 00:15:10,320 change the meaning of some of your 442 00:15:08,760 --> 00:15:12,360 aggregate queries 443 00:15:10,320 --> 00:15:14,399 it's not a common circumstance but you 444 00:15:12,360 --> 00:15:15,480 will run into it when you when you when 445 00:15:14,399 --> 00:15:17,279 you write your queries and say where's 446 00:15:15,480 --> 00:15:19,380 the data not coming back how I expected 447 00:15:17,279 --> 00:15:21,779 it to which means that sometimes you 448 00:15:19,380 --> 00:15:23,399 will need to bypass and not call 449 00:15:21,779 --> 00:15:25,139 filtered tenant and in fact apply the 450 00:15:23,399 --> 00:15:28,199 tenant filtering manually even though 451 00:15:25,139 --> 00:15:31,139 you are using the orm 452 00:15:28,199 --> 00:15:33,420 a more subtle problem is that by default 453 00:15:31,139 --> 00:15:35,100 if I have a model with a foreign key 454 00:15:33,420 --> 00:15:37,860 that refers to another model 455 00:15:35,100 --> 00:15:39,660 that second model also creates a reverse 456 00:15:37,860 --> 00:15:40,860 foreign key in Django back to the 457 00:15:39,660 --> 00:15:42,720 original model 458 00:15:40,860 --> 00:15:44,279 now what this means is that if I have a 459 00:15:42,720 --> 00:15:45,779 shared table so for example I have a 460 00:15:44,279 --> 00:15:47,279 list of countries I don't need to 461 00:15:45,779 --> 00:15:48,660 duplicate that between the all the 462 00:15:47,279 --> 00:15:49,920 tenants because the list of countries is 463 00:15:48,660 --> 00:15:51,839 fixed 464 00:15:49,920 --> 00:15:54,120 and I have users who all have a country 465 00:15:51,839 --> 00:15:55,920 then by following user.country.user 466 00:15:54,120 --> 00:15:57,420 underscore set that user underscore set 467 00:15:55,920 --> 00:15:59,459 is the reverse foreign key that jenko 468 00:15:57,420 --> 00:16:01,860 creates automatically I'm not just 469 00:15:59,459 --> 00:16:03,300 accessing users that relate to that 470 00:16:01,860 --> 00:16:05,639 country for the tenant that I started 471 00:16:03,300 --> 00:16:07,800 with I'm accessing users from the entire 472 00:16:05,639 --> 00:16:09,959 system that relate to that country so 473 00:16:07,800 --> 00:16:12,000 you need to be careful about that 474 00:16:09,959 --> 00:16:14,040 if you want to be strict then what you 475 00:16:12,000 --> 00:16:15,779 can do is turn off the reverse current 476 00:16:14,040 --> 00:16:17,940 foreign Keys you need to add on a 477 00:16:15,779 --> 00:16:20,300 foreign Key by Foreign key basis or user 478 00:16:17,940 --> 00:16:22,320 base class that does that by default 479 00:16:20,300 --> 00:16:25,260 and again if you want to be strict 480 00:16:22,320 --> 00:16:26,820 there's a system of checks called Django 481 00:16:25,260 --> 00:16:28,139 system checks if you've not used those 482 00:16:26,820 --> 00:16:30,180 before go look up the manual they're 483 00:16:28,139 --> 00:16:33,000 fantastic the code that will run on 484 00:16:30,180 --> 00:16:35,220 every Django server startup what you can 485 00:16:33,000 --> 00:16:36,959 do is introspect all of the apps in the 486 00:16:35,220 --> 00:16:38,220 system get the models for those apps 487 00:16:36,959 --> 00:16:39,540 look at the fields in each of those 488 00:16:38,220 --> 00:16:41,040 models and make sure that every single 489 00:16:39,540 --> 00:16:42,480 foreign key field has had the reverse 490 00:16:41,040 --> 00:16:45,920 foreign key disabled that way you can't 491 00:16:42,480 --> 00:16:45,920 accidentally Escape out of it 492 00:16:46,139 --> 00:16:50,639 but the biggest problem of all is that 493 00:16:48,300 --> 00:16:52,259 the develop that is that the developer 494 00:16:50,639 --> 00:16:54,600 could simply forget to call filter 495 00:16:52,259 --> 00:16:57,199 tenant and nothing's going to stop them 496 00:16:54,600 --> 00:16:57,199 from doing that 497 00:16:57,420 --> 00:17:01,519 so oops what can we do about that 498 00:17:02,220 --> 00:17:05,880 what we can do is we can remember in 499 00:17:04,319 --> 00:17:07,439 each query set where the filter tenant 500 00:17:05,880 --> 00:17:10,380 was actually called and if it wasn't 501 00:17:07,439 --> 00:17:12,540 called then we will log a warning or 502 00:17:10,380 --> 00:17:13,919 raise an exception when the query set 503 00:17:12,540 --> 00:17:16,020 actually goes to fetch data back from 504 00:17:13,919 --> 00:17:17,819 the database 505 00:17:16,020 --> 00:17:19,679 um or you do something else like in 506 00:17:17,819 --> 00:17:21,299 debug you log a warning but in 507 00:17:19,679 --> 00:17:22,799 production and on CR you actually throw 508 00:17:21,299 --> 00:17:24,120 a hard error so that it can't escape 509 00:17:22,799 --> 00:17:26,100 it's a bit nicer if you're doing 510 00:17:24,120 --> 00:17:27,720 refactoring because it can be a real 511 00:17:26,100 --> 00:17:30,540 pain sometimes when you have to 512 00:17:27,720 --> 00:17:32,460 constantly stop and start 513 00:17:30,540 --> 00:17:34,200 if you do this though you'll quickly run 514 00:17:32,460 --> 00:17:36,600 into the difference between Django's 515 00:17:34,200 --> 00:17:38,520 base manager and its default manager if 516 00:17:36,600 --> 00:17:40,020 you've ever used soft deletes in Django 517 00:17:38,520 --> 00:17:40,980 you'll have run into the same situation 518 00:17:40,020 --> 00:17:42,840 before 519 00:17:40,980 --> 00:17:45,059 the default manager in Django is the 520 00:17:42,840 --> 00:17:46,740 manager that you normally use 521 00:17:45,059 --> 00:17:48,780 Django also has the notion of a base 522 00:17:46,740 --> 00:17:51,240 manager which is you can think of that 523 00:17:48,780 --> 00:17:52,799 as the master manager that has access to 524 00:17:51,240 --> 00:17:55,020 everything in the system 525 00:17:52,799 --> 00:17:57,480 Django uses this for things like refresh 526 00:17:55,020 --> 00:17:59,700 from DB if you want to get the data back 527 00:17:57,480 --> 00:18:01,860 from a record it uses it for migrations 528 00:17:59,700 --> 00:18:05,940 it uses for fetching foreign key 529 00:18:01,860 --> 00:18:08,160 relationships and you need to provide a 530 00:18:05,940 --> 00:18:09,840 way for Django to 531 00:18:08,160 --> 00:18:11,640 fetch data from the database without 532 00:18:09,840 --> 00:18:13,140 having called this because Django is not 533 00:18:11,640 --> 00:18:16,820 going to call this 534 00:18:13,140 --> 00:18:16,820 call filter tenant I mean 535 00:18:19,080 --> 00:18:23,160 now normally when we design a database 536 00:18:20,760 --> 00:18:25,020 we like to normalize everything 537 00:18:23,160 --> 00:18:27,900 so if we have a look at our task 538 00:18:25,020 --> 00:18:29,760 management task management system again 539 00:18:27,900 --> 00:18:32,640 each of those tables has a reference to 540 00:18:29,760 --> 00:18:34,620 another table but only project and user 541 00:18:32,640 --> 00:18:36,539 have a reference back to the account 542 00:18:34,620 --> 00:18:38,780 that is the tenant table 543 00:18:36,539 --> 00:18:41,220 if we needed to filter a subtask 544 00:18:38,780 --> 00:18:43,200 according to the current account we join 545 00:18:41,220 --> 00:18:45,419 the subtask to task join task to project 546 00:18:43,200 --> 00:18:46,919 join project to account and 547 00:18:45,419 --> 00:18:49,340 and then check that we're using the 548 00:18:46,919 --> 00:18:49,340 correct account 549 00:18:50,160 --> 00:18:54,480 what we can also do is denormalize that 550 00:18:52,020 --> 00:18:57,120 we can copy the tenant ID into every 551 00:18:54,480 --> 00:18:59,039 single table in the system 552 00:18:57,120 --> 00:19:01,260 why would we want to do that well 553 00:18:59,039 --> 00:19:02,700 firstly it simplifies our code there's 554 00:19:01,260 --> 00:19:04,380 no joins what you have to do anymore in 555 00:19:02,700 --> 00:19:06,360 order to filter by tenant which means 556 00:19:04,380 --> 00:19:09,120 that's aggregate query problems that I 557 00:19:06,360 --> 00:19:11,100 was talking about before no longer reply 558 00:19:09,120 --> 00:19:13,380 and it's also desirable for row level 559 00:19:11,100 --> 00:19:16,080 security which we'll talk about next 560 00:19:13,380 --> 00:19:18,260 it doesn't meaningfully 561 00:19:16,080 --> 00:19:20,760 affect anything else 562 00:19:18,260 --> 00:19:21,840 there may be a performance impact but 563 00:19:20,760 --> 00:19:23,460 that performance impact could be 564 00:19:21,840 --> 00:19:25,080 positive or could be negative depending 565 00:19:23,460 --> 00:19:26,780 on the ratio of rights to reads in the 566 00:19:25,080 --> 00:19:30,480 system 567 00:19:26,780 --> 00:19:33,200 it's harder to move data between tenants 568 00:19:30,480 --> 00:19:35,280 no longer are we just updating so let's 569 00:19:33,200 --> 00:19:37,740 assume for example we wanted to merge 570 00:19:35,280 --> 00:19:39,419 two accounts no longer are we taking all 571 00:19:37,740 --> 00:19:41,880 the projects for one account and 572 00:19:39,419 --> 00:19:43,380 pointing those to the to the new merged 573 00:19:41,880 --> 00:19:45,120 account instead we have to do the 574 00:19:43,380 --> 00:19:47,100 projects and the sub projects and the 575 00:19:45,120 --> 00:19:48,660 tasks sorry the projects and the tasks 576 00:19:47,100 --> 00:19:49,559 and the subtasks there's a lot more 577 00:19:48,660 --> 00:19:51,600 records that we're going to have to 578 00:19:49,559 --> 00:19:52,860 update there's a potential to introduce 579 00:19:51,600 --> 00:19:54,539 errors here 580 00:19:52,860 --> 00:19:56,100 let's say we forgot to do a transaction 581 00:19:54,539 --> 00:19:58,380 and we're doing that merge and we got 582 00:19:56,100 --> 00:19:59,520 halfway through then now we have data 583 00:19:58,380 --> 00:20:01,919 that's half 584 00:19:59,520 --> 00:20:03,360 changed and data that's half not changed 585 00:20:01,919 --> 00:20:05,280 and some of those foreign keys are not 586 00:20:03,360 --> 00:20:08,299 going to be consistent with the other 587 00:20:05,280 --> 00:20:08,299 data that's in the system 588 00:20:08,640 --> 00:20:13,559 so it's possible for coding mistakes to 589 00:20:11,539 --> 00:20:16,140 introduce problems and you have your 590 00:20:13,559 --> 00:20:17,760 standard denormalization caveats um 591 00:20:16,140 --> 00:20:20,160 your database will be bigger because 592 00:20:17,760 --> 00:20:21,900 we've duplicated data in particular your 593 00:20:20,160 --> 00:20:22,940 indexes you can have a lot more indexes 594 00:20:21,900 --> 00:20:25,320 in the system 595 00:20:22,940 --> 00:20:27,780 that means that every time that you 596 00:20:25,320 --> 00:20:29,880 update an account record 597 00:20:27,780 --> 00:20:32,280 every single record in the system that 598 00:20:29,880 --> 00:20:34,080 has an index associated with it that 599 00:20:32,280 --> 00:20:35,580 points to the account is potentially 600 00:20:34,080 --> 00:20:37,500 also going to have to be Rewritten by 601 00:20:35,580 --> 00:20:39,059 postgres which means no longer is 602 00:20:37,500 --> 00:20:41,160 postgres just going to have to update 603 00:20:39,059 --> 00:20:43,500 the projects that relate to an account 604 00:20:41,160 --> 00:20:45,240 for their indexes it's also going to 605 00:20:43,500 --> 00:20:47,840 have to update the tasks and the 606 00:20:45,240 --> 00:20:47,840 subtasks 607 00:20:50,460 --> 00:20:54,660 there's a third-party package called 608 00:20:52,320 --> 00:20:56,840 Django multi-tenant which assumes that 609 00:20:54,660 --> 00:20:59,820 you have already denormalized your data 610 00:20:56,840 --> 00:21:02,400 and it automatically adds where Clauses 611 00:20:59,820 --> 00:21:04,320 onto every irm query that you do so that 612 00:21:02,400 --> 00:21:05,880 it assumes well the tenant ID is already 613 00:21:04,320 --> 00:21:07,799 in here rather than you having to 614 00:21:05,880 --> 00:21:09,360 manually add the where Clauses I'll just 615 00:21:07,799 --> 00:21:11,580 do it for you 616 00:21:09,360 --> 00:21:14,160 it also provides a replacement foreign 617 00:21:11,580 --> 00:21:16,080 key class that avoids the problem I 618 00:21:14,160 --> 00:21:17,820 spoke about before about escaping from 619 00:21:16,080 --> 00:21:19,520 your current tenancy via the foreign 620 00:21:17,820 --> 00:21:22,260 keys 621 00:21:19,520 --> 00:21:23,760 and also worth mentioning the reason for 622 00:21:22,260 --> 00:21:27,000 the existence of that project is that 623 00:21:23,760 --> 00:21:28,740 the company behind it cytus they make a 624 00:21:27,000 --> 00:21:30,840 postgres extension which will 625 00:21:28,740 --> 00:21:32,280 automatically look at the tenant ID and 626 00:21:30,840 --> 00:21:33,960 use that to Shard your data between 627 00:21:32,280 --> 00:21:35,340 multiple servers so from your 628 00:21:33,960 --> 00:21:37,620 perspective it looks like a single 629 00:21:35,340 --> 00:21:40,320 postgres server but in fact it could 630 00:21:37,620 --> 00:21:41,460 have 20 or 30 servers behind it and your 631 00:21:40,320 --> 00:21:43,020 software doesn't have to change at all 632 00:21:41,460 --> 00:21:44,460 even better Microsoft bought that 633 00:21:43,020 --> 00:21:46,380 company and now it's like a couple of 634 00:21:44,460 --> 00:21:48,720 Clicks in Azure when you set up a new 635 00:21:46,380 --> 00:21:49,919 database to get that you also pay a lot 636 00:21:48,720 --> 00:21:51,720 more because you're paying for many more 637 00:21:49,919 --> 00:21:54,080 servers which is why Microsoft want you 638 00:21:51,720 --> 00:21:54,080 to do that 639 00:21:55,500 --> 00:21:59,760 and finally posters row level security 640 00:21:57,179 --> 00:22:02,100 so postgres has the ability to add 641 00:21:59,760 --> 00:22:04,740 Security checks on a row by row basis 642 00:22:02,100 --> 00:22:07,380 now that's not just select that's also 643 00:22:04,740 --> 00:22:09,559 updates and inserts and deletes and it 644 00:22:07,380 --> 00:22:13,679 will validate that every time you run a 645 00:22:09,559 --> 00:22:15,539 query against a table that some check 646 00:22:13,679 --> 00:22:18,480 the check being the SQL code that you 647 00:22:15,539 --> 00:22:21,120 define that condition is met 648 00:22:18,480 --> 00:22:23,039 now that check itself could do selects 649 00:22:21,120 --> 00:22:24,600 you probably don't want to do that 650 00:22:23,039 --> 00:22:25,740 because it's really easy to blow up the 651 00:22:24,600 --> 00:22:27,720 number of queries that you're going to 652 00:22:25,740 --> 00:22:29,460 do in your um 653 00:22:27,720 --> 00:22:31,380 in your system and have Major 654 00:22:29,460 --> 00:22:32,760 Performance issues so you probably want 655 00:22:31,380 --> 00:22:34,799 to keep it to a relatively simple 656 00:22:32,760 --> 00:22:36,659 expression 657 00:22:34,799 --> 00:22:38,340 if you look at row level security 658 00:22:36,659 --> 00:22:40,620 tutorials on the internet you'll see 659 00:22:38,340 --> 00:22:41,700 that almost all of them use postgres 660 00:22:40,620 --> 00:22:44,220 users 661 00:22:41,700 --> 00:22:46,080 that's probably something that I think 662 00:22:44,220 --> 00:22:47,760 is a bad idea the problem with postgres 663 00:22:46,080 --> 00:22:49,440 users is that they're not specific to a 664 00:22:47,760 --> 00:22:51,480 database they're Global across all 665 00:22:49,440 --> 00:22:52,860 databases on a postgres server which 666 00:22:51,480 --> 00:22:54,480 means if you have your database and 667 00:22:52,860 --> 00:22:56,100 someone else has another database that's 668 00:22:54,480 --> 00:22:57,720 using the same server and you're both 669 00:22:56,100 --> 00:22:58,500 creating and deleting users there's a 670 00:22:57,720 --> 00:22:59,760 really good chance you're going to be 671 00:22:58,500 --> 00:23:02,159 clobbering each other and you're going 672 00:22:59,760 --> 00:23:03,539 to end up having a bad time 673 00:23:02,159 --> 00:23:05,340 and that assumes you even have 674 00:23:03,539 --> 00:23:06,720 permission to create users if you have 675 00:23:05,340 --> 00:23:07,860 CIS admins that are managing your 676 00:23:06,720 --> 00:23:08,880 databases they're probably not going to 677 00:23:07,860 --> 00:23:10,559 give you those permissions they're going 678 00:23:08,880 --> 00:23:12,679 to say no we gave you a user use that 679 00:23:10,559 --> 00:23:12,679 one 680 00:23:13,860 --> 00:23:18,179 so what we can do instead is that again 681 00:23:16,380 --> 00:23:20,039 middleware is a common theme running 682 00:23:18,179 --> 00:23:22,200 through this here we can have some 683 00:23:20,039 --> 00:23:25,080 middleware set the current tenant ID on 684 00:23:22,200 --> 00:23:26,940 the connection and then in the row level 685 00:23:25,080 --> 00:23:29,159 security checks we can look at what the 686 00:23:26,940 --> 00:23:31,380 current tenant ID is and make sure that 687 00:23:29,159 --> 00:23:33,480 that matches the data that's we're 688 00:23:31,380 --> 00:23:35,280 operating against now in the example 689 00:23:33,480 --> 00:23:37,260 here we've set this variable called 690 00:23:35,280 --> 00:23:39,299 django.accountid 691 00:23:37,260 --> 00:23:41,340 and note that the dot in that name is 692 00:23:39,299 --> 00:23:43,320 very important what we're actually doing 693 00:23:41,340 --> 00:23:45,840 here is we're abusing postgres settings 694 00:23:43,320 --> 00:23:49,679 postgres thinks that we're trying to set 695 00:23:45,840 --> 00:23:52,140 a configuration item for a postgres 696 00:23:49,679 --> 00:23:54,059 extension called Django that we just 697 00:23:52,140 --> 00:23:55,860 happen not to have loaded yet 698 00:23:54,059 --> 00:23:57,659 and it says well that's fine you can 699 00:23:55,860 --> 00:23:58,740 temporarily override that configuration 700 00:23:57,659 --> 00:24:00,780 setting 701 00:23:58,740 --> 00:24:02,580 and later on we then pull that back out 702 00:24:00,780 --> 00:24:03,539 again and use that for our row level 703 00:24:02,580 --> 00:24:06,179 check 704 00:24:03,539 --> 00:24:07,559 because it's a configuration option it's 705 00:24:06,179 --> 00:24:10,860 always going to be a string which is why 706 00:24:07,559 --> 00:24:12,659 we have that cast from int there 707 00:24:10,860 --> 00:24:14,580 now there are some caveats for postgres 708 00:24:12,659 --> 00:24:17,400 row level security 709 00:24:14,580 --> 00:24:19,320 again because we're setting some State 710 00:24:17,400 --> 00:24:21,380 on the connection you need to make sure 711 00:24:19,320 --> 00:24:23,580 that you're not interleaving requests 712 00:24:21,380 --> 00:24:25,380 simultaneously across the same 713 00:24:23,580 --> 00:24:28,940 connection 714 00:24:25,380 --> 00:24:31,500 there are potential performance impacts 715 00:24:28,940 --> 00:24:33,360 if you run an explain analyze on your 716 00:24:31,500 --> 00:24:34,980 query after activating row level 717 00:24:33,360 --> 00:24:36,720 security you'll see all of these extra 718 00:24:34,980 --> 00:24:38,340 row level security checks that have 719 00:24:36,720 --> 00:24:40,980 basically been tacked onto your queries 720 00:24:38,340 --> 00:24:42,600 by postgres most of the time that should 721 00:24:40,980 --> 00:24:44,940 be okay as long as you were already 722 00:24:42,600 --> 00:24:48,000 filtering by the tenant ID in your 723 00:24:44,940 --> 00:24:49,260 Django code but if you weren't then the 724 00:24:48,000 --> 00:24:50,700 row level checks are now going to add 725 00:24:49,260 --> 00:24:52,500 these extra filters which may not have 726 00:24:50,700 --> 00:24:54,240 indexes associated with them and now you 727 00:24:52,500 --> 00:24:55,260 may have performance problems so you'll 728 00:24:54,240 --> 00:24:57,000 need to 729 00:24:55,260 --> 00:24:58,380 just keep an eye on things once you've 730 00:24:57,000 --> 00:25:00,179 activated to make sure that there's no 731 00:24:58,380 --> 00:25:02,340 unexpected effects and if there are you 732 00:25:00,179 --> 00:25:05,159 may need to look at the indexes that you 733 00:25:02,340 --> 00:25:06,960 that you need to apply and again the 734 00:25:05,159 --> 00:25:08,940 postgres query Optimizer is pretty good 735 00:25:06,960 --> 00:25:09,960 but the more complicated the query is 736 00:25:08,940 --> 00:25:11,400 the more chance that it's going to 737 00:25:09,960 --> 00:25:13,080 choose a query plan that's not the one 738 00:25:11,400 --> 00:25:15,360 that you expected it to use and you may 739 00:25:13,080 --> 00:25:20,280 have to jig around your queries a bit in 740 00:25:15,360 --> 00:25:20,280 order to get it to operate performantly 741 00:25:20,880 --> 00:25:24,779 if you're doing your local development 742 00:25:22,559 --> 00:25:27,120 postgres you probably just have one one 743 00:25:24,779 --> 00:25:28,919 user that user is actually a super user 744 00:25:27,120 --> 00:25:31,080 that user will completely bypass any row 745 00:25:28,919 --> 00:25:33,179 level checks so your row level checks 746 00:25:31,080 --> 00:25:34,919 will appear not to work at all you need 747 00:25:33,179 --> 00:25:35,940 to make sure that when you are doing 748 00:25:34,919 --> 00:25:37,320 your development and when you get to 749 00:25:35,940 --> 00:25:39,000 production that you didn't accidentally 750 00:25:37,320 --> 00:25:40,320 connect to the database with the super 751 00:25:39,000 --> 00:25:42,779 user 752 00:25:40,320 --> 00:25:44,400 and conversely when you're taking your 753 00:25:42,779 --> 00:25:45,659 backups from the database you need to 754 00:25:44,400 --> 00:25:48,240 make sure that you are using a super 755 00:25:45,659 --> 00:25:49,980 user if you were just using PG dump and 756 00:25:48,240 --> 00:25:51,960 that was not a super user and that user 757 00:25:49,980 --> 00:25:54,480 had row level security checks applied 758 00:25:51,960 --> 00:25:55,980 and was not seeing all of the data you 759 00:25:54,480 --> 00:25:57,480 will not get all of your data in your 760 00:25:55,980 --> 00:26:01,640 backup and you'll have a very bad day 761 00:25:57,480 --> 00:26:01,640 when you go to restore from backups 762 00:26:04,140 --> 00:26:08,880 so in summary 763 00:26:06,600 --> 00:26:10,440 multiple app servers and databases very 764 00:26:08,880 --> 00:26:12,299 simple to code but can be costly and 765 00:26:10,440 --> 00:26:14,100 painful with a large number of tenants 766 00:26:12,299 --> 00:26:16,020 using a single lap server with multiple 767 00:26:14,100 --> 00:26:17,220 databases and multiple schemas the big 768 00:26:16,020 --> 00:26:20,159 problem you're going to have here is 769 00:26:17,220 --> 00:26:23,100 partially applied migrations 770 00:26:20,159 --> 00:26:26,100 uh Django managers 771 00:26:23,100 --> 00:26:27,659 they're very nice at detecting missing 772 00:26:26,100 --> 00:26:29,100 filters and they can provide consistency 773 00:26:27,659 --> 00:26:31,620 to your code 774 00:26:29,100 --> 00:26:33,299 denormalizing tenancy data can simplify 775 00:26:31,620 --> 00:26:36,240 your code at the expense of database 776 00:26:33,299 --> 00:26:38,700 size and postgres row levels Security 777 00:26:36,240 --> 00:26:40,500 checks are a really nice fail safe in 778 00:26:38,700 --> 00:26:42,600 case you forgot something in Django the 779 00:26:40,500 --> 00:26:43,679 database is there got you back just to 780 00:26:42,600 --> 00:26:46,400 make sure that you don't accidentally 781 00:26:43,679 --> 00:26:46,400 mess up something 782 00:26:49,620 --> 00:26:52,919 and I have code samples for most of 783 00:26:51,720 --> 00:26:55,380 these 784 00:26:52,919 --> 00:26:56,700 pinned in my repository so if you'd like 785 00:26:55,380 --> 00:27:01,700 to have a look at some of these things 786 00:26:56,700 --> 00:27:01,700 with a proof of concept then feel free 787 00:27:02,760 --> 00:27:11,520 questions or comments 788 00:27:05,240 --> 00:27:14,460 [Applause] 789 00:27:11,520 --> 00:27:17,419 who has questions 790 00:27:14,460 --> 00:27:17,419 or comments 791 00:27:19,320 --> 00:27:22,380 thanks for for running through all those 792 00:27:20,880 --> 00:27:24,360 there's sort of there's a lot of things 793 00:27:22,380 --> 00:27:25,559 to keep in mind and a lot of places you 794 00:27:24,360 --> 00:27:27,659 could accidentally trip over your own 795 00:27:25,559 --> 00:27:29,820 feet and accidentally show the world 796 00:27:27,659 --> 00:27:32,100 things that aren't going to be seen 797 00:27:29,820 --> 00:27:35,100 uh can you are there any ways that you 798 00:27:32,100 --> 00:27:36,779 can systematically avoid those problems 799 00:27:35,100 --> 00:27:38,700 or give any confidence that you haven't 800 00:27:36,779 --> 00:27:40,380 hit those problems or is this just a 801 00:27:38,700 --> 00:27:42,179 matter of you need to make sure you've 802 00:27:40,380 --> 00:27:44,580 got test coverage and things yeah test 803 00:27:42,179 --> 00:27:46,140 coverage the thing the thing that I 804 00:27:44,580 --> 00:27:49,140 found we actually used mandatory 805 00:27:46,140 --> 00:27:51,000 filtering on a project once and I had so 806 00:27:49,140 --> 00:27:52,320 much pushback from developers who say oh 807 00:27:51,000 --> 00:27:54,240 it's constantly complaining that I 808 00:27:52,320 --> 00:27:56,640 forgot to filter and I'd ask them or 809 00:27:54,240 --> 00:27:57,299 should you be filtering yes 810 00:27:56,640 --> 00:28:00,360 um 811 00:27:57,299 --> 00:28:02,760 it is it is like the Nazi Checker of 812 00:28:00,360 --> 00:28:05,220 making sure that you've put filtering in 813 00:28:02,760 --> 00:28:07,740 but I've actually personally I like it 814 00:28:05,220 --> 00:28:09,360 the most as something that forces you to 815 00:28:07,740 --> 00:28:11,400 consider should I be filtering this or 816 00:28:09,360 --> 00:28:14,039 not and you always have the escape hatch 817 00:28:11,400 --> 00:28:15,360 you can use all objects but it also 818 00:28:14,039 --> 00:28:17,220 means it's really easy to audit your 819 00:28:15,360 --> 00:28:18,779 code and say find me all of the 820 00:28:17,220 --> 00:28:19,740 instances of all objects because I know 821 00:28:18,779 --> 00:28:21,000 that those ones aren't going to have 822 00:28:19,740 --> 00:28:23,539 filters applied and I can go and double 823 00:28:21,000 --> 00:28:23,539 check those 824 00:28:24,059 --> 00:28:30,440 oh we have more questions 825 00:28:26,640 --> 00:28:30,440 I'm getting my steps in 826 00:28:30,480 --> 00:28:32,960 sorry 827 00:28:33,779 --> 00:28:39,779 um watching this talk uh I I saw myself 828 00:28:37,260 --> 00:28:41,640 I went through this process at work a 829 00:28:39,779 --> 00:28:42,779 couple years ago and all the things you 830 00:28:41,640 --> 00:28:44,100 talked about were all the things that we 831 00:28:42,779 --> 00:28:46,380 dealt with 832 00:28:44,100 --> 00:28:49,200 um so I'll ask a high specific question 833 00:28:46,380 --> 00:28:51,480 uh using thread local storage to store 834 00:28:49,200 --> 00:28:53,940 the context the to the tenant context 835 00:28:51,480 --> 00:28:56,279 I've always felt kind of uneasy about 836 00:28:53,940 --> 00:28:58,700 that partially because of globals but 837 00:28:56,279 --> 00:29:01,380 partially because I don't 838 00:28:58,700 --> 00:29:03,419 I don't trust my knowledge of threads 839 00:29:01,380 --> 00:29:06,900 and whatever else 840 00:29:03,419 --> 00:29:08,700 are they common pitfalls that someone 841 00:29:06,900 --> 00:29:10,020 like me should be looking out for or 842 00:29:08,700 --> 00:29:12,000 edge cases that I might not have 843 00:29:10,020 --> 00:29:14,820 considered 844 00:29:12,000 --> 00:29:17,460 um I'm not sure of any common pitfalls 845 00:29:14,820 --> 00:29:19,740 specifically I would just say test and 846 00:29:17,460 --> 00:29:21,600 make sure you test multiple requests 847 00:29:19,740 --> 00:29:22,380 simultaneously 848 00:29:21,600 --> 00:29:24,360 um 849 00:29:22,380 --> 00:29:25,320 I actually had some test cases that 850 00:29:24,360 --> 00:29:27,419 would 851 00:29:25,320 --> 00:29:29,399 just do tests over and over and over and 852 00:29:27,419 --> 00:29:30,360 over again in two different threads it's 853 00:29:29,399 --> 00:29:33,240 kind of a pain and it's 854 00:29:30,360 --> 00:29:35,279 non-deterministic unless you have pauses 855 00:29:33,240 --> 00:29:37,140 and sleeps that you insert just to make 856 00:29:35,279 --> 00:29:38,640 sure that but but even that assumes that 857 00:29:37,140 --> 00:29:41,580 you put the paws in the sleep in the 858 00:29:38,640 --> 00:29:43,380 correct in the correct location 859 00:29:41,580 --> 00:29:45,539 on thread local storage though I can't 860 00:29:43,380 --> 00:29:46,919 say Django uses it internally I was a 861 00:29:45,539 --> 00:29:49,799 bit wary of it first and was like well 862 00:29:46,919 --> 00:29:51,360 Django's doing it so if I'd know Django 863 00:29:49,799 --> 00:29:52,980 async doesn't work with it we'll then 864 00:29:51,360 --> 00:29:55,500 say there's a bunch of other core Django 865 00:29:52,980 --> 00:29:56,580 code that's going to break so that I 866 00:29:55,500 --> 00:29:59,360 don't know if that makes you feel better 867 00:29:56,580 --> 00:29:59,360 about using it 868 00:29:59,760 --> 00:30:03,840 the recording that was a little bit 869 00:30:01,919 --> 00:30:05,340 um sadly we are out of time so let's 870 00:30:03,840 --> 00:30:06,370 thank our speaker again for the great 871 00:30:05,340 --> 00:30:12,770 talk 872 00:30:06,370 --> 00:30:12,770 [Applause]