1 00:00:00,480 --> 00:00:03,480 foreign 2 00:00:08,460 --> 00:00:12,719 welcome back everybody wonderful to see 3 00:00:10,800 --> 00:00:14,820 you all here for the last session of our 4 00:00:12,719 --> 00:00:17,580 connected Universe I'm really excited to 5 00:00:14,820 --> 00:00:19,560 hear in this session uh from Ram a 6 00:00:17,580 --> 00:00:22,500 engineer at spiral data all about 7 00:00:19,560 --> 00:00:25,100 improving water networks using AIML take 8 00:00:22,500 --> 00:00:25,100 it away rum 9 00:00:27,680 --> 00:00:33,000 sorry I haven't been presenting for a 10 00:00:30,180 --> 00:00:34,380 while so give me a second apologies for 11 00:00:33,000 --> 00:00:36,920 the 12 00:00:34,380 --> 00:00:36,920 okay 13 00:00:39,899 --> 00:00:43,040 just sorry 14 00:00:45,480 --> 00:00:50,039 much better 15 00:00:46,860 --> 00:00:52,559 hi everyone uh thank you very much for 16 00:00:50,039 --> 00:00:55,860 staying on until the last session I will 17 00:00:52,559 --> 00:00:58,559 try to keep it short and but I'll try to 18 00:00:55,860 --> 00:01:01,140 make it worthwhile there were amazing 19 00:00:58,559 --> 00:01:03,359 talks all day so it's a little bit 20 00:01:01,140 --> 00:01:05,280 honestly intimidating to talk the last 21 00:01:03,359 --> 00:01:08,280 person but 22 00:01:05,280 --> 00:01:10,380 just uh has a different perspective this 23 00:01:08,280 --> 00:01:12,600 talk will not have a lot of code 24 00:01:10,380 --> 00:01:16,020 Snippets but it's more like a zoomed out 25 00:01:12,600 --> 00:01:18,119 talk trying to talk about the problem I 26 00:01:16,020 --> 00:01:20,640 personally care a lot about and the 27 00:01:18,119 --> 00:01:24,420 problem but my organization spiral data 28 00:01:20,640 --> 00:01:26,580 cares about a lot and all the aspects of 29 00:01:24,420 --> 00:01:28,979 the problem and how Python and its 30 00:01:26,580 --> 00:01:31,080 ecosystem is helping us to try to solve 31 00:01:28,979 --> 00:01:34,740 that problem so and I'm happy to take 32 00:01:31,080 --> 00:01:37,799 questions at the end our connect 33 00:01:34,740 --> 00:01:39,479 so just a bit of an outline we are an 34 00:01:37,799 --> 00:01:42,659 Adelaide company based out of the 35 00:01:39,479 --> 00:01:46,320 tonsley precinct the main thing that we 36 00:01:42,659 --> 00:01:49,439 do is Big Data and machine learning 37 00:01:46,320 --> 00:01:51,180 mainly for water and we do all the way 38 00:01:49,439 --> 00:01:54,600 from Consulting trying to understand the 39 00:01:51,180 --> 00:01:56,700 problem to build to support and we are 40 00:01:54,600 --> 00:01:58,979 also I mean we are also proud of the AWS 41 00:01:56,700 --> 00:02:03,079 award that we got recently 42 00:01:58,979 --> 00:02:03,079 and these are some of our customers 43 00:02:04,560 --> 00:02:11,160 is why do we do what we do and why do I 44 00:02:08,459 --> 00:02:13,560 personally connect with the problem for 45 00:02:11,160 --> 00:02:15,900 those of you who don't know about the UN 46 00:02:13,560 --> 00:02:19,080 sets up sustainable development goals 47 00:02:15,900 --> 00:02:22,680 for all the world all the countries as a 48 00:02:19,080 --> 00:02:25,440 Target and goal six talks about ensuring 49 00:02:22,680 --> 00:02:26,220 availability of water and sanitation for 50 00:02:25,440 --> 00:02:30,120 all 51 00:02:26,220 --> 00:02:32,700 and this is a mission that we as a I 52 00:02:30,120 --> 00:02:36,060 care deeply personally because I come 53 00:02:32,700 --> 00:02:38,520 from a place where I had to suffer for 54 00:02:36,060 --> 00:02:41,099 clean water when I was in childhood and 55 00:02:38,520 --> 00:02:43,560 had personal experience but this is not 56 00:02:41,099 --> 00:02:45,840 just a developing country's problem this 57 00:02:43,560 --> 00:02:47,879 is a problem that currently exists even 58 00:02:45,840 --> 00:02:49,920 in Australia more than half a million 59 00:02:47,879 --> 00:02:51,959 Australians especially in remote 60 00:02:49,920 --> 00:02:54,000 Aboriginal communities do not have 61 00:02:51,959 --> 00:02:55,200 access to clean drinking water and 62 00:02:54,000 --> 00:02:57,840 sanitation 63 00:02:55,200 --> 00:02:59,640 and just 64 00:02:57,840 --> 00:03:02,459 why is that 65 00:02:59,640 --> 00:03:04,800 the problem is extremely complex water 66 00:03:02,459 --> 00:03:06,720 networks are complex everything around 67 00:03:04,800 --> 00:03:09,420 water is complex just to give you a 68 00:03:06,720 --> 00:03:12,120 perspective in Adelaide South Australia 69 00:03:09,420 --> 00:03:14,459 most of the top people live within the 70 00:03:12,120 --> 00:03:16,260 Adelaide Metro but the state is 71 00:03:14,459 --> 00:03:18,659 extremely large 72 00:03:16,260 --> 00:03:21,120 the density is around one to two percent 73 00:03:18,659 --> 00:03:23,519 per kilometer and so what happens is you 74 00:03:21,120 --> 00:03:25,980 have Mains that are running for tens of 75 00:03:23,519 --> 00:03:28,080 thousands of kilometers and same as with 76 00:03:25,980 --> 00:03:30,300 the sewer main so this is data just from 77 00:03:28,080 --> 00:03:32,940 the sa water the biggest utility here 78 00:03:30,300 --> 00:03:35,040 but there are also other utilities who 79 00:03:32,940 --> 00:03:37,260 manage especially the Superman and the 80 00:03:35,040 --> 00:03:38,519 sub IT Supplies more than 220 billion 81 00:03:37,260 --> 00:03:41,519 liters of water 82 00:03:38,519 --> 00:03:45,379 so just the magnitude is so complex and 83 00:03:41,519 --> 00:03:45,379 so and so is the problems around it 84 00:03:45,480 --> 00:03:50,220 what are the problems there is a lot of 85 00:03:48,000 --> 00:03:53,220 them I just had to pick few just to keep 86 00:03:50,220 --> 00:03:55,860 it short with aging infrastructure so 87 00:03:53,220 --> 00:03:57,780 most of the pipes that are across sea 88 00:03:55,860 --> 00:04:00,420 Adelaide and most of the cities are over 89 00:03:57,780 --> 00:04:02,760 100 years old constantly needing 90 00:04:00,420 --> 00:04:05,280 monitoring and upgrading and also the 91 00:04:02,760 --> 00:04:07,140 issue with climate change because uh 92 00:04:05,280 --> 00:04:08,640 there is the encroachment that is coming 93 00:04:07,140 --> 00:04:11,159 in that is affecting all the water 94 00:04:08,640 --> 00:04:13,680 networks hurricanes and also the 95 00:04:11,159 --> 00:04:16,560 environment impact like any 96 00:04:13,680 --> 00:04:18,959 leakage from the sewer systems can have 97 00:04:16,560 --> 00:04:21,060 like catastrophic effects especially in 98 00:04:18,959 --> 00:04:25,020 very sensitive areas 99 00:04:21,060 --> 00:04:27,419 so we strongly believe most of these 100 00:04:25,020 --> 00:04:30,120 problems can be addressed and at least 101 00:04:27,419 --> 00:04:32,880 mitigated technologically but then even 102 00:04:30,120 --> 00:04:35,699 technological challenges exist in water 103 00:04:32,880 --> 00:04:38,300 networks the main being technological 104 00:04:35,699 --> 00:04:41,400 maturity because water being an old 105 00:04:38,300 --> 00:04:43,199 industry they're also very resistant to 106 00:04:41,400 --> 00:04:45,060 change and for good reason because they 107 00:04:43,199 --> 00:04:48,419 are transferring 108 00:04:45,060 --> 00:04:51,660 water which is being ingested by us so 109 00:04:48,419 --> 00:04:53,759 any mistakes can be catastrophic and 110 00:04:51,660 --> 00:04:55,500 also the other problem is siled solution 111 00:04:53,759 --> 00:04:58,380 so the most of the solutions that they 112 00:04:55,500 --> 00:05:00,540 build are specific for those use cases 113 00:04:58,380 --> 00:05:03,360 and cannot be leveraged across multiple 114 00:05:00,540 --> 00:05:05,340 problems so the and also and with that 115 00:05:03,360 --> 00:05:07,500 comes the issue with data because every 116 00:05:05,340 --> 00:05:09,360 vendor has their own data standards data 117 00:05:07,500 --> 00:05:12,540 provenance data management 118 00:05:09,360 --> 00:05:14,580 but then it's not all dull and gloomy 119 00:05:12,540 --> 00:05:18,120 there are also a lot of opportunities 120 00:05:14,580 --> 00:05:20,100 the biggest that in my opinion is the 121 00:05:18,120 --> 00:05:22,020 collaboration water industry is very 122 00:05:20,100 --> 00:05:23,940 highly collaborative utilities talk to 123 00:05:22,020 --> 00:05:26,340 each other constantly sharing 124 00:05:23,940 --> 00:05:29,220 technological advantages and there is 125 00:05:26,340 --> 00:05:31,500 also a growing sense of Fleet of iot4 126 00:05:29,220 --> 00:05:35,160 devices that is exponentially increasing 127 00:05:31,500 --> 00:05:37,320 but then the last part of the puzzle is 128 00:05:35,160 --> 00:05:39,360 how do you use this data how do you use 129 00:05:37,320 --> 00:05:42,240 this data to solve problems that are 130 00:05:39,360 --> 00:05:45,060 important for them that to make it a lot 131 00:05:42,240 --> 00:05:48,240 more easier for the water abilities to 132 00:05:45,060 --> 00:05:51,740 maintain the infrastructure and also to 133 00:05:48,240 --> 00:05:51,740 meet up with existing demand 134 00:05:52,979 --> 00:05:58,139 so the question is do we have a silver 135 00:05:55,740 --> 00:06:01,560 bullet no we don't and the idea is can 136 00:05:58,139 --> 00:06:03,539 we at least design a framework but the 137 00:06:01,560 --> 00:06:06,000 framework is not just technological the 138 00:06:03,539 --> 00:06:07,620 framework includes people who are going 139 00:06:06,000 --> 00:06:09,840 to build this people who are going to 140 00:06:07,620 --> 00:06:11,400 use this the data that needs to be part 141 00:06:09,840 --> 00:06:14,460 of this the tools that is going to be 142 00:06:11,400 --> 00:06:16,800 part of it and what we I mean we are 143 00:06:14,460 --> 00:06:19,740 just going to tell about the the 144 00:06:16,800 --> 00:06:22,560 framework that we come about the mainly 145 00:06:19,740 --> 00:06:25,979 on how do we leverage all these pillars 146 00:06:22,560 --> 00:06:29,160 of data people platform and process and 147 00:06:25,979 --> 00:06:30,539 especially using AI machine learning uh 148 00:06:29,160 --> 00:06:32,639 to create what we call as decision 149 00:06:30,539 --> 00:06:34,880 advantages decisions around operational 150 00:06:32,639 --> 00:06:37,440 improvements decision around 151 00:06:34,880 --> 00:06:39,960 strategic Investments like for example 152 00:06:37,440 --> 00:06:42,139 the talk in the morning in Hall C talked 153 00:06:39,960 --> 00:06:45,060 about the pipes monitoring using GP 154 00:06:42,139 --> 00:06:48,060 geospatial data that things like we 155 00:06:45,060 --> 00:06:50,400 don't use your special but it is a big 156 00:06:48,060 --> 00:06:52,979 deal because it involves hundreds of 157 00:06:50,400 --> 00:06:54,120 millions of dollars in to do any upgrade 158 00:06:52,979 --> 00:06:55,860 and 159 00:06:54,120 --> 00:06:58,800 so 160 00:06:55,860 --> 00:07:01,740 what does and you need to do that by 161 00:06:58,800 --> 00:07:06,720 maximizing value by reducing cost effort 162 00:07:01,740 --> 00:07:09,240 and time but and what we see is python 163 00:07:06,720 --> 00:07:12,780 has a role to play in every part of that 164 00:07:09,240 --> 00:07:15,120 python helps us to handle data better 165 00:07:12,780 --> 00:07:17,220 display data better it helps us to 166 00:07:15,120 --> 00:07:19,139 streamline the process of development it 167 00:07:17,220 --> 00:07:20,940 helps us to integrate platforms and it 168 00:07:19,139 --> 00:07:23,460 also helps us to communicate better with 169 00:07:20,940 --> 00:07:25,800 people using dashboards uis everything 170 00:07:23,460 --> 00:07:28,819 so I'll just give a brief outline of how 171 00:07:25,800 --> 00:07:28,819 we go through the process 172 00:07:29,699 --> 00:07:36,180 so the main challenge here is 173 00:07:33,240 --> 00:07:38,460 as I said the technological maturity is 174 00:07:36,180 --> 00:07:40,440 very low in water utilities and as a 175 00:07:38,460 --> 00:07:43,500 result people don't know what they want 176 00:07:40,440 --> 00:07:46,500 until they see it so the idea is how do 177 00:07:43,500 --> 00:07:48,780 we rapidly build solutions that the 178 00:07:46,500 --> 00:07:51,599 customers can have a look and feel and 179 00:07:48,780 --> 00:07:53,340 use it before they say yes this is what 180 00:07:51,599 --> 00:07:56,160 we want this is what something we don't 181 00:07:53,340 --> 00:07:57,840 want and the idea is there are multiple 182 00:07:56,160 --> 00:07:59,220 stakeholders here the customers who are 183 00:07:57,840 --> 00:08:01,020 going to use it people who are going to 184 00:07:59,220 --> 00:08:04,319 pay for it people who are building it 185 00:08:01,020 --> 00:08:07,560 and testing it and the idea and what we 186 00:08:04,319 --> 00:08:09,720 see is we python provides a framework 187 00:08:07,560 --> 00:08:12,240 especially the Ripple framework that 188 00:08:09,720 --> 00:08:14,520 helps us to prototype it rapidly share 189 00:08:12,240 --> 00:08:16,979 with the customers using streamlit and 190 00:08:14,520 --> 00:08:19,620 if and once it is iterated we can we 191 00:08:16,979 --> 00:08:23,039 also use like UI ux using bokeh 192 00:08:19,620 --> 00:08:25,259 colorways and also apis using fast API 193 00:08:23,039 --> 00:08:27,060 so there are these are a lot of them are 194 00:08:25,259 --> 00:08:28,500 big ones but especially I wanted to call 195 00:08:27,060 --> 00:08:30,419 out the small 196 00:08:28,500 --> 00:08:32,219 people like the packages that are being 197 00:08:30,419 --> 00:08:34,919 maintained by one person two person 198 00:08:32,219 --> 00:08:37,200 especially like jupy text jupit text is 199 00:08:34,919 --> 00:08:39,120 a godsend because data scientists love 200 00:08:37,200 --> 00:08:40,740 the Ripple framework they want the 201 00:08:39,120 --> 00:08:42,060 notebooks but 202 00:08:40,740 --> 00:08:44,399 but software Engineers don't want to 203 00:08:42,060 --> 00:08:46,980 touch it with a 10 feet Pole 204 00:08:44,399 --> 00:08:49,980 so the idea is up text helps us to use 205 00:08:46,980 --> 00:08:52,800 light person formats where 206 00:08:49,980 --> 00:08:54,660 the data scientists are happy and go on 207 00:08:52,800 --> 00:08:56,580 to do the job but the software Engineers 208 00:08:54,660 --> 00:08:58,920 are also happy because now it's plain 209 00:08:56,580 --> 00:09:02,040 text files you can lint it you can test 210 00:08:58,920 --> 00:09:04,380 it you can do code reviews on it and you 211 00:09:02,040 --> 00:09:08,580 can do Source control on it so yeah I 212 00:09:04,380 --> 00:09:12,320 mean the environment is so rich that it 213 00:09:08,580 --> 00:09:12,320 helps to solve a lot of the problems 214 00:09:12,420 --> 00:09:16,560 so this is the platform that we are I 215 00:09:14,940 --> 00:09:18,600 mean at a very schematic level again 216 00:09:16,560 --> 00:09:19,980 there is data mainly the sensor data 217 00:09:18,600 --> 00:09:21,839 that comes in and then there is a whole 218 00:09:19,980 --> 00:09:24,180 heap of data engineering that happens 219 00:09:21,839 --> 00:09:25,920 and then there is the platform itself 220 00:09:24,180 --> 00:09:28,440 with the core is being building machine 221 00:09:25,920 --> 00:09:30,180 learning models Rapids devops and 222 00:09:28,440 --> 00:09:32,640 security and then the output gets 223 00:09:30,180 --> 00:09:35,420 through the alerts analytics and API 224 00:09:32,640 --> 00:09:35,420 gateways 225 00:09:36,720 --> 00:09:40,980 so just to give a context regarding the 226 00:09:39,060 --> 00:09:44,220 challenges in data engineering data can 227 00:09:40,980 --> 00:09:46,740 come in any form some utilities are very 228 00:09:44,220 --> 00:09:48,660 Advanced and they have like extremely 229 00:09:46,740 --> 00:09:51,060 good data provenance and data management 230 00:09:48,660 --> 00:09:53,100 some utilities they just have flat files 231 00:09:51,060 --> 00:09:55,080 that they copy it into a hard drive and 232 00:09:53,100 --> 00:09:57,600 ship it to you 233 00:09:55,080 --> 00:09:59,700 and similarly the data velocity is also 234 00:09:57,600 --> 00:10:02,220 quite different there are sensors that 235 00:09:59,700 --> 00:10:04,860 can stream up to like hundreds of Hertz 236 00:10:02,220 --> 00:10:06,540 that needs to be captured and maintained 237 00:10:04,860 --> 00:10:08,519 and then the extraction patterns are 238 00:10:06,540 --> 00:10:11,220 also complex sometimes real sometimes 239 00:10:08,519 --> 00:10:13,260 patched and the volume is also sometimes 240 00:10:11,220 --> 00:10:15,420 massive again there is a lot of the 241 00:10:13,260 --> 00:10:17,760 tools that we have to use in order to 242 00:10:15,420 --> 00:10:20,399 integrate them across different use 243 00:10:17,760 --> 00:10:22,920 cases different 244 00:10:20,399 --> 00:10:26,160 customers 245 00:10:22,920 --> 00:10:28,140 again like you see the richness of the 246 00:10:26,160 --> 00:10:30,180 like the python ecosystem that is 247 00:10:28,140 --> 00:10:32,640 helping us to leverage this like we use 248 00:10:30,180 --> 00:10:35,720 all SQL Alchemy is a godsend because we 249 00:10:32,640 --> 00:10:39,839 can use that across all kinds of 250 00:10:35,720 --> 00:10:43,080 databases and it and also similarly we 251 00:10:39,839 --> 00:10:45,779 are fast API and especially I wanted to 252 00:10:43,080 --> 00:10:47,700 call out the python wrapper that is 253 00:10:45,779 --> 00:10:50,700 being maintained by Sergey Z standard 254 00:10:47,700 --> 00:10:52,560 the core implementation is by a meta for 255 00:10:50,700 --> 00:10:53,940 Facebook but then there are python 256 00:10:52,560 --> 00:10:56,339 wrappers that are being maintained by 257 00:10:53,940 --> 00:10:58,320 single developers who are just doing it 258 00:10:56,339 --> 00:11:00,240 for passion and that helps us to 259 00:10:58,320 --> 00:11:03,860 compress data at a massive scale and 260 00:11:00,240 --> 00:11:03,860 being able to solve these problems 261 00:11:04,700 --> 00:11:12,720 so the core of our work is AML workbench 262 00:11:09,380 --> 00:11:16,200 the plenary talk talked a lot about AI 263 00:11:12,720 --> 00:11:17,700 bias the the other aspect of the AI is 264 00:11:16,200 --> 00:11:19,920 the accuracy and the reliability 265 00:11:17,700 --> 00:11:22,920 especially in critical infrastructures 266 00:11:19,920 --> 00:11:25,260 like water because if you build an AI 267 00:11:22,920 --> 00:11:27,060 model that gives unphysical results like 268 00:11:25,260 --> 00:11:29,640 negative pressure the customer is never 269 00:11:27,060 --> 00:11:32,160 going to talk to us again so the idea is 270 00:11:29,640 --> 00:11:35,040 how do you build reliable machine 271 00:11:32,160 --> 00:11:37,380 learning models with but the speed to 272 00:11:35,040 --> 00:11:39,240 Value needs to be high but then the 273 00:11:37,380 --> 00:11:44,040 other important challenge is 274 00:11:39,240 --> 00:11:45,839 most of the utilities are extremely busy 275 00:11:44,040 --> 00:11:48,120 they have very little time and then you 276 00:11:45,839 --> 00:11:50,519 need to make like whatever little time 277 00:11:48,120 --> 00:11:52,980 you have you need to utilize it again 278 00:11:50,519 --> 00:11:55,399 the way we are trying to see is a lot of 279 00:11:52,980 --> 00:11:58,380 our work is built on anomaly detection 280 00:11:55,399 --> 00:12:00,720 and what we try to do is we try to use 281 00:11:58,380 --> 00:12:03,360 unsupervised model again start start 282 00:12:00,720 --> 00:12:05,279 model SK learn to build a labeling 283 00:12:03,360 --> 00:12:08,640 pipeline again it's based on python 284 00:12:05,279 --> 00:12:11,399 label studio and smes gives us feedback 285 00:12:08,640 --> 00:12:14,100 on how to I mean what 286 00:12:11,399 --> 00:12:16,019 kind of data is an actual anomaly what 287 00:12:14,100 --> 00:12:19,260 is not an anomaly and then we use that 288 00:12:16,019 --> 00:12:21,000 to again build supervised models and 289 00:12:19,260 --> 00:12:23,940 then again we have model Registries 290 00:12:21,000 --> 00:12:25,620 using ml flow and then once the mod and 291 00:12:23,940 --> 00:12:28,079 this gets iterated it's not a single 292 00:12:25,620 --> 00:12:31,260 process right we constantly iterate this 293 00:12:28,079 --> 00:12:34,260 a lot of times until the customers are 294 00:12:31,260 --> 00:12:36,720 satisfied with the reliability before it 295 00:12:34,260 --> 00:12:39,120 can get deployed again like one of this 296 00:12:36,720 --> 00:12:40,800 amazing tools that is just maintained by 297 00:12:39,120 --> 00:12:43,320 a single person is piode which is an 298 00:12:40,800 --> 00:12:45,240 anomaly detecting package and anybody 299 00:12:43,320 --> 00:12:46,920 who is working on anomaly detection I 300 00:12:45,240 --> 00:12:49,040 strongly encourage to take a look at 301 00:12:46,920 --> 00:12:49,040 that 302 00:12:50,160 --> 00:12:55,519 so I talked a lot about the problems I 303 00:12:52,860 --> 00:12:58,860 talked a lot about in terms of abstract 304 00:12:55,519 --> 00:13:01,380 of how python is using we just wanted to 305 00:12:58,860 --> 00:13:03,959 give a couple of real world examples so 306 00:13:01,380 --> 00:13:06,240 this is a real world example this was a 307 00:13:03,959 --> 00:13:08,220 massive fight break a few years back in 308 00:13:06,240 --> 00:13:09,540 King William Street just down across the 309 00:13:08,220 --> 00:13:13,440 road 310 00:13:09,540 --> 00:13:16,139 so this is caused by what the engineers 311 00:13:13,440 --> 00:13:18,480 call as pressure transience so there is 312 00:13:16,139 --> 00:13:20,700 constant flow of water happening through 313 00:13:18,480 --> 00:13:23,579 the network that is at a certain 314 00:13:20,700 --> 00:13:26,220 pressure but then if there is a sudden 315 00:13:23,579 --> 00:13:28,019 increase in demand like say during half 316 00:13:26,220 --> 00:13:29,700 time of the metal does game if everybody 317 00:13:28,019 --> 00:13:32,279 went to the restroom 318 00:13:29,700 --> 00:13:34,740 that is going to be a sudden spike in 319 00:13:32,279 --> 00:13:36,720 demand that is gonna what happens is it 320 00:13:34,740 --> 00:13:39,600 creates like a hammer effect you are 321 00:13:36,720 --> 00:13:42,480 hitting the network with the hammer at a 322 00:13:39,600 --> 00:13:44,880 very high velocity and very high force 323 00:13:42,480 --> 00:13:48,420 uh the technical term is pressure 324 00:13:44,880 --> 00:13:51,079 transience and these are very I mean 325 00:13:48,420 --> 00:13:53,220 these happen across the network but the 326 00:13:51,079 --> 00:13:55,260 understanding is there is very little 327 00:13:53,220 --> 00:13:58,019 analytics around this where does it 328 00:13:55,260 --> 00:14:00,060 happen when does it happen how strong is 329 00:13:58,019 --> 00:14:02,820 the impact and what is the likely impact 330 00:14:00,060 --> 00:14:05,040 on that so just to give a schematics of 331 00:14:02,820 --> 00:14:07,200 how this works so the transients are 332 00:14:05,040 --> 00:14:09,720 being generated and this Hammer effect 333 00:14:07,200 --> 00:14:12,660 that keeps ongoing it's not one time it 334 00:14:09,720 --> 00:14:14,820 keeps happening continuously and that 335 00:14:12,660 --> 00:14:17,720 creates what again in mechanical terms 336 00:14:14,820 --> 00:14:20,160 is fatigue and then fatigue is like 337 00:14:17,720 --> 00:14:22,620 similar to human fatigue which leads to 338 00:14:20,160 --> 00:14:26,220 burnout but here it leads to breaks and 339 00:14:22,620 --> 00:14:28,139 cracks so if it cracks it's not just a 340 00:14:26,220 --> 00:14:31,019 challenge of fixing it there is not just 341 00:14:28,139 --> 00:14:33,120 an operational cost that is also a huge 342 00:14:31,019 --> 00:14:35,519 Capital cost invest in replacing and 343 00:14:33,120 --> 00:14:37,800 more importantly there is a large 344 00:14:35,519 --> 00:14:41,160 Downstream effect of this effects water 345 00:14:37,800 --> 00:14:45,000 quality for a long time few days because 346 00:14:41,160 --> 00:14:46,980 there is uh I mean there is I would say 347 00:14:45,000 --> 00:14:48,779 the unnecessary sediments that coming 348 00:14:46,980 --> 00:14:51,360 into the network that cannot be cleaned 349 00:14:48,779 --> 00:14:53,339 easily and that affects a lot of things 350 00:14:51,360 --> 00:14:55,860 especially vulnerable population if they 351 00:14:53,339 --> 00:14:57,180 need access to clean water it this is a 352 00:14:55,860 --> 00:15:01,019 major issue 353 00:14:57,180 --> 00:15:04,079 and so the first thing is 354 00:15:01,019 --> 00:15:06,899 detecting it requires sensors at really 355 00:15:04,079 --> 00:15:09,300 high frequency slow so the sensors that 356 00:15:06,899 --> 00:15:11,399 we worked on this with they say what are 357 00:15:09,300 --> 00:15:14,339 that what they deployed is over 100 358 00:15:11,399 --> 00:15:17,279 Hertz and that roughly translates to a 359 00:15:14,339 --> 00:15:20,100 few billion records per sensor per year 360 00:15:17,279 --> 00:15:22,440 it's not manually possible to analyze 361 00:15:20,100 --> 00:15:24,959 any of them so the first kind of models 362 00:15:22,440 --> 00:15:27,180 that we built is trying to extract 363 00:15:24,959 --> 00:15:28,680 patterns from these data that are 364 00:15:27,180 --> 00:15:31,339 equivalent to the pressure transient 365 00:15:28,680 --> 00:15:34,560 there is also a lot of noise in this 366 00:15:31,339 --> 00:15:36,660 and then but what they want is a 367 00:15:34,560 --> 00:15:38,699 specific pattern where there is a sudden 368 00:15:36,660 --> 00:15:41,339 increase or a decrease and then there is 369 00:15:38,699 --> 00:15:44,339 a flattening out we were able to design 370 00:15:41,339 --> 00:15:46,139 these algorithms the main criteria for 371 00:15:44,339 --> 00:15:48,420 them was the reliability because without 372 00:15:46,139 --> 00:15:51,240 that nothing moves forward but then the 373 00:15:48,420 --> 00:15:52,500 other thing is also in the the approved 374 00:15:51,240 --> 00:15:54,240 the solution needs to be fast because 375 00:15:52,500 --> 00:15:56,399 they need real-time monitoring because 376 00:15:54,240 --> 00:15:58,560 if there is an impending break they need 377 00:15:56,399 --> 00:16:01,380 to be proactive rather than being 378 00:15:58,560 --> 00:16:04,740 reactive and the other thing is all so 379 00:16:01,380 --> 00:16:06,300 scalable you can do it for One sensor 380 00:16:04,740 --> 00:16:08,100 but then the idea is can you do it for 381 00:16:06,300 --> 00:16:10,680 100 sensors 382 00:16:08,100 --> 00:16:12,600 what happened is we were able to extract 383 00:16:10,680 --> 00:16:14,639 it but it's still a hundred more than 384 00:16:12,600 --> 00:16:17,519 100 events per week per sensor and 385 00:16:14,639 --> 00:16:20,220 that's still not possible for any humans 386 00:16:17,519 --> 00:16:22,440 uh to do it because just because they 387 00:16:20,220 --> 00:16:24,560 are maxed out they do it they have 388 00:16:22,440 --> 00:16:27,899 minimal time and then they still can't 389 00:16:24,560 --> 00:16:30,060 uh analyze each and every event 390 00:16:27,899 --> 00:16:32,459 independently 391 00:16:30,060 --> 00:16:34,860 and so the next step was we wanted to 392 00:16:32,459 --> 00:16:36,600 group them automatically again we used a 393 00:16:34,860 --> 00:16:39,120 lot of the information within the 394 00:16:36,600 --> 00:16:41,519 characteristics of Time series using 395 00:16:39,120 --> 00:16:44,100 various metrics Dynamic time warping and 396 00:16:41,519 --> 00:16:47,100 all the other things but also we worked 397 00:16:44,100 --> 00:16:48,540 with smes to understand the Tran the 398 00:16:47,100 --> 00:16:50,759 type of transience like for example 399 00:16:48,540 --> 00:16:54,540 what's the transient that is going to be 400 00:16:50,759 --> 00:16:57,959 created by a fire testing versus a large 401 00:16:54,540 --> 00:16:59,759 custom like a large plan for example or 402 00:16:57,959 --> 00:17:01,440 like a large high-rise apartments and 403 00:16:59,759 --> 00:17:04,020 because that was important for them to 404 00:17:01,440 --> 00:17:06,299 figure out what's the source and then 405 00:17:04,020 --> 00:17:08,280 again needs to be reliable yes it 406 00:17:06,299 --> 00:17:11,160 significantly improved now they only had 407 00:17:08,280 --> 00:17:13,620 to analyze a few tens of groups 408 00:17:11,160 --> 00:17:15,059 but the challenge is this is One sensor 409 00:17:13,620 --> 00:17:18,660 what happens if there are a hundred 410 00:17:15,059 --> 00:17:21,120 sensor or a thousand sensors 411 00:17:18,660 --> 00:17:23,520 uh what we did was we started to build 412 00:17:21,120 --> 00:17:25,140 analytics around it and then like across 413 00:17:23,520 --> 00:17:27,600 all the groups we tried to look for 414 00:17:25,140 --> 00:17:29,940 patterns on daytime and then what we we 415 00:17:27,600 --> 00:17:31,919 were able to surface those anomalies and 416 00:17:29,940 --> 00:17:34,140 one of them is shown here which is like 417 00:17:31,919 --> 00:17:36,960 the same kind of pattern happens every 418 00:17:34,140 --> 00:17:38,460 Friday in the morning and and these are 419 00:17:36,960 --> 00:17:39,780 the information that is useful for them 420 00:17:38,460 --> 00:17:41,640 because now they can go and do field 421 00:17:39,780 --> 00:17:43,140 testing and then they came back and said 422 00:17:41,640 --> 00:17:44,760 yes this was a fire testing that was 423 00:17:43,140 --> 00:17:48,240 happening especially in Peak demand 424 00:17:44,760 --> 00:17:49,799 times and that's and I think they are 425 00:17:48,240 --> 00:17:51,419 working with them to try to move the 426 00:17:49,799 --> 00:17:53,340 testing time so that it doesn't affect 427 00:17:51,419 --> 00:17:55,980 the infrastructure 428 00:17:53,340 --> 00:17:57,660 and similarly there were other work that 429 00:17:55,980 --> 00:17:59,760 we did in trying to understand is that 430 00:17:57,660 --> 00:18:01,320 the characteristics change over time is 431 00:17:59,760 --> 00:18:03,960 the new transient behavior that is 432 00:18:01,320 --> 00:18:06,120 coming in that is stopping and and so 433 00:18:03,960 --> 00:18:08,280 the idea is this is that what we call as 434 00:18:06,120 --> 00:18:11,340 decision Advantage how do you take the 435 00:18:08,280 --> 00:18:14,100 raw data and then try to create as much 436 00:18:11,340 --> 00:18:17,340 insights as possible automatically for 437 00:18:14,100 --> 00:18:20,160 them that are reliable and accurate that 438 00:18:17,340 --> 00:18:22,380 they can act upon and create some kind 439 00:18:20,160 --> 00:18:25,320 of a value 440 00:18:22,380 --> 00:18:29,160 and so the second one I wanted to talk 441 00:18:25,320 --> 00:18:31,860 about is alexandrina Council Alexander 442 00:18:29,160 --> 00:18:33,720 is just 50 kilometers down South for 443 00:18:31,860 --> 00:18:35,580 those of you who are new in Adelaide 444 00:18:33,720 --> 00:18:37,500 have strongly recommend going there 445 00:18:35,580 --> 00:18:40,160 because it's Where the River Murray 446 00:18:37,500 --> 00:18:43,320 meets the sea and it's a beautiful place 447 00:18:40,160 --> 00:18:44,820 the challenge is it's a large geography 448 00:18:43,320 --> 00:18:46,559 the council is over a thousand square 449 00:18:44,820 --> 00:18:49,200 kilometer it's very ecologically 450 00:18:46,559 --> 00:18:51,440 sensitive there are a few people but 451 00:18:49,200 --> 00:18:54,480 then there are even fewer employees and 452 00:18:51,440 --> 00:18:56,640 maintaining the Wastewater network is 453 00:18:54,480 --> 00:19:01,100 really hard for such a large geography 454 00:18:56,640 --> 00:19:01,100 with such few people excuse me 455 00:19:01,380 --> 00:19:06,299 and the challenge is so the drains can 456 00:19:04,740 --> 00:19:08,220 get blocked there are a lot of reasons 457 00:19:06,299 --> 00:19:11,340 that it can get blocked the predominant 458 00:19:08,220 --> 00:19:13,860 one being that trees Roots trying to 459 00:19:11,340 --> 00:19:15,900 force its way into the sewer Network 460 00:19:13,860 --> 00:19:17,820 and what this happens is if there is a 461 00:19:15,900 --> 00:19:19,799 Blog there is a back pressure happening 462 00:19:17,820 --> 00:19:23,000 and then this can cause spills and then 463 00:19:19,799 --> 00:19:26,280 spills especially around the 464 00:19:23,000 --> 00:19:28,620 Delta of the Murray river is very is in 465 00:19:26,280 --> 00:19:31,740 one it has a strong environmental impact 466 00:19:28,620 --> 00:19:33,780 the challenges there are no alerts that 467 00:19:31,740 --> 00:19:36,419 are present at this time because most of 468 00:19:33,780 --> 00:19:38,640 the alerts and in the modbus system to 469 00:19:36,419 --> 00:19:40,799 control these pumps is based on just 470 00:19:38,640 --> 00:19:43,380 thresholds as it reached the start limit 471 00:19:40,799 --> 00:19:45,780 or is it has reach the stop limit and so 472 00:19:43,380 --> 00:19:48,059 this is there are no alerts they have 473 00:19:45,780 --> 00:19:49,860 these issues happening from time to time 474 00:19:48,059 --> 00:19:52,500 and so we are working with them to 475 00:19:49,860 --> 00:19:54,000 create a solution that is of operational 476 00:19:52,500 --> 00:19:56,700 value to them 477 00:19:54,000 --> 00:19:59,880 so the data then again we worked with is 478 00:19:56,700 --> 00:20:02,700 for over five areas and then they had 54 479 00:19:59,880 --> 00:20:05,179 pump stations and over 800 million 480 00:20:02,700 --> 00:20:05,179 samples 481 00:20:05,280 --> 00:20:10,799 so the idea here is so this is what they 482 00:20:08,460 --> 00:20:12,360 call as there are multiple pump stations 483 00:20:10,799 --> 00:20:14,640 and then there are sums that are 484 00:20:12,360 --> 00:20:16,860 associated with them so there is sewer 485 00:20:14,640 --> 00:20:19,140 water coming in from every residential 486 00:20:16,860 --> 00:20:21,179 or other commercial buildings and then 487 00:20:19,140 --> 00:20:23,100 it gets into the main Network which then 488 00:20:21,179 --> 00:20:24,600 goes to the pump station it gets pumped 489 00:20:23,100 --> 00:20:28,020 all the way to the wastewater treatment 490 00:20:24,600 --> 00:20:31,740 plant and so what happens is as you can 491 00:20:28,020 --> 00:20:34,860 see the red there it's slowly racing but 492 00:20:31,740 --> 00:20:36,720 then it's not below the lower limit to 493 00:20:34,860 --> 00:20:39,120 raise an alarm and it is not above the 494 00:20:36,720 --> 00:20:41,220 higher limit to raise an alarm so unless 495 00:20:39,120 --> 00:20:43,380 somebody sees this data manually 496 00:20:41,220 --> 00:20:45,600 constantly inspecting the dashboards 497 00:20:43,380 --> 00:20:48,179 there's no way for them to know and I 498 00:20:45,600 --> 00:20:51,360 and likewise in this specific scenario 499 00:20:48,179 --> 00:20:53,820 they had no idea about this and it 500 00:20:51,360 --> 00:20:55,140 caused a spill where where state 501 00:20:53,820 --> 00:20:56,880 government had to get involved and 502 00:20:55,140 --> 00:20:59,520 federal government got to get involved 503 00:20:56,880 --> 00:21:01,740 and there are a lot of other cascading 504 00:20:59,520 --> 00:21:03,559 issues as well it affects tourism the 505 00:21:01,740 --> 00:21:06,660 main source of 506 00:21:03,559 --> 00:21:08,280 economy and a lot of things so what we 507 00:21:06,660 --> 00:21:10,860 were trying to do is can we find 508 00:21:08,280 --> 00:21:13,919 patterns that can prevent these large 509 00:21:10,860 --> 00:21:16,140 blockages to get spill 510 00:21:13,919 --> 00:21:18,660 so that and as you can see from the data 511 00:21:16,140 --> 00:21:21,120 the red circle shows that was a small 512 00:21:18,660 --> 00:21:24,299 block that happened that got released 513 00:21:21,120 --> 00:21:26,340 for a short period of time and then the 514 00:21:24,299 --> 00:21:29,100 larger block ensured so we were able to 515 00:21:26,340 --> 00:21:33,480 pick up this from the historical study 516 00:21:29,100 --> 00:21:35,159 so ideally if a solution was in place an 517 00:21:33,480 --> 00:21:37,679 alert would have gone out when this 518 00:21:35,159 --> 00:21:40,380 small block happened and released and 519 00:21:37,679 --> 00:21:42,600 then the field investigators could have 520 00:21:40,380 --> 00:21:45,360 taken a look solved the block before the 521 00:21:42,600 --> 00:21:47,100 environmental impact happened 522 00:21:45,360 --> 00:21:49,500 so 523 00:21:47,100 --> 00:21:51,720 there are a lot of problems in water 524 00:21:49,500 --> 00:21:54,720 networks and what I'm trying to do is 525 00:21:51,720 --> 00:21:57,419 just giving you a flavor of some of the 526 00:21:54,720 --> 00:22:00,059 a very very small tiny piece of the 527 00:21:57,419 --> 00:22:03,059 problems that we are trying to solve and 528 00:22:00,059 --> 00:22:04,799 python is helping us to solve from all 529 00:22:03,059 --> 00:22:06,840 sides like it is helping us to better 530 00:22:04,799 --> 00:22:09,659 communicate with the customers build 531 00:22:06,840 --> 00:22:11,760 these problems solutions faster deploy 532 00:22:09,659 --> 00:22:14,360 them better and also maintain them 533 00:22:11,760 --> 00:22:14,360 better so 534 00:22:14,700 --> 00:22:19,440 summary if there are just three things 535 00:22:17,700 --> 00:22:21,059 that I would like people to remember 536 00:22:19,440 --> 00:22:23,940 it's just that 537 00:22:21,059 --> 00:22:25,559 machine learning has a huge potential to 538 00:22:23,940 --> 00:22:28,580 improving water Network utilities 539 00:22:25,559 --> 00:22:31,620 especially automatically analyzing data 540 00:22:28,580 --> 00:22:34,200 identifying patterns and creating alerts 541 00:22:31,620 --> 00:22:37,440 and it helps them to do better decisions 542 00:22:34,200 --> 00:22:40,380 and python is the base for all of the 543 00:22:37,440 --> 00:22:43,260 technology to be made possible but more 544 00:22:40,380 --> 00:22:45,059 importantly that a lot more people with 545 00:22:43,260 --> 00:22:47,700 the right skill sets needs to be 546 00:22:45,059 --> 00:22:49,260 involved in solving these problems a lot 547 00:22:47,700 --> 00:22:52,320 of people are involved in energy 548 00:22:49,260 --> 00:22:53,820 transition towards net zero but then we 549 00:22:52,320 --> 00:22:56,820 also need a lot more people to be 550 00:22:53,820 --> 00:22:59,940 involved in solving water problems 551 00:22:56,820 --> 00:23:01,799 as Gibson said the future is here it's 552 00:22:59,940 --> 00:23:04,380 just not evenly distributed especially 553 00:23:01,799 --> 00:23:05,720 not for the water Networks 554 00:23:04,380 --> 00:23:08,400 so 555 00:23:05,720 --> 00:23:11,280 yeah if you want to learn more about us 556 00:23:08,400 --> 00:23:13,140 I'm here in bike control someday and or 557 00:23:11,280 --> 00:23:15,539 you can go online we also have a 558 00:23:13,140 --> 00:23:17,340 newsletter where we just don't talk 559 00:23:15,539 --> 00:23:19,380 about our work but we talk about the 560 00:23:17,340 --> 00:23:22,620 tech in general across the water 561 00:23:19,380 --> 00:23:24,900 networks and yeah I mean feel free to 562 00:23:22,620 --> 00:23:27,000 email us and we are looking for good 563 00:23:24,900 --> 00:23:30,360 people people who are motivated to make 564 00:23:27,000 --> 00:23:33,539 a difference to solving socially 565 00:23:30,360 --> 00:23:35,640 important problems people who are happy 566 00:23:33,539 --> 00:23:37,020 to work with ambiguity but even in 567 00:23:35,640 --> 00:23:38,640 general if you want to know more about 568 00:23:37,020 --> 00:23:39,800 the area so 569 00:23:38,640 --> 00:23:45,079 thank you 570 00:23:39,800 --> 00:23:45,079 [Applause] 571 00:23:45,360 --> 00:23:48,659 thank you ram that was a fascinating 572 00:23:46,980 --> 00:23:49,919 session we may have time for one or two 573 00:23:48,659 --> 00:23:53,640 questions if anyone in the audience 574 00:23:49,919 --> 00:23:56,360 would like to ask a question I think I 575 00:23:53,640 --> 00:23:56,360 can see a hand 576 00:23:57,120 --> 00:24:01,039 yes excellent we have a question 577 00:24:01,200 --> 00:24:05,700 uh that was a really interesting talk 578 00:24:03,419 --> 00:24:08,460 um personally I work in a similar space 579 00:24:05,700 --> 00:24:12,179 where I'm delivering Ai and ml based 580 00:24:08,460 --> 00:24:14,400 solutions to sort of Legacy or very 581 00:24:12,179 --> 00:24:16,320 traditional Industries and have a lot of 582 00:24:14,400 --> 00:24:18,659 trouble building confidence with those 583 00:24:16,320 --> 00:24:20,340 those customers especially uh when 584 00:24:18,659 --> 00:24:21,840 you're sort of the first pass at 585 00:24:20,340 --> 00:24:23,880 bringing Information Systems in and 586 00:24:21,840 --> 00:24:26,760 you're bringing in AI stuff I'd be 587 00:24:23,880 --> 00:24:28,679 interested to know how your experience 588 00:24:26,760 --> 00:24:30,419 has been building confidence in in 589 00:24:28,679 --> 00:24:32,100 systems like this especially hands-off 590 00:24:30,419 --> 00:24:34,980 automated ones 591 00:24:32,100 --> 00:24:38,400 yes it is a challenge it is an ongoing 592 00:24:34,980 --> 00:24:41,340 challenge but what we realized is 593 00:24:38,400 --> 00:24:43,740 they are a lot more 594 00:24:41,340 --> 00:24:48,120 they are I mean they are open to hearing 595 00:24:43,740 --> 00:24:50,280 ideas that are not operational so if you 596 00:24:48,120 --> 00:24:52,559 go and tell them that we can help you to 597 00:24:50,280 --> 00:24:56,039 make strategic decisions of analyzing 598 00:24:52,559 --> 00:24:58,559 historical data to say like for this uh 599 00:24:56,039 --> 00:25:01,260 part of your asset is bad this is needs 600 00:24:58,559 --> 00:25:04,200 more maintenance they're more uh they're 601 00:25:01,260 --> 00:25:06,000 all years but if you want to do any 602 00:25:04,200 --> 00:25:07,740 modifications to the operational 603 00:25:06,000 --> 00:25:09,840 day-to-day operations in terms of 604 00:25:07,740 --> 00:25:12,419 maintaining then they are a lot more 605 00:25:09,840 --> 00:25:15,419 resistant so the idea is you try to get 606 00:25:12,419 --> 00:25:17,520 in and start communicating with them at 607 00:25:15,419 --> 00:25:19,320 a more strategic level and trying to 608 00:25:17,520 --> 00:25:21,299 solve those problems first and then 609 00:25:19,320 --> 00:25:25,039 slowly get into the operational I mean 610 00:25:21,299 --> 00:25:25,039 that's been what our experience has been 611 00:25:27,000 --> 00:25:30,779 I can't see all your wonderful faces 612 00:25:29,400 --> 00:25:33,779 through the lighting so is there a 613 00:25:30,779 --> 00:25:36,179 second question in the audience 614 00:25:33,779 --> 00:25:37,980 I feel like an Auctioneer going twice 615 00:25:36,179 --> 00:25:39,720 no there isn't Ram thank you very much 616 00:25:37,980 --> 00:25:40,980 for a fantastic session here is a small 617 00:25:39,720 --> 00:25:42,230 gift to say thank you for your talk 618 00:25:40,980 --> 00:25:47,820 thank you very much 619 00:25:42,230 --> 00:25:48,960 [Applause] 620 00:25:47,820 --> 00:25:51,840 Chris would you like to come down and 621 00:25:48,960 --> 00:25:54,840 join me uh we won't do an extended uh 622 00:25:51,840 --> 00:25:57,120 track close but uh from myself and from 623 00:25:54,840 --> 00:25:59,279 Chris uh we wanted to say thank you very 624 00:25:57,120 --> 00:26:01,380 much for coming and joining us at the 625 00:25:59,279 --> 00:26:03,360 our connected Universe track it's the 626 00:26:01,380 --> 00:26:05,279 first time we've run this track together 627 00:26:03,360 --> 00:26:07,460 uh and we're delighted that you could 628 00:26:05,279 --> 00:26:10,240 all be here thank you 629 00:26:07,460 --> 00:26:10,890 next time 630 00:26:10,240 --> 00:26:13,239 [Music] 631 00:26:10,890 --> 00:26:13,239 [Applause]