1 00:00:00,000 --> 00:00:05,910 We'd 2 00:00:02,830 --> 00:00:05,910 [Music] 3 00:00:10,160 --> 00:00:14,960 like to welcome to the stage at this 4 00:00:11,440 --> 00:00:17,520 time Andressa Delo Cabistani uh who will 5 00:00:14,960 --> 00:00:20,480 speak with us on self-healing system for 6 00:00:17,520 --> 00:00:23,720 UI tests using ML. Thank you. 7 00:00:20,480 --> 00:00:23,720 Thank you. 8 00:00:28,000 --> 00:00:34,559 So hi. Oh my god, you are a lot of 9 00:00:31,119 --> 00:00:36,160 people. So, hi, my name is Andresa 10 00:00:34,559 --> 00:00:39,360 Cabeni. 11 00:00:36,160 --> 00:00:42,719 Uh, and I am not a machine learning 12 00:00:39,360 --> 00:00:45,520 engineer nor a data scientist. I work as 13 00:00:42,719 --> 00:00:48,559 a software quality engineer. And I want 14 00:00:45,520 --> 00:00:51,600 to show today that there are things we 15 00:00:48,559 --> 00:00:53,120 can do using machine learning in my area 16 00:00:51,600 --> 00:00:54,719 too. 17 00:00:53,120 --> 00:00:57,600 So today I'll talk about the 18 00:00:54,719 --> 00:01:00,640 self-healing system for UI tests that we 19 00:00:57,600 --> 00:01:03,440 built in my team this year. This system 20 00:01:00,640 --> 00:01:06,479 started as a project during an AI 21 00:01:03,440 --> 00:01:11,640 hackathon and ended up as a solution to 22 00:01:06,479 --> 00:01:11,640 make our UI test suites more reliable. 23 00:01:11,680 --> 00:01:17,600 But before I start, let me introduce 24 00:01:13,920 --> 00:01:19,759 myself a little bit. So as I said, my 25 00:01:17,600 --> 00:01:21,439 name is Andrea. 26 00:01:19,759 --> 00:01:24,720 Probably you already know that because 27 00:01:21,439 --> 00:01:27,600 of my accent, but I am Brazilian. 28 00:01:24,720 --> 00:01:29,680 Brazil is doing an outofse carnival 29 00:01:27,600 --> 00:01:33,840 right now celebrating that the former 30 00:01:29,680 --> 00:01:36,960 president was sentenced to 27 years of 31 00:01:33,840 --> 00:01:40,880 prison for attempting a coup d'etata. 32 00:01:36,960 --> 00:01:44,240 So, y Brazil. 33 00:01:40,880 --> 00:01:46,720 Uh, I work at Red Hat and I love 34 00:01:44,240 --> 00:01:49,360 programming. What I like the most about 35 00:01:46,720 --> 00:01:52,000 programming uh is the fact that we are 36 00:01:49,360 --> 00:01:54,960 always learning. Uh when I'm not 37 00:01:52,000 --> 00:01:58,000 working, I am probably singing K-pop 38 00:01:54,960 --> 00:02:00,719 demon hunter songs with my daughters. 39 00:01:58,000 --> 00:02:03,119 Those are day, by the way, I didn't just 40 00:02:00,719 --> 00:02:07,439 get images from random girls in the 41 00:02:03,119 --> 00:02:09,920 internet. Um and the last thing is that 42 00:02:07,439 --> 00:02:11,680 I love Star Wars, but who doesn't, 43 00:02:09,920 --> 00:02:15,520 right? 44 00:02:11,680 --> 00:02:20,080 So, enough about me. Let's do a quick 45 00:02:15,520 --> 00:02:22,560 quick overview of how a UI test works. 46 00:02:20,080 --> 00:02:25,840 Basically, we need to use an automation 47 00:02:22,560 --> 00:02:28,959 framework, for example, Selenium or the 48 00:02:25,840 --> 00:02:32,400 one I built this whole project around uh 49 00:02:28,959 --> 00:02:36,720 called playright and tell the automation 50 00:02:32,400 --> 00:02:40,080 framework to identify a specific element 51 00:02:36,720 --> 00:02:42,000 in the page and perform an action in 52 00:02:40,080 --> 00:02:45,840 that element. 53 00:02:42,000 --> 00:02:48,879 An example to understand that is if I 54 00:02:45,840 --> 00:02:51,599 have a test to validate that the log 55 00:02:48,879 --> 00:02:54,560 logging functionality is working in a 56 00:02:51,599 --> 00:02:58,400 page. I tell the automation framework to 57 00:02:54,560 --> 00:03:01,360 navigate to the page under test. Select 58 00:02:58,400 --> 00:03:03,840 the username input box and fill that 59 00:03:01,360 --> 00:03:06,720 with the correct username. 60 00:03:03,840 --> 00:03:09,440 Select the password input box and fill 61 00:03:06,720 --> 00:03:12,480 that with the correct password. Click 62 00:03:09,440 --> 00:03:16,319 the login button and then assert that 63 00:03:12,480 --> 00:03:19,840 the login functionality was successful. 64 00:03:16,319 --> 00:03:22,480 Or like the example in the slide shows, 65 00:03:19,840 --> 00:03:26,480 go to the PyCon Australia 66 00:03:22,480 --> 00:03:30,879 2025 website and click the link 67 00:03:26,480 --> 00:03:37,080 identified by that huge CSS selector 68 00:03:30,879 --> 00:03:37,080 which is the about link in that page. 69 00:03:39,440 --> 00:03:45,440 To add some context about Python 70 00:03:42,640 --> 00:03:48,959 playright, the first thing to understand 71 00:03:45,440 --> 00:03:52,239 is that Python playright has a page 72 00:03:48,959 --> 00:03:55,280 class with methods that are used to 73 00:03:52,239 --> 00:03:58,480 locate elements in the page. For 74 00:03:55,280 --> 00:04:01,920 example, get by role 75 00:03:58,480 --> 00:04:04,720 that receives the type of row like 76 00:04:01,920 --> 00:04:07,680 button and the name of that button. for 77 00:04:04,720 --> 00:04:10,879 example, submit. 78 00:04:07,680 --> 00:04:14,640 These methods return a playright's 79 00:04:10,879 --> 00:04:19,280 locator instance. The locator instances 80 00:04:14,640 --> 00:04:22,400 make it possible to click, fill, check, 81 00:04:19,280 --> 00:04:24,960 and perform other playright actions. In 82 00:04:22,400 --> 00:04:27,919 this slide, we can see a video I 83 00:04:24,960 --> 00:04:31,360 recorded of myself pointing to the the 84 00:04:27,919 --> 00:04:35,120 mouse cursor uh to different elements 85 00:04:31,360 --> 00:04:37,520 using Playright code generator 86 00:04:35,120 --> 00:04:42,400 just to show some of the ways playright 87 00:04:37,520 --> 00:04:46,320 uses to identify elements in a page. For 88 00:04:42,400 --> 00:04:50,000 example, the get by RO passing the link 89 00:04:46,320 --> 00:04:53,759 as row and setting name about as an 90 00:04:50,000 --> 00:04:57,600 alternative to calling locator method 91 00:04:53,759 --> 00:05:00,320 and using that long awful CSS selector 92 00:04:57,600 --> 00:05:03,280 from the other slides. 93 00:05:00,320 --> 00:05:05,680 And please take a good look at that 94 00:05:03,280 --> 00:05:09,560 about link in the page because I'll talk 95 00:05:05,680 --> 00:05:09,560 about it a lot today. 96 00:05:11,120 --> 00:05:16,720 So when we are testing something like 97 00:05:13,759 --> 00:05:20,160 the logging functionality, 98 00:05:16,720 --> 00:05:22,479 we call it an end to end test. 99 00:05:20,160 --> 00:05:25,440 And if I have a test suite full of end 100 00:05:22,479 --> 00:05:28,880 to end tests that are triggered in a 101 00:05:25,440 --> 00:05:32,560 CI/CD pipeline to run every time new 102 00:05:28,880 --> 00:05:35,600 pieces of code are merged. We call them 103 00:05:32,560 --> 00:05:38,160 regression tests. 104 00:05:35,600 --> 00:05:41,840 These regressions tests run so we can 105 00:05:38,160 --> 00:05:45,039 lower the risk that the changes affect 106 00:05:41,840 --> 00:05:48,160 other functionalities before continuing 107 00:05:45,039 --> 00:05:52,000 its way to prod. And when a failure 108 00:05:48,160 --> 00:05:54,320 occurs in this regression test, we stop 109 00:05:52,000 --> 00:05:59,479 the release of the functionality and 110 00:05:54,320 --> 00:05:59,479 start to investigate what happened. 111 00:06:04,479 --> 00:06:09,520 And here is where the UI test problem 112 00:06:07,919 --> 00:06:13,440 begins. 113 00:06:09,520 --> 00:06:17,919 Sometimes even a small front end change 114 00:06:13,440 --> 00:06:20,240 to an attribute can break a selector 115 00:06:17,919 --> 00:06:23,440 making impossible for an automation 116 00:06:20,240 --> 00:06:27,039 framework to identify the element in the 117 00:06:23,440 --> 00:06:30,400 page to perform the expected action 118 00:06:27,039 --> 00:06:33,680 until a timeout error occurs. 119 00:06:30,400 --> 00:06:36,560 So we stopped the whole pipeline started 120 00:06:33,680 --> 00:06:39,120 to investigate what is wrong just to 121 00:06:36,560 --> 00:06:43,280 find out that the ID changed from 122 00:06:39,120 --> 00:06:46,319 hashtag username to hashtag user and 123 00:06:43,280 --> 00:06:47,840 this doesn't affected the logging 124 00:06:46,319 --> 00:06:51,280 functionality 125 00:06:47,840 --> 00:06:56,120 the logging is still working it's just a 126 00:06:51,280 --> 00:06:56,120 UI test framework limitation 127 00:06:56,160 --> 00:07:03,360 so as good engineers we wanted to create 128 00:06:59,840 --> 00:07:06,720 a solution for this problem. A system 129 00:07:03,360 --> 00:07:10,080 that when a timeout error occur during 130 00:07:06,720 --> 00:07:12,240 an action like click, fail, check, etc. 131 00:07:10,080 --> 00:07:15,599 Because playright couldn't find the 132 00:07:12,240 --> 00:07:18,479 element, the test would heal itself. 133 00:07:15,599 --> 00:07:21,599 Sounds a lot like a Jedi mind trick, 134 00:07:18,479 --> 00:07:25,319 right? It heal itself and make the 135 00:07:21,599 --> 00:07:25,319 pipeline succeed. 136 00:07:25,840 --> 00:07:33,680 It is not as cool as Jedi mind trick but 137 00:07:30,560 --> 00:07:35,520 we can do we can do cool flowcharts with 138 00:07:33,680 --> 00:07:41,039 it. 139 00:07:35,520 --> 00:07:44,800 So let's start with the successful case. 140 00:07:41,039 --> 00:07:48,960 The test start and tell playright to 141 00:07:44,800 --> 00:07:52,639 perform the action click in an element. 142 00:07:48,960 --> 00:07:54,879 If playright is successful to click in 143 00:07:52,639 --> 00:07:57,520 the element, 144 00:07:54,879 --> 00:08:00,160 fingerprints for the element are saved 145 00:07:57,520 --> 00:08:02,000 in a database and the test will 146 00:08:00,160 --> 00:08:07,440 continue. 147 00:08:02,000 --> 00:08:10,080 Now the case that is interest us more 148 00:08:07,440 --> 00:08:13,759 the one that the action wasn't 149 00:08:10,080 --> 00:08:17,120 successful in this case if playright 150 00:08:13,759 --> 00:08:19,120 can't find the element the healing is 151 00:08:17,120 --> 00:08:22,000 triggered. 152 00:08:19,120 --> 00:08:24,800 When the healing is triggered, there are 153 00:08:22,000 --> 00:08:28,160 a bunch of things that need to happen 154 00:08:24,800 --> 00:08:31,039 before attempting to heal. The first 155 00:08:28,160 --> 00:08:33,440 thing in the healing process is to 156 00:08:31,039 --> 00:08:36,399 extract information 157 00:08:33,440 --> 00:08:39,919 for the failed element. So the system 158 00:08:36,399 --> 00:08:42,719 can search in the database for similar 159 00:08:39,919 --> 00:08:46,320 fingerprints. 160 00:08:42,719 --> 00:08:49,680 A after finding similar fingerprints, 161 00:08:46,320 --> 00:08:52,959 multiple alternatives are generated and 162 00:08:49,680 --> 00:08:54,640 sent to an endpoint where they will be 163 00:08:52,959 --> 00:08:58,399 classified 164 00:08:54,640 --> 00:09:02,000 and ranked by confidence score, a number 165 00:08:58,399 --> 00:09:05,440 that express how likely they are to be a 166 00:09:02,000 --> 00:09:08,480 successful substitute. 167 00:09:05,440 --> 00:09:11,760 The healing system will iterate over the 168 00:09:08,480 --> 00:09:14,720 rent alternatives and start to test them 169 00:09:11,760 --> 00:09:18,000 one by one. 170 00:09:14,720 --> 00:09:21,519 Both successful and unsuccessful healing 171 00:09:18,000 --> 00:09:24,720 events will be stored in the database 172 00:09:21,519 --> 00:09:27,839 for future learning. The healing system 173 00:09:24,720 --> 00:09:30,800 will try until an alternative is 174 00:09:27,839 --> 00:09:33,839 successful or until exhaust all 175 00:09:30,800 --> 00:09:36,560 alternatives with confidence score equal 176 00:09:33,839 --> 00:09:40,600 or greater than the threshold which is 177 00:09:36,560 --> 00:09:40,600 defaulted to 0.5 178 00:09:42,880 --> 00:09:48,040 water. Okay. So 179 00:09:48,320 --> 00:09:55,360 since we wanted that our existing test 180 00:09:52,080 --> 00:09:58,800 suits using Pyest playright 181 00:09:55,360 --> 00:10:02,080 didn't need many changes when starting 182 00:09:58,800 --> 00:10:05,120 to use the self-healing system we 183 00:10:02,080 --> 00:10:09,279 implemented a wrapper around playright's 184 00:10:05,120 --> 00:10:11,120 page called self-healing page which is 185 00:10:09,279 --> 00:10:14,800 very creative 186 00:10:11,120 --> 00:10:18,880 and if in the normal Python playright we 187 00:10:14,800 --> 00:10:22,000 have a page object and all methods used 188 00:10:18,880 --> 00:10:25,839 to locate an element return a locator 189 00:10:22,000 --> 00:10:28,560 instance. Self-healing page has all the 190 00:10:25,839 --> 00:10:31,600 same methods but returning a 191 00:10:28,560 --> 00:10:34,560 self-healing locator instance and 192 00:10:31,600 --> 00:10:37,839 self-healing locator implements the same 193 00:10:34,560 --> 00:10:40,240 interface as locator but with the 194 00:10:37,839 --> 00:10:42,720 healing functionality inside all 195 00:10:40,240 --> 00:10:47,240 methods. 196 00:10:42,720 --> 00:10:47,240 the action methods actually. 197 00:10:48,160 --> 00:10:54,560 So in this slide we can see a 198 00:10:51,680 --> 00:10:58,079 comparison. We can see the changes 199 00:10:54,560 --> 00:11:01,839 needed to initialize a PyCon AU page 200 00:10:58,079 --> 00:11:04,079 instance for a system using only Pyest 201 00:11:01,839 --> 00:11:07,920 playright without the healing 202 00:11:04,079 --> 00:11:12,000 functionality. And how to initialize 203 00:11:07,920 --> 00:11:14,720 using the healing system without any 204 00:11:12,000 --> 00:11:18,000 healing we just need to pass the page 205 00:11:14,720 --> 00:11:20,399 fixture to Python AU page and it's ready 206 00:11:18,000 --> 00:11:23,680 to be used. 207 00:11:20,399 --> 00:11:26,959 The second way we need to create a page 208 00:11:23,680 --> 00:11:29,600 object with healing capabilities 209 00:11:26,959 --> 00:11:31,760 which I called self-healing page in 210 00:11:29,600 --> 00:11:35,279 snake case 211 00:11:31,760 --> 00:11:39,600 since we can't pass the fixture page 212 00:11:35,279 --> 00:11:43,279 from pyest playrate directly to pyon au 213 00:11:39,600 --> 00:11:46,240 page we pass it to the class 214 00:11:43,279 --> 00:11:48,560 self-healing p 215 00:11:46,240 --> 00:11:52,079 page to build a page with healing 216 00:11:48,560 --> 00:11:55,279 capabilities that we want to pass to 217 00:11:52,079 --> 00:11:58,720 PyCon AU page. 218 00:11:55,279 --> 00:12:02,560 Self-healing page will accept both page 219 00:11:58,720 --> 00:12:05,440 and the healing DB as our arguments. 220 00:12:02,560 --> 00:12:08,639 After that, we just pass self-healing 221 00:12:05,440 --> 00:12:12,079 page object to Python AU page and the 222 00:12:08,639 --> 00:12:16,800 fixture is ready to be used. 223 00:12:12,079 --> 00:12:21,040 So all the tests using the fixture pyon 224 00:12:16,800 --> 00:12:24,160 AU page won't need to change. The only 225 00:12:21,040 --> 00:12:27,360 changes were in how to set up the 226 00:12:24,160 --> 00:12:32,040 fixture and all methods used are 227 00:12:27,360 --> 00:12:32,040 compatible with both pages. 228 00:12:34,959 --> 00:12:41,680 So we already know from the flowchart 229 00:12:38,000 --> 00:12:44,240 what is the flow of the system. 230 00:12:41,680 --> 00:12:47,839 But what happens in the code when I use 231 00:12:44,240 --> 00:12:51,519 the fixture pyon AU page with healing 232 00:12:47,839 --> 00:12:57,160 capabilities inside my test and tell 233 00:12:51,519 --> 00:12:57,160 playright to click in that about link. 234 00:12:57,279 --> 00:13:03,680 First the get by roll method inside 235 00:13:00,720 --> 00:13:06,240 selfhealing page will return a 236 00:13:03,680 --> 00:13:09,600 self-healing locator. 237 00:13:06,240 --> 00:13:12,560 If you paid attention in the about link 238 00:13:09,600 --> 00:13:15,600 like I told you to do, you already know 239 00:13:12,560 --> 00:13:19,120 that they are using 240 00:13:15,600 --> 00:13:21,839 uh are doing to locate the about link in 241 00:13:19,120 --> 00:13:25,200 the page is a good way. 242 00:13:21,839 --> 00:13:27,200 So the click action is successful 243 00:13:25,200 --> 00:13:29,440 without any trigger to the healing 244 00:13:27,200 --> 00:13:34,079 system. And when an action is 245 00:13:29,440 --> 00:13:38,800 successful, the action method will call 246 00:13:34,079 --> 00:13:44,279 the method to insert the fingerprint and 247 00:13:38,800 --> 00:13:44,279 save the su successful interaction 248 00:13:44,560 --> 00:13:49,600 in a database 249 00:13:47,120 --> 00:13:52,560 and save all and store all the 250 00:13:49,600 --> 00:13:56,680 fingerprints in the database. But what 251 00:13:52,560 --> 00:13:56,680 are those fingerprints? 252 00:13:59,360 --> 00:14:02,360 Oh, 253 00:14:02,480 --> 00:14:05,959 just a second. 254 00:14:06,880 --> 00:14:12,720 I am running. 255 00:14:10,639 --> 00:14:14,560 Okay. 256 00:14:12,720 --> 00:14:18,000 Okay. 257 00:14:14,560 --> 00:14:21,360 Uh so what are those fingerpin 258 00:14:18,000 --> 00:14:25,440 fingerprints? So these are information 259 00:14:21,360 --> 00:14:28,240 about the elements we just clicked. 260 00:14:25,440 --> 00:14:31,519 This information are tag name which is 261 00:14:28,240 --> 00:14:35,040 an A here for a link or what's the inner 262 00:14:31,519 --> 00:14:37,519 text which we know it's about or if 263 00:14:35,040 --> 00:14:40,560 there are accessibility attributes or 264 00:14:37,519 --> 00:14:43,920 the page this element is located and the 265 00:14:40,560 --> 00:14:46,160 type of locator method it is used to get 266 00:14:43,920 --> 00:14:49,440 that element. 267 00:14:46,160 --> 00:14:52,959 And these are all important information 268 00:14:49,440 --> 00:14:56,079 that we will need as soon as the locator 269 00:14:52,959 --> 00:14:58,800 is not valid anymore 270 00:14:56,079 --> 00:15:04,000 and needs to heal. 271 00:14:58,800 --> 00:15:05,760 But now let's simulate a failure. 272 00:15:04,000 --> 00:15:08,480 So 273 00:15:05,760 --> 00:15:11,199 inside our test, we are trying to locate 274 00:15:08,480 --> 00:15:13,199 an incorrect name 275 00:15:11,199 --> 00:15:15,360 ABT. 276 00:15:13,199 --> 00:15:19,199 Well, we know this is supposed to 277 00:15:15,360 --> 00:15:22,240 trigger a healing, but in code details, 278 00:15:19,199 --> 00:15:25,680 what happens? 279 00:15:22,240 --> 00:15:29,680 The timeout error is handled inside each 280 00:15:25,680 --> 00:15:32,720 action method in self-healing locator. 281 00:15:29,680 --> 00:15:35,920 So when the timeout error happens during 282 00:15:32,720 --> 00:15:38,880 the click, it will first check if the 283 00:15:35,920 --> 00:15:43,040 healing service is available and if it 284 00:15:38,880 --> 00:15:44,959 is, it will attempt to heal. 285 00:15:43,040 --> 00:15:47,440 The first step when the system is 286 00:15:44,959 --> 00:15:51,360 attempting to heal 287 00:15:47,440 --> 00:15:54,880 is to get more information 288 00:15:51,360 --> 00:15:58,560 about the failing element. 289 00:15:54,880 --> 00:16:01,040 The stract element context is called and 290 00:15:58,560 --> 00:16:04,240 the first thing it does is to start a 291 00:16:01,040 --> 00:16:07,680 dictionary called the context with 292 00:16:04,240 --> 00:16:11,360 default values for each fingerprint. 293 00:16:07,680 --> 00:16:15,199 After that it extracts the exact page 294 00:16:11,360 --> 00:16:18,480 the element is supposed to be and to and 295 00:16:15,199 --> 00:16:20,639 add to the context and it's possible to 296 00:16:18,480 --> 00:16:24,399 try and get more elements from the body 297 00:16:20,639 --> 00:16:27,360 in that page. It's possible to infer 298 00:16:24,399 --> 00:16:30,800 some data about that element too from 299 00:16:27,360 --> 00:16:35,360 the row link. For example, it's possible 300 00:16:30,800 --> 00:16:37,920 to infer that the tag name is probably a 301 00:16:35,360 --> 00:16:44,199 after adding all this information in the 302 00:16:37,920 --> 00:16:44,199 context dict returns the context. 303 00:16:49,120 --> 00:16:54,959 Now it's time for the healing system to 304 00:16:51,600 --> 00:16:58,160 generate all possible alternatives for 305 00:16:54,959 --> 00:17:01,040 the failing locator. 306 00:16:58,160 --> 00:17:03,839 The dict context was passed as argument 307 00:17:01,040 --> 00:17:08,079 and is used as base to query the 308 00:17:03,839 --> 00:17:10,480 database to get similar fingerprints. 309 00:17:08,079 --> 00:17:13,919 The similar fingerprints are passed to 310 00:17:10,480 --> 00:17:16,559 strategies to generate an exhaustive 311 00:17:13,919 --> 00:17:19,280 number of possibilities 312 00:17:16,559 --> 00:17:22,079 and it can be a really big number of 313 00:17:19,280 --> 00:17:26,319 possibilities. For the example we are 314 00:17:22,079 --> 00:17:28,480 working on here, it suggests 133 315 00:17:26,319 --> 00:17:31,440 alternatives. 316 00:17:28,480 --> 00:17:34,320 A lot of is rubbish of course, but 317 00:17:31,440 --> 00:17:37,360 that's not for me to classify them as 318 00:17:34,320 --> 00:17:41,000 rubbish. That's the job for random 319 00:17:37,360 --> 00:17:41,000 forest classifier. 320 00:17:41,039 --> 00:17:48,000 Sending 133 alternatives wouldn't be 321 00:17:44,799 --> 00:17:50,400 very easy to add in a slide. So I just 322 00:17:48,000 --> 00:17:53,679 added some good alternatives and some 323 00:17:50,400 --> 00:17:56,320 not so good alternatives to call and did 324 00:17:53,679 --> 00:17:58,640 a call to the endpoint where the model 325 00:17:56,320 --> 00:18:02,160 is being served. 326 00:17:58,640 --> 00:18:04,640 The model then classified all of them. 327 00:18:02,160 --> 00:18:06,880 Let's start analyzing the first one. 328 00:18:04,640 --> 00:18:10,799 It's 0.92. 329 00:18:06,880 --> 00:18:13,520 It is a really high confidence score. 330 00:18:10,799 --> 00:18:17,919 The one I used before as an as an 331 00:18:13,520 --> 00:18:21,600 example of successful is the third one 332 00:18:17,919 --> 00:18:25,280 and is classified as 0.91. 333 00:18:21,600 --> 00:18:27,760 Again, very high confidence score. 334 00:18:25,280 --> 00:18:31,840 And we can see that the weirder it gets, 335 00:18:27,760 --> 00:18:34,240 less likely to succeed it is. The 336 00:18:31,840 --> 00:18:37,679 endpoint not only returns all 337 00:18:34,240 --> 00:18:40,720 alternatives classified, but also return 338 00:18:37,679 --> 00:18:44,240 a ranked alternatives list organized 339 00:18:40,720 --> 00:18:46,720 from the most likely to succeed to less 340 00:18:44,240 --> 00:18:49,360 likely. And this is good because as soon 341 00:18:46,720 --> 00:18:53,360 as the healing system finds a good 342 00:18:49,360 --> 00:18:56,360 locator, it just stop. No need to try 343 00:18:53,360 --> 00:18:56,360 others. 344 00:18:59,120 --> 00:19:04,960 Scikitle learn random forest classifier 345 00:19:02,880 --> 00:19:07,760 uh is a supervised machine learning 346 00:19:04,960 --> 00:19:10,880 algorithm which means it learns from 347 00:19:07,760 --> 00:19:13,679 historical examples where we already 348 00:19:10,880 --> 00:19:16,720 know the correct outcomes 349 00:19:13,679 --> 00:19:20,080 in the self-healing system case. This 350 00:19:16,720 --> 00:19:23,039 means training on thousands of past UI 351 00:19:20,080 --> 00:19:26,799 healing events that are labeled as 352 00:19:23,039 --> 00:19:30,080 either success or failure. 353 00:19:26,799 --> 00:19:33,919 The algorithm builds hundreds of 354 00:19:30,080 --> 00:19:36,080 decision trees, which are the forest, 355 00:19:33,919 --> 00:19:38,400 where each tree is trained on a 356 00:19:36,080 --> 00:19:40,960 different random subset of this 357 00:19:38,400 --> 00:19:43,679 historical data and uses a random 358 00:19:40,960 --> 00:19:47,200 selection of features at each decision 359 00:19:43,679 --> 00:19:49,440 point. When making predictions, 360 00:19:47,200 --> 00:19:52,799 all trees vote on the most likely 361 00:19:49,440 --> 00:19:56,080 outcome and the majority decision become 362 00:19:52,799 --> 00:20:00,000 becomes the final prediction. 363 00:19:56,080 --> 00:20:03,840 basically uses the wisdom of crowds 364 00:20:00,000 --> 00:20:07,440 principle that an idea uh of an 365 00:20:03,840 --> 00:20:09,039 individual individual can inherently be 366 00:20:07,440 --> 00:20:11,440 biased 367 00:20:09,039 --> 00:20:14,640 where 368 00:20:11,440 --> 00:20:17,200 but taking the average knowledge of a 369 00:20:14,640 --> 00:20:20,240 crowd can result in eliminating the bias 370 00:20:17,200 --> 00:20:24,480 or noise to produce a clearer and more 371 00:20:20,240 --> 00:20:26,960 coherent result. So English is it is my 372 00:20:24,480 --> 00:20:29,960 second language. 373 00:20:26,960 --> 00:20:29,960 So 374 00:20:30,559 --> 00:20:36,640 since machine mach machine learning oh 375 00:20:33,840 --> 00:20:39,360 my god it's getting hard. 376 00:20:36,640 --> 00:20:42,159 So since machine learning algorithms can 377 00:20:39,360 --> 00:20:45,440 only understand numbers we need to 378 00:20:42,159 --> 00:20:48,559 convert everything about a UI element 379 00:20:45,440 --> 00:20:52,000 and its alternatives into numerical 380 00:20:48,559 --> 00:20:55,360 representations. For example, when a 381 00:20:52,000 --> 00:20:59,039 test fails on a selector like get by row 382 00:20:55,360 --> 00:21:03,679 link name abt, the healing system 383 00:20:59,039 --> 00:21:06,400 doesn't just see text. It is measurable 384 00:21:03,679 --> 00:21:08,640 characteristics. 385 00:21:06,400 --> 00:21:11,360 And the transformation from human 386 00:21:08,640 --> 00:21:15,120 readable information to numerical data 387 00:21:11,360 --> 00:21:17,120 is what allows the AI to make uh 388 00:21:15,120 --> 00:21:20,120 mathematical comparisons and 389 00:21:17,120 --> 00:21:20,120 predictions. 390 00:21:20,559 --> 00:21:27,280 The self-healing system extracts 85 391 00:21:24,159 --> 00:21:30,480 different distinct uh numerical features 392 00:21:27,280 --> 00:21:32,880 across six categories, each designed to 393 00:21:30,480 --> 00:21:36,000 capture different aspects of what makes 394 00:21:32,880 --> 00:21:39,520 a good healing choice. 395 00:21:36,000 --> 00:21:42,080 The selector features converts basic 396 00:21:39,520 --> 00:21:46,159 characteristics of the selectors in 397 00:21:42,080 --> 00:21:50,320 numbers. For example, a string length 398 00:21:46,159 --> 00:21:54,640 becomes a character count. Complexity 399 00:21:50,320 --> 00:21:57,600 becomes a numerical score based on how 400 00:21:54,640 --> 00:22:02,400 many conditions and nested structures 401 00:21:57,600 --> 00:22:04,320 exist. For example, a simple ID 402 00:22:02,400 --> 00:22:08,640 hashubmit 403 00:22:04,320 --> 00:22:12,559 button gets a complexity score of one. 404 00:22:08,640 --> 00:22:16,000 But that nested selector in this slide 405 00:22:12,559 --> 00:22:18,159 gets a complexity score way higher 406 00:22:16,000 --> 00:22:21,159 because of multiple condition and 407 00:22:18,159 --> 00:22:21,159 nesting. 408 00:22:22,960 --> 00:22:30,080 The similarity features that transform 409 00:22:25,440 --> 00:22:33,360 the concept of how similar are those two 410 00:22:30,080 --> 00:22:37,440 selectors into precise numerical 411 00:22:33,360 --> 00:22:40,159 measurements using jakard similarity. 412 00:22:37,440 --> 00:22:43,679 Jakar similarity compares two sets of 413 00:22:40,159 --> 00:22:47,760 data and returns a percentage from zero 414 00:22:43,679 --> 00:22:50,640 to one that express how similar they are 415 00:22:47,760 --> 00:22:53,679 basically is the ratio between the 416 00:22:50,640 --> 00:22:57,440 number of observations in both sets and 417 00:22:53,679 --> 00:23:00,640 the number in either set. So we can see 418 00:22:57,440 --> 00:23:04,400 that after exclude excluding repeated 419 00:23:00,640 --> 00:23:08,640 car characters from both sets the jakard 420 00:23:04,400 --> 00:23:11,919 similarity is very high because only the 421 00:23:08,640 --> 00:23:17,200 u in about 422 00:23:11,919 --> 00:23:20,320 is different. Also since from ab to 423 00:23:17,200 --> 00:23:24,000 about there are only two characters 424 00:23:20,320 --> 00:23:27,200 added. The added distance between the 425 00:23:24,000 --> 00:23:30,480 two words is true. And there are also 426 00:23:27,200 --> 00:23:33,520 semantic similarity features that checks 427 00:23:30,480 --> 00:23:37,679 abt and about about 428 00:23:33,520 --> 00:23:41,039 and abt and sponsor and understand that 429 00:23:37,679 --> 00:23:42,960 abt and sponsor are two unlike to be the 430 00:23:41,039 --> 00:23:46,400 same element. 431 00:23:42,960 --> 00:23:50,919 But about is very likely to be the link 432 00:23:46,400 --> 00:23:50,919 name playwright is looking for. 433 00:23:52,320 --> 00:23:57,840 The context features converts 434 00:23:54,799 --> 00:24:01,360 environmental information about the web 435 00:23:57,840 --> 00:24:05,679 page and HTML elements into numerical 436 00:24:01,360 --> 00:24:09,600 form. Look all the extracted information 437 00:24:05,679 --> 00:24:12,240 for PyCon Wayu web page in the home path 438 00:24:09,600 --> 00:24:14,559 and about path. 439 00:24:12,240 --> 00:24:17,760 Just from all this information, the 440 00:24:14,559 --> 00:24:21,320 system already knows these are two 441 00:24:17,760 --> 00:24:21,320 different paths. 442 00:24:21,360 --> 00:24:28,480 In this context features the HTML 443 00:24:24,400 --> 00:24:31,679 element element type become categorical 444 00:24:28,480 --> 00:24:36,240 numbers where button equals 1 input 445 00:24:31,679 --> 00:24:39,600 equals 2 link equals three 446 00:24:36,240 --> 00:24:42,240 etc. So when healing fails on get by row 447 00:24:39,600 --> 00:24:45,919 link name abt 448 00:24:42,240 --> 00:24:48,240 the system knows it's targeting a link 449 00:24:45,919 --> 00:24:51,720 element and can factor that into its 450 00:24:48,240 --> 00:24:51,720 healing predictions. 451 00:24:52,000 --> 00:24:55,679 Now what I believe to be the most 452 00:24:53,760 --> 00:24:58,559 sophisticated 453 00:24:55,679 --> 00:25:02,320 uh transformation the reliability 454 00:24:58,559 --> 00:25:07,240 features they convert best practices for 455 00:25:02,320 --> 00:25:07,240 selectors in numerical scores. 456 00:25:07,279 --> 00:25:14,000 The presence of an ID selector 457 00:25:10,880 --> 00:25:16,240 becomes uses ID selector equals one. 458 00:25:14,000 --> 00:25:21,760 Good for reliability 459 00:25:16,240 --> 00:25:25,279 and receive a penalty for uh and fragile 460 00:25:21,760 --> 00:25:31,279 patterns like nth child receives a 461 00:25:25,279 --> 00:25:34,000 penalty for uh bad reliability. 462 00:25:31,279 --> 00:25:37,600 Role based selectors like get by row 463 00:25:34,000 --> 00:25:41,919 link name about get positive reliability 464 00:25:37,600 --> 00:25:46,320 scores because they use semantic HTML 465 00:25:41,919 --> 00:25:49,600 uh roles rather than brittle CSS paths. 466 00:25:46,320 --> 00:25:52,799 The system literally counts things like 467 00:25:49,600 --> 00:25:54,799 nesting death, selector nesting death 468 00:25:52,799 --> 00:25:59,120 equals tree, 469 00:25:54,799 --> 00:26:01,440 attribute conditions, and CSS uh 470 00:25:59,120 --> 00:26:04,240 combinators to create numerical 471 00:26:01,440 --> 00:26:08,200 reliability indicators that predict 472 00:26:04,240 --> 00:26:08,200 selector stability. 473 00:26:11,679 --> 00:26:18,400 The dome features analyze where an 474 00:26:14,080 --> 00:26:21,279 element is within the page. Uh here are 475 00:26:18,400 --> 00:26:25,600 code structure like the address of an 476 00:26:21,279 --> 00:26:28,480 element in the HTML document. 477 00:26:25,600 --> 00:26:31,200 When a UI test fails, knowing the 478 00:26:28,480 --> 00:26:33,840 structure context helps predict which 479 00:26:31,200 --> 00:26:37,039 alternative will work in the same 480 00:26:33,840 --> 00:26:41,039 location. The features transform this 481 00:26:37,039 --> 00:26:44,159 address into numerical data also creates 482 00:26:41,039 --> 00:26:47,120 binary flags from parents information to 483 00:26:44,159 --> 00:26:50,240 know if it is inside a form, a nav or a 484 00:26:47,120 --> 00:26:53,200 container. This is important for for 485 00:26:50,240 --> 00:26:57,039 healing because elements within forms 486 00:26:53,200 --> 00:27:00,080 typically uses semantic attributes like 487 00:26:57,039 --> 00:27:01,600 name or ID. So if the element has form 488 00:27:00,080 --> 00:27:05,200 parent 489 00:27:01,600 --> 00:27:10,039 uh the model knows that should 490 00:27:05,200 --> 00:27:10,039 prioritize form specific selectors. 491 00:27:11,360 --> 00:27:17,600 The last category are the text features. 492 00:27:14,799 --> 00:27:20,400 The text features anal uh analyze the 493 00:27:17,600 --> 00:27:24,400 actual readable content associated with 494 00:27:20,400 --> 00:27:28,559 UI elements. X example of the text about 495 00:27:24,400 --> 00:27:32,320 for that link. Text content is often uh 496 00:27:28,559 --> 00:27:36,720 the most stable aspect of web interfaces 497 00:27:32,320 --> 00:27:39,840 while CSS cla classes and DOM structure 498 00:27:36,720 --> 00:27:42,640 uh frequently change. User visible 499 00:27:39,840 --> 00:27:44,320 labels like about 10 to remain 500 00:27:42,640 --> 00:27:47,279 consistent. 501 00:27:44,320 --> 00:27:49,200 The system examiners the inner text and 502 00:27:47,279 --> 00:27:51,600 test content to create multiple 503 00:27:49,200 --> 00:27:54,480 numerical features like if it has a 504 00:27:51,600 --> 00:27:58,240 inner text, the length of the inner 505 00:27:54,480 --> 00:28:01,279 text, the length of the text. 506 00:27:58,240 --> 00:28:04,240 Um, also checks if the text is numerical 507 00:28:01,279 --> 00:28:06,240 or if it contains spaces. And this is 508 00:28:04,240 --> 00:28:09,039 important for healing because knowing 509 00:28:06,240 --> 00:28:12,399 that has an inner text like about it 510 00:28:09,039 --> 00:28:15,039 makes get by row link name about a 511 00:28:12,399 --> 00:28:19,760 preferred alternative when healing the 512 00:28:15,039 --> 00:28:24,480 failing element. Finally, let's see in 513 00:28:19,760 --> 00:28:26,640 practice how this works. I have uh a 514 00:28:24,480 --> 00:28:31,799 video here 515 00:28:26,640 --> 00:28:31,799 for a test without the healing. 516 00:28:31,919 --> 00:28:39,919 So, uh this will take a while. Spoiler 517 00:28:35,919 --> 00:28:43,480 alert. It will fail. Will trigger a 518 00:28:39,919 --> 00:28:43,480 timeout error. 519 00:28:51,120 --> 00:28:57,520 I am running out of time. So, I will 520 00:28:53,520 --> 00:29:04,159 just move it forward. Okay. So, a 521 00:28:57,520 --> 00:29:07,039 timeout error occurred. Um yeah so 522 00:29:04,159 --> 00:29:10,200 now we are going to see how it works in 523 00:29:07,039 --> 00:29:10,200 a test 524 00:29:13,760 --> 00:29:20,320 with the healing and I will move forward 525 00:29:17,279 --> 00:29:23,679 because it takes like around 30 seconds 526 00:29:20,320 --> 00:29:25,520 to to the heal to the timeout 527 00:29:23,679 --> 00:29:28,240 starts 528 00:29:25,520 --> 00:29:32,159 and the healing and then it will just 529 00:29:28,240 --> 00:29:37,880 like click click around 530 00:29:32,159 --> 00:29:37,880 the page and do another things and then 531 00:29:40,880 --> 00:29:45,399 boom successful. So 532 00:29:45,840 --> 00:29:50,559 final considerations really quick to 533 00:29:48,640 --> 00:29:52,960 finish this talk there are a couple 534 00:29:50,559 --> 00:29:55,760 things I want to mention about what is 535 00:29:52,960 --> 00:30:01,200 the future 536 00:29:55,760 --> 00:30:08,000 for that for uh this project. So 537 00:30:01,200 --> 00:30:12,080 this was trained with synthetic data. So 538 00:30:08,000 --> 00:30:14,960 now from moving forward as more healing 539 00:30:12,080 --> 00:30:20,000 events um 540 00:30:14,960 --> 00:30:23,600 start to to happen in our in our tests 541 00:30:20,000 --> 00:30:27,120 more data we will have to retrain the 542 00:30:23,600 --> 00:30:30,399 model and then refine this model later. 543 00:30:27,120 --> 00:30:34,000 uh and this is important because uh the 544 00:30:30,399 --> 00:30:37,679 when it's it's a powerful powerful thing 545 00:30:34,000 --> 00:30:41,039 to have synthetic data generated but 546 00:30:37,679 --> 00:30:44,320 they are not uh very good for real 547 00:30:41,039 --> 00:30:49,039 problems. So as soon we have more real 548 00:30:44,320 --> 00:30:51,120 data the model will be refined and also 549 00:30:49,039 --> 00:30:55,279 uh the other consideration is a 550 00:30:51,120 --> 00:30:57,520 notification system because uh we want 551 00:30:55,279 --> 00:31:00,720 that for a pipeline. We want the 552 00:30:57,520 --> 00:31:03,360 pipeline to succeed but we don't want to 553 00:31:00,720 --> 00:31:07,360 keep healing all the time. So we need to 554 00:31:03,360 --> 00:31:10,799 not be notified when a healing is h 555 00:31:07,360 --> 00:31:13,440 happened. uh so we can change the code 556 00:31:10,799 --> 00:31:17,360 right we don't want uh to keep using 557 00:31:13,440 --> 00:31:22,960 this helings all the time so 558 00:31:17,360 --> 00:31:25,600 I okay so thank you that was my talk and 559 00:31:22,960 --> 00:31:28,880 I probably don't have more time now for 560 00:31:25,600 --> 00:31:32,580 questions but I'll be around if needed 561 00:31:28,880 --> 00:31:36,160 to talk about it thank Thank you. 562 00:31:32,580 --> 00:31:36,160 [Applause]