1 00:00:00,000 --> 00:00:08,469 foreign 2 00:00:00,500 --> 00:00:08,469 [Music] 3 00:00:12,920 --> 00:00:16,340 have a good lunch 4 00:00:16,500 --> 00:00:22,320 good so let's start the talk so the next 5 00:00:20,340 --> 00:00:23,160 speaker will be 6 00:00:22,320 --> 00:00:26,460 um 7 00:00:23,160 --> 00:00:28,800 Jeremy Kurt thank you yep 8 00:00:26,460 --> 00:00:33,719 um we'll be talking he'll be talking a 9 00:00:28,800 --> 00:00:36,300 Libra BMC literally everything is open 10 00:00:33,719 --> 00:00:38,219 so Jeremy Kurt is the Linux and open 11 00:00:36,300 --> 00:00:41,940 source system developer 12 00:00:38,219 --> 00:00:46,440 working on the Kernel drivers firmware 13 00:00:41,940 --> 00:00:49,680 and relate related Plumbing today Jeremy 14 00:00:46,440 --> 00:00:53,059 will be describing the completely open 15 00:00:49,680 --> 00:00:56,640 baseball management control system Libra 16 00:00:53,059 --> 00:00:57,060 BMC so welcome Jeremy thank you very 17 00:00:56,640 --> 00:00:59,820 much 18 00:00:57,060 --> 00:01:01,379 [Applause] 19 00:00:59,820 --> 00:01:03,539 thanks everyone welcome 20 00:01:01,379 --> 00:01:05,519 um firstly I would like to acknowledge 21 00:01:03,539 --> 00:01:07,500 the traditional owners of now the Warren 22 00:01:05,519 --> 00:01:09,299 jury we were on people of the cooler 23 00:01:07,500 --> 00:01:10,979 Nation 24 00:01:09,299 --> 00:01:12,659 um as maybe indicated my pronunciation I 25 00:01:10,979 --> 00:01:16,439 am a visitor on this country 26 00:01:12,659 --> 00:01:18,659 um I'm from the uh the country over in 27 00:01:16,439 --> 00:01:20,939 Perth and a lot of this work was done on 28 00:01:18,659 --> 00:01:22,500 that country there I would like to uh to 29 00:01:20,939 --> 00:01:23,400 share my respect for elders past and 30 00:01:22,500 --> 00:01:26,040 presence 31 00:01:23,400 --> 00:01:29,700 uh by way of my intro uh my name is 32 00:01:26,040 --> 00:01:32,700 Jeremy I um I run a little uh 33 00:01:29,700 --> 00:01:34,320 open source embedded uh systems 34 00:01:32,700 --> 00:01:35,939 programming consultancy called code 35 00:01:34,320 --> 00:01:38,520 construct I'm very proud to sponsor 36 00:01:35,939 --> 00:01:39,240 everything open this year 37 00:01:38,520 --> 00:01:40,560 um 38 00:01:39,240 --> 00:01:41,759 and before we get into the kind of 39 00:01:40,560 --> 00:01:43,619 technical bits a couple of couple of 40 00:01:41,759 --> 00:01:46,979 credits and or disclaimers 41 00:01:43,619 --> 00:01:48,540 um firstly this what I'm presenting 42 00:01:46,979 --> 00:01:50,700 today is built on a lot of a lot of 43 00:01:48,540 --> 00:01:53,820 existing code that's it's written by 44 00:01:50,700 --> 00:01:56,280 people that are much smarter than I am 45 00:01:53,820 --> 00:01:57,479 um some of those folks are in the room 46 00:01:56,280 --> 00:01:59,280 today so I'm going to warn you there's 47 00:01:57,479 --> 00:02:01,100 going to be some oversimplifications as 48 00:01:59,280 --> 00:02:03,479 well in in the talk 49 00:02:01,100 --> 00:02:05,939 what we've done essentially is kind of 50 00:02:03,479 --> 00:02:07,439 taken a lot of that that that work from 51 00:02:05,939 --> 00:02:09,179 these clever people and kind of glued it 52 00:02:07,439 --> 00:02:11,400 together done a bit of bring up and done 53 00:02:09,179 --> 00:02:13,560 a few fixes and build something that we 54 00:02:11,400 --> 00:02:15,599 can actually talk about at a conference 55 00:02:13,560 --> 00:02:17,819 this work is definitely a big 56 00:02:15,599 --> 00:02:20,220 collaboration between code constructs 57 00:02:17,819 --> 00:02:22,140 the open power Foundation which Hugh 58 00:02:20,220 --> 00:02:23,640 mentioned earlier in skinneret and uh 59 00:02:22,140 --> 00:02:24,780 and IBM as well so thanks for that 60 00:02:23,640 --> 00:02:27,480 thanks for those folks for for 61 00:02:24,780 --> 00:02:30,120 participating 62 00:02:27,480 --> 00:02:32,040 so what are we doing here uh our general 63 00:02:30,120 --> 00:02:34,080 plan is is a two-step thing we're going 64 00:02:32,040 --> 00:02:36,300 to get a server we're going to take out 65 00:02:34,080 --> 00:02:39,720 its BMC and we're going to put a more 66 00:02:36,300 --> 00:02:41,400 open sourcing BMC into it uh that's my 67 00:02:39,720 --> 00:02:45,900 talk thanks all 68 00:02:41,400 --> 00:02:48,060 um no what's a BMC uh so BMC is a a 69 00:02:45,900 --> 00:02:50,879 acronym for a baseboard uh management 70 00:02:48,060 --> 00:02:52,560 controller and essentially the 71 00:02:50,879 --> 00:02:55,800 management controller that's responsible 72 00:02:52,560 --> 00:02:59,160 for managing and booting and to those 73 00:02:55,800 --> 00:03:01,800 extent sort of controlling uh a server 74 00:02:59,160 --> 00:03:03,780 and a data center so uh things like 75 00:03:01,800 --> 00:03:05,940 being able to remotely turn on and off a 76 00:03:03,780 --> 00:03:08,220 computer being able to inspect its 77 00:03:05,940 --> 00:03:10,260 inventory while the the main part of the 78 00:03:08,220 --> 00:03:12,239 computer itself is off being able to get 79 00:03:10,260 --> 00:03:14,819 a console on it all the sort of the 80 00:03:12,239 --> 00:03:17,099 remote things that are sort of side side 81 00:03:14,819 --> 00:03:21,300 bands to the the main computer itself 82 00:03:17,099 --> 00:03:23,340 the The BMC is responsible for 83 00:03:21,300 --> 00:03:24,900 while we're looking at open source BMC 84 00:03:23,340 --> 00:03:27,659 uh essentially 85 00:03:24,900 --> 00:03:30,599 there's a it's a lot of um 86 00:03:27,659 --> 00:03:33,599 a lot of side channels involved here uh 87 00:03:30,599 --> 00:03:36,420 and having a a some mysterious computer 88 00:03:33,599 --> 00:03:40,080 that's talking to your actual computer 89 00:03:36,420 --> 00:03:40,980 um can be a bit of a risk so back at 90 00:03:40,080 --> 00:03:45,120 um 91 00:03:40,980 --> 00:03:46,860 LCA in 2021 I presented a bit about kind 92 00:03:45,120 --> 00:03:49,440 of what we're doing in The BMC space and 93 00:03:46,860 --> 00:03:51,180 it included these two slides um 94 00:03:49,440 --> 00:03:54,000 on the left a bit of a system diagram 95 00:03:51,180 --> 00:03:55,459 about what uh what the host might be 96 00:03:54,000 --> 00:03:58,739 able to do 97 00:03:55,459 --> 00:04:00,599 accessing sort of parts of the BMC over 98 00:03:58,739 --> 00:04:02,900 over one of the buses between those two 99 00:04:00,599 --> 00:04:02,900 devices 100 00:04:03,840 --> 00:04:06,840 um 101 00:04:05,040 --> 00:04:10,379 there was a cve 102 00:04:06,840 --> 00:04:12,959 um recorded uh 6260. 103 00:04:10,379 --> 00:04:13,980 um it's not not a huge issue and for 104 00:04:12,959 --> 00:04:15,840 most 105 00:04:13,980 --> 00:04:17,519 applications where we're using a server 106 00:04:15,840 --> 00:04:19,919 with a BMC it's it's kind of not a 107 00:04:17,519 --> 00:04:21,660 problem but it it can be depending on 108 00:04:19,919 --> 00:04:24,180 your trust model of how you're using a 109 00:04:21,660 --> 00:04:25,500 server who has access to the BMC who has 110 00:04:24,180 --> 00:04:27,000 access to the host and if there's need 111 00:04:25,500 --> 00:04:28,620 to be separate 112 00:04:27,000 --> 00:04:30,240 so I while it's not a serious problem 113 00:04:28,620 --> 00:04:32,580 it's a kind of demonstration of some 114 00:04:30,240 --> 00:04:35,340 some potential issues in that 115 00:04:32,580 --> 00:04:36,900 pretty much every every server kind of 116 00:04:35,340 --> 00:04:40,560 2019 117 00:04:36,900 --> 00:04:42,300 and earlier had had this potential issue 118 00:04:40,560 --> 00:04:43,979 um and it's exposed by essentially the 119 00:04:42,300 --> 00:04:46,320 the fixed piece of silicon that is The 120 00:04:43,979 --> 00:04:49,020 BMC and how it operates how it interacts 121 00:04:46,320 --> 00:04:50,520 with everything else in the system 122 00:04:49,020 --> 00:04:52,680 um but instead what we're going to do 123 00:04:50,520 --> 00:04:54,300 here is is apply kind of the concept of 124 00:04:52,680 --> 00:04:57,900 an open ecosystem and open sourcing 125 00:04:54,300 --> 00:05:01,139 everything into making that BMC part 126 00:04:57,900 --> 00:05:03,199 um a bit more uh verifiable a bit more 127 00:05:01,139 --> 00:05:05,280 trustable a bit more modifiable because 128 00:05:03,199 --> 00:05:07,080 we're either writing the source 129 00:05:05,280 --> 00:05:10,860 ourselves or we're using bits that we 130 00:05:07,080 --> 00:05:12,780 can inspect and trust and modify 131 00:05:10,860 --> 00:05:14,280 so in doing that we're going to start 132 00:05:12,780 --> 00:05:15,660 kind of going to go up the stack 133 00:05:14,280 --> 00:05:17,280 starting from Hardware 134 00:05:15,660 --> 00:05:19,860 and of course we're going to start with 135 00:05:17,280 --> 00:05:22,199 open Hardware we're going for an open 136 00:05:19,860 --> 00:05:23,000 approach to everything here starting the 137 00:05:22,199 --> 00:05:25,500 hardware 138 00:05:23,000 --> 00:05:28,020 and I guess for extra credit here we're 139 00:05:25,500 --> 00:05:31,860 going for not just the BMC but also some 140 00:05:28,020 --> 00:05:32,699 elements of the the host server as well 141 00:05:31,860 --> 00:05:34,800 um 142 00:05:32,699 --> 00:05:36,419 what we're going to doing is we're not 143 00:05:34,800 --> 00:05:39,000 getting room from the entire thing we're 144 00:05:36,419 --> 00:05:41,759 doing a BMC bring up against existing 145 00:05:39,000 --> 00:05:44,340 host server sorry host server Hardware 146 00:05:41,759 --> 00:05:46,199 and we want to choose a platform that 147 00:05:44,340 --> 00:05:47,820 that we can actually inspect a bit and 148 00:05:46,199 --> 00:05:49,860 do our bring up without having to do too 149 00:05:47,820 --> 00:05:53,039 much reverse engineering so we've got to 150 00:05:49,860 --> 00:05:55,139 choose a platform as a Target and uh as 151 00:05:53,039 --> 00:05:57,300 was mentioned in in Hughes keynote this 152 00:05:55,139 --> 00:05:59,580 morning open power platforms are a 153 00:05:57,300 --> 00:06:01,740 pretty good choice for that 154 00:05:59,580 --> 00:06:03,300 um the the repo here contains 155 00:06:01,740 --> 00:06:04,919 essentially all of the firmware that's 156 00:06:03,300 --> 00:06:07,259 running on an open Tower server so we 157 00:06:04,919 --> 00:06:09,840 can look at it we can inspect it we can 158 00:06:07,259 --> 00:06:11,520 write the the bits to talk to it without 159 00:06:09,840 --> 00:06:14,340 having to sort of make too many 160 00:06:11,520 --> 00:06:15,419 assumptions about things 161 00:06:14,340 --> 00:06:16,880 and that that's going to be 162 00:06:15,419 --> 00:06:19,139 exceptionally helpful for our bring out 163 00:06:16,880 --> 00:06:21,000 for sort of reasons I'm going to talk 164 00:06:19,139 --> 00:06:22,259 about later being open source we're not 165 00:06:21,000 --> 00:06:23,940 going to modify it we're still going to 166 00:06:22,259 --> 00:06:25,380 use it as a reference for our for our 167 00:06:23,940 --> 00:06:27,600 BMC work here 168 00:06:25,380 --> 00:06:28,800 so we've kind of chosen open power as a 169 00:06:27,600 --> 00:06:30,600 base for this 170 00:06:28,800 --> 00:06:33,660 for that that's more of an ecosystem 171 00:06:30,600 --> 00:06:38,039 what we need is a specific platform 172 00:06:33,660 --> 00:06:40,740 back in uh June 2018 uh this system here 173 00:06:38,039 --> 00:06:43,740 the uh the IBM 174 00:06:40,740 --> 00:06:46,020 Oak Ridge Summit supercomputer was 175 00:06:43,740 --> 00:06:47,460 announced as the the highest performing 176 00:06:46,020 --> 00:06:49,319 supercomputer 177 00:06:47,460 --> 00:06:51,180 um of the time according to the the top 178 00:06:49,319 --> 00:06:55,680 500 list 179 00:06:51,180 --> 00:06:57,840 and it's composed of about 4 600 of 180 00:06:55,680 --> 00:06:59,039 these these individual nodes all 181 00:06:57,840 --> 00:07:00,960 networked together 182 00:06:59,039 --> 00:07:04,039 and those individual compute compute 183 00:07:00,960 --> 00:07:05,940 nodes have this catchy name the ac922 184 00:07:04,039 --> 00:07:08,940 and that's what we're going to choose 185 00:07:05,940 --> 00:07:10,440 for our our bring up project here so 186 00:07:08,940 --> 00:07:13,080 we're going to get one of these servers 187 00:07:10,440 --> 00:07:15,240 take out the BMC and put our new open 188 00:07:13,080 --> 00:07:17,400 source cbmc unit 189 00:07:15,240 --> 00:07:19,620 this is a 190 00:07:17,400 --> 00:07:23,819 the insides of one of those machines uh 191 00:07:19,620 --> 00:07:26,400 it's a two socket 128 threads uh two 192 00:07:23,819 --> 00:07:28,740 rack unit device uh here we've got a lot 193 00:07:26,400 --> 00:07:31,199 of empty space uh there's there's 194 00:07:28,740 --> 00:07:33,360 carriers for a lot of honking gpus in 195 00:07:31,199 --> 00:07:35,340 there it's definitely a production 196 00:07:33,360 --> 00:07:36,599 server it was was part of the fastest 197 00:07:35,340 --> 00:07:37,979 computer in the world in the world for a 198 00:07:36,599 --> 00:07:39,060 while definitely a high performance 199 00:07:37,979 --> 00:07:41,220 service so we're not kind of just doing 200 00:07:39,060 --> 00:07:42,599 a toy bring up project this is this is a 201 00:07:41,220 --> 00:07:44,880 series better kit 202 00:07:42,599 --> 00:07:49,500 full one out of production effectively 203 00:07:44,880 --> 00:07:51,960 I I borrowed it it's uh yeah it we will 204 00:07:49,500 --> 00:07:53,280 send it back at some point yeah the Oak 205 00:07:51,960 --> 00:07:55,740 Ridge is not missing one of the nodes 206 00:07:53,280 --> 00:07:56,880 it's all right 207 00:07:55,740 --> 00:07:59,280 um 208 00:07:56,880 --> 00:08:00,840 as well as well as having I guess an 209 00:07:59,280 --> 00:08:02,880 important design point about these 210 00:08:00,840 --> 00:08:05,639 platform is as well as having a fairly 211 00:08:02,880 --> 00:08:06,599 open architecture and open design 212 00:08:05,639 --> 00:08:09,840 um 213 00:08:06,599 --> 00:08:12,060 the The BMC itself is socketed so that's 214 00:08:09,840 --> 00:08:15,720 going to be pretty handy for us if we 215 00:08:12,060 --> 00:08:18,300 look at a kind of a ga x86 motherboard 216 00:08:15,720 --> 00:08:21,180 this is a so this is a server board here 217 00:08:18,300 --> 00:08:23,039 kind of similar uh two sockets items for 218 00:08:21,180 --> 00:08:26,340 a socket it's about the same sort of 219 00:08:23,039 --> 00:08:28,139 generation as our ac922 but it doesn't 220 00:08:26,340 --> 00:08:30,180 have those space for the six gpus 221 00:08:28,139 --> 00:08:33,360 doesn't really matter for this 222 00:08:30,180 --> 00:08:35,520 um but if we look at the bottom uh 223 00:08:33,360 --> 00:08:36,959 bottom you're left of that we've got 224 00:08:35,520 --> 00:08:40,080 this little space here and that that 225 00:08:36,959 --> 00:08:42,539 circle thing is The BMC now I'm pretty 226 00:08:40,080 --> 00:08:43,860 handy with a chisel but I don't think I 227 00:08:42,539 --> 00:08:45,420 could get that off without ruining the 228 00:08:43,860 --> 00:08:47,940 entire thing 229 00:08:45,420 --> 00:08:49,980 um so these hardware folks are going to 230 00:08:47,940 --> 00:08:52,019 see that this is not something we can 231 00:08:49,980 --> 00:08:56,279 just take it off and and replace it with 232 00:08:52,019 --> 00:08:58,440 a new one whereas on our at our 922 this 233 00:08:56,279 --> 00:09:00,720 is The BMC that that's shipped with the 234 00:08:58,440 --> 00:09:02,760 machine so as you can see here it's on a 235 00:09:00,720 --> 00:09:05,220 separate board we can we can unplug that 236 00:09:02,760 --> 00:09:07,500 and put our new thing on it so that's 237 00:09:05,220 --> 00:09:10,200 got a much better chance of success 238 00:09:07,500 --> 00:09:12,620 modifying this platform than than an 239 00:09:10,200 --> 00:09:12,620 early one 240 00:09:13,560 --> 00:09:18,060 uh just going so this he said existing 241 00:09:16,260 --> 00:09:19,200 BMC that's that's what we're trying to 242 00:09:18,060 --> 00:09:21,600 replace here 243 00:09:19,200 --> 00:09:23,760 so reviewing our progress so far we have 244 00:09:21,600 --> 00:09:25,260 a server uh I'm going to kind of build 245 00:09:23,760 --> 00:09:26,220 on this the project managers in the 246 00:09:25,260 --> 00:09:28,560 audience are going to love these slides 247 00:09:26,220 --> 00:09:30,000 we kind of build up as we go 248 00:09:28,560 --> 00:09:32,519 um so we have a server we have a server 249 00:09:30,000 --> 00:09:34,320 we can do a bit of experimentation on um 250 00:09:32,519 --> 00:09:35,880 due to the open nature of the stack and 251 00:09:34,320 --> 00:09:39,000 the platform and the the implementation 252 00:09:35,880 --> 00:09:40,680 uh we can remove the BMC but of course 253 00:09:39,000 --> 00:09:41,880 we're going to want to add a new one to 254 00:09:40,680 --> 00:09:43,680 that we can't just operate this thing 255 00:09:41,880 --> 00:09:46,080 without a BMC 256 00:09:43,680 --> 00:09:47,880 so before we kind of go into that a bit 257 00:09:46,080 --> 00:09:49,140 of a digression into a bit of a 258 00:09:47,880 --> 00:09:53,160 technology that's happening at the 259 00:09:49,140 --> 00:09:55,380 moment called dcscm or dcsci a couple of 260 00:09:53,160 --> 00:09:57,720 acronyms they're going to help us with 261 00:09:55,380 --> 00:09:58,440 our with our project here 262 00:09:57,720 --> 00:10:00,899 um 263 00:09:58,440 --> 00:10:03,600 the clever Folks at the open compute 264 00:10:00,899 --> 00:10:05,820 project have published the standards 265 00:10:03,600 --> 00:10:08,339 um kind of heavily that the authors here 266 00:10:05,820 --> 00:10:11,940 are essentially Microsoft and Google 267 00:10:08,339 --> 00:10:14,339 um under a I guess an effort to make 268 00:10:11,940 --> 00:10:16,800 their own systems production more 269 00:10:14,339 --> 00:10:19,019 modular so by producing a spec about 270 00:10:16,800 --> 00:10:21,300 what a BMC should look like and how it 271 00:10:19,019 --> 00:10:25,560 connects into the system that's right as 272 00:10:21,300 --> 00:10:28,200 an effort to possibly use new parts make 273 00:10:25,560 --> 00:10:30,860 make a a system out of more modular 274 00:10:28,200 --> 00:10:30,860 components 275 00:10:30,899 --> 00:10:34,560 and and 276 00:10:32,519 --> 00:10:35,940 kind of having a standard rather than 277 00:10:34,560 --> 00:10:38,339 having to do everything from scratch for 278 00:10:35,940 --> 00:10:40,019 every platform so that that gives us a 279 00:10:38,339 --> 00:10:42,240 bit of a standard form factor for a BMC 280 00:10:40,019 --> 00:10:45,120 and a standard connector 281 00:10:42,240 --> 00:10:47,120 um the architecture of of that um that 282 00:10:45,120 --> 00:10:49,800 standard so there's two parts there the 283 00:10:47,120 --> 00:10:51,420 dcscm is the module itself so The BMC 284 00:10:49,800 --> 00:10:53,820 module which has a size and a connector 285 00:10:51,420 --> 00:10:55,620 and filling the dcsci is the connector 286 00:10:53,820 --> 00:10:57,959 definition 287 00:10:55,620 --> 00:11:00,240 um so the intention here is they have 288 00:10:57,959 --> 00:11:02,880 your BMC you have all of these signaling 289 00:11:00,240 --> 00:11:04,860 go through this this DC SCI connector 290 00:11:02,880 --> 00:11:06,600 going to all of the bits that it's 291 00:11:04,860 --> 00:11:08,760 managing so your peripherals your board 292 00:11:06,600 --> 00:11:10,860 your high CPU and it's doing all of its 293 00:11:08,760 --> 00:11:13,800 temperature management CPU boot all that 294 00:11:10,860 --> 00:11:16,740 kind of thing over this one connector 295 00:11:13,800 --> 00:11:18,899 um this is what the connector looks like 296 00:11:16,740 --> 00:11:19,920 um again if you're sort of following the 297 00:11:18,899 --> 00:11:22,800 hardware side of things you might 298 00:11:19,920 --> 00:11:25,800 recognize this from connectors such as 299 00:11:22,800 --> 00:11:25,800 the 300 00:11:26,760 --> 00:11:30,860 [Music] 301 00:11:28,920 --> 00:11:34,320 the SSS 302 00:11:30,860 --> 00:11:36,440 ffta1002 which is a oh my buttons came 303 00:11:34,320 --> 00:11:36,440 back 304 00:11:38,760 --> 00:11:42,720 or a little happening at once 305 00:11:40,620 --> 00:11:45,240 anyway 306 00:11:42,720 --> 00:11:47,040 it is the protocol agnostic multi-lane 307 00:11:45,240 --> 00:11:49,140 high-speed connector 308 00:11:47,040 --> 00:11:50,839 and when they say multi-lane they are 309 00:11:49,140 --> 00:11:53,100 not messing around 310 00:11:50,839 --> 00:11:54,480 so this is what's carried over that 311 00:11:53,100 --> 00:11:57,300 connector and this is essentially the 312 00:11:54,480 --> 00:11:59,940 stuff you need to to implement a BMC so 313 00:11:57,300 --> 00:12:02,220 all sorts of different signaling some of 314 00:11:59,940 --> 00:12:04,500 which is used on some platforms some 315 00:12:02,220 --> 00:12:06,180 won't be using some platforms 316 00:12:04,500 --> 00:12:08,579 um and that 317 00:12:06,180 --> 00:12:10,320 basically gives us a standard form 318 00:12:08,579 --> 00:12:13,200 factor to connect these kind of things 319 00:12:10,320 --> 00:12:14,940 into into the base platform and do all 320 00:12:13,200 --> 00:12:18,440 our servery bits 321 00:12:14,940 --> 00:12:21,779 so specs are all great this is DCM dcsci 322 00:12:18,440 --> 00:12:24,240 but we need an actual implementation and 323 00:12:21,779 --> 00:12:25,800 on our open source tilt here we want an 324 00:12:24,240 --> 00:12:28,320 open implementation 325 00:12:25,800 --> 00:12:29,579 and this is one of them so this is a 326 00:12:28,320 --> 00:12:31,980 board 327 00:12:29,579 --> 00:12:34,920 produced by a manufacturer called Ant 328 00:12:31,980 --> 00:12:35,760 micro they've designed and produced this 329 00:12:34,920 --> 00:12:38,279 board 330 00:12:35,760 --> 00:12:42,060 called a dcsem board 331 00:12:38,279 --> 00:12:43,019 it is this dcsci connector down the 332 00:12:42,060 --> 00:12:43,680 bottom there 333 00:12:43,019 --> 00:12:45,720 um 334 00:12:43,680 --> 00:12:47,700 basically with all those lines going 335 00:12:45,720 --> 00:12:50,940 into an fpga 336 00:12:47,700 --> 00:12:54,360 um it has some on-board Ram so we've got 337 00:12:50,940 --> 00:12:56,459 500 512 megabytes of ram we have a 338 00:12:54,360 --> 00:12:59,040 network connector at the top we have a 339 00:12:56,459 --> 00:13:01,320 little bit of storage and some emmc 340 00:12:59,040 --> 00:13:04,380 device there and we have all the little 341 00:13:01,320 --> 00:13:05,760 bits required to to manage an fpga so we 342 00:13:04,380 --> 00:13:08,220 have a programming interface we have 343 00:13:05,760 --> 00:13:09,839 some USB in which we can interact with 344 00:13:08,220 --> 00:13:13,920 that fpga 345 00:13:09,839 --> 00:13:16,800 and of course everything's open here so 346 00:13:13,920 --> 00:13:18,959 these are the schematics of that board 347 00:13:16,800 --> 00:13:20,579 if you can find the bug there I'm very 348 00:13:18,959 --> 00:13:21,779 impressed and where were you five months 349 00:13:20,579 --> 00:13:22,620 ago 350 00:13:21,779 --> 00:13:24,959 um 351 00:13:22,620 --> 00:13:26,639 so again big thing is the the hardware 352 00:13:24,959 --> 00:13:29,279 is open the Hardware's out there we can 353 00:13:26,639 --> 00:13:30,660 see it we can investigate it we can 354 00:13:29,279 --> 00:13:32,339 potentially modify it and do our own 355 00:13:30,660 --> 00:13:34,200 thing if we want we're sort of not quite 356 00:13:32,339 --> 00:13:36,720 in that that business at the moment but 357 00:13:34,200 --> 00:13:38,459 whether that's again that the advantages 358 00:13:36,720 --> 00:13:41,240 of having things open for us are pretty 359 00:13:38,459 --> 00:13:41,240 huge in this case 360 00:13:41,820 --> 00:13:48,120 uh the the source as I said is online 361 00:13:43,800 --> 00:13:49,980 this is the repo for that device you can 362 00:13:48,120 --> 00:13:52,260 open pull requests you can do all sorts 363 00:13:49,980 --> 00:13:53,880 of things uh just I guess like coming 364 00:13:52,260 --> 00:13:55,380 from a software world it's kind of cool 365 00:13:53,880 --> 00:13:57,360 to see that we can we can have those 366 00:13:55,380 --> 00:14:00,360 kind of designs out there and usable for 367 00:13:57,360 --> 00:14:03,120 for this so back to our list we have a 368 00:14:00,360 --> 00:14:04,620 server and we have potentially a BMC 369 00:14:03,120 --> 00:14:06,839 Hardware I say potentially we'll get 370 00:14:04,620 --> 00:14:07,620 back to that in a bit 371 00:14:06,839 --> 00:14:10,820 um 372 00:14:07,620 --> 00:14:12,899 another another aspect I guess about our 373 00:14:10,820 --> 00:14:15,480 ac9227 going back to that original photo 374 00:14:12,899 --> 00:14:18,060 the top right here is where that BMC 375 00:14:15,480 --> 00:14:20,820 usually goes and we can zoom into that 376 00:14:18,060 --> 00:14:22,320 area and we find one of these friendly 377 00:14:20,820 --> 00:14:24,660 connectors 378 00:14:22,320 --> 00:14:27,060 um basically a massive pin grid with all 379 00:14:24,660 --> 00:14:28,380 the BMC signaling to that and again the 380 00:14:27,060 --> 00:14:33,740 astute hardware folks in the audience 381 00:14:28,380 --> 00:14:33,740 might have noticed that that is not that 382 00:14:34,320 --> 00:14:38,279 um so we have a bit of a problem here 383 00:14:35,399 --> 00:14:39,180 fortunately someone has already solved 384 00:14:38,279 --> 00:14:40,500 it for us 385 00:14:39,180 --> 00:14:43,519 um there is one more piece of Hardware 386 00:14:40,500 --> 00:14:46,320 again an open piece of Hardware which is 387 00:14:43,519 --> 00:14:49,199 this little guy so 388 00:14:46,320 --> 00:14:52,079 Big Board uh it's not that big kind of 389 00:14:49,199 --> 00:14:54,839 this week uh with one of those dcsci 390 00:14:52,079 --> 00:14:56,880 connectors and that massive pin grid on 391 00:14:54,839 --> 00:15:00,360 the bottom and that routes kind of 392 00:14:56,880 --> 00:15:02,639 between what the the ac922 server 393 00:15:00,360 --> 00:15:04,620 expects in terms of signaling and the DC 394 00:15:02,639 --> 00:15:06,959 SCI on the other side now it's not 395 00:15:04,620 --> 00:15:08,339 entirely one to one there's some signals 396 00:15:06,959 --> 00:15:10,680 that exist on one connector that don't 397 00:15:08,339 --> 00:15:12,240 exist in the other but we can kind of we 398 00:15:10,680 --> 00:15:15,839 can get around that 399 00:15:12,240 --> 00:15:19,860 um and this allows us to 400 00:15:15,839 --> 00:15:21,779 do this so this is our same ac922 again 401 00:15:19,860 --> 00:15:25,800 into poser and green at the bottom and 402 00:15:21,779 --> 00:15:29,459 that that dcscm board on top 403 00:15:25,800 --> 00:15:31,079 so now we have a server actually 404 00:15:29,459 --> 00:15:34,019 connected to 405 00:15:31,079 --> 00:15:35,160 our potentially a BMC Hardware now I did 406 00:15:34,019 --> 00:15:36,660 say I would go back to that potentially 407 00:15:35,160 --> 00:15:37,860 aspect 408 00:15:36,660 --> 00:15:39,480 um 409 00:15:37,860 --> 00:15:41,459 in 410 00:15:39,480 --> 00:15:44,399 if you're producing A system that uses a 411 00:15:41,459 --> 00:15:45,779 dcscm and or I guess any any BMC based 412 00:15:44,399 --> 00:15:48,839 systems you're going to have 413 00:15:45,779 --> 00:15:50,760 some specialized Hardware here so this 414 00:15:48,839 --> 00:15:54,480 is essentially what we would need to 415 00:15:50,760 --> 00:15:56,820 implement a dcscm module we've got our 416 00:15:54,480 --> 00:16:00,360 BMC with all of these kind of different 417 00:15:56,820 --> 00:16:02,519 signaling lines going out to our dcsci 418 00:16:00,360 --> 00:16:04,380 connector and some some on-board 419 00:16:02,519 --> 00:16:06,839 Hardware that's relevant to the BMC 420 00:16:04,380 --> 00:16:07,980 itself The BMC needs to talk over the 421 00:16:06,839 --> 00:16:10,139 network for example it needs some 422 00:16:07,980 --> 00:16:11,100 storage so there's some 423 00:16:10,139 --> 00:16:14,699 um 424 00:16:11,100 --> 00:16:16,500 some specialized Hardware on the device 425 00:16:14,699 --> 00:16:18,000 there's a couple of vendors that 426 00:16:16,500 --> 00:16:19,380 actually make one of these system on 427 00:16:18,000 --> 00:16:20,060 chip designs 428 00:16:19,380 --> 00:16:22,920 um 429 00:16:20,060 --> 00:16:24,959 there is I guess two industry leaders 430 00:16:22,920 --> 00:16:28,079 that basically own the entire Market of 431 00:16:24,959 --> 00:16:30,779 what's on a BMC in any server that you 432 00:16:28,079 --> 00:16:33,600 buy now and they they produce 433 00:16:30,779 --> 00:16:35,940 essentially a little device has a little 434 00:16:33,600 --> 00:16:37,740 arm core on it and and all of the kind 435 00:16:35,940 --> 00:16:41,600 of the bits of Hardware that you need 436 00:16:37,740 --> 00:16:44,639 out of a BMC to talk things like i2c FSI 437 00:16:41,600 --> 00:16:46,259 LPC all the the interfaces to the rest 438 00:16:44,639 --> 00:16:47,820 of the system so those already exist in 439 00:16:46,259 --> 00:16:49,920 Silicon what we're doing is replacing it 440 00:16:47,820 --> 00:16:51,540 so we kind of need to re-implement not 441 00:16:49,920 --> 00:16:53,759 reinpoint we need to design something 442 00:16:51,540 --> 00:16:55,680 that works like this however what we 443 00:16:53,759 --> 00:16:59,459 have at the moment with that dcscm board 444 00:16:55,680 --> 00:17:03,720 is just an fpga right and all it has is 445 00:16:59,459 --> 00:17:07,679 IOS there's just lines going from the 446 00:17:03,720 --> 00:17:10,740 dcsui connector into unassigned iOS on 447 00:17:07,679 --> 00:17:13,740 the fpga there's nothing about this that 448 00:17:10,740 --> 00:17:14,760 makes it a BMC at this point 449 00:17:13,740 --> 00:17:18,059 um 450 00:17:14,760 --> 00:17:20,160 and I guess more obviously we on a BMC 451 00:17:18,059 --> 00:17:22,140 we have an actual CPU core on it right 452 00:17:20,160 --> 00:17:25,500 we have an arm core on it we don't have 453 00:17:22,140 --> 00:17:29,160 that on rfpga we just got a a whole 454 00:17:25,500 --> 00:17:31,260 bunch of um like 75 000 uh logic units 455 00:17:29,160 --> 00:17:33,000 that we can turn into something we 456 00:17:31,260 --> 00:17:34,440 haven't turned anything in it turned it 457 00:17:33,000 --> 00:17:35,700 into anything yet 458 00:17:34,440 --> 00:17:38,039 so 459 00:17:35,700 --> 00:17:39,840 um existing about an arm core at the 460 00:17:38,039 --> 00:17:43,679 moment this is our Blank Slate to mess 461 00:17:39,840 --> 00:17:45,900 with we've got iOS no core nothing 462 00:17:43,679 --> 00:17:47,700 um so it has turned me the bits that 463 00:17:45,900 --> 00:17:50,820 turn that blank slate into some kind of 464 00:17:47,700 --> 00:17:53,160 BMC and for those kind of not familiar 465 00:17:50,820 --> 00:17:55,760 with the the fpga world we'll refer to 466 00:17:53,160 --> 00:17:55,760 that as Gateway 467 00:17:56,220 --> 00:17:59,940 or it's not quite software not quite 468 00:17:57,660 --> 00:18:03,120 Hardware but we we can write some some 469 00:17:59,940 --> 00:18:04,860 code that will return into the fpga bit 470 00:18:03,120 --> 00:18:06,840 stream that turns this thing into a 471 00:18:04,860 --> 00:18:08,880 usable device 472 00:18:06,840 --> 00:18:10,919 um so that's the Gateway itself is the 473 00:18:08,880 --> 00:18:13,320 very log of the vhdl whatever we compile 474 00:18:10,919 --> 00:18:16,320 into the the bits room defines the 475 00:18:13,320 --> 00:18:17,340 behavior of that device to do BMC like 476 00:18:16,320 --> 00:18:19,500 things 477 00:18:17,340 --> 00:18:21,900 now of course this is an open source 478 00:18:19,500 --> 00:18:24,240 conference so I want some open gateway 479 00:18:21,900 --> 00:18:26,880 to do that 480 00:18:24,240 --> 00:18:28,799 um it's actually a a massively 481 00:18:26,880 --> 00:18:31,080 increasing amount of open Gateway around 482 00:18:28,799 --> 00:18:35,039 there's lots of resources about bits 483 00:18:31,080 --> 00:18:36,480 that we can reuse reapply into into our 484 00:18:35,039 --> 00:18:38,880 Blank Slate 485 00:18:36,480 --> 00:18:40,559 um and using a whole bunch of different 486 00:18:38,880 --> 00:18:43,140 development development models to do 487 00:18:40,559 --> 00:18:46,080 that some folks have written raw vhdl 488 00:18:43,140 --> 00:18:49,919 raw verilog some have written gen like 489 00:18:46,080 --> 00:18:51,600 schemes to generate that verilog out of 490 00:18:49,919 --> 00:18:54,240 high-level languages 491 00:18:51,600 --> 00:18:56,880 I'm going to go into that a bit uh maybe 492 00:18:54,240 --> 00:18:59,580 offline but there's certainly ways we 493 00:18:56,880 --> 00:19:01,559 can we take open source bits of code to 494 00:18:59,580 --> 00:19:03,240 generate behaviors for our our Blank 495 00:19:01,559 --> 00:19:05,520 Slate of an fpga 496 00:19:03,240 --> 00:19:07,980 so going through that first up we do 497 00:19:05,520 --> 00:19:09,840 need a CPU and um 498 00:19:07,980 --> 00:19:12,840 I can highly recommend this one so 499 00:19:09,840 --> 00:19:15,440 there's a basically a 500 00:19:12,840 --> 00:19:18,299 an open source CPU core called micro 501 00:19:15,440 --> 00:19:20,340 this really deserves its own talk as a 502 00:19:18,299 --> 00:19:23,340 massive amount of cool stuff in there 503 00:19:20,340 --> 00:19:25,580 it's a fully open open power instruction 504 00:19:23,340 --> 00:19:27,660 set architecture core 505 00:19:25,580 --> 00:19:30,299 it's just missing a couple of things and 506 00:19:27,660 --> 00:19:31,919 another like the vector like that 128 507 00:19:30,299 --> 00:19:33,900 bit wide registers and rigid 508 00:19:31,919 --> 00:19:35,700 instructions we don't need those at all 509 00:19:33,900 --> 00:19:37,799 for our BMC we're not trying to produce 510 00:19:35,700 --> 00:19:39,120 a high performance computer on our BMC 511 00:19:37,799 --> 00:19:40,620 this is kind of 512 00:19:39,120 --> 00:19:42,360 the server itself is the high 513 00:19:40,620 --> 00:19:44,460 performance computer here 514 00:19:42,360 --> 00:19:47,640 um this is the microwave can be 515 00:19:44,460 --> 00:19:49,860 synthesized for lattice and Xanax fpgs 516 00:19:47,640 --> 00:19:52,799 you can also run in Sim but what it 517 00:19:49,860 --> 00:19:56,100 gives us essentially is a way to to drop 518 00:19:52,799 --> 00:19:58,080 a very usable core onto our fpga device 519 00:19:56,100 --> 00:19:59,400 and have some sort of compute capacity 520 00:19:58,080 --> 00:20:01,740 behind that 521 00:19:59,400 --> 00:20:04,020 and super importantly the Linux port for 522 00:20:01,740 --> 00:20:06,120 this has already done for us so we could 523 00:20:04,020 --> 00:20:08,580 drop a microwave core there and have 524 00:20:06,120 --> 00:20:11,160 Linux boot on that core with with fairly 525 00:20:08,580 --> 00:20:13,200 minimal effort so that's that's going to 526 00:20:11,160 --> 00:20:15,480 be our CPU corfids project 527 00:20:13,200 --> 00:20:18,780 we also need we've got a CPU we need 528 00:20:15,480 --> 00:20:20,760 some Ram so we need a dram controller 529 00:20:18,780 --> 00:20:24,539 um from that we're borrowing some code 530 00:20:20,760 --> 00:20:26,220 from the latex project so latex uh 531 00:20:24,539 --> 00:20:28,799 library of of 532 00:20:26,220 --> 00:20:30,720 um of fpga components 533 00:20:28,799 --> 00:20:32,100 um one of which is the dram controller 534 00:20:30,720 --> 00:20:33,600 we can kind of that's already plugged 535 00:20:32,100 --> 00:20:35,220 into the microwave call for us so that's 536 00:20:33,600 --> 00:20:37,860 that's super 537 00:20:35,220 --> 00:20:41,700 uh we also have an Ethernet device on 538 00:20:37,860 --> 00:20:44,340 rfpga we'll also grab some some uh the 539 00:20:41,700 --> 00:20:46,980 the light Heath module that can also be 540 00:20:44,340 --> 00:20:49,980 plugged into microwave uh this is a 541 00:20:46,980 --> 00:20:51,539 quite a simple ethernet device uh which 542 00:20:49,980 --> 00:20:53,160 which already has this Linux support 543 00:20:51,539 --> 00:20:54,600 we're not again we're not going for 544 00:20:53,160 --> 00:20:56,640 super high performance networking here 545 00:20:54,600 --> 00:21:00,140 but it's enough to to communicate with 546 00:20:56,640 --> 00:21:00,140 our without microaught CPU 547 00:21:02,400 --> 00:21:07,559 uh we need some storage you know to load 548 00:21:05,520 --> 00:21:08,580 our kernel and root file system like 549 00:21:07,559 --> 00:21:11,340 it's from 550 00:21:08,580 --> 00:21:14,100 all the kind of BMC Software comes from 551 00:21:11,340 --> 00:21:16,320 our eme MMC device 552 00:21:14,100 --> 00:21:18,059 and in order to talk to an EMC device we 553 00:21:16,320 --> 00:21:20,100 need an EMC controller and for that 554 00:21:18,059 --> 00:21:21,840 we're using more lightx code the uh the 555 00:21:20,100 --> 00:21:22,919 light SD project 556 00:21:21,840 --> 00:21:26,700 um 557 00:21:22,919 --> 00:21:28,380 it has some Linux support uh 558 00:21:26,700 --> 00:21:30,600 what we can see here we've got an SD 559 00:21:28,380 --> 00:21:33,059 card type controller we're talking to an 560 00:21:30,600 --> 00:21:35,700 EMC device so we needed to do a bit of 561 00:21:33,059 --> 00:21:38,159 um uh work unless that was Matt in the 562 00:21:35,700 --> 00:21:41,100 audience here uh making that compatible 563 00:21:38,159 --> 00:21:43,500 with an EMC device and the Linux drivers 564 00:21:41,100 --> 00:21:45,539 that talk to it to actually support emmc 565 00:21:43,500 --> 00:21:47,820 but once we kind of sorted out we just 566 00:21:45,539 --> 00:21:50,100 have the the Linux emmc storage layer 567 00:21:47,820 --> 00:21:55,039 talking to this controller talking to 568 00:21:50,100 --> 00:21:55,039 the the EMC Hardware on the board itself 569 00:21:56,159 --> 00:22:00,179 cool 570 00:21:58,260 --> 00:22:01,919 and of course one of my favorite things 571 00:22:00,179 --> 00:22:04,559 we have a uart 572 00:22:01,919 --> 00:22:06,480 um it's from the open course project it 573 00:22:04,559 --> 00:22:07,140 just comes as a vhdl 574 00:22:06,480 --> 00:22:10,740 um 575 00:22:07,140 --> 00:22:12,840 it's a standard 16 550 uart so of course 576 00:22:10,740 --> 00:22:15,179 it has Linux support uh one of the most 577 00:22:12,840 --> 00:22:17,580 common bits of Hardware so that at this 578 00:22:15,179 --> 00:22:20,520 point that gives us our CPU call we've 579 00:22:17,580 --> 00:22:21,120 got Ram we've got Network we've got 580 00:22:20,520 --> 00:22:23,280 um 581 00:22:21,120 --> 00:22:26,100 a uh 582 00:22:23,280 --> 00:22:27,299 uh storage got a Serial console all that 583 00:22:26,100 --> 00:22:29,580 kind of stuff so it's at this point 584 00:22:27,299 --> 00:22:33,780 we're kind of we've got our server we've 585 00:22:29,580 --> 00:22:35,100 connected to uh our basic SOC so but 586 00:22:33,780 --> 00:22:38,159 it's not quite Hardware we're kind of 587 00:22:35,100 --> 00:22:40,320 Gateway and and and Hardware distinction 588 00:22:38,159 --> 00:22:42,419 is a bit blur at this point but we do 589 00:22:40,320 --> 00:22:45,380 have our our little chip we can run 590 00:22:42,419 --> 00:22:45,380 codon at that point 591 00:22:45,840 --> 00:22:51,299 now back to our diagram earlier 592 00:22:49,140 --> 00:22:53,940 um this is what we're trying to make 593 00:22:51,299 --> 00:22:55,740 um at one point we've kind of covered 594 00:22:53,940 --> 00:22:57,120 the left we still don't have any of 595 00:22:55,740 --> 00:22:58,679 these bits on the right here would 596 00:22:57,120 --> 00:23:00,120 actually make it a BMC and they're kind 597 00:22:58,679 --> 00:23:02,340 of the interesting things for for what 598 00:23:00,120 --> 00:23:03,960 we're doing here is producing a BMC that 599 00:23:02,340 --> 00:23:05,880 can control the server 600 00:23:03,960 --> 00:23:07,980 so we're still missing the dedicated 601 00:23:05,880 --> 00:23:09,659 hardware for like the LPC and the FSI 602 00:23:07,980 --> 00:23:10,799 and the ITC 603 00:23:09,659 --> 00:23:12,780 um 604 00:23:10,799 --> 00:23:14,940 so we want to turn our SLC into 605 00:23:12,780 --> 00:23:17,520 something useful 606 00:23:14,940 --> 00:23:19,620 um so I grabbed a few other bits uh one 607 00:23:17,520 --> 00:23:22,260 of them being produced by 608 00:23:19,620 --> 00:23:25,440 um some folks at IBM under the open 609 00:23:22,260 --> 00:23:27,360 power Foundation project which is a 610 00:23:25,440 --> 00:23:28,860 essentially an LPC controller which 611 00:23:27,360 --> 00:23:30,539 which is our main 612 00:23:28,860 --> 00:23:32,960 pretty important bus to be talking 613 00:23:30,539 --> 00:23:34,799 between The BMC and the the server 614 00:23:32,960 --> 00:23:36,360 mainly because it's used for the 615 00:23:34,799 --> 00:23:38,820 firmware transfer so when the when the 616 00:23:36,360 --> 00:23:41,460 server boots it needs to request its 617 00:23:38,820 --> 00:23:42,600 bile its low level firmware over LPC 618 00:23:41,460 --> 00:23:44,580 from 619 00:23:42,600 --> 00:23:46,140 from The BMC so it's pretty critical we 620 00:23:44,580 --> 00:23:47,159 have this working 621 00:23:46,140 --> 00:23:49,919 um 622 00:23:47,159 --> 00:23:53,520 the in our in our architecture here The 623 00:23:49,919 --> 00:23:56,100 BMC is the LPC device and the host CPUs 624 00:23:53,520 --> 00:23:58,020 of the LPC controller so trying to do 625 00:23:56,100 --> 00:24:00,240 this in software is a bit tricky hence 626 00:23:58,020 --> 00:24:02,880 having actual dedicated bit of hardware 627 00:24:00,240 --> 00:24:04,620 on rfpga to do that is pretty essential 628 00:24:02,880 --> 00:24:07,200 for that so it gives us our firmware 629 00:24:04,620 --> 00:24:09,600 load it gives us a bit of a Channel of 630 00:24:07,200 --> 00:24:12,720 communication between the The BMC and 631 00:24:09,600 --> 00:24:15,360 the the host in things like ipmi and and 632 00:24:12,720 --> 00:24:18,240 console host console as well so we're 633 00:24:15,360 --> 00:24:20,580 using the LPC peripheral for that 634 00:24:18,240 --> 00:24:22,559 we also have in the micro what 635 00:24:20,580 --> 00:24:24,240 definition itself there is a GPO 636 00:24:22,559 --> 00:24:27,600 controller so that allows us to do all 637 00:24:24,240 --> 00:24:28,799 of the the kind of off to the side bits 638 00:24:27,600 --> 00:24:30,659 of implementing a server things like the 639 00:24:28,799 --> 00:24:32,460 power button that's basically just a 640 00:24:30,659 --> 00:24:34,860 gpio that goes from from the button 641 00:24:32,460 --> 00:24:37,440 itself to The BMC that tells it to start 642 00:24:34,860 --> 00:24:39,240 booting the server so GPO is for that 643 00:24:37,440 --> 00:24:41,580 the microwave code already had those 644 00:24:39,240 --> 00:24:43,200 present we just needed to add some some 645 00:24:41,580 --> 00:24:46,260 interrupt support for that it's pretty 646 00:24:43,200 --> 00:24:47,640 critical that we we have GPS state 647 00:24:46,260 --> 00:24:50,039 change is actually interpreting the CPU 648 00:24:47,640 --> 00:24:53,780 rather than polling for that so the iiq 649 00:24:50,039 --> 00:24:53,780 support that was all Uma you did the 650 00:24:54,500 --> 00:25:00,600 we now have the the gpos that support 651 00:24:58,260 --> 00:25:02,460 um things like our LEDs our fan presence 652 00:25:00,600 --> 00:25:06,679 all those bits there 653 00:25:02,460 --> 00:25:06,679 so going back to what we have so far 654 00:25:06,720 --> 00:25:11,340 this is our the system we've built out 655 00:25:09,240 --> 00:25:13,559 of those those bits 656 00:25:11,340 --> 00:25:16,080 um we're starting now to look a bit more 657 00:25:13,559 --> 00:25:19,200 like a BMC you know we have we have our 658 00:25:16,080 --> 00:25:20,820 core we are Ram we have our bits 659 00:25:19,200 --> 00:25:23,820 um Don't Laugh at My 100 megahertz 660 00:25:20,820 --> 00:25:25,140 computer it's uh it's pretty good we'll 661 00:25:23,820 --> 00:25:25,860 see how we go 662 00:25:25,140 --> 00:25:27,360 um 663 00:25:25,860 --> 00:25:30,120 but we are missing a couple of devices 664 00:25:27,360 --> 00:25:33,779 that I I did mention earlier on uh and 665 00:25:30,120 --> 00:25:36,539 they're the i2c and the FSI bosses 666 00:25:33,779 --> 00:25:41,340 gonna get to that 667 00:25:36,539 --> 00:25:43,260 um so i2c uh is used heavily on on BMC 668 00:25:41,340 --> 00:25:45,960 systems a lot of like the temperature 669 00:25:43,260 --> 00:25:49,500 sensors a lot of fans all those kind of 670 00:25:45,960 --> 00:25:53,400 bits uh even gpro expanders so like more 671 00:25:49,500 --> 00:25:56,700 gpus behind an ITC bus implemented using 672 00:25:53,400 --> 00:25:59,700 an i2c connection between The BMC 673 00:25:56,700 --> 00:26:01,919 and we also have this FSI bus Which good 674 00:25:59,700 --> 00:26:03,360 question uh is 675 00:26:01,919 --> 00:26:05,340 what does it stand for this week the 676 00:26:03,360 --> 00:26:07,200 free support interface or field service 677 00:26:05,340 --> 00:26:09,299 anyway whatever it's called this week 678 00:26:07,200 --> 00:26:12,720 it's an IBM design 679 00:26:09,299 --> 00:26:14,100 um it's uh it's required it's a bus 680 00:26:12,720 --> 00:26:16,440 between 681 00:26:14,100 --> 00:26:18,419 The BMC and the processes that 682 00:26:16,440 --> 00:26:21,360 implements kind of the really like the 683 00:26:18,419 --> 00:26:24,360 early firmware load early or low level 684 00:26:21,360 --> 00:26:26,520 control of of the the units inside a CPU 685 00:26:24,360 --> 00:26:28,620 there's 686 00:26:26,520 --> 00:26:30,480 gross of oversimplification but it's but 687 00:26:28,620 --> 00:26:32,100 it's our way our magic way of talking to 688 00:26:30,480 --> 00:26:34,440 the CPU essentially and doing some 689 00:26:32,100 --> 00:26:37,440 really early early brute processes on 690 00:26:34,440 --> 00:26:39,360 the on the CPU there's also a few 691 00:26:37,440 --> 00:26:43,500 monitoring applications that happen over 692 00:26:39,360 --> 00:26:45,000 FSI uh I think we have access to the uh 693 00:26:43,500 --> 00:26:47,820 the on-chip control sort of thermal 694 00:26:45,000 --> 00:26:48,900 management over SSI in some cases 695 00:26:47,820 --> 00:26:50,400 um 696 00:26:48,900 --> 00:26:52,500 so we haven't implemented these in 697 00:26:50,400 --> 00:26:54,480 Gateway but fortunately 698 00:26:52,500 --> 00:26:57,299 um in Linux we have a couple of drivers 699 00:26:54,480 --> 00:27:01,799 that are handy for us here we can take a 700 00:26:57,299 --> 00:27:04,559 couple of gpios and use them for i2c 701 00:27:01,799 --> 00:27:07,260 just by manually toggling or reading 702 00:27:04,559 --> 00:27:08,940 those those gpos for FSI we have a 703 00:27:07,260 --> 00:27:10,740 back-end driver for the FSI bus that 704 00:27:08,940 --> 00:27:12,900 does the same thing we can we can just 705 00:27:10,740 --> 00:27:15,059 twiddle the gpos and get proper FSI 706 00:27:12,900 --> 00:27:17,520 transactions uh 707 00:27:15,059 --> 00:27:18,600 working over essentially a raw GPA 708 00:27:17,520 --> 00:27:20,580 device 709 00:27:18,600 --> 00:27:23,159 um 710 00:27:20,580 --> 00:27:24,059 yeah yeah good yeah exactly so these are 711 00:27:23,159 --> 00:27:26,100 used in production we're not just 712 00:27:24,059 --> 00:27:28,740 hacking things around here we are 713 00:27:26,100 --> 00:27:31,799 introducing a bit of technical debt and 714 00:27:28,740 --> 00:27:34,260 um unlike all technical debt 715 00:27:31,799 --> 00:27:35,580 it's true like all technical debt we're 716 00:27:34,260 --> 00:27:37,980 going to kick that can down the road for 717 00:27:35,580 --> 00:27:40,200 a few slides time 718 00:27:37,980 --> 00:27:42,179 so what do we have we've got 719 00:27:40,200 --> 00:27:43,500 our fpga this is looking like our 720 00:27:42,179 --> 00:27:47,159 earlier diagram we've got all these 721 00:27:43,500 --> 00:27:49,860 signaling happening in actual bits of um 722 00:27:47,159 --> 00:27:51,779 Hardware uh that allows us to implement 723 00:27:49,860 --> 00:27:55,500 things like a BMC and we have our our 724 00:27:51,779 --> 00:27:57,299 sort of peripherals like IO MMC dram all 725 00:27:55,500 --> 00:28:00,539 those kind of bits are now implemented 726 00:27:57,299 --> 00:28:02,700 on our on our little device here 727 00:28:00,539 --> 00:28:06,480 so back to our slide we have a server 728 00:28:02,700 --> 00:28:07,799 connected to an actual BMC 729 00:28:06,480 --> 00:28:09,480 good stuff 730 00:28:07,799 --> 00:28:11,100 so we have Hardware right we have our 731 00:28:09,480 --> 00:28:13,080 Hardware sorted out it's not doing 732 00:28:11,100 --> 00:28:14,700 anything yet it's not I mean we've got 733 00:28:13,080 --> 00:28:16,980 the hardware plus The BMC we don't have 734 00:28:14,700 --> 00:28:20,520 a being a BMC 735 00:28:16,980 --> 00:28:21,960 so of course we need some open software 736 00:28:20,520 --> 00:28:24,080 um 737 00:28:21,960 --> 00:28:27,179 I guess going back a few levels 738 00:28:24,080 --> 00:28:28,799 Linux already exists on microwatt that's 739 00:28:27,179 --> 00:28:31,620 not something we've done 740 00:28:28,799 --> 00:28:34,320 um Linux already exists on as a BMC in 741 00:28:31,620 --> 00:28:36,539 an open BMC space so I'll bring up Focus 742 00:28:34,320 --> 00:28:39,179 here is 743 00:28:36,539 --> 00:28:41,100 bringing up this dcscm board with the 744 00:28:39,179 --> 00:28:44,700 hardware that we've just produced in 745 00:28:41,100 --> 00:28:46,740 Gateway and for doing that we have all 746 00:28:44,700 --> 00:28:47,760 manner of Open Source software which I'm 747 00:28:46,740 --> 00:28:48,600 sure you're sort of already familiar 748 00:28:47,760 --> 00:28:50,520 with 749 00:28:48,600 --> 00:28:52,799 and here's kind of I guess our story of 750 00:28:50,520 --> 00:28:54,299 how we're bringing that up 751 00:28:52,799 --> 00:28:57,299 so of course we need a Bootloader for 752 00:28:54,299 --> 00:29:00,360 that we used industry standard ubert 753 00:28:57,299 --> 00:29:03,240 um the uh 754 00:29:00,360 --> 00:29:05,039 uh the the ports I think it's Upstream 755 00:29:03,240 --> 00:29:07,740 of Hubert already supports the micro 756 00:29:05,039 --> 00:29:09,960 Rock core we just needed to add a few 757 00:29:07,740 --> 00:29:12,779 like a platform definition 758 00:29:09,960 --> 00:29:14,640 and some bits for reading you know boot 759 00:29:12,779 --> 00:29:16,679 payloads off EMC and all that sort of 760 00:29:14,640 --> 00:29:17,940 thing so the micro reports already there 761 00:29:16,679 --> 00:29:19,140 already has the ethernet controller 762 00:29:17,940 --> 00:29:21,179 already has 763 00:29:19,140 --> 00:29:24,419 um you know all out a lot of bits there 764 00:29:21,179 --> 00:29:27,059 we just need to define the dcscm 765 00:29:24,419 --> 00:29:28,860 platform for u-boot so we know which 766 00:29:27,059 --> 00:29:30,840 Hardware is out there what can what can 767 00:29:28,860 --> 00:29:32,760 what it can interact with to actually 768 00:29:30,840 --> 00:29:34,020 start our bootloader on our and our new 769 00:29:32,760 --> 00:29:37,080 bit of hardware 770 00:29:34,020 --> 00:29:38,940 that work is was about 32 patches 771 00:29:37,080 --> 00:29:40,500 patches worth on the sort of the 772 00:29:38,940 --> 00:29:42,179 mainline new boot tree so not a whole 773 00:29:40,500 --> 00:29:44,820 lot but just our our platform 774 00:29:42,179 --> 00:29:46,380 definitions um getting that going 775 00:29:44,820 --> 00:29:47,539 once we have a bootloader we want to 776 00:29:46,380 --> 00:29:52,520 actually Boot something 777 00:29:47,539 --> 00:29:52,520 and in our case that is 778 00:29:52,980 --> 00:29:57,659 suspense 779 00:29:55,740 --> 00:30:00,480 Linux 780 00:29:57,659 --> 00:30:02,279 um again Linux already boots on 781 00:30:00,480 --> 00:30:04,740 microwots already have the core covered 782 00:30:02,279 --> 00:30:07,500 we just needed to add some of the the 783 00:30:04,740 --> 00:30:08,460 drivers for our BMC peripherals the 784 00:30:07,500 --> 00:30:10,919 actual 785 00:30:08,460 --> 00:30:12,659 um the platform definition again it's 786 00:30:10,919 --> 00:30:15,240 kind of similar to the efforts we did in 787 00:30:12,659 --> 00:30:17,520 new boot kind of replace with production 788 00:30:15,240 --> 00:30:20,340 Linux drivers for this 789 00:30:17,520 --> 00:30:24,720 um so we have about 37 patches in total 790 00:30:20,340 --> 00:30:26,460 for the Linux Port of our dcscm board 791 00:30:24,720 --> 00:30:28,620 um and and nothing too controversial 792 00:30:26,460 --> 00:30:30,539 there just again some device trees some 793 00:30:28,620 --> 00:30:32,159 some bits of driver plumbing and that 794 00:30:30,539 --> 00:30:34,679 sort of thing 795 00:30:32,159 --> 00:30:35,520 so we've got our OS layer 796 00:30:34,679 --> 00:30:37,380 um 797 00:30:35,520 --> 00:30:39,120 we have our server connected to our 798 00:30:37,380 --> 00:30:40,380 actual BMC and now we're building Linux 799 00:30:39,120 --> 00:30:41,820 and of course 800 00:30:40,380 --> 00:30:43,260 it's a booty Linux on something we'll 801 00:30:41,820 --> 00:30:47,340 Benchmark it right 802 00:30:43,260 --> 00:30:49,380 so back to our 100 megahertz computer 803 00:30:47,340 --> 00:30:51,299 um on our 804 00:30:49,380 --> 00:30:54,480 the numbers in the kind of the the 805 00:30:51,299 --> 00:30:57,539 fourth row are the benchmarks from the 806 00:30:54,480 --> 00:31:00,240 existing BMC on that server which is our 807 00:30:57,539 --> 00:31:02,100 arm core running at 800 megahertz the 808 00:31:00,240 --> 00:31:04,679 arrows indicate whether higher numbers 809 00:31:02,100 --> 00:31:07,799 are better or worse and our relative 810 00:31:04,679 --> 00:31:10,200 column on the the right gives us where 811 00:31:07,799 --> 00:31:11,880 one is equivalent numbers are higher 812 00:31:10,200 --> 00:31:13,740 than one are where the micro right core 813 00:31:11,880 --> 00:31:15,299 is doing better numbers below one is 814 00:31:13,740 --> 00:31:17,159 where we're not quite there so I cut a 815 00:31:15,299 --> 00:31:18,600 percentage of of what's in the current 816 00:31:17,159 --> 00:31:20,700 production machine 817 00:31:18,600 --> 00:31:22,440 uh and this is pretty cool 818 00:31:20,700 --> 00:31:24,059 um we're we're running an fpga call 819 00:31:22,440 --> 00:31:27,419 running 100 megahertz 820 00:31:24,059 --> 00:31:29,940 um we're doing better in some benchmarks 821 00:31:27,419 --> 00:31:32,399 um which is kind of cool uh we're doing 822 00:31:29,940 --> 00:31:33,600 absolutely awfully on our Snell Cisco 823 00:31:32,399 --> 00:31:34,679 Benchmark 824 00:31:33,600 --> 00:31:36,720 um 825 00:31:34,679 --> 00:31:38,100 and my theory on this is that that we're 826 00:31:36,720 --> 00:31:40,620 missing out on some optimization 827 00:31:38,100 --> 00:31:42,899 required for the null Sysco but we can 828 00:31:40,620 --> 00:31:45,360 we can ignore that for now turns out it 829 00:31:42,899 --> 00:31:47,640 wasn't wasn't super important but so we 830 00:31:45,360 --> 00:31:50,399 have decent 831 00:31:47,640 --> 00:31:52,440 um kind of IPC type performance open BMC 832 00:31:50,399 --> 00:31:55,740 which is our BMC payload we'll get into 833 00:31:52,440 --> 00:31:57,120 that later uh is very heavy on IPC so 834 00:31:55,740 --> 00:32:00,600 that's kind of the things we're focusing 835 00:31:57,120 --> 00:32:02,460 on so our message passing uh benchmarks 836 00:32:00,600 --> 00:32:04,559 pretty good 837 00:32:02,460 --> 00:32:07,559 um we have some excellent speed in in 838 00:32:04,559 --> 00:32:08,220 memory access which is also great 839 00:32:07,559 --> 00:32:12,480 um 840 00:32:08,220 --> 00:32:14,340 the network performance at about 62 is 841 00:32:12,480 --> 00:32:15,960 not fantastic but it's not super 842 00:32:14,340 --> 00:32:17,700 critical and we're doing a huge amount 843 00:32:15,960 --> 00:32:18,779 of data transfer over the network to our 844 00:32:17,700 --> 00:32:20,100 BMC 845 00:32:18,779 --> 00:32:22,260 uh 846 00:32:20,100 --> 00:32:24,779 we have a few artificial benchmarks just 847 00:32:22,260 --> 00:32:27,360 the boot time of an open BMC system 848 00:32:24,779 --> 00:32:28,980 um which are about parody on 849 00:32:27,360 --> 00:32:30,360 um but we did have quite a cut down 850 00:32:28,980 --> 00:32:34,679 system at that point 851 00:32:30,360 --> 00:32:37,320 uh and open BMC being pretty heavy on 852 00:32:34,679 --> 00:32:38,580 the IPC uses debuss a lot for its 853 00:32:37,320 --> 00:32:40,679 internal communication 854 00:32:38,580 --> 00:32:43,080 and we're kind of a bit this is that 855 00:32:40,679 --> 00:32:45,360 this is my concerning line here our 37 856 00:32:43,080 --> 00:32:46,140 performance on on dbos 857 00:32:45,360 --> 00:32:47,460 um 858 00:32:46,140 --> 00:32:48,659 everything else is kind of cool we've 859 00:32:47,460 --> 00:32:51,720 got we've got Network we've got flash 860 00:32:48,659 --> 00:32:54,360 we've got system call behavior on our 861 00:32:51,720 --> 00:32:56,340 micro rut core running at 1 8 of the 862 00:32:54,360 --> 00:32:58,320 clock speed of the thing we're replacing 863 00:32:56,340 --> 00:33:02,580 so that was pretty cool 864 00:32:58,320 --> 00:33:05,100 um as part of the project we kind of um 865 00:33:02,580 --> 00:33:06,240 we did this about Midway through to 866 00:33:05,100 --> 00:33:07,320 figure out whether we should keep going 867 00:33:06,240 --> 00:33:09,720 or not 868 00:33:07,320 --> 00:33:11,820 um and and this this was quite 869 00:33:09,720 --> 00:33:14,519 encouraging I was I guess you know ready 870 00:33:11,820 --> 00:33:16,620 to uh we can't do it but you know this 871 00:33:14,519 --> 00:33:18,419 table was was kind of Handy to provide a 872 00:33:16,620 --> 00:33:21,200 bit of encouragement 873 00:33:18,419 --> 00:33:24,380 so um using using a bit of this data we 874 00:33:21,200 --> 00:33:24,380 continued along 875 00:33:24,720 --> 00:33:31,019 uh for some work on open BMC so this is 876 00:33:30,120 --> 00:33:36,840 um 877 00:33:31,019 --> 00:33:40,260 uh a open implementation of a BMC but 878 00:33:36,840 --> 00:33:43,500 nothing new here the the existing ac922 879 00:33:40,260 --> 00:33:45,539 BMC already supports open BMC the code's 880 00:33:43,500 --> 00:33:47,640 all Upstream we don't need to worry too 881 00:33:45,539 --> 00:33:49,620 much about that side of things 882 00:33:47,640 --> 00:33:52,140 what we do need to do is provide the 883 00:33:49,620 --> 00:33:55,679 platform definition that boots open BMC 884 00:33:52,140 --> 00:33:57,419 on our new bit of dcscm Hardware 885 00:33:55,679 --> 00:34:00,419 um so 886 00:33:57,419 --> 00:34:02,640 I think every Upstream open BMC platform 887 00:34:00,419 --> 00:34:05,000 is currently on the arm architecture 888 00:34:02,640 --> 00:34:08,760 we had to have a um 889 00:34:05,000 --> 00:34:10,859 uh a power support for that now given 890 00:34:08,760 --> 00:34:12,899 our micro core as a power architecture 891 00:34:10,859 --> 00:34:14,159 it's not an armed device so we had to do 892 00:34:12,899 --> 00:34:17,399 the um 893 00:34:14,159 --> 00:34:19,919 the the bits required to make a power 894 00:34:17,399 --> 00:34:21,200 based open BMC system like I said 895 00:34:19,919 --> 00:34:24,240 platform definition 896 00:34:21,200 --> 00:34:26,099 a little a couple of patches the tool 897 00:34:24,240 --> 00:34:28,260 chain to allow it to support the the 898 00:34:26,099 --> 00:34:29,700 micro Watts architecture because we're 899 00:34:28,260 --> 00:34:32,220 missing those Vector registers we have 900 00:34:29,700 --> 00:34:33,419 to uh patch the tool chain very slightly 901 00:34:32,220 --> 00:34:37,440 to do that 902 00:34:33,419 --> 00:34:39,060 but also trimmed down open BMC a little 903 00:34:37,440 --> 00:34:40,940 um removing a few of the servers because 904 00:34:39,060 --> 00:34:43,139 we're at such a small CPU in this case 905 00:34:40,940 --> 00:34:44,940 removing some of the services that we're 906 00:34:43,139 --> 00:34:47,520 certainly not going to need like some of 907 00:34:44,940 --> 00:34:49,080 the debug support we can we can pull out 908 00:34:47,520 --> 00:34:50,940 so that we have something that's 909 00:34:49,080 --> 00:34:52,740 actually going to boot and you know 910 00:34:50,940 --> 00:34:54,540 provide some decent performance on 100 911 00:34:52,740 --> 00:34:58,680 megahertz CPU 912 00:34:54,540 --> 00:35:01,560 that that base support of getting 913 00:34:58,680 --> 00:35:05,460 just kind of the the open BMC system 914 00:35:01,560 --> 00:35:07,619 itself booting was about 19 patches 915 00:35:05,460 --> 00:35:09,060 um and that's that's kind of getting the 916 00:35:07,619 --> 00:35:10,920 thing burning to Shell not talking to 917 00:35:09,060 --> 00:35:13,320 the hostess yet but getting an open BMC 918 00:35:10,920 --> 00:35:16,020 user space all all booted at that stage 919 00:35:13,320 --> 00:35:18,540 so we're reviewing our progress here we 920 00:35:16,020 --> 00:35:20,520 have a server connected to an actual BMC 921 00:35:18,540 --> 00:35:23,040 it's booting Linux and it's running 922 00:35:20,520 --> 00:35:24,780 actual BMC Software at this point which 923 00:35:23,040 --> 00:35:26,520 is kind of cool 924 00:35:24,780 --> 00:35:29,220 um we're at the point where we can 925 00:35:26,520 --> 00:35:32,880 actually access the web page exported by 926 00:35:29,220 --> 00:35:35,099 our fpga BMC this is the standard open 927 00:35:32,880 --> 00:35:37,260 BMC UI it already existed we're not 928 00:35:35,099 --> 00:35:39,780 doing any of this so this is just 929 00:35:37,260 --> 00:35:43,260 um the the standard open BMC Software 930 00:35:39,780 --> 00:35:45,000 running on the port of our dcscm fpga 931 00:35:43,260 --> 00:35:46,980 Hardware 932 00:35:45,000 --> 00:35:48,540 um so this is the the default page you 933 00:35:46,980 --> 00:35:50,400 get when you visit it 934 00:35:48,540 --> 00:35:51,720 um the 100 megahertziness is is 935 00:35:50,400 --> 00:35:54,060 definitely showing this point the web 936 00:35:51,720 --> 00:35:56,280 page takes is not quite as Snappy as you 937 00:35:54,060 --> 00:35:58,380 expect for for one of these things but 938 00:35:56,280 --> 00:36:00,060 it's it's usable tolerable we're still 939 00:35:58,380 --> 00:36:01,800 missing a lot of things we don't have 940 00:36:00,060 --> 00:36:04,800 any inventory data populated on the 941 00:36:01,800 --> 00:36:08,460 system we're not pulling much out from 942 00:36:04,800 --> 00:36:10,200 the uh from the core but at least we 943 00:36:08,460 --> 00:36:12,359 have our web page here we can do some 944 00:36:10,200 --> 00:36:14,160 basic things on that including 945 00:36:12,359 --> 00:36:16,800 uh checking out some of the sensors so 946 00:36:14,160 --> 00:36:19,020 this was without doing much on the host 947 00:36:16,800 --> 00:36:21,839 bring up side we already have our BMC 948 00:36:19,020 --> 00:36:23,579 definition for the ac922 we just ported 949 00:36:21,839 --> 00:36:25,859 it and we kind of have all our our 950 00:36:23,579 --> 00:36:28,520 sensors listed ready to go 951 00:36:25,859 --> 00:36:31,079 so we've got essentially our BMC running 952 00:36:28,520 --> 00:36:33,020 but not talking to much at the moment so 953 00:36:31,079 --> 00:36:35,820 the next phase of bring up 954 00:36:33,020 --> 00:36:38,220 is getting it communicating with our 955 00:36:35,820 --> 00:36:40,220 open power so our open power BMC 956 00:36:38,220 --> 00:36:42,780 communicating with our open power 957 00:36:40,220 --> 00:36:45,480 server itself 958 00:36:42,780 --> 00:36:47,400 um again important aspect of that is the 959 00:36:45,480 --> 00:36:50,040 firmware for an open power machine is 960 00:36:47,400 --> 00:36:51,720 all open we can refer to it we can refer 961 00:36:50,040 --> 00:36:53,460 the documentation we can see the bits 962 00:36:51,720 --> 00:36:58,079 that are out there when we're doing our 963 00:36:53,460 --> 00:36:59,700 our BMC bring up now like I said the the 964 00:36:58,079 --> 00:37:02,700 platform support is already there we're 965 00:36:59,700 --> 00:37:05,099 just doing the the dcscme parts of that 966 00:37:02,700 --> 00:37:06,480 and that meant that we can you know we 967 00:37:05,099 --> 00:37:07,980 have access to the code we can modify it 968 00:37:06,480 --> 00:37:11,640 we can change the platform definitions 969 00:37:07,980 --> 00:37:12,960 to suit our new BMC device 970 00:37:11,640 --> 00:37:15,839 um so what we're doing here is taking 971 00:37:12,960 --> 00:37:18,359 our open power port uh sorry open power 972 00:37:15,839 --> 00:37:21,000 board applying that to our new open BMC 973 00:37:18,359 --> 00:37:24,540 platform now it's already a thing so the 974 00:37:21,000 --> 00:37:27,780 The ac922 BMC like I said before is is 975 00:37:24,540 --> 00:37:30,240 is a Upstream device what we're doing 976 00:37:27,780 --> 00:37:31,859 here is our all of our lots of platform 977 00:37:30,240 --> 00:37:34,380 definition bits that get it running on 978 00:37:31,859 --> 00:37:38,460 microaught plus The BMC Hardware plus 979 00:37:34,380 --> 00:37:39,960 all that using the existing ac922 BMC 980 00:37:38,460 --> 00:37:42,839 definitions 981 00:37:39,960 --> 00:37:45,240 um unfortunately it's kind of not that 982 00:37:42,839 --> 00:37:46,560 modular on the hardware there's a lot of 983 00:37:45,240 --> 00:37:48,960 um 984 00:37:46,560 --> 00:37:50,880 coded I hard-coded things about paths 985 00:37:48,960 --> 00:37:52,140 and SFS for where you find your your 986 00:37:50,880 --> 00:37:54,119 temperature sensors and that sort of 987 00:37:52,140 --> 00:37:56,940 thing which was not not great there's 988 00:37:54,119 --> 00:37:59,760 some awful hacks in in our Port 989 00:37:56,940 --> 00:38:01,619 um we had to fake up a few gpos where 990 00:37:59,760 --> 00:38:03,180 they didn't exist through that connector 991 00:38:01,619 --> 00:38:05,040 for example 992 00:38:03,180 --> 00:38:07,320 um we had to extend a few timeouts we're 993 00:38:05,040 --> 00:38:09,540 running you know pretty slow so we had 994 00:38:07,320 --> 00:38:10,560 to stretch a few limits there 995 00:38:09,540 --> 00:38:12,680 um 996 00:38:10,560 --> 00:38:15,540 the I guess our 997 00:38:12,680 --> 00:38:17,820 the talking between the two machines 998 00:38:15,540 --> 00:38:19,920 support was about nine patches to open 999 00:38:17,820 --> 00:38:21,180 BMC in total 1000 00:38:19,920 --> 00:38:23,520 um 1001 00:38:21,180 --> 00:38:25,440 it's not a lot I guess um and the major 1002 00:38:23,520 --> 00:38:28,320 factor in that is that we're starting 1003 00:38:25,440 --> 00:38:31,800 with an open base on the server side and 1004 00:38:28,320 --> 00:38:33,839 an open base on the uh the open BMC side 1005 00:38:31,800 --> 00:38:36,180 um one of the cool things is now we're 1006 00:38:33,839 --> 00:38:37,920 dealing with everything open 1007 00:38:36,180 --> 00:38:38,940 um we've got our open Hardware we can 1008 00:38:37,920 --> 00:38:40,079 look at 1009 00:38:38,940 --> 00:38:42,540 um I guess it's like any sort of 1010 00:38:40,079 --> 00:38:44,040 standard bring up project where the key 1011 00:38:42,540 --> 00:38:45,660 difference is that if you find a bug in 1012 00:38:44,040 --> 00:38:48,119 the hardware the hardware folks don't 1013 00:38:45,660 --> 00:38:50,640 just tell you to get bent uh we can 1014 00:38:48,119 --> 00:38:53,160 actually work on fixing that and in one 1015 00:38:50,640 --> 00:38:54,540 case we found an issue from our original 1016 00:38:53,160 --> 00:38:56,760 schematic 1017 00:38:54,540 --> 00:38:58,200 um where there was apparently hardware 1018 00:38:56,760 --> 00:39:00,599 folks don't do off by ones that have 1019 00:38:58,200 --> 00:39:04,020 like twos we had some of the signals 1020 00:39:00,599 --> 00:39:07,320 routed a little incorrectly we can file 1021 00:39:04,020 --> 00:39:09,420 a bug and about two weeks later the app 1022 00:39:07,320 --> 00:39:12,060 micro folks who ends repo had fixed in 1023 00:39:09,420 --> 00:39:12,960 their designs now of course 1024 00:39:12,060 --> 00:39:14,579 um 1025 00:39:12,960 --> 00:39:16,440 fixing their designs is great but we 1026 00:39:14,579 --> 00:39:17,880 still have our physical piece of 1027 00:39:16,440 --> 00:39:20,220 Hardware which is not fixed so we had to 1028 00:39:17,880 --> 00:39:21,780 do some some rewiring and doing sort of 1029 00:39:20,220 --> 00:39:23,460 thing but again since we can see the 1030 00:39:21,780 --> 00:39:25,260 schematic we can see the board layouts 1031 00:39:23,460 --> 00:39:27,060 we can we can do those sorts of fixes 1032 00:39:25,260 --> 00:39:30,599 without having too much involvement with 1033 00:39:27,060 --> 00:39:33,180 with the the Upstream folks if necessary 1034 00:39:30,599 --> 00:39:34,680 so once fixing all of these once kind of 1035 00:39:33,180 --> 00:39:37,800 doing some some little bits of Gateway 1036 00:39:34,680 --> 00:39:41,220 rewrites uh we actually had 1037 00:39:37,800 --> 00:39:44,400 our BMC ready to go 1038 00:39:41,220 --> 00:39:46,680 we can go to our page on The BMC web 1039 00:39:44,400 --> 00:39:48,060 interface and click the the power on 1040 00:39:46,680 --> 00:39:49,680 button 1041 00:39:48,060 --> 00:39:51,420 um now after an extraordinarily long 1042 00:39:49,680 --> 00:39:55,380 amount of time 1043 00:39:51,420 --> 00:39:58,859 we get this uh so this is on our on your 1044 00:39:55,380 --> 00:40:01,079 left hand side we have the console and 1045 00:39:58,859 --> 00:40:04,260 the server itself on our right hand side 1046 00:40:01,079 --> 00:40:07,200 we have a console of The BMC just a 1047 00:40:04,260 --> 00:40:08,520 little comparison on this side we have 1048 00:40:07,200 --> 00:40:10,320 our 1049 00:40:08,520 --> 00:40:14,339 just just the top of the CPU information 1050 00:40:10,320 --> 00:40:16,800 file uh our power9 CPUs of which we have 1051 00:40:14,339 --> 00:40:19,079 128 1052 00:40:16,800 --> 00:40:21,660 on the right hand side we have our 1053 00:40:19,079 --> 00:40:24,060 little that's the entire CPU info on our 1054 00:40:21,660 --> 00:40:25,440 BMC but the really cool thing is that 1055 00:40:24,060 --> 00:40:26,820 now we've got the same architecture 1056 00:40:25,440 --> 00:40:29,339 across both of these both open power 1057 00:40:26,820 --> 00:40:30,240 machines one supercomputer one tiny 1058 00:40:29,339 --> 00:40:32,700 computer 1059 00:40:30,240 --> 00:40:35,400 uh 1060 00:40:32,700 --> 00:40:37,500 so kind of tracking our thing here we've 1061 00:40:35,400 --> 00:40:39,900 got our server it's connected to through 1062 00:40:37,500 --> 00:40:42,000 that interposer the actual BMC BMC is 1063 00:40:39,900 --> 00:40:44,400 booting Linux BMC is running BMC 1064 00:40:42,000 --> 00:40:46,320 Software and 1065 00:40:44,400 --> 00:40:48,359 it can boot the server and even better 1066 00:40:46,320 --> 00:40:51,240 it can boot the server running and 1067 00:40:48,359 --> 00:40:53,640 entirely open source firmware OS there's 1068 00:40:51,240 --> 00:40:54,960 nothing in the the stack here that's the 1069 00:40:53,640 --> 00:40:56,760 closed source 1070 00:40:54,960 --> 00:40:59,220 thank you 1071 00:40:56,760 --> 00:41:01,079 anytime you just yet I did say literally 1072 00:40:59,220 --> 00:41:03,300 everything is open there's one component 1073 00:41:01,079 --> 00:41:05,339 one one little thing which is the the 1074 00:41:03,300 --> 00:41:08,480 tool chain required to compile a Gateway 1075 00:41:05,339 --> 00:41:12,599 for that particular fpga is not yet open 1076 00:41:08,480 --> 00:41:15,480 uh there are other fpgas that have uh 1077 00:41:12,599 --> 00:41:17,820 support in the open source uh uh tool 1078 00:41:15,480 --> 00:41:20,520 chain and there's also some effort in in 1079 00:41:17,820 --> 00:41:23,460 adding the xilinx uh 1080 00:41:20,520 --> 00:41:24,839 support into the existing uh placement 1081 00:41:23,460 --> 00:41:27,180 route 1082 00:41:24,839 --> 00:41:29,160 tools so that's kind of I guess a bit of 1083 00:41:27,180 --> 00:41:31,320 a bit of prospect of the future so yes 1084 00:41:29,160 --> 00:41:33,839 yeah 1085 00:41:31,320 --> 00:41:34,800 um a couple of a couple of not so great 1086 00:41:33,839 --> 00:41:35,880 things 1087 00:41:34,800 --> 00:41:38,280 um 1088 00:41:35,880 --> 00:41:41,460 it's a little slow we are running on 100 1089 00:41:38,280 --> 00:41:44,220 megahertz computer here um 1090 00:41:41,460 --> 00:41:46,320 interestingly uh we did kick that 1091 00:41:44,220 --> 00:41:48,900 technical debt down to this slide in 1092 00:41:46,320 --> 00:41:51,780 which we have about 30 to 50 percent of 1093 00:41:48,900 --> 00:41:54,980 our CPU usage is in just doing the i2c 1094 00:41:51,780 --> 00:41:58,560 and the FSI so just twiddling those gpos 1095 00:41:54,980 --> 00:42:00,060 is costing us a lot of time which is 1096 00:41:58,560 --> 00:42:02,700 kind of good news in that it's it's a 1097 00:42:00,060 --> 00:42:04,859 fairly linear fix to get some Gateway 1098 00:42:02,700 --> 00:42:09,060 now for the ITC bits and the FSI Brits 1099 00:42:04,859 --> 00:42:11,940 as well to reduce our CPU load by 32 so 1100 00:42:09,060 --> 00:42:12,780 that's kind of cool so that's our that's 1101 00:42:11,940 --> 00:42:14,700 our 1102 00:42:12,780 --> 00:42:16,140 kind of thing there it was a pretty 1103 00:42:14,700 --> 00:42:17,579 quick bring out project so we weren't 1104 00:42:16,140 --> 00:42:20,160 going through and doing a lot of 1105 00:42:17,579 --> 00:42:21,000 Polished Work um at least I wasn't 1106 00:42:20,160 --> 00:42:23,160 um 1107 00:42:21,000 --> 00:42:25,440 so there's a bit of kind of edges that 1108 00:42:23,160 --> 00:42:27,240 we can clean up and and 1109 00:42:25,440 --> 00:42:28,680 sort of a bit of Polish we can put on it 1110 00:42:27,240 --> 00:42:29,940 but it's certainly we've shown that it's 1111 00:42:28,680 --> 00:42:31,920 feasible 1112 00:42:29,940 --> 00:42:34,980 um we showed that some some low-hanging 1113 00:42:31,920 --> 00:42:37,980 fruit that we can work on to fix uh and 1114 00:42:34,980 --> 00:42:39,300 we have our entirely open stack 1115 00:42:37,980 --> 00:42:42,420 um again just reviewing what we've got 1116 00:42:39,300 --> 00:42:45,619 here we've got our core iOS everything 1117 00:42:42,420 --> 00:42:45,619 open source 1118 00:42:46,140 --> 00:42:51,320 cool 1119 00:42:48,359 --> 00:42:51,320 I just got it 1120 00:42:55,560 --> 00:43:01,619 yes yeah I'll take some questions uh 1121 00:42:59,220 --> 00:43:03,980 as times running up maybe just one 1122 00:43:01,619 --> 00:43:03,980 question 1123 00:43:04,520 --> 00:43:10,800 sorry I've got some bad news for you uh 1124 00:43:08,280 --> 00:43:14,579 any plans for incorporating 1125 00:43:10,800 --> 00:43:16,619 um uh some kind of video input so you 1126 00:43:14,579 --> 00:43:18,619 can have that on the console or is that 1127 00:43:16,619 --> 00:43:22,200 not not enough pins on the fpga 1128 00:43:18,619 --> 00:43:23,040 certainly there is enough pins 1129 00:43:22,200 --> 00:43:27,900 um 1130 00:43:23,040 --> 00:43:30,720 it's my concern with that is the amount 1131 00:43:27,900 --> 00:43:33,200 of gate where the amount of fpga usage 1132 00:43:30,720 --> 00:43:35,220 required to re-encode a video signal 1133 00:43:33,200 --> 00:43:37,859 there's a few 1134 00:43:35,220 --> 00:43:41,040 the weather's done on on um 1135 00:43:37,859 --> 00:43:42,720 on The BMC that's already there is that 1136 00:43:41,040 --> 00:43:46,260 there's a PCI device which is a video 1137 00:43:42,720 --> 00:43:47,940 card on The BMC itself which can either 1138 00:43:46,260 --> 00:43:49,500 output to a physical video connector or 1139 00:43:47,940 --> 00:43:50,640 it can re-encode that back into 1140 00:43:49,500 --> 00:43:51,839 something you can stream over the 1141 00:43:50,640 --> 00:43:54,119 network 1142 00:43:51,839 --> 00:43:57,000 um that's going to going to take a lot 1143 00:43:54,119 --> 00:43:59,400 of Lut usage to to get to that point 1144 00:43:57,000 --> 00:44:00,660 um and I guess I guess the focus on the 1145 00:43:59,400 --> 00:44:02,579 video encoding side would certainly have 1146 00:44:00,660 --> 00:44:05,359 a better idea of how much is used about 1147 00:44:02,579 --> 00:44:05,359 are we 1148 00:44:06,200 --> 00:44:11,700 uh yeah my my worst I love the uarts not 1149 00:44:10,260 --> 00:44:14,040 so much the video output 1150 00:44:11,700 --> 00:44:16,859 um but uh yeah I think I think it's a 1151 00:44:14,040 --> 00:44:18,960 pretty commonly used sort of usage model 1152 00:44:16,859 --> 00:44:20,760 in getting the video shared over the 1153 00:44:18,960 --> 00:44:22,140 network from there but um yeah it 1154 00:44:20,760 --> 00:44:24,240 certainly don't have them implemented 1155 00:44:22,140 --> 00:44:28,619 but again open source is certainly scope 1156 00:44:24,240 --> 00:44:30,720 to grab some some uh some vhdl that 1157 00:44:28,619 --> 00:44:32,280 implements a video device and kind of 1158 00:44:30,720 --> 00:44:34,440 wrangling that into the into the thing 1159 00:44:32,280 --> 00:44:36,900 we might be struggling with encoding 1160 00:44:34,440 --> 00:44:38,460 that over the network at 100 megahertz 1161 00:44:36,900 --> 00:44:40,380 um but you know if there's there's 1162 00:44:38,460 --> 00:44:42,660 clever Hardware Engineers around it 1163 00:44:40,380 --> 00:44:45,780 could kind of handle all of this in in 1164 00:44:42,660 --> 00:44:47,579 the Gateway itself which could be cool 1165 00:44:45,780 --> 00:44:49,920 awesome yes 1166 00:44:47,579 --> 00:44:52,260 he's ready for the mic 1167 00:44:49,920 --> 00:44:54,599 okay we'll chat afterwards awesome I'll 1168 00:44:52,260 --> 00:44:57,000 be around uh for the entire conference 1169 00:44:54,599 --> 00:44:58,920 I'm happy to chat happy to geek out 1170 00:44:57,000 --> 00:44:59,579 about any of the bits here 1171 00:44:58,920 --> 00:45:03,780 um 1172 00:44:59,579 --> 00:45:05,520 but uh here's our resources uh a lot of 1173 00:45:03,780 --> 00:45:07,079 these are on so the main link here is 1174 00:45:05,520 --> 00:45:08,339 kind of a basically a text description 1175 00:45:07,079 --> 00:45:09,900 of this talk 1176 00:45:08,339 --> 00:45:11,760 um with a few extra side tales about 1177 00:45:09,900 --> 00:45:12,900 bring up problems 1178 00:45:11,760 --> 00:45:14,640 um 1179 00:45:12,900 --> 00:45:16,740 and the rest of the resources are also 1180 00:45:14,640 --> 00:45:18,430 linked from that page 1181 00:45:16,740 --> 00:45:23,810 and thank you very much 1182 00:45:18,430 --> 00:45:23,810 [Applause]