1 00:00:00,000 --> 00:00:03,000 foreign 2 00:00:04,960 --> 00:00:08,490 [Music] 3 00:00:11,540 --> 00:00:15,599 welcome back I hope you all had an 4 00:00:14,280 --> 00:00:17,580 excellent lunch 5 00:00:15,599 --> 00:00:19,800 um before we get underway there's a 6 00:00:17,580 --> 00:00:23,460 small piece of housekeeping a hotel Key 7 00:00:19,800 --> 00:00:26,699 has been found and handed in at Rego so 8 00:00:23,460 --> 00:00:29,000 if you are missing your hotel key uh reg 9 00:00:26,699 --> 00:00:29,000 desk 10 00:00:29,640 --> 00:00:34,860 um but we are now here to um to hear 11 00:00:32,579 --> 00:00:38,040 about raising Heretics on a diet of open 12 00:00:34,860 --> 00:00:39,600 data with Dr Linda McIver Linda is a 13 00:00:38,040 --> 00:00:42,059 passionate educator researcher and 14 00:00:39,600 --> 00:00:45,059 advocate for stem equity and inclusion 15 00:00:42,059 --> 00:00:47,579 with a PhD in computer science education 16 00:00:45,059 --> 00:00:49,200 her mission is to ensure all Australian 17 00:00:47,579 --> 00:00:51,239 students have the opportunity to learn 18 00:00:49,200 --> 00:00:53,820 stem and data science skills in the 19 00:00:51,239 --> 00:00:55,440 context of projects that Empower them to 20 00:00:53,820 --> 00:00:57,059 solve problems and make a positive 21 00:00:55,440 --> 00:00:58,980 difference in the world 22 00:00:57,059 --> 00:01:00,360 today Linda will talk about the 23 00:00:58,980 --> 00:01:02,340 integration of data science into 24 00:01:00,360 --> 00:01:04,860 education and the potential social 25 00:01:02,340 --> 00:01:07,280 outcomes of using real world data when 26 00:01:04,860 --> 00:01:07,280 teaching 27 00:01:07,860 --> 00:01:12,420 [Applause] 28 00:01:10,799 --> 00:01:15,060 thanks Fiona 29 00:01:12,420 --> 00:01:16,860 um I would like to apologize for 30 00:01:15,060 --> 00:01:19,080 delivering the talk standing sitting 31 00:01:16,860 --> 00:01:20,580 down I'd like to be standing up 32 00:01:19,080 --> 00:01:22,020 um unfortunately standing up for 45 33 00:01:20,580 --> 00:01:23,939 minutes is not a function currently 34 00:01:22,020 --> 00:01:25,860 offered by this API 35 00:01:23,939 --> 00:01:27,119 um similarly I'm not going to be at much 36 00:01:25,860 --> 00:01:29,340 of the conference 37 00:01:27,119 --> 00:01:32,520 um much as I'd like to catch up with you 38 00:01:29,340 --> 00:01:33,720 all and feel free to find me on any of 39 00:01:32,520 --> 00:01:36,600 the 40 00:01:33,720 --> 00:01:37,820 social things and and connect I really 41 00:01:36,600 --> 00:01:40,680 like that 42 00:01:37,820 --> 00:01:43,100 so I'd like to begin by acknowledging 43 00:01:40,680 --> 00:01:45,360 the first scientists the first 44 00:01:43,100 --> 00:01:47,280 environmentalists and the traditional 45 00:01:45,360 --> 00:01:49,020 owners of these unseated lungs that will 46 00:01:47,280 --> 00:01:52,439 run through people 47 00:01:49,020 --> 00:01:54,119 um the warranty Warrior wrong people and 48 00:01:52,439 --> 00:01:56,820 terrible at pronouncing that I apologize 49 00:01:54,119 --> 00:01:58,680 of the cooling Nation we have much to 50 00:01:56,820 --> 00:02:00,000 learn from them 51 00:01:58,680 --> 00:02:03,600 so 52 00:02:00,000 --> 00:02:06,659 GPT has the education world's Collective 53 00:02:03,600 --> 00:02:08,640 underwear in a lot this is largely 54 00:02:06,659 --> 00:02:11,700 because it threatens the validity of our 55 00:02:08,640 --> 00:02:15,120 assessment it gives students the 56 00:02:11,700 --> 00:02:16,739 possibility of dodging their assignment 57 00:02:15,120 --> 00:02:19,620 work their 58 00:02:16,739 --> 00:02:22,440 um writing their research and sometimes 59 00:02:19,620 --> 00:02:24,720 even dodging setting their own exams 60 00:02:22,440 --> 00:02:27,360 many State education departments in 61 00:02:24,720 --> 00:02:29,160 Australia have banned it from schools 62 00:02:27,360 --> 00:02:31,500 entirely and let's have a moment's 63 00:02:29,160 --> 00:02:33,300 Collective silence for governments that 64 00:02:31,500 --> 00:02:35,959 are naive enough to believe they can ban 65 00:02:33,300 --> 00:02:38,280 something on the Internet 66 00:02:35,959 --> 00:02:39,840 the truth says though that it only 67 00:02:38,280 --> 00:02:41,160 threatens the validity of our assessment 68 00:02:39,840 --> 00:02:44,459 because our assessments are already 69 00:02:41,160 --> 00:02:46,080 wildly invalid we know that the way we 70 00:02:44,459 --> 00:02:47,879 are assessing people doesn't work and 71 00:02:46,080 --> 00:02:51,599 doesn't actually assess the 72 00:02:47,879 --> 00:02:54,780 characteristics and skills and abilities 73 00:02:51,599 --> 00:02:56,819 that we say we care about 74 00:02:54,780 --> 00:02:58,220 our tasks for students are all too often 75 00:02:56,819 --> 00:03:01,200 chosen for 76 00:02:58,220 --> 00:03:04,500 ease of processing ease of marketing 77 00:03:01,200 --> 00:03:06,060 ease of authentication it's very easy to 78 00:03:04,500 --> 00:03:08,280 know if a student is sitting in front of 79 00:03:06,060 --> 00:03:09,780 you writing on an exam paper that you 80 00:03:08,280 --> 00:03:11,580 can be fairly confident that they're 81 00:03:09,780 --> 00:03:13,019 writing that they're doing their own 82 00:03:11,580 --> 00:03:14,480 work although even that's getting a 83 00:03:13,019 --> 00:03:16,620 little dodgy these days 84 00:03:14,480 --> 00:03:19,379 but there's very little evidence that 85 00:03:16,620 --> 00:03:20,700 exams are actually a worthwhile form of 86 00:03:19,379 --> 00:03:22,379 assessment in fact we know that they're 87 00:03:20,700 --> 00:03:24,300 not they're not a good way of assessing 88 00:03:22,379 --> 00:03:26,340 student learning and let's face it when 89 00:03:24,300 --> 00:03:28,379 you choose a doctor you don't ask them 90 00:03:26,340 --> 00:03:30,120 how well they did on their exams you 91 00:03:28,379 --> 00:03:31,560 want to know how good they are at 92 00:03:30,120 --> 00:03:33,420 communicating you want to have someone 93 00:03:31,560 --> 00:03:34,819 who's an excellent diagnostician who's 94 00:03:33,420 --> 00:03:36,959 fabulous at 95 00:03:34,819 --> 00:03:39,480 collaborating with 96 00:03:36,959 --> 00:03:42,000 uh professionals and and various Allied 97 00:03:39,480 --> 00:03:44,519 Health Professions and also other 98 00:03:42,000 --> 00:03:45,900 doctors you want to know that uh that 99 00:03:44,519 --> 00:03:47,340 your doctor is good at choosing a 100 00:03:45,900 --> 00:03:49,019 specialist who will actually treat you 101 00:03:47,340 --> 00:03:50,940 like a human being and not like a speck 102 00:03:49,019 --> 00:03:53,159 under a microscope 103 00:03:50,940 --> 00:03:56,159 um you don't care about the Intel 104 00:03:53,159 --> 00:03:58,739 but none of these things are effectively 105 00:03:56,159 --> 00:04:02,459 assessed by exams and they're also not 106 00:03:58,739 --> 00:04:04,980 replicable by things like chat cpg 107 00:04:02,459 --> 00:04:07,319 if our assessment of students learning 108 00:04:04,980 --> 00:04:10,019 and skills is based on assessments that 109 00:04:07,319 --> 00:04:12,299 ask them to regurgitate facts and known 110 00:04:10,019 --> 00:04:14,519 processes we're not actually assessing 111 00:04:12,299 --> 00:04:16,799 anything of value there's just no 112 00:04:14,519 --> 00:04:18,959 validity there at all so I'm hoping that 113 00:04:16,799 --> 00:04:22,199 chat GPT will be the kick that we need 114 00:04:18,959 --> 00:04:25,560 to push us into more authentic forms of 115 00:04:22,199 --> 00:04:28,080 assessment into more engaging useful and 116 00:04:25,560 --> 00:04:30,479 valid ways of Education 117 00:04:28,080 --> 00:04:32,580 it's past time to transition into a 118 00:04:30,479 --> 00:04:35,580 problem-based authentic form of 119 00:04:32,580 --> 00:04:36,900 Education where students actually do 120 00:04:35,580 --> 00:04:39,720 things that build their critical 121 00:04:36,900 --> 00:04:42,360 thinking their problem solving and 122 00:04:39,720 --> 00:04:44,460 indeed their ethical decision making 123 00:04:42,360 --> 00:04:46,500 why am I telling you this in a 124 00:04:44,460 --> 00:04:48,960 conference on open data 125 00:04:46,500 --> 00:04:50,699 well when I started computer science 126 00:04:48,960 --> 00:04:52,919 when I started teaching computer science 127 00:04:50,699 --> 00:04:54,600 in a secondary school I arrived to find 128 00:04:52,919 --> 00:04:57,060 a course that was 129 00:04:54,600 --> 00:04:59,400 um very toy based we had year 10 130 00:04:57,060 --> 00:05:01,259 students we were teaching them to teach 131 00:04:59,400 --> 00:05:04,919 robots to push each other out of circles 132 00:05:01,259 --> 00:05:07,040 to follow lines to draw pretty pictures 133 00:05:04,919 --> 00:05:10,500 and scratch-based interfaces 134 00:05:07,040 --> 00:05:12,180 and it failed it failed dismally the 135 00:05:10,500 --> 00:05:13,500 students hated it and this was in a 136 00:05:12,180 --> 00:05:15,180 science goal where you would think that 137 00:05:13,500 --> 00:05:17,400 the kids were you know naturally 138 00:05:15,180 --> 00:05:19,500 predisposed to enjoy technology and to 139 00:05:17,400 --> 00:05:21,960 engage with it but they couldn't see the 140 00:05:19,500 --> 00:05:23,880 point they couldn't see the relevance to 141 00:05:21,960 --> 00:05:26,639 anything that they would want to do in 142 00:05:23,880 --> 00:05:28,560 their future lives they were bored they 143 00:05:26,639 --> 00:05:30,539 were disengaged and they were wholly 144 00:05:28,560 --> 00:05:32,460 unwilling 145 00:05:30,539 --> 00:05:35,639 now when we switched using data science 146 00:05:32,460 --> 00:05:36,900 instead we were teaching the same coding 147 00:05:35,639 --> 00:05:39,000 skills they were learning about 148 00:05:36,900 --> 00:05:44,100 variables and functions and iteration 149 00:05:39,000 --> 00:05:45,900 and basic code but now we were doing it 150 00:05:44,100 --> 00:05:48,600 in the context of real data sets and 151 00:05:45,900 --> 00:05:51,060 that that the word that second last word 152 00:05:48,600 --> 00:05:53,900 there is key real data sets as soon as 153 00:05:51,060 --> 00:05:56,460 you use data that is real and meaningful 154 00:05:53,900 --> 00:05:57,740 it completely changes the nature of your 155 00:05:56,460 --> 00:06:00,660 educational 156 00:05:57,740 --> 00:06:02,880 so we were also teaching a large side of 157 00:06:00,660 --> 00:06:04,259 data literacy but for me you know I'm a 158 00:06:02,880 --> 00:06:06,300 computer scientist I want everyone to 159 00:06:04,259 --> 00:06:10,259 learn to code for me this started out as 160 00:06:06,300 --> 00:06:10,979 a quest to engage more kids with code 161 00:06:10,259 --> 00:06:13,320 um 162 00:06:10,979 --> 00:06:15,780 and sorry I was mildly distracted there 163 00:06:13,320 --> 00:06:19,520 by the site of a helicopter going down 164 00:06:15,780 --> 00:06:19,520 below the floor level it's gone now 165 00:06:20,400 --> 00:06:26,039 um you don't see that very often 166 00:06:22,259 --> 00:06:28,199 um so uh yeah we were using those data 167 00:06:26,039 --> 00:06:29,580 sets to answer meaningful questions and 168 00:06:28,199 --> 00:06:31,620 the level of Engagement completely 169 00:06:29,580 --> 00:06:33,600 changed suddenly instead of coming to me 170 00:06:31,620 --> 00:06:35,580 saying why are you making me do this I 171 00:06:33,600 --> 00:06:36,960 don't want to do this I hate this and 172 00:06:35,580 --> 00:06:38,819 you know half of them to be honest 173 00:06:36,960 --> 00:06:40,800 cheating to just get the assignment out 174 00:06:38,819 --> 00:06:42,479 of the way and move on with it now they 175 00:06:40,800 --> 00:06:44,520 were coming to me saying oh my God this 176 00:06:42,479 --> 00:06:46,440 is so useful I used it in my science 177 00:06:44,520 --> 00:06:48,000 project and to be honest those science 178 00:06:46,440 --> 00:06:49,919 projects are half the reason I started 179 00:06:48,000 --> 00:06:51,360 teaching data because every year there 180 00:06:49,919 --> 00:06:53,360 were graphs in their science projects 181 00:06:51,360 --> 00:06:55,680 made me cry 182 00:06:53,360 --> 00:06:57,419 but now they're also coming to me saying 183 00:06:55,680 --> 00:06:59,160 oh I used it in my maths exam and oh my 184 00:06:57,419 --> 00:07:00,720 god there was a graph on the news last 185 00:06:59,160 --> 00:07:04,460 night and there was no zero on the scale 186 00:07:00,720 --> 00:07:04,460 and it was so outrageous it was social 187 00:07:05,419 --> 00:07:09,780 so suddenly just because they're working 188 00:07:08,400 --> 00:07:11,520 with real data and learning some data 189 00:07:09,780 --> 00:07:15,360 literacy they were becoming critical 190 00:07:11,520 --> 00:07:16,620 thinkers and that thread just grew and 191 00:07:15,360 --> 00:07:17,940 grew and grew to the point where I 192 00:07:16,620 --> 00:07:19,740 didn't care about the learning coding 193 00:07:17,940 --> 00:07:21,120 anymore the critical thinking of the 194 00:07:19,740 --> 00:07:23,759 problem solving was actually the 195 00:07:21,120 --> 00:07:26,099 significant part 196 00:07:23,759 --> 00:07:28,380 um so we talk a lot about boosting the 197 00:07:26,099 --> 00:07:31,860 pipeline when we talk about getting more 198 00:07:28,380 --> 00:07:33,419 women and non-binary folks into uh Tech 199 00:07:31,860 --> 00:07:35,340 in general and data science in 200 00:07:33,419 --> 00:07:37,740 particular the problem is we tend to 201 00:07:35,340 --> 00:07:39,360 focus our efforts on 202 00:07:37,740 --> 00:07:42,300 um late high school when the kids are 203 00:07:39,360 --> 00:07:44,759 starting to as we normally call it 204 00:07:42,300 --> 00:07:45,900 choose their careers although I'm sure 205 00:07:44,759 --> 00:07:48,419 there are plenty of people in this room 206 00:07:45,900 --> 00:07:50,880 who would who provide evidence that you 207 00:07:48,419 --> 00:07:52,919 can change your career at any point 208 00:07:50,880 --> 00:07:56,639 um myself included 209 00:07:52,919 --> 00:07:59,340 um but we we focus on Lake High School 210 00:07:56,639 --> 00:08:01,620 or university and the problem with that 211 00:07:59,340 --> 00:08:05,900 is that we know that kids attitudes to 212 00:08:01,620 --> 00:08:09,120 stem and maths in particular solidify in 213 00:08:05,900 --> 00:08:11,460 mid to early to mid Primary School we've 214 00:08:09,120 --> 00:08:13,259 lost them before they ever hit High 215 00:08:11,460 --> 00:08:16,139 School 216 00:08:13,259 --> 00:08:17,819 so kids interest in stem is really 217 00:08:16,139 --> 00:08:18,900 caught or lost in those early years but 218 00:08:17,819 --> 00:08:20,639 the trouble is they're often being 219 00:08:18,900 --> 00:08:23,039 taught by teachers who've never learned 220 00:08:20,639 --> 00:08:25,379 stem themselves who've never been taught 221 00:08:23,039 --> 00:08:27,740 to teach stem and who are quite friendly 222 00:08:25,379 --> 00:08:32,060 often terrified of it 223 00:08:27,740 --> 00:08:34,020 coincidentally perhaps kids own 224 00:08:32,060 --> 00:08:35,760 self-efficacy or their perception of 225 00:08:34,020 --> 00:08:37,680 their self-efficacy 226 00:08:35,760 --> 00:08:39,000 um solidifies around that time too so 227 00:08:37,680 --> 00:08:41,399 you get kids coming out of primary 228 00:08:39,000 --> 00:08:42,899 school saying I am no good at maths I am 229 00:08:41,399 --> 00:08:44,540 no good at Tech I couldn't do anything 230 00:08:42,899 --> 00:08:47,100 with those robots 231 00:08:44,540 --> 00:08:48,540 and they're being taught by teachers who 232 00:08:47,100 --> 00:08:50,279 are terrified of it there may be a 233 00:08:48,540 --> 00:08:51,180 connection 234 00:08:50,279 --> 00:08:54,660 um 235 00:08:51,180 --> 00:08:57,000 so we need to stop aiming our 236 00:08:54,660 --> 00:08:58,620 recruitment drives at high school when 237 00:08:57,000 --> 00:08:59,880 we've already lost them because yes okay 238 00:08:58,620 --> 00:09:01,200 it makes sense to aim at high school 239 00:08:59,880 --> 00:09:03,899 when they're choosing their subjects and 240 00:09:01,200 --> 00:09:05,279 things and they are choosing their path 241 00:09:03,899 --> 00:09:06,779 at that point but the truth is I've 242 00:09:05,279 --> 00:09:08,399 already pruned a lot of those paths 243 00:09:06,779 --> 00:09:11,339 they're already saying I'm not doing 244 00:09:08,399 --> 00:09:13,500 maths because I suck at maths and both 245 00:09:11,339 --> 00:09:15,420 my kids by the way don't tell them I 246 00:09:13,500 --> 00:09:18,959 told you this test is gifted in maths 247 00:09:15,420 --> 00:09:21,120 but believe they suck at it because they 248 00:09:18,959 --> 00:09:22,380 because they frequently misinterpret the 249 00:09:21,120 --> 00:09:24,060 questions 250 00:09:22,380 --> 00:09:26,760 um and they're badly worded questions 251 00:09:24,060 --> 00:09:28,680 but that's a rant for another time if 252 00:09:26,760 --> 00:09:30,180 you want to jump in at any point please 253 00:09:28,680 --> 00:09:31,440 feel free there'll be plenty of time for 254 00:09:30,180 --> 00:09:33,420 questions at the end but if you want to 255 00:09:31,440 --> 00:09:35,760 jump in at some point just make yourself 256 00:09:33,420 --> 00:09:37,380 known 257 00:09:35,760 --> 00:09:38,880 um 258 00:09:37,380 --> 00:09:41,100 yeah so 259 00:09:38,880 --> 00:09:44,339 we need to teach these kids much earlier 260 00:09:41,100 --> 00:09:47,040 that stem skills are meaningful that 261 00:09:44,339 --> 00:09:48,240 they're useful and really importantly we 262 00:09:47,040 --> 00:09:49,500 need to teach them that they're 263 00:09:48,240 --> 00:09:52,500 accessible 264 00:09:49,500 --> 00:09:54,899 that they can all do them 265 00:09:52,500 --> 00:09:58,320 um this this kind of only a certain type 266 00:09:54,899 --> 00:10:02,399 of person can code thing the the coding 267 00:09:58,320 --> 00:10:04,320 DNA kind of myth is really pervasive 268 00:10:02,399 --> 00:10:05,940 um not just in society but also in Texas 269 00:10:04,320 --> 00:10:07,800 I'm sure you know 270 00:10:05,940 --> 00:10:10,320 um but it's not anyone can learn to code 271 00:10:07,800 --> 00:10:12,060 anyone can learn data science anyone can 272 00:10:10,320 --> 00:10:13,019 learn stem skills now I'm not saying 273 00:10:12,060 --> 00:10:16,440 that I'm going to turn all these kids 274 00:10:13,019 --> 00:10:18,839 into Data scientists but they all need 275 00:10:16,440 --> 00:10:21,540 to have enough data literacy to engage 276 00:10:18,839 --> 00:10:23,640 with the meaningful conversations we all 277 00:10:21,540 --> 00:10:25,260 need to have about where we're going as 278 00:10:23,640 --> 00:10:28,019 a society because data science is 279 00:10:25,260 --> 00:10:30,360 increasingly driving that in some 280 00:10:28,019 --> 00:10:32,399 interesting directions 281 00:10:30,360 --> 00:10:34,860 so we know that one way to engage kids 282 00:10:32,399 --> 00:10:36,959 with stem is to use data science solving 283 00:10:34,860 --> 00:10:38,700 real problems so clearly we need to 284 00:10:36,959 --> 00:10:40,920 bring data science into schools from the 285 00:10:38,700 --> 00:10:42,480 very beginning well I have good news and 286 00:10:40,920 --> 00:10:43,740 for you I have great news 287 00:10:42,480 --> 00:10:45,540 the good news is we're already building 288 00:10:43,740 --> 00:10:47,160 that data science into education it's 289 00:10:45,540 --> 00:10:49,500 why I started the Australian data 290 00:10:47,160 --> 00:10:50,760 Science Education Institute and it is 291 00:10:49,500 --> 00:10:53,940 starting to spread we're getting now 292 00:10:50,760 --> 00:10:55,800 developing resources and sharing lesson 293 00:10:53,940 --> 00:10:57,899 plans and doing teacher training and 294 00:10:55,800 --> 00:10:58,680 getting out there into the world 295 00:10:57,899 --> 00:11:01,980 um 296 00:10:58,680 --> 00:11:05,519 but the great news is that open data 297 00:11:01,980 --> 00:11:09,240 gives us the power to use School data 298 00:11:05,519 --> 00:11:10,740 science projects to solve serious data 299 00:11:09,240 --> 00:11:13,260 problems 300 00:11:10,740 --> 00:11:15,120 we all know there's more data out there 301 00:11:13,260 --> 00:11:17,700 than any 302 00:11:15,120 --> 00:11:19,620 um than the field of data science can 303 00:11:17,700 --> 00:11:21,839 analyze even if we collectively forego 304 00:11:19,620 --> 00:11:23,279 sleep and food forever uh hands up if 305 00:11:21,839 --> 00:11:25,079 you've heard of the Japanese term 306 00:11:23,279 --> 00:11:26,880 sundoku 307 00:11:25,079 --> 00:11:28,200 the stack of books that you have beside 308 00:11:26,880 --> 00:11:30,600 your bed that you haven't got around to 309 00:11:28,200 --> 00:11:32,540 reading yet now put your hands up if you 310 00:11:30,600 --> 00:11:35,820 have your own personal data tsundoku 311 00:11:32,540 --> 00:11:36,720 right we just can't get to it that's too 312 00:11:35,820 --> 00:11:38,100 much 313 00:11:36,720 --> 00:11:39,899 um so how about we throw kids some of 314 00:11:38,100 --> 00:11:41,399 that data how about we get them 315 00:11:39,899 --> 00:11:44,000 answering real questions and doing 316 00:11:41,399 --> 00:11:46,560 meaningful stuff it's more than possible 317 00:11:44,000 --> 00:11:50,040 I do it every day 318 00:11:46,560 --> 00:11:52,320 so the challenge is 319 00:11:50,040 --> 00:11:56,459 to do things like what if we taught 320 00:11:52,320 --> 00:11:59,100 probability using gender pay data sets 321 00:11:56,459 --> 00:12:00,800 instead of black and white and black and 322 00:11:59,100 --> 00:12:03,180 white balls in and 323 00:12:00,800 --> 00:12:04,440 Charlie is a non-binary software 324 00:12:03,180 --> 00:12:06,660 engineer given that they've been working 325 00:12:04,440 --> 00:12:07,920 in the field for three years what is the 326 00:12:06,660 --> 00:12:09,899 probability that they are receiving the 327 00:12:07,920 --> 00:12:11,880 same pay as James Asus hit weight man 328 00:12:09,899 --> 00:12:14,940 doing the same job 329 00:12:11,880 --> 00:12:16,560 but of course we need open pay data in 330 00:12:14,940 --> 00:12:18,839 order to answer that question and the 331 00:12:16,560 --> 00:12:21,240 pay gap for gender pay Gap data that has 332 00:12:18,839 --> 00:12:22,920 been released is not sufficient that's a 333 00:12:21,240 --> 00:12:24,839 conversation for another time 334 00:12:22,920 --> 00:12:26,339 the amazing thing about using real data 335 00:12:24,839 --> 00:12:27,959 sets for classroom projects is that the 336 00:12:26,339 --> 00:12:30,180 possibility exists for kids to find new 337 00:12:27,959 --> 00:12:31,500 things so they can be looking at a data 338 00:12:30,180 --> 00:12:33,180 set that no one's ever looked at or 339 00:12:31,500 --> 00:12:35,100 asking that data set a question that no 340 00:12:33,180 --> 00:12:37,620 one's ever asked that's not something I 341 00:12:35,100 --> 00:12:39,300 get to do at school they normally get to 342 00:12:37,620 --> 00:12:40,860 solve problems that have been solved 343 00:12:39,300 --> 00:12:42,720 before and where they take it up to the 344 00:12:40,860 --> 00:12:45,360 teacher and the teacher goes yes or no 345 00:12:42,720 --> 00:12:48,060 yes it's right no it's wrong do some 346 00:12:45,360 --> 00:12:50,220 more work that's the extent of it but 347 00:12:48,060 --> 00:12:51,959 now they're doing real science with data 348 00:12:50,220 --> 00:12:53,820 sets that haven't been fully analyzed so 349 00:12:51,959 --> 00:12:55,320 they have the chance to ask questions no 350 00:12:53,820 --> 00:12:57,420 one's asked before 351 00:12:55,320 --> 00:12:59,700 the first style science project I ran 352 00:12:57,420 --> 00:13:01,440 with my year 10s used a data search on 353 00:12:59,700 --> 00:13:03,480 the Australian electoral commission and 354 00:13:01,440 --> 00:13:06,060 we downloaded a CSV file which was over 355 00:13:03,480 --> 00:13:09,240 three million lines every line in that 356 00:13:06,060 --> 00:13:12,560 CSV file was a vote for the Senate in 357 00:13:09,240 --> 00:13:15,420 Victoria for the federal election 358 00:13:12,560 --> 00:13:17,760 and three million lines won't even open 359 00:13:15,420 --> 00:13:20,519 in Excel so gosh tell it they had to 360 00:13:17,760 --> 00:13:23,459 learn to code so sorry 361 00:13:20,519 --> 00:13:25,560 um mind you the code they had to write 362 00:13:23,459 --> 00:13:27,860 might have been 10 or 20 lines in the 363 00:13:25,560 --> 00:13:30,420 end they were very simple 364 00:13:27,860 --> 00:13:32,820 bits of analysis that most of them did 365 00:13:30,420 --> 00:13:35,040 but being a real data set with endless 366 00:13:32,820 --> 00:13:37,100 scope those who are ready for it could 367 00:13:35,040 --> 00:13:40,519 do and did 368 00:13:37,100 --> 00:13:42,959 insanely complicated and fabulous things 369 00:13:40,519 --> 00:13:46,320 so they weren't learning a lot of code 370 00:13:42,959 --> 00:13:49,139 where they were learning a manageable a 371 00:13:46,320 --> 00:13:50,399 manageable amount that enable them to do 372 00:13:49,139 --> 00:13:52,860 something real 373 00:13:50,399 --> 00:13:55,260 imagine if everybody's first experience 374 00:13:52,860 --> 00:13:56,820 with code was a small bit a small 375 00:13:55,260 --> 00:13:58,800 program that did something meaningful 376 00:13:56,820 --> 00:14:00,180 and enabled them to find something out 377 00:13:58,800 --> 00:14:02,279 that they cared about 378 00:14:00,180 --> 00:14:04,500 that would be a bit differential tree 379 00:14:02,279 --> 00:14:05,459 Hello World 380 00:14:04,500 --> 00:14:07,500 um 381 00:14:05,459 --> 00:14:09,600 so the assignment was to find a question 382 00:14:07,500 --> 00:14:13,200 the data set could answer now there were 383 00:14:09,600 --> 00:14:14,459 just endless scope for that and 384 00:14:13,200 --> 00:14:17,279 visualize the results now the 385 00:14:14,459 --> 00:14:18,180 visualizations were mostly done by hand 386 00:14:17,279 --> 00:14:21,060 um 387 00:14:18,180 --> 00:14:22,700 partly because python visualization 388 00:14:21,060 --> 00:14:25,920 libraries 389 00:14:22,700 --> 00:14:27,839 setting kids new code is free on python 390 00:14:25,920 --> 00:14:29,660 visualization libraries and tiers I mean 391 00:14:27,839 --> 00:14:32,639 those libraries make me cry 392 00:14:29,660 --> 00:14:34,680 but also because they could do much more 393 00:14:32,639 --> 00:14:36,720 creative and compelling things if they 394 00:14:34,680 --> 00:14:38,100 visualize it by hand 395 00:14:36,720 --> 00:14:40,019 um so then you have to think about 396 00:14:38,100 --> 00:14:41,579 whether the graph is still valid the 397 00:14:40,019 --> 00:14:44,339 visualization is valid you know making 398 00:14:41,579 --> 00:14:46,139 sure things are to scale and stuff but 399 00:14:44,339 --> 00:14:49,199 they learn a lot more making the 400 00:14:46,139 --> 00:14:52,440 visualizations more interesting 401 00:14:49,199 --> 00:14:54,180 um to to their General audience 402 00:14:52,440 --> 00:14:57,420 every student had to find a different 403 00:14:54,180 --> 00:15:00,240 question now that's a problem really if 404 00:14:57,420 --> 00:15:02,639 you think about it for for the way we 405 00:15:00,240 --> 00:15:03,899 normally do education that's 180 406 00:15:02,639 --> 00:15:05,339 different questions of this data set 407 00:15:03,899 --> 00:15:07,260 none of which the teacher already knows 408 00:15:05,339 --> 00:15:09,000 the answer to we'll talk about that in a 409 00:15:07,260 --> 00:15:11,160 minute but some of the questions they 410 00:15:09,000 --> 00:15:12,899 asked were how did the people at my 411 00:15:11,160 --> 00:15:14,579 local polling week about compared to the 412 00:15:12,899 --> 00:15:16,800 people in the electorate as a whole or 413 00:15:14,579 --> 00:15:18,600 in Victoria as a whole uh what was the 414 00:15:16,800 --> 00:15:19,800 proportion of women as candidates and 415 00:15:18,600 --> 00:15:21,600 how much of the vote did they get 416 00:15:19,800 --> 00:15:22,800 compared to the men and that was a 417 00:15:21,600 --> 00:15:24,300 particularly interesting one because we 418 00:15:22,800 --> 00:15:26,699 got to examine the thorny question of 419 00:15:24,300 --> 00:15:27,839 gendering people by their first name and 420 00:15:26,699 --> 00:15:29,339 whatever you could find out with Google 421 00:15:27,839 --> 00:15:31,680 and it you know was a fabulous 422 00:15:29,339 --> 00:15:35,399 introduction to the complexity and 423 00:15:31,680 --> 00:15:37,500 ugliness of real data the messiness 424 00:15:35,399 --> 00:15:39,240 um they asked which parties did people 425 00:15:37,500 --> 00:15:40,680 vote for who voted above the line or 426 00:15:39,240 --> 00:15:42,959 which parties did people mostly vote for 427 00:15:40,680 --> 00:15:45,060 who voted below the line which parties 428 00:15:42,959 --> 00:15:46,740 were more likely to receive the second 429 00:15:45,060 --> 00:15:48,959 preference of people who voted one 430 00:15:46,740 --> 00:15:50,760 before or any given party 431 00:15:48,959 --> 00:15:53,880 um where are Paul enhances One Nation 432 00:15:50,760 --> 00:15:55,560 voters Metro Urban or rural which 433 00:15:53,880 --> 00:15:58,579 party's voters were more likely to 434 00:15:55,560 --> 00:16:01,079 follow yeah that was not a big surprise 435 00:15:58,579 --> 00:16:03,000 which party's voters were more likely to 436 00:16:01,079 --> 00:16:04,620 follow they had vote cards uh won't 437 00:16:03,000 --> 00:16:08,120 surprise you to know that Green's voters 438 00:16:04,620 --> 00:16:08,120 were the most rebellious in that respect 439 00:16:08,220 --> 00:16:12,899 um in short they got to take a large 440 00:16:10,380 --> 00:16:15,540 data set and ask the questions that um 441 00:16:12,899 --> 00:16:16,980 most interested them and that does that 442 00:16:15,540 --> 00:16:19,620 had only just been released it was just 443 00:16:16,980 --> 00:16:20,940 after the 2016 federal election so a lot 444 00:16:19,620 --> 00:16:22,680 of these questions hadn't been asked at 445 00:16:20,940 --> 00:16:23,880 all and and indeed would never asked by 446 00:16:22,680 --> 00:16:25,920 anybody else because they were quite 447 00:16:23,880 --> 00:16:27,540 specific 448 00:16:25,920 --> 00:16:29,519 um and the focus was very much on the 449 00:16:27,540 --> 00:16:30,899 data literacy aspects what questions can 450 00:16:29,519 --> 00:16:31,980 they start the set answer because of 451 00:16:30,899 --> 00:16:32,880 course the first thing they all want to 452 00:16:31,980 --> 00:16:35,699 know is 453 00:16:32,880 --> 00:16:36,839 um which is the best party and sadly I 454 00:16:35,699 --> 00:16:39,740 don't think that Bible said answered 455 00:16:36,839 --> 00:16:39,740 that question yes 456 00:16:43,139 --> 00:16:48,360 this is absolutely fascinating research 457 00:16:45,480 --> 00:16:51,019 in and of itself did you publish their 458 00:16:48,360 --> 00:16:51,019 work anywhere 459 00:16:51,180 --> 00:16:54,779 sadly there the question was did I 460 00:16:53,519 --> 00:16:56,459 publish the work anywhere because it's 461 00:16:54,779 --> 00:16:57,420 amazing research and it is amazing 462 00:16:56,459 --> 00:16:59,279 research 463 00:16:57,420 --> 00:17:01,740 um sadly there are real complexities 464 00:16:59,279 --> 00:17:04,919 around publishing things with real kids 465 00:17:01,740 --> 00:17:07,199 and blah blah blah no yes I'd love to 466 00:17:04,919 --> 00:17:08,819 the former academic in me really wants 467 00:17:07,199 --> 00:17:10,980 to research this but at the same time as 468 00:17:08,819 --> 00:17:14,459 teaching it and spreading it and yeah 469 00:17:10,980 --> 00:17:15,600 there's a rather I ran out of means 470 00:17:14,459 --> 00:17:17,579 um 471 00:17:15,600 --> 00:17:18,839 yeah so the focus was on how do we 472 00:17:17,579 --> 00:17:21,419 communicate the answers to those 473 00:17:18,839 --> 00:17:23,160 questions accurately and compellingly 474 00:17:21,419 --> 00:17:24,900 like how do we get people to care about 475 00:17:23,160 --> 00:17:26,819 this stuff that you found out 476 00:17:24,900 --> 00:17:27,900 and I picked this data set because I had 477 00:17:26,819 --> 00:17:29,400 a student who was really interested in 478 00:17:27,900 --> 00:17:30,900 politics so I went and rummaged on the 479 00:17:29,400 --> 00:17:33,360 aec website to see what I could find 480 00:17:30,900 --> 00:17:34,500 it's like this one's great it was 481 00:17:33,360 --> 00:17:36,179 particularly good because he had 482 00:17:34,500 --> 00:17:37,559 explained the rules of Senate voting to 483 00:17:36,179 --> 00:17:39,120 kids you know like you can vote above 484 00:17:37,559 --> 00:17:40,320 the line or below the line but not both 485 00:17:39,120 --> 00:17:41,760 and 486 00:17:40,320 --> 00:17:43,679 um you know you can only use the number 487 00:17:41,760 --> 00:17:45,660 one once and things like that and so 488 00:17:43,679 --> 00:17:47,700 they go great so if I look at this this 489 00:17:45,660 --> 00:17:49,880 string which contains the votes there'll 490 00:17:47,700 --> 00:17:52,679 only be one number one I'm like 491 00:17:49,880 --> 00:17:54,240 well the thing you have to also deal 492 00:17:52,679 --> 00:17:56,220 with is that people don't necessarily 493 00:17:54,240 --> 00:17:57,900 follow the rules and boom their little 494 00:17:56,220 --> 00:17:59,460 heads explode It's a Wonderful learning 495 00:17:57,900 --> 00:18:00,179 experience 496 00:17:59,460 --> 00:18:02,280 um 497 00:18:00,179 --> 00:18:03,960 but uh the thing is that even the kids 498 00:18:02,280 --> 00:18:05,280 who weren't interested in politics were 499 00:18:03,960 --> 00:18:07,320 super interested 500 00:18:05,280 --> 00:18:10,799 in the fact that this was real 501 00:18:07,320 --> 00:18:13,020 and they could easily see the connection 502 00:18:10,799 --> 00:18:14,400 to things that they did care about and 503 00:18:13,020 --> 00:18:16,020 how they could use their skills that 504 00:18:14,400 --> 00:18:17,880 they were learning for something that 505 00:18:16,020 --> 00:18:18,480 mattered to them more 506 00:18:17,880 --> 00:18:20,640 um 507 00:18:18,480 --> 00:18:22,320 this turns out to be Central to 508 00:18:20,640 --> 00:18:24,360 motivation kids who can't see the point 509 00:18:22,320 --> 00:18:27,900 in learning something don't 510 00:18:24,360 --> 00:18:30,600 it's as simple as that and making it fun 511 00:18:27,900 --> 00:18:31,520 is very dependent on your definition of 512 00:18:30,600 --> 00:18:33,780 fun 513 00:18:31,520 --> 00:18:37,500 there is no single thing that every 514 00:18:33,780 --> 00:18:39,600 student will find fun but also robots 515 00:18:37,500 --> 00:18:40,980 um they are preaching to the converted 516 00:18:39,600 --> 00:18:43,260 to the kids who love robots will love 517 00:18:40,980 --> 00:18:46,140 robots the kids who don't a lot 518 00:18:43,260 --> 00:18:47,460 I'm pretty straightforward but if you're 519 00:18:46,140 --> 00:18:49,140 using real data they can see the 520 00:18:47,460 --> 00:18:51,840 relevance and they can apply it to other 521 00:18:49,140 --> 00:18:53,580 things even if that particular data set 522 00:18:51,840 --> 00:18:55,860 doesn't do it for them 523 00:18:53,580 --> 00:18:58,580 and the cool thing is when we use real 524 00:18:55,860 --> 00:19:01,559 data sets there are a lot of questions 525 00:18:58,580 --> 00:19:04,620 to ask before you even begin to analyze 526 00:19:01,559 --> 00:19:07,260 them questions like how was the data 527 00:19:04,620 --> 00:19:08,940 collected what are the problems or 528 00:19:07,260 --> 00:19:11,520 limitations of the way the data was 529 00:19:08,940 --> 00:19:14,340 collected well this is sample size what 530 00:19:11,520 --> 00:19:16,500 biases are embodied in this data what 531 00:19:14,340 --> 00:19:18,660 were the limitations of any sensors used 532 00:19:16,500 --> 00:19:20,940 has the data been processed at all and 533 00:19:18,660 --> 00:19:23,220 if so what what information was lost in 534 00:19:20,940 --> 00:19:25,440 the processing what does each of the 535 00:19:23,220 --> 00:19:26,820 fields mean how do the fields relate to 536 00:19:25,440 --> 00:19:29,100 each other so you've got to understand 537 00:19:26,820 --> 00:19:30,539 the domain of the data which is not the 538 00:19:29,100 --> 00:19:32,520 way we typically teach maths we 539 00:19:30,539 --> 00:19:34,679 typically teach maths with here are some 540 00:19:32,520 --> 00:19:36,900 numbers here is a process applied the 541 00:19:34,679 --> 00:19:40,200 process to the numbers and tell us the 542 00:19:36,900 --> 00:19:42,360 result not well it depends on the type 543 00:19:40,200 --> 00:19:44,940 of seagrass that you're measuring that's 544 00:19:42,360 --> 00:19:47,580 not a part of math 545 00:19:44,940 --> 00:19:49,200 um and this is my favorite and this has 546 00:19:47,580 --> 00:19:50,580 particularly come out of my one of my 547 00:19:49,200 --> 00:19:52,320 podcasts which I'll talk about a bit in 548 00:19:50,580 --> 00:19:55,500 a minute but um what definitions 549 00:19:52,320 --> 00:19:57,660 underpin this data because the devil is 550 00:19:55,500 --> 00:20:00,600 absolutely in the definitions um I 551 00:19:57,660 --> 00:20:02,700 mentioned the gender pay Gap Data before 552 00:20:00,600 --> 00:20:04,380 um one of the issues with that data for 553 00:20:02,700 --> 00:20:07,679 there are many I I love that it's 554 00:20:04,380 --> 00:20:11,360 released right but it's not the Panacea 555 00:20:07,679 --> 00:20:13,440 that some people have said it is because 556 00:20:11,360 --> 00:20:14,600 part-timers are not included in that 557 00:20:13,440 --> 00:20:18,900 data 558 00:20:14,600 --> 00:20:20,220 now who is typically part-time 559 00:20:18,900 --> 00:20:22,679 correct 560 00:20:20,220 --> 00:20:25,020 so you know the assumptions that have 561 00:20:22,679 --> 00:20:27,299 been made in the definitions and the you 562 00:20:25,020 --> 00:20:29,220 know who's in and who's out can be very 563 00:20:27,299 --> 00:20:30,660 significant in the results that you get 564 00:20:29,220 --> 00:20:32,580 and that is 565 00:20:30,660 --> 00:20:34,200 that's a bit mind-bending for people 566 00:20:32,580 --> 00:20:36,059 who've been taught that things are black 567 00:20:34,200 --> 00:20:37,740 and white and you know very 568 00:20:36,059 --> 00:20:39,840 straightforward and you get the same 569 00:20:37,740 --> 00:20:41,700 results every time 570 00:20:39,840 --> 00:20:43,559 so already you're starting a data 571 00:20:41,700 --> 00:20:44,880 literacy conversation that builds 572 00:20:43,559 --> 00:20:46,500 critical thinking and problem solving 573 00:20:44,880 --> 00:20:47,520 skills and we haven't even opened the 574 00:20:46,500 --> 00:20:49,679 file yet 575 00:20:47,520 --> 00:20:50,940 I love that that you know that they 576 00:20:49,679 --> 00:20:54,660 really have to start thinking about 577 00:20:50,940 --> 00:20:57,179 things before they even get down to work 578 00:20:54,660 --> 00:20:58,799 we have a tendency as human beings to 579 00:20:57,179 --> 00:21:01,679 kind of Bend at the knees when we see a 580 00:20:58,799 --> 00:21:04,620 graph or some statistics but teaching 581 00:21:01,679 --> 00:21:09,240 data science but teaching data science 582 00:21:04,620 --> 00:21:11,039 using real data sets builds a culture of 583 00:21:09,240 --> 00:21:13,919 rational skepticism that makes it normal 584 00:21:11,039 --> 00:21:15,840 to ask what was the sample size how did 585 00:21:13,919 --> 00:21:17,820 you collect that data where did it come 586 00:21:15,840 --> 00:21:20,100 from how reliable is it what biases 587 00:21:17,820 --> 00:21:22,260 might there be and we all need to be 588 00:21:20,100 --> 00:21:25,160 asking Those Questions by default and we 589 00:21:22,260 --> 00:21:28,080 tend not to 590 00:21:25,160 --> 00:21:29,760 my kids and my students cursed me 591 00:21:28,080 --> 00:21:31,380 because they can't look at things like 592 00:21:29,760 --> 00:21:33,240 that without without asking those 593 00:21:31,380 --> 00:21:35,159 questions anymore you know it's like 594 00:21:33,240 --> 00:21:37,740 once your eyes are opened you can't 595 00:21:35,159 --> 00:21:41,000 close them again which I love but it's 596 00:21:37,740 --> 00:21:41,000 also kind of distressing 597 00:21:41,400 --> 00:21:45,600 um of course yeah there are challenges 598 00:21:43,140 --> 00:21:49,080 if you have 180 kids asking different 599 00:21:45,600 --> 00:21:50,820 questions of a data set you have 1280 600 00:21:49,080 --> 00:21:54,600 answers which aren't in the back of the 601 00:21:50,820 --> 00:21:57,480 textbook that to me is an upside not a 602 00:21:54,600 --> 00:21:59,159 downside because now instead of taking 603 00:21:57,480 --> 00:22:00,720 it to the teacher and asking us if it's 604 00:21:59,159 --> 00:22:02,100 correct or looking it up in the back of 605 00:22:00,720 --> 00:22:03,240 the textbook or hitting the submit 606 00:22:02,100 --> 00:22:05,340 button and finding out whether you've 607 00:22:03,240 --> 00:22:06,240 got the right answer now what you have 608 00:22:05,340 --> 00:22:09,360 to do 609 00:22:06,240 --> 00:22:11,340 you have to check your own work 610 00:22:09,360 --> 00:22:13,559 you have to critically evaluate your own 611 00:22:11,340 --> 00:22:16,440 work you have to figure out why you 612 00:22:13,559 --> 00:22:19,080 think it's valid challenge it test it to 613 00:22:16,440 --> 00:22:20,880 see if it is valid figure out what other 614 00:22:19,080 --> 00:22:22,740 explanations there might be for the 615 00:22:20,880 --> 00:22:24,960 results that you got all of those 616 00:22:22,740 --> 00:22:26,880 wonderful things that we don't normally 617 00:22:24,960 --> 00:22:30,360 do in schools and in fact we don't 618 00:22:26,880 --> 00:22:32,220 normally do them in real life either 619 00:22:30,360 --> 00:22:34,140 um so not only are they learning to be 620 00:22:32,220 --> 00:22:35,400 rationally skeptical about data but 621 00:22:34,140 --> 00:22:37,980 they're also learning to be rationally 622 00:22:35,400 --> 00:22:40,620 skeptical about their own work and to 623 00:22:37,980 --> 00:22:42,539 assume that there will be flaws in their 624 00:22:40,620 --> 00:22:44,100 own work because when you're using real 625 00:22:42,539 --> 00:22:45,419 data sets and solving real problems 626 00:22:44,100 --> 00:22:46,500 there is no such thing as a perfect 627 00:22:45,419 --> 00:22:48,000 solution 628 00:22:46,500 --> 00:22:50,820 so now you have to evaluate Your 629 00:22:48,000 --> 00:22:52,380 solution and and I I think that's 630 00:22:50,820 --> 00:22:54,179 magnificent and it wasn't something I 631 00:22:52,380 --> 00:22:55,919 thought about when I started doing data 632 00:22:54,179 --> 00:22:57,480 like I said when I started teaching data 633 00:22:55,919 --> 00:22:59,580 science I just wanted them to make 634 00:22:57,480 --> 00:23:02,340 better graphs and learn to code 635 00:22:59,580 --> 00:23:03,840 that it turns out it's not the important 636 00:23:02,340 --> 00:23:05,820 stuff 637 00:23:03,840 --> 00:23:08,400 so when we give kids real things to do 638 00:23:05,820 --> 00:23:10,440 and the power to create change they see 639 00:23:08,400 --> 00:23:12,960 the purpose of tech and data science 640 00:23:10,440 --> 00:23:14,820 skills and they're eager to learn black 641 00:23:12,960 --> 00:23:16,380 and white balls in an urn or teaching 642 00:23:14,820 --> 00:23:17,940 robots to push each other out of circles 643 00:23:16,380 --> 00:23:19,260 just doesn't cut it doesn't have nearly 644 00:23:17,940 --> 00:23:21,299 the same impact 645 00:23:19,260 --> 00:23:22,919 and the more open data we have the more 646 00:23:21,299 --> 00:23:25,020 data sets that kids can play with 647 00:23:22,919 --> 00:23:26,580 themselves and figure stuff out with the 648 00:23:25,020 --> 00:23:29,159 greater the potential for projects that 649 00:23:26,580 --> 00:23:31,559 Empower kids to make change in their own 650 00:23:29,159 --> 00:23:34,440 communities 651 00:23:31,559 --> 00:23:38,760 so imagine kids exploring pedestrian 652 00:23:34,440 --> 00:23:41,280 data in their local Town Center 653 00:23:38,760 --> 00:23:43,380 um or tracking covert cases in their 654 00:23:41,280 --> 00:23:45,780 Community if that data were available 655 00:23:43,380 --> 00:23:47,940 which it no longer is 656 00:23:45,780 --> 00:23:50,039 um imagine them evaluating the impact of 657 00:23:47,940 --> 00:23:51,900 nearby development on threatening 658 00:23:50,039 --> 00:23:53,880 species or looking at the impact of 659 00:23:51,900 --> 00:23:56,100 dredging in Port Phillip Bay on Dolphin 660 00:23:53,880 --> 00:23:57,600 numbers and behavior imagine them 661 00:23:56,100 --> 00:23:59,480 analyzing traffic around their school 662 00:23:57,600 --> 00:24:02,039 and devising safer traffic management 663 00:23:59,480 --> 00:24:05,100 processes for drop off and pickup time 664 00:24:02,039 --> 00:24:07,640 or using Google Mobility data to analyze 665 00:24:05,100 --> 00:24:11,340 covert lockdowns and figure out 666 00:24:07,640 --> 00:24:14,220 which country really have the longest 667 00:24:11,340 --> 00:24:17,220 and strictest lockdowns in the world 668 00:24:14,220 --> 00:24:19,260 or um or using public health and Road 669 00:24:17,220 --> 00:24:21,059 accident data to try and figure out 670 00:24:19,260 --> 00:24:22,679 which is really the most dangerous to 671 00:24:21,059 --> 00:24:25,440 the population in the long-term cycling 672 00:24:22,679 --> 00:24:29,760 or inactivity 673 00:24:25,440 --> 00:24:31,500 imagine well imagine exploring real 674 00:24:29,760 --> 00:24:32,700 current data and anything the kids care 675 00:24:31,500 --> 00:24:34,860 about 676 00:24:32,700 --> 00:24:37,140 it's as simple as that 677 00:24:34,860 --> 00:24:39,780 and as someone who's currently Mobility 678 00:24:37,140 --> 00:24:41,820 impaired I'm fascinated and enraged by 679 00:24:39,780 --> 00:24:44,159 how much further all the accessible 680 00:24:41,820 --> 00:24:46,440 stuff is you have to walk further for 681 00:24:44,159 --> 00:24:48,840 the lift you have to walk further to 682 00:24:46,440 --> 00:24:50,700 press the button to make the automatic 683 00:24:48,840 --> 00:24:54,600 door open 684 00:24:50,700 --> 00:24:57,059 um the ambulance toilets and and the 685 00:24:54,600 --> 00:25:00,299 lows are always way down the other end 686 00:24:57,059 --> 00:25:01,799 like it just every step some days is 687 00:25:00,299 --> 00:25:03,600 incredibly painful and I have to take 688 00:25:01,799 --> 00:25:06,000 way more of them to use the stuff that's 689 00:25:03,600 --> 00:25:07,799 going to make my life easier so imagine 690 00:25:06,000 --> 00:25:09,240 if we had kids doing a project just 691 00:25:07,799 --> 00:25:14,159 around their school or around their 692 00:25:09,240 --> 00:25:15,900 local Shopping Center or park to figure 693 00:25:14,159 --> 00:25:17,940 out how much further people in 694 00:25:15,900 --> 00:25:20,700 wheelchairs have to go to access the 695 00:25:17,940 --> 00:25:21,799 same things that the rest of us take for 696 00:25:20,700 --> 00:25:26,520 granted 697 00:25:21,799 --> 00:25:29,279 or um what about a project where kids 698 00:25:26,520 --> 00:25:31,799 measure track or try to fix anything 699 00:25:29,279 --> 00:25:33,860 that is gets in the way of people who 700 00:25:31,799 --> 00:25:37,020 are vision impaired 701 00:25:33,860 --> 00:25:39,380 uneven footpaths low hanging signs which 702 00:25:37,020 --> 00:25:43,020 by the way are also a problem for the 703 00:25:39,380 --> 00:25:45,120 clumsy I told people among us hi 704 00:25:43,020 --> 00:25:46,860 um or measure and track accessibility on 705 00:25:45,120 --> 00:25:50,100 websites what about a project to track 706 00:25:46,860 --> 00:25:51,900 the use of alt tags on lasted on and 707 00:25:50,100 --> 00:25:52,500 compare it with Twitter 708 00:25:51,900 --> 00:25:54,299 um 709 00:25:52,500 --> 00:25:56,039 really when you start thinking this way 710 00:25:54,299 --> 00:25:57,779 there's just no end to the kinds of 711 00:25:56,039 --> 00:26:00,000 projects that kids can do and the topics 712 00:25:57,779 --> 00:26:02,460 that they can take on and the cool thing 713 00:26:00,000 --> 00:26:04,500 is kids want to do something real they 714 00:26:02,460 --> 00:26:05,159 want to make a difference 715 00:26:04,500 --> 00:26:07,380 um 716 00:26:05,159 --> 00:26:09,000 when they ask why do I learn this they 717 00:26:07,380 --> 00:26:11,820 don't actually want the answer because 718 00:26:09,000 --> 00:26:14,340 it's on the exam they they want to know 719 00:26:11,820 --> 00:26:17,880 how it's meaningful unless it's 720 00:26:14,340 --> 00:26:19,860 fundamentally meaningful it's just built 721 00:26:17,880 --> 00:26:22,919 in 722 00:26:19,860 --> 00:26:24,419 um so every project when the kids come 723 00:26:22,919 --> 00:26:25,919 up with Solutions because there's no 724 00:26:24,419 --> 00:26:26,760 textbook answer and they can't look it 725 00:26:25,919 --> 00:26:27,960 up at the back of the book called 726 00:26:26,760 --> 00:26:30,000 compare it with the teachers answer 727 00:26:27,960 --> 00:26:32,159 sheet they have to evaluate their own 728 00:26:30,000 --> 00:26:33,240 solutions they have to ask how does that 729 00:26:32,159 --> 00:26:35,100 help 730 00:26:33,240 --> 00:26:37,440 how does it harm 731 00:26:35,100 --> 00:26:39,960 does it actually make things better how 732 00:26:37,440 --> 00:26:41,880 could we improve it imagine if that was 733 00:26:39,960 --> 00:26:44,460 a certain approach to programs in 734 00:26:41,880 --> 00:26:47,880 government a bit radical 735 00:26:44,460 --> 00:26:49,559 none of these are easy questions none of 736 00:26:47,880 --> 00:26:52,080 them have easy answers but that's 737 00:26:49,559 --> 00:26:54,480 fantastic preparation for the real world 738 00:26:52,080 --> 00:26:56,700 where easy questions and obvious answers 739 00:26:54,480 --> 00:26:58,799 are conspicuous by their absence or 740 00:26:56,700 --> 00:27:00,299 deeply suspicious when they're presented 741 00:26:58,799 --> 00:27:01,679 to you 742 00:27:00,299 --> 00:27:03,240 um and the Australian data Science 743 00:27:01,679 --> 00:27:05,880 Education Institute is building these 744 00:27:03,240 --> 00:27:07,440 kinds of projects and using open data or 745 00:27:05,880 --> 00:27:09,299 getting kids to collect their own data 746 00:27:07,440 --> 00:27:11,700 about problems in the local area for 747 00:27:09,299 --> 00:27:13,740 kids as young as five years old 748 00:27:11,700 --> 00:27:15,720 you can do stuff the kids they're 749 00:27:13,740 --> 00:27:17,520 graphing by stacking blocks it's it's 750 00:27:15,720 --> 00:27:18,600 great fun 751 00:27:17,520 --> 00:27:21,539 um we're building their critical 752 00:27:18,600 --> 00:27:22,919 thinking rational skepticism and stem 753 00:27:21,539 --> 00:27:24,659 skills from the very start of their 754 00:27:22,919 --> 00:27:25,980 education and we're teaching them that 755 00:27:24,659 --> 00:27:28,020 they have the power to change the world 756 00:27:25,980 --> 00:27:31,279 and that stem skills 757 00:27:28,020 --> 00:27:31,279 enhance that power 758 00:27:31,620 --> 00:27:35,940 a friend of mine recently hired a newly 759 00:27:33,900 --> 00:27:37,320 graduated data scientist and he came to 760 00:27:35,940 --> 00:27:38,940 her deeply distressed at one point 761 00:27:37,320 --> 00:27:40,679 saying 762 00:27:38,940 --> 00:27:42,480 my data set like I don't understand 763 00:27:40,679 --> 00:27:44,279 what's going wrong the curve is really 764 00:27:42,480 --> 00:27:45,539 wonky and my friends looked at it 765 00:27:44,279 --> 00:27:47,460 fortunately she's an experienced 766 00:27:45,539 --> 00:27:50,400 statistician she was like yeah that's 767 00:27:47,460 --> 00:27:52,919 that's how real data comes out this guy 768 00:27:50,400 --> 00:27:56,039 had a masters in data science and had 769 00:27:52,919 --> 00:27:58,200 never seen a curve that didn't come out 770 00:27:56,039 --> 00:27:59,940 perfectly I mean that's an outrageous 771 00:27:58,200 --> 00:28:02,340 indictment on that master's program and 772 00:27:59,940 --> 00:28:04,980 I don't know which university it is but 773 00:28:02,340 --> 00:28:07,380 could probably be any of them 774 00:28:04,980 --> 00:28:08,880 um no wonder is Paul Brayton exploded on 775 00:28:07,380 --> 00:28:11,400 contact with real data like that's just 776 00:28:08,880 --> 00:28:13,559 we should 777 00:28:11,400 --> 00:28:15,059 you should never be using textbook data 778 00:28:13,559 --> 00:28:18,360 sets because they don't exist in the 779 00:28:15,059 --> 00:28:19,679 real world it's it's just outrageous so 780 00:28:18,360 --> 00:28:21,179 you have to bear in mind of course that 781 00:28:19,679 --> 00:28:23,520 not every school project will produce 782 00:28:21,179 --> 00:28:25,860 tangible great outcomes or usable 783 00:28:23,520 --> 00:28:27,000 outcomes sometimes a school project you 784 00:28:25,860 --> 00:28:28,559 know disappears at the end of the 785 00:28:27,000 --> 00:28:29,880 semester and that's all there is to it 786 00:28:28,559 --> 00:28:31,679 which is fine 787 00:28:29,880 --> 00:28:32,700 the fascinating thing is that just 788 00:28:31,679 --> 00:28:34,320 knowing that they're working on 789 00:28:32,700 --> 00:28:36,779 something real 790 00:28:34,320 --> 00:28:38,820 um and meaningful give gives kids a 791 00:28:36,779 --> 00:28:40,860 mind-blowing level of motivation and 792 00:28:38,820 --> 00:28:42,240 engagement and it makes it easier for 793 00:28:40,860 --> 00:28:45,480 them to imagine how the skills that 794 00:28:42,240 --> 00:28:46,860 they're learning could be used elsewhere 795 00:28:45,480 --> 00:28:48,480 even though they're learning the same 796 00:28:46,860 --> 00:28:51,120 essential skills 797 00:28:48,480 --> 00:28:52,919 at least where coding is concerned 798 00:28:51,120 --> 00:28:55,020 um they just 799 00:28:52,919 --> 00:28:57,360 they can transfer them much more easily 800 00:28:55,020 --> 00:29:00,120 and also you know they're using python 801 00:28:57,360 --> 00:29:01,140 not Mindstorm so the the transference is 802 00:29:00,120 --> 00:29:02,880 easier because there's a lot of stuff 803 00:29:01,140 --> 00:29:05,520 happening in Python 804 00:29:02,880 --> 00:29:07,140 open data combined with this kind of 805 00:29:05,520 --> 00:29:08,520 data Science Education gives students 806 00:29:07,140 --> 00:29:10,020 the power to change the world and ask 807 00:29:08,520 --> 00:29:11,880 all those really difficult questions 808 00:29:10,020 --> 00:29:14,100 that's why I called it raising Heretics 809 00:29:11,880 --> 00:29:15,419 and I have a book by the same name it's 810 00:29:14,100 --> 00:29:17,159 we're really teaching kids to ask 811 00:29:15,419 --> 00:29:19,440 difficult questions which a lot of 812 00:29:17,159 --> 00:29:21,779 people don't like I was I was in a 813 00:29:19,440 --> 00:29:25,140 teacher conference and I talked about 814 00:29:21,779 --> 00:29:26,580 how you know we we should be challenging 815 00:29:25,140 --> 00:29:27,960 the validity with the assessments and 816 00:29:26,580 --> 00:29:29,520 one of the teachers was like oh you 817 00:29:27,960 --> 00:29:31,020 better not tell the kids that they might 818 00:29:29,520 --> 00:29:34,220 start challenging the validity 819 00:29:31,020 --> 00:29:37,380 assessments I'm like yes 820 00:29:34,220 --> 00:29:40,440 so another one's ready for it 821 00:29:37,380 --> 00:29:42,179 um but as open data enthusiasts I charge 822 00:29:40,440 --> 00:29:44,640 you with a few things that will help us 823 00:29:42,179 --> 00:29:47,340 turn your data into school projects 824 00:29:44,640 --> 00:29:50,880 number one please annotate your data for 825 00:29:47,340 --> 00:29:52,880 non-experts if you have a CSV file with 826 00:29:50,880 --> 00:29:55,919 Fields labeled things like 827 00:29:52,880 --> 00:29:58,020 f6642g it may be open but it's not 828 00:29:55,919 --> 00:29:59,760 really accessible so please provide a 829 00:29:58,020 --> 00:30:02,399 data dictionary if you can that that 830 00:29:59,760 --> 00:30:04,740 explains every field in a clear and 831 00:30:02,399 --> 00:30:06,360 non-specialist language together with as 832 00:30:04,740 --> 00:30:08,520 much information as you can about how 833 00:30:06,360 --> 00:30:09,059 the data was collected and 834 00:30:08,520 --> 00:30:10,799 um 835 00:30:09,059 --> 00:30:12,360 and any processing that's been done to 836 00:30:10,799 --> 00:30:15,000 it 837 00:30:12,360 --> 00:30:18,120 use uh non-proprietary formats if you 838 00:30:15,000 --> 00:30:19,679 possibly can CSV is ideal we can't make 839 00:30:18,120 --> 00:30:21,740 any assumptions about available software 840 00:30:19,679 --> 00:30:24,659 Hardware or even internet in schools 841 00:30:21,740 --> 00:30:27,779 sometimes we are reduced to sneaker net 842 00:30:24,659 --> 00:30:30,360 which brings me to if your data set is 843 00:30:27,779 --> 00:30:33,000 massive if you can provide meaningful 844 00:30:30,360 --> 00:30:34,919 subsets that's super helpful so for 845 00:30:33,000 --> 00:30:38,279 example I wrote the Google Mobility data 846 00:30:34,919 --> 00:30:41,580 down for Australia into states which is 847 00:30:38,279 --> 00:30:43,020 a you know simple super simple python 848 00:30:41,580 --> 00:30:44,580 script for me but it's out of the reach 849 00:30:43,020 --> 00:30:46,740 of a lot of teachers and a lot of 850 00:30:44,580 --> 00:30:48,120 students so it just makes it a little 851 00:30:46,740 --> 00:30:50,220 easier for them to dive in without 852 00:30:48,120 --> 00:30:52,320 having to do all the prep work and the 853 00:30:50,220 --> 00:30:53,580 thing about the prep work is I had time 854 00:30:52,320 --> 00:30:55,559 to do it when I was teaching because I 855 00:30:53,580 --> 00:30:58,380 was half time also I already had the 856 00:30:55,559 --> 00:30:59,580 skills most teachers full-time teachers 857 00:30:58,380 --> 00:31:01,440 don't have that time they don't have 858 00:30:59,580 --> 00:31:03,419 time to find the data make sense of it 859 00:31:01,440 --> 00:31:04,980 build a project that's one of the 860 00:31:03,419 --> 00:31:06,360 reasons I created Etsy to try to Short 861 00:31:04,980 --> 00:31:08,880 Circuit that for them so they can just 862 00:31:06,360 --> 00:31:10,740 grab the data that's pre-annotated grab 863 00:31:08,880 --> 00:31:14,399 lesson plans and things like that 864 00:31:10,740 --> 00:31:16,380 uh number four be open and explicit 865 00:31:14,399 --> 00:31:18,059 um about the issues and limitations of 866 00:31:16,380 --> 00:31:19,740 her data that's super helpful and also 867 00:31:18,059 --> 00:31:21,720 this is going up on my blog as soon as 868 00:31:19,740 --> 00:31:23,100 I've done here so um you don't have to 869 00:31:21,720 --> 00:31:25,320 write it down if you wouldn't look for 870 00:31:23,100 --> 00:31:27,480 it it'll be up on the ADC website which 871 00:31:25,320 --> 00:31:29,760 I will give you the link to in a minute 872 00:31:27,480 --> 00:31:31,980 it's up there in fact 873 00:31:29,760 --> 00:31:34,020 um yeah talk about patches where there's 874 00:31:31,980 --> 00:31:36,720 data missing or where census filed over 875 00:31:34,020 --> 00:31:38,340 the Internet uh went down or biases in 876 00:31:36,720 --> 00:31:41,000 the sample anything like that is super 877 00:31:38,340 --> 00:31:43,919 helpful to know and please please please 878 00:31:41,000 --> 00:31:46,559 do not clean your data 879 00:31:43,919 --> 00:31:48,120 I want kids to be working with the messy 880 00:31:46,559 --> 00:31:49,320 stuff to understand that people don't 881 00:31:48,120 --> 00:31:50,880 always follow the rules when they're 882 00:31:49,320 --> 00:31:52,620 voting for the same act don't you know 883 00:31:50,880 --> 00:31:55,140 all that stuff 884 00:31:52,620 --> 00:31:57,360 um real messy complicated data sets are 885 00:31:55,140 --> 00:32:01,140 the best possible preparation for the 886 00:31:57,360 --> 00:32:03,480 real messy complicated real world 887 00:32:01,140 --> 00:32:05,700 um we need to really stop teaching kids 888 00:32:03,480 --> 00:32:07,279 that questions have simple answers 889 00:32:05,700 --> 00:32:10,620 this is my thing 890 00:32:07,279 --> 00:32:12,059 instead we want to equip them to handle 891 00:32:10,620 --> 00:32:15,179 real life 892 00:32:12,059 --> 00:32:17,220 so uh number six provide contact 893 00:32:15,179 --> 00:32:18,779 information if you can if you're 894 00:32:17,220 --> 00:32:20,880 uncomfortable with doing that or your 895 00:32:18,779 --> 00:32:23,640 data set is so exciting that you feel 896 00:32:20,880 --> 00:32:26,039 you're going to be inundated please feel 897 00:32:23,640 --> 00:32:28,380 free to use me as an intermediary 898 00:32:26,039 --> 00:32:29,820 um contact me first 899 00:32:28,380 --> 00:32:31,559 um but I'm happy for you to put my 900 00:32:29,820 --> 00:32:33,659 contact details this is part of what I 901 00:32:31,559 --> 00:32:36,120 do right it's part of what ANSI does put 902 00:32:33,659 --> 00:32:37,919 we'll put up answer as a contact and um 903 00:32:36,120 --> 00:32:39,299 then you at least will only have to 904 00:32:37,919 --> 00:32:41,880 answer the questions that I can't answer 905 00:32:39,299 --> 00:32:44,220 and only answer them once so that's 906 00:32:41,880 --> 00:32:46,080 always an option when I was working with 907 00:32:44,220 --> 00:32:48,960 the aec data I spent hours on the phone 908 00:32:46,080 --> 00:32:51,860 to the aec trying to find someone who 909 00:32:48,960 --> 00:32:53,940 could confirm for me the mapping of the 910 00:32:51,860 --> 00:32:56,159 one-dimensional string to the 911 00:32:53,940 --> 00:32:57,779 two-dimensional ballot paper and I could 912 00:32:56,159 --> 00:33:00,080 not find anyone there who knew 913 00:32:57,779 --> 00:33:02,520 presumably there is somebody 914 00:33:00,080 --> 00:33:03,960 but I couldn't find them and it took me 915 00:33:02,520 --> 00:33:06,200 hours and teachers don't have that kind 916 00:33:03,960 --> 00:33:06,200 of time 917 00:33:06,659 --> 00:33:08,779 um 918 00:33:09,240 --> 00:33:14,220 yeah 919 00:33:11,220 --> 00:33:16,919 a key part of etsy's mission thank you 920 00:33:14,220 --> 00:33:18,480 is to relieve teachers of the burden of 921 00:33:16,919 --> 00:33:20,039 finding those interesting data sets and 922 00:33:18,480 --> 00:33:22,440 figuring them out as well as helping 923 00:33:20,039 --> 00:33:24,000 teach them how to do this stuff long 924 00:33:22,440 --> 00:33:24,899 term the goal is to put Etsy out of 925 00:33:24,000 --> 00:33:26,039 business because it'll be the way 926 00:33:24,899 --> 00:33:27,419 teachers teach and if we the way 927 00:33:26,039 --> 00:33:28,860 teachers are trained to teach and it'll 928 00:33:27,419 --> 00:33:31,740 be the way the curriculant has written 929 00:33:28,860 --> 00:33:34,200 it's going to take a minute or two so in 930 00:33:31,740 --> 00:33:36,419 the meantime adsey is a charity because 931 00:33:34,200 --> 00:33:39,899 funding must never be a barrier to 932 00:33:36,419 --> 00:33:42,080 access the road to social equality is 933 00:33:39,899 --> 00:33:45,000 educational equality 934 00:33:42,080 --> 00:33:46,860 which means that we do charge for those 935 00:33:45,000 --> 00:33:48,179 who can't afford to pay don't and it 936 00:33:46,860 --> 00:33:50,940 also means that we're always structure 937 00:33:48,179 --> 00:33:52,860 cash so uh 938 00:33:50,940 --> 00:33:56,539 please feel free 939 00:33:52,860 --> 00:33:56,539 to sling us some money if you can 940 00:33:56,820 --> 00:34:00,179 there we go 941 00:33:58,620 --> 00:34:02,220 um 942 00:34:00,179 --> 00:34:04,440 the more authentic and meaningful data 943 00:34:02,220 --> 00:34:05,880 that we have access to the more kids we 944 00:34:04,440 --> 00:34:08,159 can Empower to change the world using 945 00:34:05,880 --> 00:34:09,659 data science so please put them up on 946 00:34:08,159 --> 00:34:11,520 all the data repositories you can find 947 00:34:09,659 --> 00:34:12,720 send them straight to me or send me a 948 00:34:11,520 --> 00:34:14,940 link to let me know that they're there 949 00:34:12,720 --> 00:34:18,599 it's super helpful and gives kids access 950 00:34:14,940 --> 00:34:19,560 to much more interesting projects 951 00:34:18,599 --> 00:34:23,220 um 952 00:34:19,560 --> 00:34:26,639 I do have a book where you can read more 953 00:34:23,220 --> 00:34:28,139 of this stuff it's free as a podcast 954 00:34:26,639 --> 00:34:30,480 um which is really just an audio book 955 00:34:28,139 --> 00:34:32,820 but with lower sound quality 956 00:34:30,480 --> 00:34:34,379 um and it's also you can buy it all over 957 00:34:32,820 --> 00:34:35,300 the place 958 00:34:34,379 --> 00:34:38,580 um 959 00:34:35,300 --> 00:34:41,060 I also have a podcast called make me dot 960 00:34:38,580 --> 00:34:43,320 electric if you do funky stuff with data 961 00:34:41,060 --> 00:34:45,300 singer and I'll have you on because it's 962 00:34:43,320 --> 00:34:47,700 it's really fun 963 00:34:45,300 --> 00:34:49,260 um and it's a it's just interesting 964 00:34:47,700 --> 00:34:50,639 chats with people who do cool stuff with 965 00:34:49,260 --> 00:34:52,500 data and there's a few new living 966 00:34:50,639 --> 00:34:53,940 audience here I'm like oh I have a catch 967 00:34:52,500 --> 00:34:56,119 you one there you go 968 00:34:53,940 --> 00:34:56,119 um 969 00:34:56,220 --> 00:35:01,140 um one more thing 970 00:34:58,980 --> 00:35:03,740 talk to me 971 00:35:01,140 --> 00:35:03,740 questions 972 00:35:05,839 --> 00:35:10,640 Fiona's gonna come around with a 973 00:35:07,380 --> 00:35:10,640 microphone so we can all hear 974 00:35:11,400 --> 00:35:17,579 on your website do you have a 975 00:35:14,940 --> 00:35:21,240 some links to data that you've 976 00:35:17,579 --> 00:35:23,099 previously used that you would like or I 977 00:35:21,240 --> 00:35:25,320 think we can get a kick start with yes 978 00:35:23,099 --> 00:35:27,359 uh the website is severely in need of 979 00:35:25,320 --> 00:35:30,000 updating like all good websites but yes 980 00:35:27,359 --> 00:35:31,980 there are there's projects and existing 981 00:35:30,000 --> 00:35:33,660 data sets on there there will be more 982 00:35:31,980 --> 00:35:36,740 I'm actually 983 00:35:33,660 --> 00:35:36,740 since I try to myself 984 00:35:36,839 --> 00:35:41,760 this sounds really exciting for kids I 985 00:35:38,940 --> 00:35:44,940 was wondering if you have any advice for 986 00:35:41,760 --> 00:35:47,400 the adults who got left behind who want 987 00:35:44,940 --> 00:35:48,380 to learn this stuff 988 00:35:47,400 --> 00:35:51,000 um 989 00:35:48,380 --> 00:35:52,500 there's a whole bunch of ways to learn 990 00:35:51,000 --> 00:35:55,560 data science online 991 00:35:52,500 --> 00:35:57,180 um but but again I would say find 992 00:35:55,560 --> 00:35:59,339 yourself a problem 993 00:35:57,180 --> 00:36:01,140 that that is Meaningful to you and try 994 00:35:59,339 --> 00:36:02,099 to figure that out and build your skills 995 00:36:01,140 --> 00:36:03,180 that way 996 00:36:02,099 --> 00:36:04,680 um because you're more likely to stick 997 00:36:03,180 --> 00:36:05,760 to it first out because it's interesting 998 00:36:04,680 --> 00:36:06,839 to you 999 00:36:05,760 --> 00:36:09,000 um I'm working with a company called 1000 00:36:06,839 --> 00:36:12,060 grock Academy who build 1001 00:36:09,000 --> 00:36:13,680 um a whole bunch of uh online coding 1002 00:36:12,060 --> 00:36:16,260 courses and stuff they are building some 1003 00:36:13,680 --> 00:36:19,380 data science courses as well which I am 1004 00:36:16,260 --> 00:36:20,940 involved with and so there's some useful 1005 00:36:19,380 --> 00:36:22,619 sort of data literacy and data science 1006 00:36:20,940 --> 00:36:24,200 skills coming online they're not there 1007 00:36:22,619 --> 00:36:26,520 yet but they will be 1008 00:36:24,200 --> 00:36:27,900 they're free now 1009 00:36:26,520 --> 00:36:29,400 um because they've got a whole bunch of 1010 00:36:27,900 --> 00:36:30,300 funding so 1011 00:36:29,400 --> 00:36:32,579 um 1012 00:36:30,300 --> 00:36:34,500 it's not a bad place to start but really 1013 00:36:32,579 --> 00:36:36,839 find something you're interested in and 1014 00:36:34,500 --> 00:36:38,160 sort of dig into it that way also there 1015 00:36:36,839 --> 00:36:40,079 is a Facebook group called teachers 1016 00:36:38,160 --> 00:36:41,339 using data science which you can jump 1017 00:36:40,079 --> 00:36:43,200 into even if you're not a teacher and 1018 00:36:41,339 --> 00:36:45,980 ask questions in there and teachers love 1019 00:36:43,200 --> 00:36:45,980 to tell people stuff 1020 00:36:46,020 --> 00:36:48,320 interesting 1021 00:36:49,800 --> 00:36:54,020 are there any more questions for Linda 1022 00:36:57,000 --> 00:37:00,119 um one other thing I hear from teachers 1023 00:36:58,560 --> 00:37:01,940 all the time is but how do I fit this 1024 00:37:00,119 --> 00:37:04,740 into the curriculum 1025 00:37:01,940 --> 00:37:06,900 and the curriculum is 1026 00:37:04,740 --> 00:37:09,300 the slow movie 1027 00:37:06,900 --> 00:37:10,859 um thing and 1028 00:37:09,300 --> 00:37:13,260 do you have any interesting stories or 1029 00:37:10,859 --> 00:37:16,320 anecdotes about things you've been able 1030 00:37:13,260 --> 00:37:17,700 to get teachers to do to 1031 00:37:16,320 --> 00:37:19,260 make it 1032 00:37:17,700 --> 00:37:21,660 the simplest thing is that all the 1033 00:37:19,260 --> 00:37:23,760 projects are curriculum linked so 1034 00:37:21,660 --> 00:37:25,859 actually any one of those real projects 1035 00:37:23,760 --> 00:37:27,680 you can hit curriculum points from like 1036 00:37:25,859 --> 00:37:31,079 50 different subjects 1037 00:37:27,680 --> 00:37:32,940 and so I did a bunch of projects for the 1038 00:37:31,079 --> 00:37:35,700 education department in Victoria a few 1039 00:37:32,940 --> 00:37:37,440 years ago and we literally had five or 1040 00:37:35,700 --> 00:37:39,240 six different rubrics for you know 1041 00:37:37,440 --> 00:37:41,400 English and maths and Science and 1042 00:37:39,240 --> 00:37:42,900 geography and because they you know 1043 00:37:41,400 --> 00:37:43,859 since you're using a real data set 1044 00:37:42,900 --> 00:37:47,400 you're covering a lot of different 1045 00:37:43,859 --> 00:37:49,020 points so it's not hard to build it into 1046 00:37:47,400 --> 00:37:50,940 the curriculum that they need to teach 1047 00:37:49,020 --> 00:37:53,339 anyway 1048 00:37:50,940 --> 00:37:54,660 the the real challenge is persuading 1049 00:37:53,339 --> 00:37:56,220 teachers that they have the skills they 1050 00:37:54,660 --> 00:37:58,920 need to do this which is why most of 1051 00:37:56,220 --> 00:38:00,180 what I do is in spreadsheets when I'm 1052 00:37:58,920 --> 00:38:01,380 training teachers because if I walk into 1053 00:38:00,180 --> 00:38:02,400 a room of teachers and so I'm going to 1054 00:38:01,380 --> 00:38:05,119 teach you all the code in Python 1055 00:38:02,400 --> 00:38:05,119 there'll be a whisking noise 1056 00:38:05,579 --> 00:38:09,060 um it's there's there are teachers out 1057 00:38:07,500 --> 00:38:12,119 there doing amazing stuff in Python but 1058 00:38:09,060 --> 00:38:13,680 they're very very outnumbered and so if 1059 00:38:12,119 --> 00:38:15,540 I'm trying to engage science teachers 1060 00:38:13,680 --> 00:38:17,040 and Math teachers and history teachers 1061 00:38:15,540 --> 00:38:18,599 and geography teachers actually 1062 00:38:17,040 --> 00:38:20,160 geography teachers are all over this 1063 00:38:18,599 --> 00:38:22,079 stuff they're amazing 1064 00:38:20,160 --> 00:38:23,280 um but I need to show them they've 1065 00:38:22,079 --> 00:38:24,780 already got the skills and you can do 1066 00:38:23,280 --> 00:38:27,180 really simple stuff in spreadsheets 1067 00:38:24,780 --> 00:38:28,380 which is again where the subsets of data 1068 00:38:27,180 --> 00:38:30,000 come in handy 1069 00:38:28,380 --> 00:38:31,859 because we can do stuff in spreadsheets 1070 00:38:30,000 --> 00:38:34,800 and not have to code ultimately everyone 1071 00:38:31,859 --> 00:38:36,780 will code but one step at a time 1072 00:38:34,800 --> 00:38:38,160 so in addition to the whisking noise the 1073 00:38:36,780 --> 00:38:39,660 teachers make when they're scurring out 1074 00:38:38,160 --> 00:38:42,720 of the room what other type of 1075 00:38:39,660 --> 00:38:44,040 resistance or sort of refusal have you 1076 00:38:42,720 --> 00:38:45,859 encountered throughout this entire 1077 00:38:44,040 --> 00:38:48,599 Journey that you've been on 1078 00:38:45,859 --> 00:38:50,339 uh the biggest thing is if I label my 1079 00:38:48,599 --> 00:38:52,560 workshops as data science workshops I 1080 00:38:50,339 --> 00:38:56,220 get no signups I'm going to label them 1081 00:38:52,560 --> 00:38:59,820 as stem workshops and they sell out 1082 00:38:56,220 --> 00:39:02,700 same contact it is the Fear Factor but 1083 00:38:59,820 --> 00:39:05,220 also you know I I sometimes worked with 1084 00:39:02,700 --> 00:39:07,200 schools where a head of department head 1085 00:39:05,220 --> 00:39:09,359 of science has said everyone must work 1086 00:39:07,200 --> 00:39:11,700 with lender and there's crickets and 1087 00:39:09,359 --> 00:39:13,680 then he says please contact Linda and 1088 00:39:11,700 --> 00:39:16,800 set up a time crickets and then one 1089 00:39:13,680 --> 00:39:18,720 person does it and the first time I did 1090 00:39:16,800 --> 00:39:20,820 that the person who came to me came 1091 00:39:18,720 --> 00:39:21,780 looking like she was gonna die like she 1092 00:39:20,820 --> 00:39:24,300 was like 1093 00:39:21,780 --> 00:39:26,280 you know lamb to the slaughter you know 1094 00:39:24,300 --> 00:39:29,940 she's pretty sure what she was walking 1095 00:39:26,280 --> 00:39:31,079 into a bloodbath and I sat sat her down 1096 00:39:29,940 --> 00:39:33,000 and went through what she was already 1097 00:39:31,079 --> 00:39:34,980 doing she's a chem teacher an excellent 1098 00:39:33,000 --> 00:39:36,540 cam teacher and I was like so when 1099 00:39:34,980 --> 00:39:37,680 you're doing errors we could do this and 1100 00:39:36,540 --> 00:39:40,440 she said 1101 00:39:37,680 --> 00:39:42,359 I do that and you know just the the 1102 00:39:40,440 --> 00:39:44,339 weight fell off her shoulders and she 1103 00:39:42,359 --> 00:39:46,320 kind of set up straight again 1104 00:39:44,339 --> 00:39:47,640 um it really is the Fear Factor is just 1105 00:39:46,320 --> 00:39:50,400 the big thing 1106 00:39:47,640 --> 00:39:52,619 um so it's it's getting past that and 1107 00:39:50,400 --> 00:39:56,400 that's you know as much as sociology and 1108 00:39:52,619 --> 00:40:00,599 marketing exercises it is anything else 1109 00:39:56,400 --> 00:40:03,119 um and it kind of helps to have 1110 00:40:00,599 --> 00:40:05,400 um a mole in the school you know someone 1111 00:40:03,119 --> 00:40:07,140 who's already you know a double agent 1112 00:40:05,400 --> 00:40:09,680 spreading the word kind of thing that 1113 00:40:07,140 --> 00:40:09,680 works really well 1114 00:40:11,160 --> 00:40:13,880 there's a 1115 00:40:14,520 --> 00:40:17,520 sort of touring curriculum there's 1116 00:40:15,839 --> 00:40:19,140 there's a kind of technology is 1117 00:40:17,520 --> 00:40:20,760 something that we use as a tool and it's 1118 00:40:19,140 --> 00:40:24,420 everywhere and then there's the 1119 00:40:20,760 --> 00:40:26,599 technology curriculum where do you 1120 00:40:24,420 --> 00:40:30,119 do you find that 1121 00:40:26,599 --> 00:40:31,680 complicates it or do you have a way of 1122 00:40:30,119 --> 00:40:34,140 dealing with both of them I think that's 1123 00:40:31,680 --> 00:40:36,359 a really weird setup and I always have 1124 00:40:34,140 --> 00:40:38,160 but I don't I don't particularly care 1125 00:40:36,359 --> 00:40:40,380 about the faculties or the divisions 1126 00:40:38,160 --> 00:40:42,599 because this stuff should be in all of 1127 00:40:40,380 --> 00:40:45,060 the subjects you can I've seen a really 1128 00:40:42,599 --> 00:40:46,800 cool project that analyzes 1129 00:40:45,060 --> 00:40:48,960 um different characters appearing in 1130 00:40:46,800 --> 00:40:51,660 Harry Potter books I wouldn't do Harry 1131 00:40:48,960 --> 00:40:53,940 Potter now but you know you could like 1132 00:40:51,660 --> 00:40:56,220 just it's just it's like a kind of 1133 00:40:53,940 --> 00:40:57,960 visual display of where the characters 1134 00:40:56,220 --> 00:40:59,400 are most likely to appear in the books 1135 00:40:57,960 --> 00:41:00,660 and which characters appear on the same 1136 00:40:59,400 --> 00:41:02,040 page and things like that like there's 1137 00:41:00,660 --> 00:41:03,980 all kinds of stuff you can do there's a 1138 00:41:02,040 --> 00:41:07,560 wonderful history project using the 1139 00:41:03,980 --> 00:41:11,160 Titanic data set which is a readily 1140 00:41:07,560 --> 00:41:15,960 available text file which has the the 1141 00:41:11,160 --> 00:41:18,320 name the age the sex the um 1142 00:41:15,960 --> 00:41:20,820 class and in some cases the occupation 1143 00:41:18,320 --> 00:41:22,320 of the passengers and crew on the 1144 00:41:20,820 --> 00:41:23,460 Titanic or whether they lived or died so 1145 00:41:22,320 --> 00:41:26,339 you can analyze it and say well did 1146 00:41:23,460 --> 00:41:29,579 women and children really go first 1147 00:41:26,339 --> 00:41:31,200 um things like that that's history 1148 00:41:29,579 --> 00:41:33,380 and it doesn't require a lot of maths 1149 00:41:31,200 --> 00:41:33,380 either 1150 00:41:34,380 --> 00:41:39,060 okay I think we're at time so would you 1151 00:41:37,140 --> 00:41:40,800 all join with me please in thanking 1152 00:41:39,060 --> 00:41:42,720 Linda for a very thought-provoking 1153 00:41:40,800 --> 00:41:45,619 session 1154 00:41:42,720 --> 00:41:45,619 thank you