1 00:00:06,320 --> 00:00:11,499 [Music] 2 00:00:16,320 --> 00:00:20,800 welcome back everyone um thank you 3 00:00:19,439 --> 00:00:23,359 for 4 00:00:20,800 --> 00:00:24,160 bearing with us until the end of today 5 00:00:23,359 --> 00:00:25,840 um 6 00:00:24,160 --> 00:00:27,800 next up we have 7 00:00:25,840 --> 00:00:30,480 daniel giving us a talk about 8 00:00:27,800 --> 00:00:33,680 musicbrainz.org and wikidata.org 9 00:00:30,480 --> 00:00:35,920 um daniel is a frequent lca attendee and 10 00:00:33,680 --> 00:00:38,000 i remember him coming to our first ever 11 00:00:35,920 --> 00:00:40,079 go glam which was fantastic so it's good 12 00:00:38,000 --> 00:00:42,320 to see familiar faces giving talks i've 13 00:00:40,079 --> 00:00:43,120 been going to as many as lcas as you 14 00:00:42,320 --> 00:00:44,399 have 15 00:00:43,120 --> 00:00:47,039 oh geez 16 00:00:44,399 --> 00:00:49,120 it's been a long time then um he's also 17 00:00:47,039 --> 00:00:52,719 a music enthusiast and i will hand 18 00:00:49,120 --> 00:00:54,960 straight over to daniel thanks 19 00:00:52,719 --> 00:00:56,800 okay 20 00:00:54,960 --> 00:00:58,480 hey everyone um 21 00:00:56,800 --> 00:01:00,640 i'm gonna 22 00:00:58,480 --> 00:01:03,920 give an introduction to two projects 23 00:01:00,640 --> 00:01:07,200 that i've been editing in 24 00:01:03,920 --> 00:01:08,799 um for many many years 25 00:01:07,200 --> 00:01:09,840 um 26 00:01:08,799 --> 00:01:11,040 and 27 00:01:09,840 --> 00:01:13,680 hopefully 28 00:01:11,040 --> 00:01:16,000 you learned something 29 00:01:13,680 --> 00:01:16,000 um 30 00:01:19,200 --> 00:01:24,560 so about me um 31 00:01:21,520 --> 00:01:29,520 i'm daniel sobi i'm from adelaide 32 00:01:24,560 --> 00:01:32,400 i attend the conf um sort of in my 33 00:01:29,520 --> 00:01:35,240 spare time sitting in front of the tv 34 00:01:32,400 --> 00:01:37,200 i occasionally edit a website called 35 00:01:35,240 --> 00:01:38,640 musicbrainz.org 36 00:01:37,200 --> 00:01:42,560 um 37 00:01:38,640 --> 00:01:45,119 i have 31 000 edits and i'm not one of 38 00:01:42,560 --> 00:01:47,119 the software developers or 39 00:01:45,119 --> 00:01:49,920 there's things called auto editors that 40 00:01:47,119 --> 00:01:51,280 can auto approve things i'm just a 41 00:01:49,920 --> 00:01:53,520 normal 42 00:01:51,280 --> 00:01:55,840 person that's just been stuck around for 43 00:01:53,520 --> 00:01:55,840 ages 44 00:01:56,240 --> 00:01:59,280 um 45 00:01:57,200 --> 00:02:00,960 so the two projects 46 00:01:59,280 --> 00:02:02,640 that i'm going to talk about is a 47 00:02:00,960 --> 00:02:04,719 musicbrainz.org 48 00:02:02,640 --> 00:02:07,360 it's a website 49 00:02:04,719 --> 00:02:10,239 it's got music data it's got people 50 00:02:07,360 --> 00:02:12,239 things about artists labels 51 00:02:10,239 --> 00:02:14,400 recordings it's 52 00:02:12,239 --> 00:02:16,480 not necessarily all music 53 00:02:14,400 --> 00:02:18,480 there are things like audio books and 54 00:02:16,480 --> 00:02:20,640 other things that are sort of music 55 00:02:18,480 --> 00:02:22,480 related 56 00:02:20,640 --> 00:02:24,800 we don't really care if it's a 57 00:02:22,480 --> 00:02:26,400 commercial cd or 58 00:02:24,800 --> 00:02:28,160 someone's soundcloud that they've 59 00:02:26,400 --> 00:02:30,560 uploaded something it just 60 00:02:28,160 --> 00:02:34,239 has to exist 61 00:02:30,560 --> 00:02:35,879 and uh the other project um 62 00:02:34,239 --> 00:02:37,599 i'm going to talk about is uh 63 00:02:35,879 --> 00:02:40,319 wikidata.org 64 00:02:37,599 --> 00:02:40,319 it's sort of a 65 00:02:40,800 --> 00:02:45,920 parallel database used by wikipedia 66 00:02:43,920 --> 00:02:47,440 to um 67 00:02:45,920 --> 00:02:49,760 to link 68 00:02:47,440 --> 00:02:51,760 all the sites together and to store 69 00:02:49,760 --> 00:02:53,680 factual information 70 00:02:51,760 --> 00:02:55,040 so there's going to be things like the 71 00:02:53,680 --> 00:02:57,519 height of 72 00:02:55,040 --> 00:02:59,280 everest and the height of all the large 73 00:02:57,519 --> 00:03:01,360 mountains 74 00:02:59,280 --> 00:03:02,840 presidents 75 00:03:01,360 --> 00:03:05,280 all sorts of things 76 00:03:02,840 --> 00:03:07,200 films you name it there's probably an 77 00:03:05,280 --> 00:03:09,280 entry for it 78 00:03:07,200 --> 00:03:10,319 and 79 00:03:09,280 --> 00:03:12,159 it's a 80 00:03:10,319 --> 00:03:16,159 another database 81 00:03:12,159 --> 00:03:16,159 that anyone can edit and they sort of 82 00:03:16,239 --> 00:03:20,720 have two different approaches to how 83 00:03:17,840 --> 00:03:22,720 they're structured a bit so it's sort of 84 00:03:20,720 --> 00:03:23,680 related sort of do things differently 85 00:03:22,720 --> 00:03:26,239 and 86 00:03:23,680 --> 00:03:30,200 hopefully if i get to it i'll 87 00:03:26,239 --> 00:03:30,200 explain some of that on the way 88 00:03:32,080 --> 00:03:34,319 so 89 00:03:34,879 --> 00:03:40,400 it's database of musicbrainz it's 90 00:03:37,280 --> 00:03:42,720 database of audio recordings 91 00:03:40,400 --> 00:03:45,680 um each entity has a 92 00:03:42,720 --> 00:03:46,400 long string a uuid 93 00:03:45,680 --> 00:03:49,840 that's 94 00:03:46,400 --> 00:03:51,840 really important because 95 00:03:49,840 --> 00:03:54,000 there's an awful lot of name collision 96 00:03:51,840 --> 00:03:56,000 and that's sort of thing so there are 97 00:03:54,000 --> 00:03:58,560 two bands called pendulum 98 00:03:56,000 --> 00:04:00,640 they both have rear awards 99 00:03:58,560 --> 00:04:01,840 you have to be able to say which one it 100 00:04:00,640 --> 00:04:03,840 is 101 00:04:01,840 --> 00:04:06,000 when dealing with 102 00:04:03,840 --> 00:04:08,239 such a big database 103 00:04:06,000 --> 00:04:09,360 it's old user pretty much all user 104 00:04:08,239 --> 00:04:13,040 edited 105 00:04:09,360 --> 00:04:15,519 there's pretty much no bots that edit 106 00:04:13,040 --> 00:04:16,320 um it's 107 00:04:15,519 --> 00:04:17,680 um 108 00:04:16,320 --> 00:04:19,440 hosted 109 00:04:17,680 --> 00:04:20,959 and run by a 110 00:04:19,440 --> 00:04:23,919 non-profit organization called 111 00:04:20,959 --> 00:04:26,960 metabrains.org 112 00:04:23,919 --> 00:04:26,960 they do get funding 113 00:04:27,199 --> 00:04:31,440 the data is all public 114 00:04:30,479 --> 00:04:34,160 um 115 00:04:31,440 --> 00:04:36,960 mostly creative com zero license 116 00:04:34,160 --> 00:04:39,199 uh they get funding from uh things like 117 00:04:36,960 --> 00:04:40,080 google and spotify and 118 00:04:39,199 --> 00:04:42,720 other 119 00:04:40,080 --> 00:04:43,140 big music companies 120 00:04:42,720 --> 00:04:44,479 um 121 00:04:43,140 --> 00:04:46,960 [Music] 122 00:04:44,479 --> 00:04:49,360 so we provide the raw data and 123 00:04:46,960 --> 00:04:51,680 they'll use the data to help fix up 124 00:04:49,360 --> 00:04:54,000 their system and that's sort of 125 00:04:51,680 --> 00:04:57,280 um been enough to help 126 00:04:54,000 --> 00:05:00,000 fund the organization and keep a few 127 00:04:57,280 --> 00:05:04,000 key employees keeping the lights on 128 00:05:00,000 --> 00:05:04,000 fixing bugs adding new features 129 00:05:04,800 --> 00:05:08,479 um 130 00:05:06,800 --> 00:05:11,919 there's a public api 131 00:05:08,479 --> 00:05:13,680 anyone can do um use it 132 00:05:11,919 --> 00:05:15,120 uh the thing that we need 133 00:05:13,680 --> 00:05:17,199 from that if you're gonna use the public 134 00:05:15,120 --> 00:05:19,840 api is 135 00:05:17,199 --> 00:05:24,080 in your http headers you need to include 136 00:05:19,840 --> 00:05:25,759 a string to say who you are so we can 137 00:05:24,080 --> 00:05:27,440 let you know if you're doing something 138 00:05:25,759 --> 00:05:30,479 really bad 139 00:05:27,440 --> 00:05:32,639 that is occasionally being needed and 140 00:05:30,479 --> 00:05:33,680 if you have a qnap 141 00:05:32,639 --> 00:05:36,000 nas 142 00:05:33,680 --> 00:05:36,880 they would be 143 00:05:36,000 --> 00:05:39,840 um 144 00:05:36,880 --> 00:05:41,840 hitting the api every nas that 145 00:05:39,840 --> 00:05:42,800 was in existence was hitting the api so 146 00:05:41,840 --> 00:05:44,960 much 147 00:05:42,800 --> 00:05:47,759 that eventually they decided to block 148 00:05:44,960 --> 00:05:50,479 them and if you need to 149 00:05:47,759 --> 00:05:53,759 use the api you need to say can you 150 00:05:50,479 --> 00:05:56,160 please unblock me until they updated the 151 00:05:53,759 --> 00:05:59,520 software to not pit than that every 152 00:05:56,160 --> 00:05:59,520 single every five minutes 153 00:05:59,840 --> 00:06:03,919 um 154 00:06:01,039 --> 00:06:06,720 each entry hasn't edited history 155 00:06:03,919 --> 00:06:07,919 so you can easily say who was the idiot 156 00:06:06,720 --> 00:06:09,840 that 157 00:06:07,919 --> 00:06:12,800 made this change and 158 00:06:09,840 --> 00:06:15,199 okay hopefully it's not you that made it 159 00:06:12,800 --> 00:06:17,360 stuff things up 160 00:06:15,199 --> 00:06:17,360 so 161 00:06:18,000 --> 00:06:21,520 the basic data structure is we've got 162 00:06:20,080 --> 00:06:23,440 artists 163 00:06:21,520 --> 00:06:25,199 we've got release groups 164 00:06:23,440 --> 00:06:27,039 which i'll get to in a minute but that's 165 00:06:25,199 --> 00:06:30,080 sort of the 166 00:06:27,039 --> 00:06:32,080 overall concept of an album so 167 00:06:30,080 --> 00:06:35,120 there is a release which is a specific 168 00:06:32,080 --> 00:06:38,639 one so you might have a cd with 169 00:06:35,120 --> 00:06:39,600 13 tracks and a special edition with 18 170 00:06:38,639 --> 00:06:41,680 tracks 171 00:06:39,600 --> 00:06:43,520 and another edition with 172 00:06:41,680 --> 00:06:44,560 two cds 173 00:06:43,520 --> 00:06:45,360 so 174 00:06:44,560 --> 00:06:46,639 to 175 00:06:45,360 --> 00:06:48,880 fit 176 00:06:46,639 --> 00:06:51,199 all the concept of these different 177 00:06:48,880 --> 00:06:53,440 editions this the release group so 178 00:06:51,199 --> 00:06:54,639 anything that fits in the 179 00:06:53,440 --> 00:06:56,319 overall 180 00:06:54,639 --> 00:06:58,560 properties 181 00:06:56,319 --> 00:07:01,199 the data stores the release group 182 00:06:58,560 --> 00:07:03,840 anything that's specific to a 183 00:07:01,199 --> 00:07:03,840 a 184 00:07:04,000 --> 00:07:10,240 particular cd goes on the release 185 00:07:07,840 --> 00:07:11,599 there is medium so 186 00:07:10,240 --> 00:07:12,880 there's 187 00:07:11,599 --> 00:07:15,039 we can have 188 00:07:12,880 --> 00:07:17,680 data about things like 189 00:07:15,039 --> 00:07:20,240 records and 190 00:07:17,680 --> 00:07:22,240 wax cylinders and all sorts of other 191 00:07:20,240 --> 00:07:23,599 medium types 192 00:07:22,240 --> 00:07:25,039 uh 193 00:07:23,599 --> 00:07:27,199 you can add 194 00:07:25,039 --> 00:07:29,360 of course i've got recordings 195 00:07:27,199 --> 00:07:31,039 we'll get to that bit later 196 00:07:29,360 --> 00:07:32,720 there's works 197 00:07:31,039 --> 00:07:35,120 which is sort of the writing credits and 198 00:07:32,720 --> 00:07:38,080 that sort of thing as opposed to 199 00:07:35,120 --> 00:07:38,080 who sung or not 200 00:07:38,240 --> 00:07:43,840 um there's labels 201 00:07:40,639 --> 00:07:45,919 um the series 202 00:07:43,840 --> 00:07:48,319 uh relationships between the two and 203 00:07:45,919 --> 00:07:51,360 then there's a few other things like 204 00:07:48,319 --> 00:07:53,759 um we've got a definitive list of 205 00:07:51,360 --> 00:07:55,520 all possible instruments 206 00:07:53,759 --> 00:07:58,000 and things like that that are maintained 207 00:07:55,520 --> 00:08:00,080 by the 208 00:07:58,000 --> 00:08:01,840 the editors and developers 209 00:08:00,080 --> 00:08:03,120 so 210 00:08:01,840 --> 00:08:05,919 here's 211 00:08:03,120 --> 00:08:08,240 an example of what you'd go to see daft 212 00:08:05,919 --> 00:08:11,240 punk 213 00:08:08,240 --> 00:08:11,240 um 214 00:08:12,639 --> 00:08:16,960 one thing that you might want to 215 00:08:15,120 --> 00:08:19,599 take notice of 216 00:08:16,960 --> 00:08:20,720 for the from the glam perspective is 217 00:08:19,599 --> 00:08:23,120 uh we 218 00:08:20,720 --> 00:08:26,879 try to link to other databases and have 219 00:08:23,120 --> 00:08:30,560 identifiers as much as we can so 220 00:08:26,879 --> 00:08:31,440 ipi and isis are 221 00:08:30,560 --> 00:08:33,200 um 222 00:08:31,440 --> 00:08:36,000 therefore 223 00:08:33,200 --> 00:08:40,640 right societies and that sort of thing 224 00:08:36,000 --> 00:08:41,519 to track a particular person or a group 225 00:08:40,640 --> 00:08:43,680 um 226 00:08:41,519 --> 00:08:45,519 there's attributes to say 227 00:08:43,680 --> 00:08:46,560 who are the members of the group they'll 228 00:08:45,519 --> 00:08:48,959 be another 229 00:08:46,560 --> 00:08:50,959 artist entry so it 230 00:08:48,959 --> 00:08:53,200 so you can have 231 00:08:50,959 --> 00:08:55,839 further detail 232 00:08:53,200 --> 00:08:55,839 um 233 00:08:57,680 --> 00:09:02,880 we link to official home pages um 234 00:09:00,720 --> 00:09:04,240 youtube 235 00:09:02,880 --> 00:09:06,959 pages 236 00:09:04,240 --> 00:09:09,680 twitter accounts that sort of thing 237 00:09:06,959 --> 00:09:12,480 so if you've got a sound cloud 238 00:09:09,680 --> 00:09:14,000 we want that sort of information so that 239 00:09:12,480 --> 00:09:16,800 makes it easier for 240 00:09:14,000 --> 00:09:20,800 if someone finds an artist they can 241 00:09:16,800 --> 00:09:20,800 find more information about them 242 00:09:21,440 --> 00:09:24,160 so 243 00:09:22,880 --> 00:09:26,240 yeah lots of 244 00:09:24,160 --> 00:09:27,440 lots of urls lots of 245 00:09:26,240 --> 00:09:29,440 extra data 246 00:09:27,440 --> 00:09:31,839 so 247 00:09:29,440 --> 00:09:35,200 uh release groups 248 00:09:31,839 --> 00:09:37,200 sort of the overall thing so 249 00:09:35,200 --> 00:09:37,920 one thing that i'd like 250 00:09:37,200 --> 00:09:40,959 to 251 00:09:37,920 --> 00:09:43,680 to include is if you 252 00:09:40,959 --> 00:09:46,080 look down a bit further that 253 00:09:43,680 --> 00:09:48,560 um you can have um 254 00:09:46,080 --> 00:09:48,560 link to 255 00:09:48,839 --> 00:09:53,600 singles linked to other releases and 256 00:09:51,680 --> 00:09:56,560 there's a whole bunch of relationships 257 00:09:53,600 --> 00:09:58,399 of this relates to this other thing 258 00:09:56,560 --> 00:10:00,160 so everything's built 259 00:09:58,399 --> 00:10:03,600 with relationships 260 00:10:00,160 --> 00:10:05,279 that you as an editor can add and 261 00:10:03,600 --> 00:10:08,160 expanding 262 00:10:05,279 --> 00:10:09,839 um the knowledge so 263 00:10:08,160 --> 00:10:12,640 so if you look at 264 00:10:09,839 --> 00:10:14,480 associated singles from 265 00:10:12,640 --> 00:10:19,399 random access memories they've got get 266 00:10:14,480 --> 00:10:19,399 lucky instant crush etc 267 00:10:20,560 --> 00:10:26,160 so getting back down to the 268 00:10:23,440 --> 00:10:29,680 further level of a release 269 00:10:26,160 --> 00:10:30,959 you can have things like barcode 270 00:10:29,680 --> 00:10:33,640 there's 271 00:10:30,959 --> 00:10:35,600 cover art isn't hosted by 272 00:10:33,640 --> 00:10:38,399 musicbrainz.org 273 00:10:35,600 --> 00:10:39,760 that's hosted by archive.org for on our 274 00:10:38,399 --> 00:10:40,880 behalf 275 00:10:39,760 --> 00:10:44,160 so if you 276 00:10:40,880 --> 00:10:44,160 upload cover uh 277 00:10:44,640 --> 00:10:49,760 archive.org will take 278 00:10:47,680 --> 00:10:52,160 we'll host it for us and 279 00:10:49,760 --> 00:10:54,560 if there's a copyright issue 280 00:10:52,160 --> 00:10:57,040 they can deal with it as another 281 00:10:54,560 --> 00:11:00,040 organization 282 00:10:57,040 --> 00:11:00,040 um 283 00:11:01,200 --> 00:11:05,360 you've got a list of tracks 284 00:11:03,360 --> 00:11:07,040 there 285 00:11:05,360 --> 00:11:09,600 there'll be catalog numbers and that 286 00:11:07,040 --> 00:11:09,600 sort of thing 287 00:11:10,800 --> 00:11:14,800 um 288 00:11:12,079 --> 00:11:17,120 so mediums 289 00:11:14,800 --> 00:11:17,120 um 290 00:11:17,200 --> 00:11:21,040 there'll be a type so it'll be cd or dvd 291 00:11:20,240 --> 00:11:22,959 or 292 00:11:21,040 --> 00:11:24,240 just digital to download 293 00:11:22,959 --> 00:11:26,480 um 294 00:11:24,240 --> 00:11:28,399 one of the edge cases that i took or two 295 00:11:26,480 --> 00:11:31,200 educations that i'd like to 296 00:11:28,399 --> 00:11:31,200 point out is 297 00:11:31,440 --> 00:11:35,760 a thing called a pre-gap track 298 00:11:34,560 --> 00:11:39,760 um 299 00:11:35,760 --> 00:11:43,360 some cds when people were experimenting 300 00:11:39,760 --> 00:11:45,920 um a cd is not a doesn't have a table of 301 00:11:43,360 --> 00:11:47,200 con a proper table of contents it's not 302 00:11:45,920 --> 00:11:49,680 a data track 303 00:11:47,200 --> 00:11:50,480 it's just pcm audio 304 00:11:49,680 --> 00:11:52,240 and 305 00:11:50,480 --> 00:11:55,120 it's got a 306 00:11:52,240 --> 00:11:58,320 when you put in a cd it reads 307 00:11:55,120 --> 00:11:59,839 the table of contents is start at 308 00:11:58,320 --> 00:12:02,160 one minute start at three minutes 309 00:11:59,839 --> 00:12:04,639 started five minutes for the next track 310 00:12:02,160 --> 00:12:07,680 with a two seven gap 311 00:12:04,639 --> 00:12:09,200 um there are a handful of cds that have 312 00:12:07,680 --> 00:12:11,279 hidden tracks 313 00:12:09,200 --> 00:12:14,320 so you put the cd in 314 00:12:11,279 --> 00:12:15,200 you press rewind and there's hidden data 315 00:12:14,320 --> 00:12:16,240 there 316 00:12:15,200 --> 00:12:18,720 so 317 00:12:16,240 --> 00:12:20,800 if you've got some of the early hilltop 318 00:12:18,720 --> 00:12:23,519 hoods 319 00:12:20,800 --> 00:12:25,600 they have hidden tracks 320 00:12:23,519 --> 00:12:27,519 it's just them chatting but 321 00:12:25,600 --> 00:12:28,399 some cool things like that 322 00:12:27,519 --> 00:12:30,320 um 323 00:12:28,399 --> 00:12:32,560 the other example 324 00:12:30,320 --> 00:12:34,320 um 325 00:12:32,560 --> 00:12:37,040 is a from 326 00:12:34,320 --> 00:12:38,720 nine inch nails broken 327 00:12:37,040 --> 00:12:41,279 what they did is 328 00:12:38,720 --> 00:12:44,240 the maximum tracks that you can have on 329 00:12:41,279 --> 00:12:46,000 a cd is 99 330 00:12:44,240 --> 00:12:48,160 so they had 331 00:12:46,000 --> 00:12:49,839 the first six were normal 332 00:12:48,160 --> 00:12:53,600 then 333 00:12:49,839 --> 00:12:56,160 a whole bunch of tracks that had 334 00:12:53,600 --> 00:12:57,519 a fraction of a second as audio and then 335 00:12:56,160 --> 00:13:00,720 the last two 336 00:12:57,519 --> 00:13:02,959 are the tracks 98 and 99 so 337 00:13:00,720 --> 00:13:05,519 if you've got a cd player 338 00:13:02,959 --> 00:13:07,519 it would play skip to last two and you 339 00:13:05,519 --> 00:13:10,800 can't go back really easily because 340 00:13:07,519 --> 00:13:10,800 there's a whole bunch of silence 341 00:13:12,320 --> 00:13:15,920 so 342 00:13:13,760 --> 00:13:17,440 recordings is where you add an awful lot 343 00:13:15,920 --> 00:13:21,279 of data 344 00:13:17,440 --> 00:13:22,240 if you it's sort of up to the users how 345 00:13:21,279 --> 00:13:24,240 um 346 00:13:22,240 --> 00:13:26,560 good the data is but 347 00:13:24,240 --> 00:13:29,120 uh the schema allowance for things like 348 00:13:26,560 --> 00:13:32,839 who played best bass guitar who played 349 00:13:29,120 --> 00:13:35,760 keyboard who who 350 00:13:32,839 --> 00:13:38,320 sung so 351 00:13:35,760 --> 00:13:39,279 with some of these really popular tracks 352 00:13:38,320 --> 00:13:42,000 you get 353 00:13:39,279 --> 00:13:44,079 extra data that you can use to tag your 354 00:13:42,000 --> 00:13:45,920 music 355 00:13:44,079 --> 00:13:48,639 and 356 00:13:45,920 --> 00:13:50,800 the other thing is they usually if you 357 00:13:48,639 --> 00:13:52,800 someone's gone to the time 358 00:13:50,800 --> 00:13:54,880 they'll have works 359 00:13:52,800 --> 00:13:56,959 so works is 360 00:13:54,880 --> 00:13:57,920 who wrote the thing 361 00:13:56,959 --> 00:14:00,399 um 362 00:13:57,920 --> 00:14:02,800 it's got um 363 00:14:00,399 --> 00:14:04,800 ids from right societies 364 00:14:02,800 --> 00:14:07,279 allowing someone else to 365 00:14:04,800 --> 00:14:08,320 double check your work 366 00:14:07,279 --> 00:14:11,120 and 367 00:14:08,320 --> 00:14:11,120 they're all linked to 368 00:14:11,199 --> 00:14:15,440 to who actually wrote it and 369 00:14:14,320 --> 00:14:16,560 yeah 370 00:14:15,440 --> 00:14:18,240 um 371 00:14:16,560 --> 00:14:19,760 the other thing that 372 00:14:18,240 --> 00:14:21,920 uh 373 00:14:19,760 --> 00:14:23,279 you want to take note of is uh classical 374 00:14:21,920 --> 00:14:24,399 music 375 00:14:23,279 --> 00:14:26,800 uh 376 00:14:24,399 --> 00:14:28,399 in music brains is sort of 377 00:14:26,800 --> 00:14:31,040 more important 378 00:14:28,399 --> 00:14:32,639 to have works than it is to have 379 00:14:31,040 --> 00:14:35,120 artists on track 380 00:14:32,639 --> 00:14:37,279 because the classical music 381 00:14:35,120 --> 00:14:38,240 they tend to deal with 382 00:14:37,279 --> 00:14:41,360 um 383 00:14:38,240 --> 00:14:45,440 barriers which have movements and sub 384 00:14:41,360 --> 00:14:47,199 sub subsections so having that sort of 385 00:14:45,440 --> 00:14:49,440 structure 386 00:14:47,199 --> 00:14:50,959 as a series of works that link to other 387 00:14:49,440 --> 00:14:51,920 sub works 388 00:14:50,959 --> 00:14:54,000 allows 389 00:14:51,920 --> 00:14:55,519 people that are interested in classical 390 00:14:54,000 --> 00:14:59,839 music to say 391 00:14:55,519 --> 00:14:59,839 i want this want to listen to this bits 392 00:15:00,240 --> 00:15:04,320 um 393 00:15:02,480 --> 00:15:05,760 so 394 00:15:04,320 --> 00:15:07,839 if you've got time 395 00:15:05,760 --> 00:15:10,320 feel free to look at 396 00:15:07,839 --> 00:15:13,279 there's also labels 397 00:15:10,320 --> 00:15:16,240 which is sort of who you are into the 398 00:15:13,279 --> 00:15:17,680 copyright and who is the publisher 399 00:15:16,240 --> 00:15:19,360 and uh 400 00:15:17,680 --> 00:15:20,800 there's a 401 00:15:19,360 --> 00:15:22,480 list of string 402 00:15:20,800 --> 00:15:23,920 list of uh 403 00:15:22,480 --> 00:15:25,279 series is sort of something that they've 404 00:15:23,920 --> 00:15:27,760 added 405 00:15:25,279 --> 00:15:30,560 sort of five years ago for things like 406 00:15:27,760 --> 00:15:33,040 compilation albums which 407 00:15:30,560 --> 00:15:34,639 it's going to be the ministry is down 408 00:15:33,040 --> 00:15:36,639 year number 409 00:15:34,639 --> 00:15:38,560 that sort of thing 410 00:15:36,639 --> 00:15:40,399 so everything is a 411 00:15:38,560 --> 00:15:42,399 sort of fixed structure 412 00:15:40,399 --> 00:15:42,870 um creating 413 00:15:42,399 --> 00:15:45,040 um 414 00:15:42,870 --> 00:15:46,560 [Music] 415 00:15:45,040 --> 00:15:48,320 creating new relationships that's sort 416 00:15:46,560 --> 00:15:49,680 of hard coded 417 00:15:48,320 --> 00:15:51,600 so 418 00:15:49,680 --> 00:15:54,399 there's a lot of 419 00:15:51,600 --> 00:15:57,199 asking in forums discussing tickets on 420 00:15:54,399 --> 00:15:57,199 what gets added 421 00:15:58,720 --> 00:16:03,360 and some things like musical instruments 422 00:16:01,440 --> 00:16:05,839 there's a person that 423 00:16:03,360 --> 00:16:08,480 has a list of possible music instruments 424 00:16:05,839 --> 00:16:11,759 and you've got to ask them to add it to 425 00:16:08,480 --> 00:16:12,800 the system in the background back end 426 00:16:11,759 --> 00:16:15,040 um 427 00:16:12,800 --> 00:16:17,120 there's a query api which 428 00:16:15,040 --> 00:16:18,800 you send it a string and it'll return 429 00:16:17,120 --> 00:16:20,000 you with the album or the barcode or 430 00:16:18,800 --> 00:16:23,040 something like that 431 00:16:20,000 --> 00:16:26,240 and then you probably want to look up 432 00:16:23,040 --> 00:16:28,079 quit do a look up thing to say give me 433 00:16:26,240 --> 00:16:29,120 info about this id 434 00:16:28,079 --> 00:16:31,120 this 435 00:16:29,120 --> 00:16:33,360 album 436 00:16:31,120 --> 00:16:34,480 release group etc 437 00:16:33,360 --> 00:16:36,880 so the other 438 00:16:34,480 --> 00:16:39,360 sort of related thing 439 00:16:36,880 --> 00:16:41,600 if i've got time is um 440 00:16:39,360 --> 00:16:44,320 let's talk about wikidata 441 00:16:41,600 --> 00:16:46,639 so that sort of built in 442 00:16:44,320 --> 00:16:49,600 a different sort of thing it's 443 00:16:46,639 --> 00:16:52,079 everything is item property value 444 00:16:49,600 --> 00:16:54,480 so everything's built from that sort of 445 00:16:52,079 --> 00:16:55,360 basic structure 446 00:16:54,480 --> 00:16:56,720 so 447 00:16:55,360 --> 00:17:00,560 if you go to 448 00:16:56,720 --> 00:17:00,560 daf punk's wiki data entry 449 00:17:00,800 --> 00:17:07,120 it's an instance of an electronic 450 00:17:03,600 --> 00:17:09,120 uh electronic duo which is a 451 00:17:07,120 --> 00:17:12,559 instance of a musical group 452 00:17:09,120 --> 00:17:12,559 they've got start and end years 453 00:17:13,600 --> 00:17:18,439 he does good for things like um 454 00:17:18,640 --> 00:17:23,439 awards received 455 00:17:20,319 --> 00:17:25,120 you can easily add more rewards so they 456 00:17:23,439 --> 00:17:28,160 find a grammy 457 00:17:25,120 --> 00:17:28,160 record that properly 458 00:17:28,400 --> 00:17:32,799 and it's another thing for 459 00:17:30,960 --> 00:17:35,200 good source of 460 00:17:32,799 --> 00:17:36,400 including external line identifiers 461 00:17:35,200 --> 00:17:38,400 so you can 462 00:17:36,400 --> 00:17:40,400 from music brains you can link to 463 00:17:38,400 --> 00:17:42,640 wikidata and from wikidata you can link 464 00:17:40,400 --> 00:17:46,240 back to musicbrainz so 465 00:17:42,640 --> 00:17:49,120 looking at one you can look at the other 466 00:17:46,240 --> 00:17:50,480 um properties are added so relationships 467 00:17:49,120 --> 00:17:54,799 are added 468 00:17:50,480 --> 00:17:54,799 through a voting process process 469 00:17:55,200 --> 00:17:58,640 you sort of 470 00:17:56,960 --> 00:18:01,280 go through that process and say i want 471 00:17:58,640 --> 00:18:04,320 to have 472 00:18:01,280 --> 00:18:05,760 for pro something so going from a talk 473 00:18:04,320 --> 00:18:08,400 earlier 474 00:18:05,760 --> 00:18:09,520 if i could 475 00:18:08,400 --> 00:18:10,960 add my 476 00:18:09,520 --> 00:18:12,720 web id 477 00:18:10,960 --> 00:18:14,640 from talk earlier 478 00:18:12,720 --> 00:18:18,160 that was a 479 00:18:14,640 --> 00:18:18,160 something that was proposed 480 00:18:18,240 --> 00:18:22,960 and 481 00:18:19,360 --> 00:18:24,880 give it a what format of your urls uh 482 00:18:22,960 --> 00:18:27,679 and 483 00:18:24,880 --> 00:18:29,039 pattern matching and that's how wikidata 484 00:18:27,679 --> 00:18:31,440 extends there 485 00:18:29,039 --> 00:18:32,640 possible things that you can 486 00:18:31,440 --> 00:18:33,679 uh 487 00:18:32,640 --> 00:18:35,440 use 488 00:18:33,679 --> 00:18:36,880 so querying 489 00:18:35,440 --> 00:18:42,120 um 490 00:18:36,880 --> 00:18:42,120 is done through a thing called a graphql 491 00:18:42,320 --> 00:18:48,640 it's done through a thing called a spa 492 00:18:45,840 --> 00:18:50,640 through a sparkle i mean 493 00:18:48,640 --> 00:18:52,320 it sort of looks like this 494 00:18:50,640 --> 00:18:53,679 you 495 00:18:52,320 --> 00:18:54,840 you have a 496 00:18:53,679 --> 00:18:58,960 list of select 497 00:18:54,840 --> 00:19:01,120 statements so you define what you want 498 00:18:58,960 --> 00:19:02,960 you'll have 499 00:19:01,120 --> 00:19:05,120 a list of where sort of 500 00:19:02,960 --> 00:19:08,720 clauses to say 501 00:19:05,120 --> 00:19:10,240 this property on an entry 502 00:19:08,720 --> 00:19:12,799 look for things with this property on 503 00:19:10,240 --> 00:19:14,320 the entry 504 00:19:12,799 --> 00:19:17,200 where's 505 00:19:14,320 --> 00:19:18,080 exist or not exist this such and such 506 00:19:17,200 --> 00:19:20,799 limit 507 00:19:18,080 --> 00:19:21,840 so this is sort of the 508 00:19:20,799 --> 00:19:23,280 that's 509 00:19:21,840 --> 00:19:25,679 thing um 510 00:19:23,280 --> 00:19:28,160 so 511 00:19:25,679 --> 00:19:31,039 changing gears to 512 00:19:28,160 --> 00:19:34,000 if you wanted to look at 513 00:19:31,039 --> 00:19:35,520 uh things with the act my id 514 00:19:34,000 --> 00:19:37,280 this is the sort of query that you'd 515 00:19:35,520 --> 00:19:39,120 have to say 516 00:19:37,280 --> 00:19:42,720 give me the 517 00:19:39,120 --> 00:19:42,720 act my id which is that there 518 00:19:43,039 --> 00:19:47,360 and 519 00:19:44,960 --> 00:19:48,720 optionally return genre if they've got 520 00:19:47,360 --> 00:19:49,840 that on the 521 00:19:48,720 --> 00:19:53,679 film 522 00:19:49,840 --> 00:19:55,919 option include the country the director 523 00:19:53,679 --> 00:19:57,120 and based on so 524 00:19:55,919 --> 00:19:59,200 there's all sorts of things that you can 525 00:19:57,120 --> 00:20:00,640 do with 526 00:19:59,200 --> 00:20:02,559 with that so 527 00:20:00,640 --> 00:20:04,159 i think my time is 528 00:20:02,559 --> 00:20:06,880 nearly up 529 00:20:04,159 --> 00:20:09,120 so 530 00:20:06,880 --> 00:20:10,720 yes thank you so much and we've actually 531 00:20:09,120 --> 00:20:12,080 got quite a few questions for you which 532 00:20:10,720 --> 00:20:14,480 is fantastic 533 00:20:12,080 --> 00:20:17,120 um first question is are there any 534 00:20:14,480 --> 00:20:21,240 metadata standards for this stuff or is 535 00:20:17,120 --> 00:20:21,240 music brains just the de facto 536 00:20:22,559 --> 00:20:26,880 it's we've got an api and you use the 537 00:20:24,640 --> 00:20:28,000 api it's yeah 538 00:20:26,880 --> 00:20:28,960 it's all 539 00:20:28,000 --> 00:20:31,159 um 540 00:20:28,960 --> 00:20:34,240 it's available in 541 00:20:31,159 --> 00:20:36,000 json.xml idf 542 00:20:34,240 --> 00:20:38,720 it's sort of 543 00:20:36,000 --> 00:20:41,280 it used to be 544 00:20:38,720 --> 00:20:43,840 they've de-emphasized idf because 545 00:20:41,280 --> 00:20:44,960 no one was really using it it was an api 546 00:20:43,840 --> 00:20:47,280 that no one 547 00:20:44,960 --> 00:20:49,039 really used so everyone just uses json 548 00:20:47,280 --> 00:20:50,080 yeah pretty much 549 00:20:49,039 --> 00:20:52,880 okay 550 00:20:50,080 --> 00:20:56,720 um how does music brains deal with 551 00:20:52,880 --> 00:20:56,720 erroneous or conflicting data 552 00:20:56,880 --> 00:21:03,120 um 553 00:20:58,720 --> 00:21:05,840 it's usually someone will edit it out so 554 00:21:03,120 --> 00:21:06,880 i think the data quality is it's pretty 555 00:21:05,840 --> 00:21:11,520 much 556 00:21:06,880 --> 00:21:13,679 there is 90 95 accurate 557 00:21:11,520 --> 00:21:16,000 looking at random cd 558 00:21:13,679 --> 00:21:16,880 find the occasional typho here or there 559 00:21:16,000 --> 00:21:18,000 but 560 00:21:16,880 --> 00:21:20,080 um sort of 561 00:21:18,000 --> 00:21:22,720 because the way that the editing system 562 00:21:20,080 --> 00:21:26,159 is it enforces constraints so that 563 00:21:22,720 --> 00:21:28,080 people are less likely to make idiot 564 00:21:26,159 --> 00:21:30,159 moves you sort of 565 00:21:28,080 --> 00:21:31,760 feeds you down the path to the most 566 00:21:30,159 --> 00:21:34,480 correct 567 00:21:31,760 --> 00:21:39,200 data and someone can then 568 00:21:34,480 --> 00:21:39,200 um come back later and fix your mistakes 569 00:21:40,000 --> 00:21:45,840 um in the streaming world how does music 570 00:21:42,480 --> 00:21:45,840 brains work 571 00:21:46,080 --> 00:21:50,559 it 572 00:21:48,240 --> 00:21:54,000 doesn't really make a difference it's an 573 00:21:50,559 --> 00:21:54,799 album it's got a list of tracks 574 00:21:54,000 --> 00:21:55,760 it 575 00:21:54,799 --> 00:21:57,280 yeah 576 00:21:55,760 --> 00:21:59,520 it 577 00:21:57,280 --> 00:22:01,280 there's a digital medium so 578 00:21:59,520 --> 00:22:04,000 you might get 579 00:22:01,280 --> 00:22:05,039 it that it's digital and that's it 580 00:22:04,000 --> 00:22:08,000 okay 581 00:22:05,039 --> 00:22:09,760 um do record labels or artists send data 582 00:22:08,000 --> 00:22:11,840 to music brains in a way that can be 583 00:22:09,760 --> 00:22:14,640 ingested easily or are they just 584 00:22:11,840 --> 00:22:17,280 generally unhelpful 585 00:22:14,640 --> 00:22:19,679 uh they're generally unhelpful there's 586 00:22:17,280 --> 00:22:22,240 there's been a few proposals of back-end 587 00:22:19,679 --> 00:22:22,240 systems but 588 00:22:22,799 --> 00:22:27,600 they've sort of been 589 00:22:24,960 --> 00:22:28,559 against automation a fair bit it's sort 590 00:22:27,600 --> 00:22:30,799 of 591 00:22:28,559 --> 00:22:34,960 we want quality instead of 592 00:22:30,799 --> 00:22:34,960 random junk that the label sent you so 593 00:22:35,440 --> 00:22:41,120 yeah um and one last question is it only 594 00:22:38,799 --> 00:22:43,360 via api or are there forms that people 595 00:22:41,120 --> 00:22:45,280 can fill in to 596 00:22:43,360 --> 00:22:48,880 add data 597 00:22:45,280 --> 00:22:50,799 so you just go the website 598 00:22:48,880 --> 00:22:53,280 if you want to 599 00:22:50,799 --> 00:22:53,280 find a 600 00:22:53,600 --> 00:22:58,960 find dark punk 601 00:22:56,640 --> 00:23:02,240 punk 602 00:22:58,960 --> 00:23:02,240 so say if i wanted to 603 00:23:03,760 --> 00:23:07,440 edit this and 604 00:23:05,360 --> 00:23:09,600 mess with the title or something 605 00:23:07,440 --> 00:23:10,640 forget it 606 00:23:09,600 --> 00:23:13,120 and 607 00:23:10,640 --> 00:23:14,559 that's the title ah cool 608 00:23:13,120 --> 00:23:17,280 change the 609 00:23:14,559 --> 00:23:17,280 article 610 00:23:17,360 --> 00:23:21,280 something else 611 00:23:19,440 --> 00:23:23,120 click next next 612 00:23:21,280 --> 00:23:25,760 and 613 00:23:23,120 --> 00:23:28,240 click accept but you need to put in that 614 00:23:25,760 --> 00:23:29,360 note to say what you're doing so 615 00:23:28,240 --> 00:23:31,120 yeah 616 00:23:29,360 --> 00:23:33,360 sort of now that seems quite simple you 617 00:23:31,120 --> 00:23:34,799 follow the bounding box and hopefully 618 00:23:33,360 --> 00:23:36,559 get there 619 00:23:34,799 --> 00:23:39,120 we should um we should all contribute to 620 00:23:36,559 --> 00:23:41,360 this um what's relatively easy today so 621 00:23:39,120 --> 00:23:43,840 um that's fantastic well thank you so 622 00:23:41,360 --> 00:23:46,960 much daniel that's um 623 00:23:43,840 --> 00:23:48,559 it was great to see those those two 624 00:23:46,960 --> 00:23:50,400 there's two sites for us to contribute 625 00:23:48,559 --> 00:23:52,080 to because i always know that i'm happy 626 00:23:50,400 --> 00:23:55,440 to put in 627 00:23:52,080 --> 00:23:58,080 more data where possible to help out so 628 00:23:55,440 --> 00:24:00,400 thank you so much 629 00:23:58,080 --> 00:24:02,880 go to musicbrainz.org 630 00:24:00,400 --> 00:24:05,279 download a program called picard 631 00:24:02,880 --> 00:24:07,679 that's the tagger that they work on and 632 00:24:05,279 --> 00:24:10,320 there's a few other taggers so 633 00:24:07,679 --> 00:24:12,880 some of the cd rippers will 634 00:24:10,320 --> 00:24:16,320 you put the cd in it'll 635 00:24:12,880 --> 00:24:16,320 automatically retrieve the data 636 00:24:16,480 --> 00:24:18,880 wonderful 637 00:24:17,520 --> 00:24:22,520 thank you 638 00:24:18,880 --> 00:24:22,520 okay thank you