1
00:00:12,080 --> 00:00:15,519
hello welcome back

2
00:00:13,920 --> 00:00:18,080
next up on the data science and

3
00:00:15,519 --> 00:00:19,520
analytics track here at pycon australia

4
00:00:18,080 --> 00:00:22,640
2021

5
00:00:19,520 --> 00:00:25,119
is coming from london a alcazine who's

6
00:00:22,640 --> 00:00:26,000
an ex-cosmologist turned data scientist

7
00:00:25,119 --> 00:00:27,680
without

8
00:00:26,000 --> 00:00:29,920
over 10 years experience in machine

9
00:00:27,680 --> 00:00:32,000
learning statistical inference and data

10
00:00:29,920 --> 00:00:34,079
inside visualizations

11
00:00:32,000 --> 00:00:35,760
ayal's claim to fame is having lived in

12
00:00:34,079 --> 00:00:38,399
four different continents within the

13
00:00:35,760 --> 00:00:40,960
span of a decade including three tennis

14
00:00:38,399 --> 00:00:43,280
grand slam cities like new york where he

15
00:00:40,960 --> 00:00:45,760
obtained his phd in astrophysics

16
00:00:43,280 --> 00:00:48,000
melbourne where he did a post postdoc

17
00:00:45,760 --> 00:00:49,520
stint at swinburne and london where he

18
00:00:48,000 --> 00:00:51,600
now resides

19
00:00:49,520 --> 00:00:54,000
ayal's topic today is one that has a

20
00:00:51,600 --> 00:00:55,840
direct relationship to real life

21
00:00:54,000 --> 00:00:58,000
how do we optimize for multiple

22
00:00:55,840 --> 00:01:00,960
objectives especially when they are in

23
00:00:58,000 --> 00:01:03,760
conflict for example how can one best

24
00:01:00,960 --> 00:01:05,760
overcome the classic trade-off between

25
00:01:03,760 --> 00:01:08,400
quality and cost

26
00:01:05,760 --> 00:01:10,000
in this talk ayal's willing yell will

27
00:01:08,400 --> 00:01:12,080
introduce us to multi-objective

28
00:01:10,000 --> 00:01:13,520
optimization using pareto fronts for the

29
00:01:12,080 --> 00:01:16,159
data-driven decision makers in the

30
00:01:13,520 --> 00:01:17,040
audience which i honestly hope is all of

31
00:01:16,159 --> 00:01:18,640
you

32
00:01:17,040 --> 00:01:20,320
we'll hear about the advantages and

33
00:01:18,640 --> 00:01:22,320
shortcomings of the technique and be

34
00:01:20,320 --> 00:01:23,759
able to assess applicability for our own

35
00:01:22,320 --> 00:01:26,479
projects

36
00:01:23,759 --> 00:01:29,040
welcome back to melbourne hey y'all

37
00:01:26,479 --> 00:01:32,479
everybody a yel casin and improved

38
00:01:29,040 --> 00:01:35,200
decision making with pareto france

39
00:01:32,479 --> 00:01:36,799
hi good morning

40
00:01:35,200 --> 00:01:39,200
hi good morning i just want to mention

41
00:01:36,799 --> 00:01:41,680
that there is a technical difficulty and

42
00:01:39,200 --> 00:01:43,200
i cannot hear anything so xavier if you

43
00:01:41,680 --> 00:01:44,640
have anything to say please put it in

44
00:01:43,200 --> 00:01:46,320
the chat at the moment and we'll just

45
00:01:44,640 --> 00:01:47,439
have to communicate that way i hope

46
00:01:46,320 --> 00:01:50,079
that's fine

47
00:01:47,439 --> 00:01:51,920
um so i'll just give you one second to

48
00:01:50,079 --> 00:01:53,360
give me that feedback otherwise i'll

49
00:01:51,920 --> 00:01:55,360
just start

50
00:01:53,360 --> 00:01:57,920
okay you hear me that's great okay great

51
00:01:55,360 --> 00:02:00,320
uh so hi everyone um hello from london

52
00:01:57,920 --> 00:02:03,680
uh my name is ayal i'm a senior data

53
00:02:00,320 --> 00:02:05,200
scientist at babylon and i'm very happy

54
00:02:03,680 --> 00:02:06,159
to be here at

55
00:02:05,200 --> 00:02:09,119
in this

56
00:02:06,159 --> 00:02:13,040
conference to talk about optimizing with

57
00:02:09,119 --> 00:02:14,640
pareto fronts um and any other things

58
00:02:13,040 --> 00:02:17,360
javier i should you mentioned before

59
00:02:14,640 --> 00:02:17,360
should i just start

60
00:02:18,640 --> 00:02:22,400
okay all right great so i'll start

61
00:02:23,840 --> 00:02:27,840
we're now living in an era of

62
00:02:25,680 --> 00:02:29,200
data-driven decision-making and there's

63
00:02:27,840 --> 00:02:32,400
always an interesting challenge that

64
00:02:29,200 --> 00:02:34,959
arises is how do we make decisions when

65
00:02:32,400 --> 00:02:38,000
dealing with multiple parameters

66
00:02:34,959 --> 00:02:40,239
normally these are not um there are

67
00:02:38,000 --> 00:02:41,519
trade-off decisions to be made making

68
00:02:40,239 --> 00:02:43,599
this a very

69
00:02:41,519 --> 00:02:45,200
challenging procedure what sort of

70
00:02:43,599 --> 00:02:47,599
trade-offs do i mean well you can always

71
00:02:45,200 --> 00:02:50,000
think about the classical trade-off for

72
00:02:47,599 --> 00:02:52,080
example between price and quality that

73
00:02:50,000 --> 00:02:54,560
we experience not only in our work lives

74
00:02:52,080 --> 00:02:56,319
but also in our everyday lives uh

75
00:02:54,560 --> 00:02:58,560
sometimes the relationship between

76
00:02:56,319 --> 00:03:01,920
quality and price is not that clear and

77
00:02:58,560 --> 00:03:04,000
so we have to make a trade-off decision

78
00:03:01,920 --> 00:03:06,319
and this also manifests in just about

79
00:03:04,000 --> 00:03:07,840
any industry that you can think of

80
00:03:06,319 --> 00:03:10,000
you could probably imagine in your own

81
00:03:07,840 --> 00:03:11,280
industry different considerations that

82
00:03:10,000 --> 00:03:13,680
you have to make

83
00:03:11,280 --> 00:03:17,120
for example you can imagine uh in the

84
00:03:13,680 --> 00:03:18,800
case of home food delivery services on

85
00:03:17,120 --> 00:03:21,360
the one hand they want to optimize the

86
00:03:18,800 --> 00:03:23,599
experience for

87
00:03:21,360 --> 00:03:26,000
people ordering food at home but also

88
00:03:23,599 --> 00:03:27,760
they have to take that optimize for the

89
00:03:26,000 --> 00:03:30,319
kitchens as well as the delivery

90
00:03:27,760 --> 00:03:33,440
personnel you cannot focus on one and

91
00:03:30,319 --> 00:03:35,920
neglect the others uh i got exposed to

92
00:03:33,440 --> 00:03:38,159
this in the field of drug discovery i

93
00:03:35,920 --> 00:03:40,000
worked in a biotech lab and uh with

94
00:03:38,159 --> 00:03:45,120
protein engineers and what i learned

95
00:03:40,000 --> 00:03:46,799
there is that uh in order for um a

96
00:03:45,120 --> 00:03:49,599
protein

97
00:03:46,799 --> 00:03:51,200
to be considered a viable drug

98
00:03:49,599 --> 00:03:52,640
for our health

99
00:03:51,200 --> 00:03:54,640
it has to pass

100
00:03:52,640 --> 00:03:57,680
rigorously many tests here i'm just

101
00:03:54,640 --> 00:03:59,680
listing a few characterizations that a

102
00:03:57,680 --> 00:04:03,360
protein has to abide by and it takes

103
00:03:59,680 --> 00:04:05,760
only one for a drug to fail the test and

104
00:04:03,360 --> 00:04:08,080
before the days of covid um these sort

105
00:04:05,760 --> 00:04:10,319
of processes would take um

106
00:04:08,080 --> 00:04:11,920
take up to a decade and these are very

107
00:04:10,319 --> 00:04:14,080
expensive like millions of billions of

108
00:04:11,920 --> 00:04:15,760
dollars so if something like toxicity

109
00:04:14,080 --> 00:04:18,880
doesn't pass you want to learn about

110
00:04:15,760 --> 00:04:22,079
this as soon as possible and hence the

111
00:04:18,880 --> 00:04:24,560
need to optimize for multiple parameters

112
00:04:22,079 --> 00:04:27,600
simultaneously normally when people

113
00:04:24,560 --> 00:04:30,240
think about optimizing for two or more

114
00:04:27,600 --> 00:04:32,479
parameters or objectives what they do is

115
00:04:30,240 --> 00:04:34,320
not multiple objectives they combine

116
00:04:32,479 --> 00:04:36,320
things them together into this heuristic

117
00:04:34,320 --> 00:04:38,000
one heuristic uh which actually in the

118
00:04:36,320 --> 00:04:39,680
literature is called single objective

119
00:04:38,000 --> 00:04:41,600
optimization and so one thing you'll

120
00:04:39,680 --> 00:04:44,160
take away from this talk is the

121
00:04:41,600 --> 00:04:46,560
limitations of this common practice and

122
00:04:44,160 --> 00:04:49,199
how this is resolved by a technique

123
00:04:46,560 --> 00:04:51,040
called pareto fronts um

124
00:04:49,199 --> 00:04:53,120
and why do you yield better solutions

125
00:04:51,040 --> 00:04:54,960
and then i'll talk about applicability

126
00:04:53,120 --> 00:04:56,479
and for those who are interested i'll

127
00:04:54,960 --> 00:04:59,199
provide you free on the material where

128
00:04:56,479 --> 00:05:01,600
you can gain hands-on experience

129
00:04:59,199 --> 00:05:04,160
to really master this topic

130
00:05:01,600 --> 00:05:06,639
this is for you if you use data to make

131
00:05:04,160 --> 00:05:09,440
decisions and you're interested in

132
00:05:06,639 --> 00:05:11,919
improving your optimization skills

133
00:05:09,440 --> 00:05:15,520
there's no real maths background needed

134
00:05:11,919 --> 00:05:18,080
and python knowledge can be fairly basic

135
00:05:15,520 --> 00:05:20,800
so just about a bit about myself thank

136
00:05:18,080 --> 00:05:23,520
you for the introduction javier

137
00:05:20,800 --> 00:05:26,960
i currently work as a data scientist in

138
00:05:23,520 --> 00:05:29,840
health tech um as heavier said i spent

139
00:05:26,960 --> 00:05:31,919
the end of my extensive academic career

140
00:05:29,840 --> 00:05:35,360
in swinburne university in melbourne

141
00:05:31,919 --> 00:05:37,440
australia um so very glad to be here in

142
00:05:35,360 --> 00:05:39,440
conference in australia

143
00:05:37,440 --> 00:05:41,360
and most relevant for this talk is two

144
00:05:39,440 --> 00:05:43,199
years that i spent in lab genius a

145
00:05:41,360 --> 00:05:45,360
biotech company in which i learned the

146
00:05:43,199 --> 00:05:48,400
trade of multi-objective optimization in

147
00:05:45,360 --> 00:05:50,880
the context of drug discovery

148
00:05:48,400 --> 00:05:53,680
um so i'll show like a use case from

149
00:05:50,880 --> 00:05:56,160
there uh this is uh uh and the agenda

150
00:05:53,680 --> 00:05:57,919
that i prepared for today in which uh we

151
00:05:56,160 --> 00:06:00,080
already covered the motivation uh next

152
00:05:57,919 --> 00:06:02,000
we'll talk about the basics of the

153
00:06:00,080 --> 00:06:04,400
concept of pareto fronts why are they

154
00:06:02,000 --> 00:06:06,880
useful uh then i'll give

155
00:06:04,400 --> 00:06:08,880
real world example with an emphasis on

156
00:06:06,880 --> 00:06:10,400
applicability how can you

157
00:06:08,880 --> 00:06:13,840
is this relevant for the projects you're

158
00:06:10,400 --> 00:06:15,280
working on and um this this this last

159
00:06:13,840 --> 00:06:17,039
part if there isn't time you'll have

160
00:06:15,280 --> 00:06:18,319
online material in which you'll be able

161
00:06:17,039 --> 00:06:19,600
to use something called genetic

162
00:06:18,319 --> 00:06:21,440
algorithms

163
00:06:19,600 --> 00:06:22,800
for those who are interested in applying

164
00:06:21,440 --> 00:06:25,039
in practice

165
00:06:22,800 --> 00:06:27,759
so with that uh i'll i can start with

166
00:06:25,039 --> 00:06:30,720
the talk um optimization means different

167
00:06:27,759 --> 00:06:32,639
things to people so for practitioners i

168
00:06:30,720 --> 00:06:35,520
like this um

169
00:06:32,639 --> 00:06:37,600
uh this definition by professor deb as a

170
00:06:35,520 --> 00:06:40,080
procedure of comparing feasible

171
00:06:37,600 --> 00:06:41,520
solutions until really there's no more

172
00:06:40,080 --> 00:06:43,520
that can be found

173
00:06:41,520 --> 00:06:44,800
either practically or just in terms of

174
00:06:43,520 --> 00:06:47,199
resources

175
00:06:44,800 --> 00:06:50,400
um you know a topic is interesting if

176
00:06:47,199 --> 00:06:53,680
the creator of the um illustrations of

177
00:06:50,400 --> 00:06:55,440
xkcd actually has a cartoon about it so

178
00:06:53,680 --> 00:06:58,000
i like this one over here and which is

179
00:06:55,440 --> 00:06:59,840
useful to understand the concepts of

180
00:06:58,000 --> 00:07:00,960
multi-objective optimization so imagine

181
00:06:59,840 --> 00:07:03,440
that you're

182
00:07:00,960 --> 00:07:06,080
so what you see here is um the creator

183
00:07:03,440 --> 00:07:08,880
randall just uh subjectively distributed

184
00:07:06,080 --> 00:07:11,199
his um how tasty he finds fruit and so

185
00:07:08,880 --> 00:07:12,720
let's just imagine you're an office mate

186
00:07:11,199 --> 00:07:15,039
of his and you want to treat him with

187
00:07:12,720 --> 00:07:16,400
the most tasty fruit according to him so

188
00:07:15,039 --> 00:07:18,400
of course you can't going to hand him a

189
00:07:16,400 --> 00:07:20,400
peach but let's say

190
00:07:18,400 --> 00:07:21,919
he also um you want him he's leaving the

191
00:07:20,400 --> 00:07:24,240
office and you want to provide him with

192
00:07:21,919 --> 00:07:26,319
the most easiest fruit to eat

193
00:07:24,240 --> 00:07:28,560
on on the go so then you'll give him

194
00:07:26,319 --> 00:07:30,400
seedless scrapes but then the question

195
00:07:28,560 --> 00:07:33,039
arises and this i actually want you to

196
00:07:30,400 --> 00:07:35,039
answer in the chat is

197
00:07:33,039 --> 00:07:38,000
which is the optimal fruit if you want

198
00:07:35,039 --> 00:07:40,479
to optimize both for taste and easy so

199
00:07:38,000 --> 00:07:43,039
just take um a few seconds to think

200
00:07:40,479 --> 00:07:44,960
about it and just pop in the chap which

201
00:07:43,039 --> 00:07:46,080
one fruit do you think would be optimal

202
00:07:44,960 --> 00:07:48,960
which one should you give them to it

203
00:07:46,080 --> 00:07:51,520
here both for ease and taste i'm not

204
00:07:48,960 --> 00:07:53,759
seeing anybody write in the chat yet

205
00:07:51,520 --> 00:07:54,960
so i'll give you a few more seconds i

206
00:07:53,759 --> 00:07:56,639
was told actually that it might be a

207
00:07:54,960 --> 00:07:57,759
delay in some countries

208
00:07:56,639 --> 00:07:59,280
um

209
00:07:57,759 --> 00:08:01,840
so remember you want to optimize for

210
00:07:59,280 --> 00:08:03,840
easy and tasty

211
00:08:01,840 --> 00:08:06,720
okay

212
00:08:03,840 --> 00:08:08,560
chris you say bananas

213
00:08:06,720 --> 00:08:11,199
i'm not sure about that

214
00:08:08,560 --> 00:08:13,039
i see one for peaches okay

215
00:08:11,199 --> 00:08:14,240
so while you're thinking about it i'll

216
00:08:13,039 --> 00:08:16,560
just give you the answer actually it was

217
00:08:14,240 --> 00:08:19,360
a trick question there's no one fruit

218
00:08:16,560 --> 00:08:21,840
that's optimal uh but rather we have to

219
00:08:19,360 --> 00:08:23,680
consider and this a set of optimal

220
00:08:21,840 --> 00:08:25,680
solutions called an optimal front or

221
00:08:23,680 --> 00:08:28,560
pareto front and the purpose of this

222
00:08:25,680 --> 00:08:30,000
talk is to have you understand why these

223
00:08:28,560 --> 00:08:32,320
are all considered

224
00:08:30,000 --> 00:08:33,919
equally optimal and

225
00:08:32,320 --> 00:08:38,080
how we can make use of this when we're

226
00:08:33,919 --> 00:08:38,080
doing multi-objective optimization

227
00:08:38,159 --> 00:08:42,399
so what is optimize and what does

228
00:08:40,159 --> 00:08:44,959
optimizing look like so imagine that we

229
00:08:42,399 --> 00:08:47,760
have in a parameter space in which we

230
00:08:44,959 --> 00:08:49,279
have two objectives uh very creatively i

231
00:08:47,760 --> 00:08:51,040
call them here objective one and

232
00:08:49,279 --> 00:08:53,040
objective two and

233
00:08:51,040 --> 00:08:55,279
this then this imagine that you're this

234
00:08:53,040 --> 00:08:56,880
is like as an analogy like a dark room

235
00:08:55,279 --> 00:08:59,279
you know nothing about the pro about

236
00:08:56,880 --> 00:09:00,560
these objectives this could be quality

237
00:08:59,279 --> 00:09:02,720
and price or this could be

238
00:09:00,560 --> 00:09:05,040
characteristics of a protein or this

239
00:09:02,720 --> 00:09:07,279
could be on the user experience of the

240
00:09:05,040 --> 00:09:09,839
the uh in home delivery service the

241
00:09:07,279 --> 00:09:12,399
experience of the people who ordered

242
00:09:09,839 --> 00:09:14,320
food or people who are creating the food

243
00:09:12,399 --> 00:09:15,920
in the kitchen so any two parameters

244
00:09:14,320 --> 00:09:18,560
that you can imagine you don't know what

245
00:09:15,920 --> 00:09:20,320
to expect so what does optimization look

246
00:09:18,560 --> 00:09:21,839
like so normally what

247
00:09:20,320 --> 00:09:23,680
what's common practice what i refer to

248
00:09:21,839 --> 00:09:25,920
as single objective optimization is you

249
00:09:23,680 --> 00:09:28,640
is is um

250
00:09:25,920 --> 00:09:30,320
what a lot of analysts do they just

251
00:09:28,640 --> 00:09:31,680
do some sort of multiplication for

252
00:09:30,320 --> 00:09:34,000
example and so what they're doing

253
00:09:31,680 --> 00:09:35,279
they're tunneling um let's say with the

254
00:09:34,000 --> 00:09:37,519
flashlight they're just going through

255
00:09:35,279 --> 00:09:39,839
this tunnel of space just optimizing for

256
00:09:37,519 --> 00:09:41,200
this one direction right there

257
00:09:39,839 --> 00:09:43,920
and then they'll conclude that this

258
00:09:41,200 --> 00:09:45,200
solution is the most optimal and so of

259
00:09:43,920 --> 00:09:46,959
course there's different algorithms you

260
00:09:45,200 --> 00:09:48,399
can add you can take the mean and let's

261
00:09:46,959 --> 00:09:50,399
say you want to that's when you're

262
00:09:48,399 --> 00:09:52,560
maximizing both but you can imagine

263
00:09:50,399 --> 00:09:53,760
maximizing one minimizing the other then

264
00:09:52,560 --> 00:09:57,200
you can take

265
00:09:53,760 --> 00:09:58,880
the difference or or divide okay

266
00:09:57,200 --> 00:10:00,800
when i was working the biology lab i

267
00:09:58,880 --> 00:10:02,959
learned that actually some processes is

268
00:10:00,800 --> 00:10:04,320
what's called i call linear one id

269
00:10:02,959 --> 00:10:06,160
optimization in which they choose one

270
00:10:04,320 --> 00:10:08,000
objective they maximize that one

271
00:10:06,160 --> 00:10:10,000
direction and then they turn around for

272
00:10:08,000 --> 00:10:12,560
the second one and then in this case

273
00:10:10,000 --> 00:10:15,040
like they say oh this is optimal

274
00:10:12,560 --> 00:10:16,320
you but that was that was

275
00:10:15,040 --> 00:10:17,839
a subjective choice why did they take

276
00:10:16,320 --> 00:10:19,680
objective one they could have just as

277
00:10:17,839 --> 00:10:22,320
easily take objective two

278
00:10:19,680 --> 00:10:23,839
uh go maximize in one direction and then

279
00:10:22,320 --> 00:10:25,120
maximize the other direction say well

280
00:10:23,839 --> 00:10:27,120
this is optimal

281
00:10:25,120 --> 00:10:30,160
right three different methods three

282
00:10:27,120 --> 00:10:31,839
different choices um and so imagine

283
00:10:30,160 --> 00:10:33,519
again this is using a flashlight in the

284
00:10:31,839 --> 00:10:34,720
dark room what do you really want to do

285
00:10:33,519 --> 00:10:37,279
well you want to turn on the light in

286
00:10:34,720 --> 00:10:39,440
the room and take have a full absorb the

287
00:10:37,279 --> 00:10:40,560
full solution space so here i'm

288
00:10:39,440 --> 00:10:42,399
highlighting

289
00:10:40,560 --> 00:10:44,399
um these three

290
00:10:42,399 --> 00:10:45,920
uh solutions that we chose that's

291
00:10:44,399 --> 00:10:48,079
optimal

292
00:10:45,920 --> 00:10:50,399
in this highly curated distribution but

293
00:10:48,079 --> 00:10:52,640
if we had the likes turned on then we're

294
00:10:50,399 --> 00:10:55,200
most likely you know subjectively just

295
00:10:52,640 --> 00:10:57,519
looking at and say um well this one is

296
00:10:55,200 --> 00:10:59,120
more likely to be more optimal than the

297
00:10:57,519 --> 00:11:00,959
rest of them

298
00:10:59,120 --> 00:11:03,279
that was very subjective but there's a

299
00:11:00,959 --> 00:11:05,279
whole lot of others which we can um with

300
00:11:03,279 --> 00:11:07,839
some mathematical rigor we can say well

301
00:11:05,279 --> 00:11:11,200
all of these are considered equally

302
00:11:07,839 --> 00:11:13,120
optimal they're called up a pareto front

303
00:11:11,200 --> 00:11:16,079
solutions and you can see of the three

304
00:11:13,120 --> 00:11:18,800
ones that are in red this one is not in

305
00:11:16,079 --> 00:11:21,360
this pro in its optimal set why is that

306
00:11:18,800 --> 00:11:23,040
well that's the purpose of the next

307
00:11:21,360 --> 00:11:24,880
size that we'll get to

308
00:11:23,040 --> 00:11:26,560
but before that we have to talk about

309
00:11:24,880 --> 00:11:29,839
subjectivity i mentioned the word

310
00:11:26,560 --> 00:11:32,000
subjectivity a few times and um why is

311
00:11:29,839 --> 00:11:34,480
that important um well i like to think

312
00:11:32,000 --> 00:11:37,360
of this quote by muliel

313
00:11:34,480 --> 00:11:40,399
which says that um

314
00:11:37,360 --> 00:11:42,160
in other words it says that if you think

315
00:11:40,399 --> 00:11:43,600
you truly believe that you understand

316
00:11:42,160 --> 00:11:45,360
something without exploring it then

317
00:11:43,600 --> 00:11:47,760
you're just fooling yourself so that's

318
00:11:45,360 --> 00:11:49,200
my takeaway from the statement and what

319
00:11:47,760 --> 00:11:51,120
does that have to do with what we're

320
00:11:49,200 --> 00:11:53,440
talking about is the fact that in any

321
00:11:51,120 --> 00:11:54,880
analysis as objective as we want to be

322
00:11:53,440 --> 00:11:57,680
we're going to make a subjective

323
00:11:54,880 --> 00:11:59,760
decision at some stage the question is

324
00:11:57,680 --> 00:12:01,839
when are we going to make this so when i

325
00:11:59,760 --> 00:12:03,040
talk about single objective optimization

326
00:12:01,839 --> 00:12:05,279
then normally

327
00:12:03,040 --> 00:12:06,880
we're creating an ad hoc heuristic not

328
00:12:05,279 --> 00:12:09,200
ad hoc sorry we're creating a heuristic

329
00:12:06,880 --> 00:12:11,279
before we're actually doing the search

330
00:12:09,200 --> 00:12:13,440
the purpose of this talk is to convey

331
00:12:11,279 --> 00:12:15,279
that with the pareto front method you

332
00:12:13,440 --> 00:12:17,680
actually want to hold on to that

333
00:12:15,279 --> 00:12:19,200
decision not make it be open for

334
00:12:17,680 --> 00:12:22,399
anything that comes and make that

335
00:12:19,200 --> 00:12:23,519
decision only after you start exploring

336
00:12:22,399 --> 00:12:24,800
the data

337
00:12:23,519 --> 00:12:26,800
and that's what we'll see in the next

338
00:12:24,800 --> 00:12:29,200
few slides so that's the importance of

339
00:12:26,800 --> 00:12:30,720
subjectivity and when you think about it

340
00:12:29,200 --> 00:12:32,399
what do i mean by analysts making

341
00:12:30,720 --> 00:12:33,760
subjective decisions in general well

342
00:12:32,399 --> 00:12:35,920
anytime you have a distribution and you

343
00:12:33,760 --> 00:12:37,760
want to quote it by number let's say do

344
00:12:35,920 --> 00:12:41,440
you decide the mean

345
00:12:37,760 --> 00:12:43,760
mode or um or median well if you have a

346
00:12:41,440 --> 00:12:44,800
perfectly symmetric bell curve shape

347
00:12:43,760 --> 00:12:46,800
then they're it's all the same it

348
00:12:44,800 --> 00:12:48,560
doesn't matter but most real-life data

349
00:12:46,800 --> 00:12:51,200
is skewed and so that's important maybe

350
00:12:48,560 --> 00:12:52,720
it's bimodal right so

351
00:12:51,200 --> 00:12:54,720
you have to learn the data and then make

352
00:12:52,720 --> 00:12:56,800
a subjective decision and the good

353
00:12:54,720 --> 00:12:59,200
analysts make sure that's a sound

354
00:12:56,800 --> 00:13:00,480
decision that can be justified the same

355
00:12:59,200 --> 00:13:02,800
thing when you decide the bins of a

356
00:13:00,480 --> 00:13:04,720
histogram the same things when you

357
00:13:02,800 --> 00:13:07,920
decide should i present data in a linear

358
00:13:04,720 --> 00:13:09,360
scale or a a log scale right so these

359
00:13:07,920 --> 00:13:11,920
are subjective decisions that we make

360
00:13:09,360 --> 00:13:14,560
and it all depends on context uh same

361
00:13:11,920 --> 00:13:16,480
here when you decide what's optimal

362
00:13:14,560 --> 00:13:19,200
first look at your parameter space and

363
00:13:16,480 --> 00:13:20,800
then make that decision

364
00:13:19,200 --> 00:13:23,200
so with that in mind we're ready to

365
00:13:20,800 --> 00:13:25,440
define what pareto front is um so

366
00:13:23,200 --> 00:13:27,200
beforehand um in order to define that we

367
00:13:25,440 --> 00:13:30,240
have to um

368
00:13:27,200 --> 00:13:31,760
when we look at a a scatter plot we have

369
00:13:30,240 --> 00:13:34,079
to differentiate between two types of

370
00:13:31,760 --> 00:13:36,720
solutions so these are two objects you

371
00:13:34,079 --> 00:13:38,560
know two dimensions of objective space

372
00:13:36,720 --> 00:13:39,760
and um

373
00:13:38,560 --> 00:13:42,000
we have to

374
00:13:39,760 --> 00:13:44,160
define all parameter uh all solutions

375
00:13:42,000 --> 00:13:46,639
either as what we call dominated or

376
00:13:44,160 --> 00:13:48,399
non-dominated uh with a non-dominant or

377
00:13:46,639 --> 00:13:50,000
what's called pareto optimum so what do

378
00:13:48,399 --> 00:13:50,959
we mean by that so let's learn by

379
00:13:50,000 --> 00:13:54,000
example

380
00:13:50,959 --> 00:13:56,399
for example this solution k over here so

381
00:13:54,000 --> 00:13:58,000
it's it's classified as dominated

382
00:13:56,399 --> 00:13:59,120
because there's at least one other

383
00:13:58,000 --> 00:14:00,000
solution

384
00:13:59,120 --> 00:14:01,519
that

385
00:14:00,000 --> 00:14:03,839
um that

386
00:14:01,519 --> 00:14:05,519
is is better that has better performance

387
00:14:03,839 --> 00:14:07,199
both in objective one and objective two

388
00:14:05,519 --> 00:14:09,279
assuming that we want to maximize both

389
00:14:07,199 --> 00:14:12,720
in that case for example we have n

390
00:14:09,279 --> 00:14:14,639
and e the reason n is called is is also

391
00:14:12,720 --> 00:14:16,959
called dominated is because e dominates

392
00:14:14,639 --> 00:14:18,720
n both in objective 1 and objective 2.

393
00:14:16,959 --> 00:14:20,639
the reason that e

394
00:14:18,720 --> 00:14:22,480
is considered non-dominated is because

395
00:14:20,639 --> 00:14:24,800
there's nobody over here that dominates

396
00:14:22,480 --> 00:14:27,040
it both in objective 1 and objective 2.

397
00:14:24,800 --> 00:14:28,720
you can say well f dominates it in 1

398
00:14:27,040 --> 00:14:31,120
right in this direction but it doesn't

399
00:14:28,720 --> 00:14:33,279
dominate it in objective two in in the

400
00:14:31,120 --> 00:14:35,279
horizon in the vertical direction same

401
00:14:33,279 --> 00:14:36,880
thing but the opposite for d d performs

402
00:14:35,279 --> 00:14:38,639
better in objective two but not in

403
00:14:36,880 --> 00:14:41,040
objective one so that's why e is

404
00:14:38,639 --> 00:14:43,519
considered non-dominated and hence what

405
00:14:41,040 --> 00:14:45,040
we call pareto optimal and the same goes

406
00:14:43,519 --> 00:14:48,079
for all of these

407
00:14:45,040 --> 00:14:49,839
solutions a through through h so again

408
00:14:48,079 --> 00:14:53,040
the definition of a pareto front is a

409
00:14:49,839 --> 00:14:54,720
set of non-dominated solutions

410
00:14:53,040 --> 00:14:56,399
and then you can ask the question

411
00:14:54,720 --> 00:14:58,160
well

412
00:14:56,399 --> 00:14:59,920
which is optimal so that's where the

413
00:14:58,160 --> 00:15:02,240
subjectivity comes in so we haven't made

414
00:14:59,920 --> 00:15:04,480
that subjective decision um

415
00:15:02,240 --> 00:15:07,199
and they're still considered equally

416
00:15:04,480 --> 00:15:10,000
optimal and then once we have this look

417
00:15:07,199 --> 00:15:11,839
of the space then you take then you or

418
00:15:10,000 --> 00:15:13,760
where the domain expert is then you

419
00:15:11,839 --> 00:15:15,760
create a subjective ranking you can

420
00:15:13,760 --> 00:15:17,440
either focus on these over here or maybe

421
00:15:15,760 --> 00:15:19,760
you're more interested in the ones over

422
00:15:17,440 --> 00:15:21,360
here it really depends on context as

423
00:15:19,760 --> 00:15:22,880
we'll see another thing that's worth

424
00:15:21,360 --> 00:15:25,040
highlighting is the nice thing about

425
00:15:22,880 --> 00:15:26,399
this approach is we do not care about

426
00:15:25,040 --> 00:15:29,120
the units of scale

427
00:15:26,399 --> 00:15:31,600
right um in this this could be anywhere

428
00:15:29,120 --> 00:15:33,120
between you know milligrams to kilograms

429
00:15:31,600 --> 00:15:34,959
it doesn't matter because you're going

430
00:15:33,120 --> 00:15:36,800
to get mathematically you're just going

431
00:15:34,959 --> 00:15:39,120
to get uh the same

432
00:15:36,800 --> 00:15:40,240
differentiation between dominated to

433
00:15:39,120 --> 00:15:42,240
non-nominate

434
00:15:40,240 --> 00:15:43,360
so that's what um that's what pareto's

435
00:15:42,240 --> 00:15:45,759
fronts are

436
00:15:43,360 --> 00:15:48,639
um so before we look at applicability

437
00:15:45,759 --> 00:15:50,959
we'll talk about um

438
00:15:48,639 --> 00:15:52,839
uh we'll talk about um

439
00:15:50,959 --> 00:15:56,240
sorry we'll just summarize

440
00:15:52,839 --> 00:15:58,959
um okay so a pareto front is a set of of

441
00:15:56,240 --> 00:16:01,279
of trade-off non-dominated solutions and

442
00:15:58,959 --> 00:16:03,440
this should be all considered equally

443
00:16:01,279 --> 00:16:05,680
optimal uh throughout this talk i'll

444
00:16:03,440 --> 00:16:07,839
i'll show you uh um in two-dimensional

445
00:16:05,680 --> 00:16:10,399
space but this is also relevant for any

446
00:16:07,839 --> 00:16:13,040
n-dimensional space this extrapolates uh

447
00:16:10,399 --> 00:16:14,480
to many dimensions

448
00:16:13,040 --> 00:16:16,160
and

449
00:16:14,480 --> 00:16:18,079
remember that the context of

450
00:16:16,160 --> 00:16:21,279
subjectivity we don't make any prior

451
00:16:18,079 --> 00:16:22,800
constraints before looking at the um at

452
00:16:21,279 --> 00:16:24,480
the distributions you if there are

453
00:16:22,800 --> 00:16:26,079
limitations of course you put them in

454
00:16:24,480 --> 00:16:27,519
but otherwise there's no if there's no

455
00:16:26,079 --> 00:16:29,839
good reason to put constraints you won't

456
00:16:27,519 --> 00:16:31,759
use them and um

457
00:16:29,839 --> 00:16:34,480
and the reason that perhaps optimization

458
00:16:31,759 --> 00:16:36,800
as you see is better than the single

459
00:16:34,480 --> 00:16:39,199
objective optimization is because you're

460
00:16:36,800 --> 00:16:41,360
really getting a broad like a bird's eye

461
00:16:39,199 --> 00:16:44,320
view of the solution space as opposed to

462
00:16:41,360 --> 00:16:46,720
putting on horse blinders

463
00:16:44,320 --> 00:16:48,399
um okay so with that

464
00:16:46,720 --> 00:16:50,720
in order to practice this concept i

465
00:16:48,399 --> 00:16:53,600
created an app and here's the link so

466
00:16:50,720 --> 00:16:56,959
feel free to go into the link and i'll

467
00:16:53,600 --> 00:16:59,839
just quickly i'll demonstrate it

468
00:16:56,959 --> 00:17:02,880
cool so i call it pareto because

469
00:16:59,839 --> 00:17:05,679
just like in wekamo the objective is to

470
00:17:02,880 --> 00:17:07,919
look at a distribution and try to guess

471
00:17:05,679 --> 00:17:10,559
of which um

472
00:17:07,919 --> 00:17:10,559
uh which

473
00:17:10,799 --> 00:17:14,319
which solutions are actually pareto

474
00:17:12,319 --> 00:17:17,039
optimal uh so javier i will ask for you

475
00:17:14,319 --> 00:17:18,079
in the chat to give me a number um so

476
00:17:17,039 --> 00:17:19,520
this will be the same number of a

477
00:17:18,079 --> 00:17:23,839
distribution that i'm not aware of i

478
00:17:19,520 --> 00:17:26,400
cannot see um 705. okay so i will put in

479
00:17:23,839 --> 00:17:28,319
705. thank you for that very

480
00:17:26,400 --> 00:17:30,799
thoughtful of you okay so this is a new

481
00:17:28,319 --> 00:17:32,720
distribution i have not seen before um

482
00:17:30,799 --> 00:17:34,960
okay so

483
00:17:32,720 --> 00:17:36,640
and i want to maximize let me make sure

484
00:17:34,960 --> 00:17:38,160
i want to do max max because that's

485
00:17:36,640 --> 00:17:40,400
easier to understand you can do any

486
00:17:38,160 --> 00:17:42,400
combination but we'll do max max in this

487
00:17:40,400 --> 00:17:44,559
case that's what i'm asked for how many

488
00:17:42,400 --> 00:17:46,400
of the 50 solutions are pareto optimal

489
00:17:44,559 --> 00:17:48,000
so my understanding and something you'll

490
00:17:46,400 --> 00:17:50,160
gain with time that's the purpose of

491
00:17:48,000 --> 00:17:52,160
this sort of app is to learn that one

492
00:17:50,160 --> 00:17:53,919
two three hopefully i'm right i'm not i

493
00:17:52,160 --> 00:17:56,720
don't always get it right but i think

494
00:17:53,919 --> 00:17:59,120
that it's three so i answer it here

495
00:17:56,720 --> 00:18:00,240
and in this case i'm correct

496
00:17:59,120 --> 00:18:03,760
okay have you ever give me another

497
00:18:00,240 --> 00:18:03,760
number quickly if you have one

498
00:18:06,080 --> 00:18:09,440
i just want to see if i can

499
00:18:08,640 --> 00:18:12,400
five

500
00:18:09,440 --> 00:18:12,400
okay so that's a five

501
00:18:12,840 --> 00:18:16,960
okay

502
00:18:14,720 --> 00:18:19,679
five new distribution

503
00:18:16,960 --> 00:18:22,480
okay that that's a bit easy this is two

504
00:18:19,679 --> 00:18:23,520
but yeah but i i i so i don't always get

505
00:18:22,480 --> 00:18:25,919
it right that's that's my that's the

506
00:18:23,520 --> 00:18:27,840
point i wanted to make uh cool so feel

507
00:18:25,919 --> 00:18:29,360
free to play around with that app you'll

508
00:18:27,840 --> 00:18:31,200
have the link again

509
00:18:29,360 --> 00:18:34,320
if you want to see it um maybe i'll just

510
00:18:31,200 --> 00:18:34,320
put it in the chat as well

511
00:18:34,720 --> 00:18:37,039
okay

512
00:18:35,679 --> 00:18:39,120
cool so

513
00:18:37,039 --> 00:18:40,000
um yeah so feel free to play around with

514
00:18:39,120 --> 00:18:42,160
that

515
00:18:40,000 --> 00:18:44,640
uh just to understand the topic okay so

516
00:18:42,160 --> 00:18:46,960
we're nearly there i i i'm nearly there

517
00:18:44,640 --> 00:18:48,320
in which i can talk about real world so

518
00:18:46,960 --> 00:18:50,160
i'm gonna

519
00:18:48,320 --> 00:18:52,240
share my screen like this

520
00:18:50,160 --> 00:18:54,400
okay we're nearly ready to present it in

521
00:18:52,240 --> 00:18:56,320
the real world but beforehand i have to

522
00:18:54,400 --> 00:18:58,320
talk about two different spaces so we

523
00:18:56,320 --> 00:18:59,600
talked about objective space uh but now

524
00:18:58,320 --> 00:19:01,679
we're talking about something called

525
00:18:59,600 --> 00:19:04,080
decision space and what does that mean

526
00:19:01,679 --> 00:19:05,600
um so far we were focusing on pareto

527
00:19:04,080 --> 00:19:07,200
fronts in the objective space how do we

528
00:19:05,600 --> 00:19:09,360
optimize for it but there is this other

529
00:19:07,200 --> 00:19:11,120
set of parameters which is what yeah

530
00:19:09,360 --> 00:19:12,640
which is called a decision space and

531
00:19:11,120 --> 00:19:14,320
what do i mean by that the objective

532
00:19:12,640 --> 00:19:16,240
space is what we're interested in right

533
00:19:14,320 --> 00:19:18,720
that's the price versus quality that's

534
00:19:16,240 --> 00:19:20,480
what we're interested uh at the end of

535
00:19:18,720 --> 00:19:22,000
at the end of the pipeline but we don't

536
00:19:20,480 --> 00:19:23,840
have control over that there are

537
00:19:22,000 --> 00:19:25,600
parameters think of those knobs those

538
00:19:23,840 --> 00:19:27,600
are the things that we turn around and

539
00:19:25,600 --> 00:19:30,240
that's what we're actually interested

540
00:19:27,600 --> 00:19:32,240
and we we can just make decisions on

541
00:19:30,240 --> 00:19:34,559
okay um so uh

542
00:19:32,240 --> 00:19:36,720
i'll give a few real a few toy examples

543
00:19:34,559 --> 00:19:38,160
and a few real-life examples this is a

544
00:19:36,720 --> 00:19:39,919
classical toy example called the

545
00:19:38,160 --> 00:19:42,640
knapsack problem and so the objective of

546
00:19:39,919 --> 00:19:46,160
the knapsack problem is you have um a

547
00:19:42,640 --> 00:19:48,880
lot of boxes and they all have value and

548
00:19:46,160 --> 00:19:50,000
they have weight and your objective

549
00:19:48,880 --> 00:19:51,679
your

550
00:19:50,000 --> 00:19:54,080
your you have two objectives you want to

551
00:19:51,679 --> 00:19:56,480
create you have to put them in in a

552
00:19:54,080 --> 00:19:58,480
knapsack in which you want to maximize

553
00:19:56,480 --> 00:20:00,480
uh for let's say you want to maximize

554
00:19:58,480 --> 00:20:03,280
for value of the packages but you want

555
00:20:00,480 --> 00:20:05,760
to minimize for weights okay my axes

556
00:20:03,280 --> 00:20:08,720
here aren't perfect but

557
00:20:05,760 --> 00:20:10,080
but just imagine that you

558
00:20:08,720 --> 00:20:11,600
make these combinations that's your

559
00:20:10,080 --> 00:20:14,080
decision space and then you have your

560
00:20:11,600 --> 00:20:17,440
objective space um

561
00:20:14,080 --> 00:20:19,679
in which again it's value versus weight

562
00:20:17,440 --> 00:20:21,600
okay and so what's also important to

563
00:20:19,679 --> 00:20:23,600
emphasis is you need a mapping between

564
00:20:21,600 --> 00:20:26,400
them the mapping here is quite trivial

565
00:20:23,600 --> 00:20:29,280
right you just weigh the boxes or you

566
00:20:26,400 --> 00:20:30,720
you quantify the monetary in that way

567
00:20:29,280 --> 00:20:32,960
that's the mapping between decision

568
00:20:30,720 --> 00:20:34,799
space to objective space so this is a

569
00:20:32,960 --> 00:20:36,640
very useful toy example and then i'll

570
00:20:34,799 --> 00:20:38,159
show you how you can actually learn this

571
00:20:36,640 --> 00:20:39,840
more in detail but i want to mention

572
00:20:38,159 --> 00:20:42,880
that this i use this as a perfect

573
00:20:39,840 --> 00:20:45,039
analogy for my work in uh

574
00:20:42,880 --> 00:20:46,880
therapeutic discovery and so what i mean

575
00:20:45,039 --> 00:20:48,720
by that is when i was working with

576
00:20:46,880 --> 00:20:51,840
protein engineers they have full

577
00:20:48,720 --> 00:20:53,760
constraint over the dna sequences so

578
00:20:51,840 --> 00:20:55,840
those things that they can easily order

579
00:20:53,760 --> 00:20:59,280
from a lab and they actually construct

580
00:20:55,840 --> 00:21:02,559
it uh within their own settings and so i

581
00:20:59,280 --> 00:21:04,960
was helping them decide what is the best

582
00:21:02,559 --> 00:21:06,880
dna sequence to optimize for what we're

583
00:21:04,960 --> 00:21:08,880
interested in we're interested in

584
00:21:06,880 --> 00:21:11,360
protein function that might be how it

585
00:21:08,880 --> 00:21:13,679
binds to another protein it's potency

586
00:21:11,360 --> 00:21:16,080
that means how effective it is or rather

587
00:21:13,679 --> 00:21:18,320
uh something deleterious like how toxic

588
00:21:16,080 --> 00:21:19,919
is it so um

589
00:21:18,320 --> 00:21:21,679
and so this is what they decide this is

590
00:21:19,919 --> 00:21:24,000
what we're interested in but the real

591
00:21:21,679 --> 00:21:25,919
expense actually is in the mapping

592
00:21:24,000 --> 00:21:27,440
because you know just having a sequence

593
00:21:25,919 --> 00:21:29,200
on your computer screen

594
00:21:27,440 --> 00:21:31,039
to actually measure this well you need

595
00:21:29,200 --> 00:21:32,960
you need a full-fledged laboratory so

596
00:21:31,039 --> 00:21:34,799
that's very expensive and that's what uh

597
00:21:32,960 --> 00:21:36,640
lab genius the biotech company work in

598
00:21:34,799 --> 00:21:38,640
they actually have that so they have the

599
00:21:36,640 --> 00:21:40,000
biological setting and i was working in

600
00:21:38,640 --> 00:21:41,360
a machine learning group and we were

601
00:21:40,000 --> 00:21:42,640
actually creating this

602
00:21:41,360 --> 00:21:44,480
using machine learning this mapping

603
00:21:42,640 --> 00:21:46,400
between sequence to

604
00:21:44,480 --> 00:21:50,400
um to protein function

605
00:21:46,400 --> 00:21:52,000
so we can optimize for um

606
00:21:50,400 --> 00:21:53,600
for for the drug discovery we were

607
00:21:52,000 --> 00:21:54,640
working in the context of crohn's

608
00:21:53,600 --> 00:21:57,520
disease

609
00:21:54,640 --> 00:21:59,039
um so here i'll just show you this

610
00:21:57,520 --> 00:22:01,200
a case study it isn't the crohn's

611
00:21:59,039 --> 00:22:03,280
disease case study this is kind of like

612
00:22:01,200 --> 00:22:07,679
i'm actually using public data as

613
00:22:03,280 --> 00:22:09,840
opposed to their priority data um

614
00:22:07,679 --> 00:22:11,280
as opposed to the proprietary data and

615
00:22:09,840 --> 00:22:14,159
so the data that i'm using is small

616
00:22:11,280 --> 00:22:15,760
proteins called anti antibodies in which

617
00:22:14,159 --> 00:22:18,080
um they're about 200 in length but

618
00:22:15,760 --> 00:22:20,159
they're about 20 locations which uh

619
00:22:18,080 --> 00:22:22,480
we're causing mutations and so 20 to the

620
00:22:20,159 --> 00:22:23,600
power 20 that's a very large space

621
00:22:22,480 --> 00:22:26,799
that's

622
00:22:23,600 --> 00:22:28,159
intractable it's 10 to the 26 10 with 26

623
00:22:26,799 --> 00:22:31,200
digits

624
00:22:28,159 --> 00:22:33,280
10 to the power 26 um and so it's

625
00:22:31,200 --> 00:22:35,039
actually mostly what's called wild type

626
00:22:33,280 --> 00:22:36,480
and that's what i manifest here in this

627
00:22:35,039 --> 00:22:38,880
frequency distribution so here are the

628
00:22:36,480 --> 00:22:40,880
20 locations i just put them all side by

629
00:22:38,880 --> 00:22:42,799
side and here you can see that ninety

630
00:22:40,880 --> 00:22:44,320
percent of the time this first one is t

631
00:22:42,799 --> 00:22:46,000
and ten percent in time it's these other

632
00:22:44,320 --> 00:22:47,440
letters that you can can't see because

633
00:22:46,000 --> 00:22:49,840
they're squashed so that's what i mean

634
00:22:47,440 --> 00:22:52,400
by ninety percent wild type and the goal

635
00:22:49,840 --> 00:22:55,679
was to discover um discover the best

636
00:22:52,400 --> 00:22:56,880
candidate for uh binding and potency and

637
00:22:55,679 --> 00:22:58,240
again i mentioned that we use machine

638
00:22:56,880 --> 00:23:01,520
learning to create that mapping that

639
00:22:58,240 --> 00:23:03,120
link between protein sequence to binding

640
00:23:01,520 --> 00:23:04,400
um so

641
00:23:03,120 --> 00:23:05,679
yeah i don't expect you to fully

642
00:23:04,400 --> 00:23:06,640
understand this in the setup and it's

643
00:23:05,679 --> 00:23:08,320
fine but

644
00:23:06,640 --> 00:23:10,720
but here i just want to show you a nice

645
00:23:08,320 --> 00:23:12,720
way that i visualized on the explo

646
00:23:10,720 --> 00:23:14,000
exploring the decision space so what

647
00:23:12,720 --> 00:23:15,200
you're seeing here is that before i

648
00:23:14,000 --> 00:23:17,360
showed frequency but here i'm showing

649
00:23:15,200 --> 00:23:19,360
the frequency change so the way to read

650
00:23:17,360 --> 00:23:21,600
this is that um

651
00:23:19,360 --> 00:23:23,120
here for example remember i mentioned t

652
00:23:21,600 --> 00:23:24,559
in the first position here it's actually

653
00:23:23,120 --> 00:23:26,559
rejecting it and preferring it for

654
00:23:24,559 --> 00:23:28,320
example t over that as opposed to these

655
00:23:26,559 --> 00:23:29,840
locations here in which it actually

656
00:23:28,320 --> 00:23:31,200
likes the wild type i don't remember

657
00:23:29,840 --> 00:23:33,200
what it is but it actually likes the

658
00:23:31,200 --> 00:23:34,960
it's probably why so it's actually likes

659
00:23:33,200 --> 00:23:36,480
the wild type uh and definitely here it

660
00:23:34,960 --> 00:23:38,240
likes the weld that's what you're seeing

661
00:23:36,480 --> 00:23:39,600
here and this is the decision space but

662
00:23:38,240 --> 00:23:40,960
remember i talked about objective space

663
00:23:39,600 --> 00:23:44,000
this is what we're actually interested

664
00:23:40,960 --> 00:23:46,000
in let's say binding and potency

665
00:23:44,000 --> 00:23:47,039
and so i'll just rerun the video from

666
00:23:46,000 --> 00:23:49,039
the start here and you can see the

667
00:23:47,039 --> 00:23:50,880
solution space started from over here

668
00:23:49,039 --> 00:23:53,120
and here i'm like doing explore exploit

669
00:23:50,880 --> 00:23:54,799
like looking at at the space and the

670
00:23:53,120 --> 00:23:56,480
objective is to get over here and i'm

671
00:23:54,799 --> 00:23:59,679
looking for solutions meaning these sort

672
00:23:56,480 --> 00:24:03,360
of combinations to get to this region

673
00:23:59,679 --> 00:24:05,200
over here so this is one uh way

674
00:24:03,360 --> 00:24:08,000
and the way i'm using it is you see here

675
00:24:05,200 --> 00:24:09,520
this pareto front the um if you start to

676
00:24:08,000 --> 00:24:11,120
get the rift of how to identify

677
00:24:09,520 --> 00:24:13,120
protofronts here they're all marked with

678
00:24:11,120 --> 00:24:15,440
x as i mentioned here so this is the

679
00:24:13,120 --> 00:24:17,200
protofront of this distribution

680
00:24:15,440 --> 00:24:18,799
um so then you write for that okay so

681
00:24:17,200 --> 00:24:22,000
great so you have this you did many

682
00:24:18,799 --> 00:24:24,480
iterations um so how do you decide uh

683
00:24:22,000 --> 00:24:26,000
which proteins to suggest um that they

684
00:24:24,480 --> 00:24:27,679
actually order in

685
00:24:26,000 --> 00:24:29,279
an engineer for well that's really

686
00:24:27,679 --> 00:24:32,080
subjective that really depends on the

687
00:24:29,279 --> 00:24:33,919
domain so the purpose is uh whatever

688
00:24:32,080 --> 00:24:35,840
decisions we made we did it only after

689
00:24:33,919 --> 00:24:36,960
the search remember not before the

690
00:24:35,840 --> 00:24:39,039
search so that's exactly what i'm

691
00:24:36,960 --> 00:24:40,720
talking about in our case um the the

692
00:24:39,039 --> 00:24:42,159
protein engineer told me well we're not

693
00:24:40,720 --> 00:24:43,919
interested in this region this is too

694
00:24:42,159 --> 00:24:45,440
low and this objective these are too

695
00:24:43,919 --> 00:24:47,360
long this objective well you know

696
00:24:45,440 --> 00:24:49,919
definitely take these and then anything

697
00:24:47,360 --> 00:24:52,880
in this vicinity and so what we ended up

698
00:24:49,919 --> 00:24:55,520
doing is testing over here for example

699
00:24:52,880 --> 00:24:57,120
and so this is one way to go about it

700
00:24:55,520 --> 00:24:59,039
so again the purpose of that slide is to

701
00:24:57,120 --> 00:25:00,960
emphasize the subject you have to make a

702
00:24:59,039 --> 00:25:02,799
subjective decision at some stage but

703
00:25:00,960 --> 00:25:05,760
you want to hold that to the end

704
00:25:02,799 --> 00:25:06,799
um this is another case study um

705
00:25:05,760 --> 00:25:08,960
so

706
00:25:06,799 --> 00:25:10,720
if you're familiar with with machine

707
00:25:08,960 --> 00:25:12,480
learning then these these parameters

708
00:25:10,720 --> 00:25:14,400
that are called hyperparameters and so

709
00:25:12,480 --> 00:25:16,480
these are things that again that we can

710
00:25:14,400 --> 00:25:17,840
can control for so here for example

711
00:25:16,480 --> 00:25:19,840
there's this technique called decision

712
00:25:17,840 --> 00:25:22,240
trees and it has one parameter called

713
00:25:19,840 --> 00:25:23,840
learning rate and the number of trees uh

714
00:25:22,240 --> 00:25:25,360
that's what we control for hence our

715
00:25:23,840 --> 00:25:27,440
decision space but that's not what we're

716
00:25:25,360 --> 00:25:29,360
interested in what we're interested in

717
00:25:27,440 --> 00:25:32,240
actually is the precision recall

718
00:25:29,360 --> 00:25:34,080
accuracy and other related metrics and

719
00:25:32,240 --> 00:25:35,279
the mapping is a whole process of

720
00:25:34,080 --> 00:25:37,679
machine learning of training

721
00:25:35,279 --> 00:25:39,600
cross-validation testing etc going

722
00:25:37,679 --> 00:25:41,120
between these parameters to what we're

723
00:25:39,600 --> 00:25:43,039
actually interested in

724
00:25:41,120 --> 00:25:44,960
so this is common practice for machine

725
00:25:43,039 --> 00:25:47,679
learning practitioners most of them do

726
00:25:44,960 --> 00:25:50,320
single objective optimization but i've

727
00:25:47,679 --> 00:25:52,880
one but i i was happy to find this

728
00:25:50,320 --> 00:25:54,640
article online in which this this team

729
00:25:52,880 --> 00:25:57,279
from sas institute they actually use

730
00:25:54,640 --> 00:25:59,840
multiple objective optimization and

731
00:25:57,279 --> 00:26:01,520
meaning pareto fronts in order to do

732
00:25:59,840 --> 00:26:03,679
hyper primary tuning so here i'm just

733
00:26:01,520 --> 00:26:05,039
highlighting uh they have a whole lot of

734
00:26:03,679 --> 00:26:07,120
case studies here i'm just highlighting

735
00:26:05,039 --> 00:26:10,240
just one of them a company called

736
00:26:07,120 --> 00:26:11,360
donorschoose in which they're helping um

737
00:26:10,240 --> 00:26:13,679
um

738
00:26:11,360 --> 00:26:14,960
uh this organization uh provide

739
00:26:13,679 --> 00:26:16,720
materials for

740
00:26:14,960 --> 00:26:19,039
for for for teachers the thing is that

741
00:26:16,720 --> 00:26:20,799
they get many proposals only about 20 of

742
00:26:19,039 --> 00:26:23,360
them are actually

743
00:26:20,799 --> 00:26:24,960
worth their while and so they wanted a

744
00:26:23,360 --> 00:26:27,200
machine learning algorithm to tell them

745
00:26:24,960 --> 00:26:29,520
to maximize uh to minimize

746
00:26:27,200 --> 00:26:32,640
misclassification and minimize uh

747
00:26:29,520 --> 00:26:34,480
simultaneously false positive rates um

748
00:26:32,640 --> 00:26:35,679
yeah so the details don't matter but the

749
00:26:34,480 --> 00:26:38,080
fact that they

750
00:26:35,679 --> 00:26:39,360
had two objectives to minimize four and

751
00:26:38,080 --> 00:26:41,360
that's what i'm showing over here so

752
00:26:39,360 --> 00:26:42,720
here um

753
00:26:41,360 --> 00:26:45,679
so

754
00:26:42,720 --> 00:26:47,279
just using the vanilla parameters uh um

755
00:26:45,679 --> 00:26:49,679
you know without doing any tests they

756
00:26:47,279 --> 00:26:51,760
have you know this misclassification

757
00:26:49,679 --> 00:26:53,200
misclassification about 15

758
00:26:51,760 --> 00:26:55,360
and just about just about three and a

759
00:26:53,200 --> 00:26:57,520
half percent false positive rate and so

760
00:26:55,360 --> 00:27:00,880
they tried single objective optimization

761
00:26:57,520 --> 00:27:02,799
minimize um you know a a few metrics and

762
00:27:00,880 --> 00:27:05,120
you can see it massively reduced

763
00:27:02,799 --> 00:27:07,360
misclassification but actually it did

764
00:27:05,120 --> 00:27:10,159
worse in terms of false pause

765
00:27:07,360 --> 00:27:13,120
false positive rate and the importance

766
00:27:10,159 --> 00:27:14,960
of this and so um and so they said then

767
00:27:13,120 --> 00:27:16,320
they did multi-objective optimization

768
00:27:14,960 --> 00:27:18,480
and here you can see the distribution of

769
00:27:16,320 --> 00:27:20,399
all the solutions they looked at and you

770
00:27:18,480 --> 00:27:22,399
can see highlighted here in green is the

771
00:27:20,399 --> 00:27:23,760
pareto front well now we're at this

772
00:27:22,399 --> 00:27:25,600
juncture well we have all these

773
00:27:23,760 --> 00:27:26,799
solutions how do we make a decision well

774
00:27:25,600 --> 00:27:28,480
what they said well before making the

775
00:27:26,799 --> 00:27:30,000
decision they notice this odd bump over

776
00:27:28,480 --> 00:27:32,000
here so what they said is they'll do a

777
00:27:30,000 --> 00:27:34,000
cut off at miss classification they'll

778
00:27:32,000 --> 00:27:36,000
redo re-run their calculation so here

779
00:27:34,000 --> 00:27:38,320
just focusing on this box they have this

780
00:27:36,000 --> 00:27:40,240
other nice visual which is so well the

781
00:27:38,320 --> 00:27:42,240
green dots are the same as before all

782
00:27:40,240 --> 00:27:44,320
these dots are the same except for these

783
00:27:42,240 --> 00:27:45,840
triangles in these triangles what they

784
00:27:44,320 --> 00:27:47,760
did is what they called a constrained

785
00:27:45,840 --> 00:27:50,000
search so even though they were open for

786
00:27:47,760 --> 00:27:51,840
all parameters before now they're saying

787
00:27:50,000 --> 00:27:54,559
only accept uh solutions that have a

788
00:27:51,840 --> 00:27:56,640
misclassification less than uh 0.15

789
00:27:54,559 --> 00:27:58,640
where this bump is and so hence they got

790
00:27:56,640 --> 00:27:59,919
these triangles and then they get again

791
00:27:58,640 --> 00:28:01,679
at this point well well we have to

792
00:27:59,919 --> 00:28:03,919
decide one of these and this is where

793
00:28:01,679 --> 00:28:05,840
subjectivity comes in just

794
00:28:03,919 --> 00:28:08,080
well i think this one

795
00:28:05,840 --> 00:28:09,840
is the best let's go with that okay

796
00:28:08,080 --> 00:28:11,760
and so that's the idea and the end

797
00:28:09,840 --> 00:28:13,840
result is compared to this initial data

798
00:28:11,760 --> 00:28:16,000
point they show that in terms of false

799
00:28:13,840 --> 00:28:18,159
positive rate um you know to get to this

800
00:28:16,000 --> 00:28:20,000
point they reduce about a relative eight

801
00:28:18,159 --> 00:28:22,399
percent and misclassification they

802
00:28:20,000 --> 00:28:24,080
improve by an absolute five percent

803
00:28:22,399 --> 00:28:26,480
so um if you're a machine learning

804
00:28:24,080 --> 00:28:27,840
practitioner then um

805
00:28:26,480 --> 00:28:30,000
it's extra resources do these

806
00:28:27,840 --> 00:28:31,520
calculations but you might get an edge

807
00:28:30,000 --> 00:28:33,120
by doing so

808
00:28:31,520 --> 00:28:35,120
um so another

809
00:28:33,120 --> 00:28:37,120
so another question for the audience um

810
00:28:35,120 --> 00:28:39,200
i talked about how to identify pareto

811
00:28:37,120 --> 00:28:40,720
fronts i challenged myself i gave you an

812
00:28:39,200 --> 00:28:43,120
app let's see what what you see over

813
00:28:40,720 --> 00:28:44,799
here so let's say what's the which fruit

814
00:28:43,120 --> 00:28:46,399
are pareto optimal for the opposite

815
00:28:44,799 --> 00:28:48,000
before before we want to be nasty

816
00:28:46,399 --> 00:28:50,000
colleagues and we want to provide them

817
00:28:48,000 --> 00:28:52,799
with the fruit that's most difficult and

818
00:28:50,000 --> 00:28:54,640
most untasty so if people can just

819
00:28:52,799 --> 00:28:56,640
answer in the chat what they think it is

820
00:28:54,640 --> 00:28:58,559
before i give the answer

821
00:28:56,640 --> 00:29:00,720
so i'll give you about 10 seconds

822
00:28:58,559 --> 00:29:03,039
which fruit are both

823
00:29:00,720 --> 00:29:05,840
difficult and untasty

824
00:29:03,039 --> 00:29:08,720
according to randall

825
00:29:05,840 --> 00:29:08,720
five more seconds

826
00:29:09,360 --> 00:29:13,200
all right so these are the ones that i

827
00:29:11,120 --> 00:29:16,720
picked up these are all considered

828
00:29:13,200 --> 00:29:18,000
proto-optimal if we want to um yes i

829
00:29:16,720 --> 00:29:19,200
appreciate that the chat is a bit

830
00:29:18,000 --> 00:29:22,159
delayed

831
00:29:19,200 --> 00:29:23,039
um but

832
00:29:22,159 --> 00:29:24,320
okay

833
00:29:23,039 --> 00:29:25,600
cool um

834
00:29:24,320 --> 00:29:27,440
so

835
00:29:25,600 --> 00:29:29,200
so yeah so if you got it great if not

836
00:29:27,440 --> 00:29:30,640
then um in any case you have the app to

837
00:29:29,200 --> 00:29:35,679
play around with it

838
00:29:30,640 --> 00:29:37,600
um is this top is is this um pareto um

839
00:29:35,679 --> 00:29:39,919
is protofronts relevant for the projects

840
00:29:37,600 --> 00:29:42,240
that you're working on that's what we're

841
00:29:39,919 --> 00:29:44,480
addressing over here so you want to look

842
00:29:42,240 --> 00:29:46,720
into this technique if you're dealing

843
00:29:44,480 --> 00:29:48,000
with conflicting objectives the next

844
00:29:46,720 --> 00:29:49,679
thing um

845
00:29:48,000 --> 00:29:51,760
my experience with my conversation with

846
00:29:49,679 --> 00:29:53,840
people i realize that not everybody has

847
00:29:51,760 --> 00:29:56,240
full control over the decision space so

848
00:29:53,840 --> 00:29:57,520
you want full control over that

849
00:29:56,240 --> 00:29:59,440
and then when it comes to the objective

850
00:29:57,520 --> 00:30:00,880
space you need an inexpensive mapping

851
00:29:59,440 --> 00:30:02,799
because sometimes you can make the

852
00:30:00,880 --> 00:30:05,600
decisions but in order to map that to

853
00:30:02,799 --> 00:30:07,600
the actual objective space that might be

854
00:30:05,600 --> 00:30:09,279
very time consuming or might be very

855
00:30:07,600 --> 00:30:11,919
expensive

856
00:30:09,279 --> 00:30:14,320
and so that's another consideration

857
00:30:11,919 --> 00:30:19,279
there's also a further

858
00:30:14,320 --> 00:30:21,600
consideration is how big is your

859
00:30:19,279 --> 00:30:23,279
objective space or decision space for

860
00:30:21,600 --> 00:30:26,159
example when i talked about drug

861
00:30:23,279 --> 00:30:28,640
discovery well the dna sequence space is

862
00:30:26,159 --> 00:30:31,919
astronomical and hence what you need is

863
00:30:28,640 --> 00:30:33,600
a stochastic algorithm in order to

864
00:30:31,919 --> 00:30:36,480
um

865
00:30:33,600 --> 00:30:37,600
in order to navigate the space

866
00:30:36,480 --> 00:30:40,559
and so

867
00:30:37,600 --> 00:30:43,760
uh one recommended technique is genetic

868
00:30:40,559 --> 00:30:46,240
algorithms and that um so with that that

869
00:30:43,760 --> 00:30:48,640
actually is that last topic but i think

870
00:30:46,240 --> 00:30:51,840
uh how much more time do we have for for

871
00:30:48,640 --> 00:30:51,840
um this talk

872
00:30:52,159 --> 00:30:55,760
okay i was asked to

873
00:30:53,840 --> 00:30:58,000
wrap up so that's fine

874
00:30:55,760 --> 00:30:58,000
so

875
00:30:58,640 --> 00:31:00,960
okay

876
00:31:02,320 --> 00:31:06,159
so don't worry about the rest of the

877
00:31:03,440 --> 00:31:07,360
material uh we have um you'll have all

878
00:31:06,159 --> 00:31:09,679
access to

879
00:31:07,360 --> 00:31:12,080
all the material online um so i actually

880
00:31:09,679 --> 00:31:14,159
in the past uh conferences i have a

881
00:31:12,080 --> 00:31:15,600
hands-on tutorial in which if you're

882
00:31:14,159 --> 00:31:17,519
interested to go over this material

883
00:31:15,600 --> 00:31:19,279
again and have getting hands-on

884
00:31:17,519 --> 00:31:21,840
experience with the knapsack problem

885
00:31:19,279 --> 00:31:23,760
with jupiter notebooks so uh you have

886
00:31:21,840 --> 00:31:26,080
all of that over here i'll just very

887
00:31:23,760 --> 00:31:27,600
quickly show you what it looks like

888
00:31:26,080 --> 00:31:29,519
um i see already

889
00:31:27,600 --> 00:31:32,240
14 of you are actually looking at it so

890
00:31:29,519 --> 00:31:33,360
that's great um so just quickly to go

891
00:31:32,240 --> 00:31:35,679
through it

892
00:31:33,360 --> 00:31:35,679
um

893
00:31:35,840 --> 00:31:41,200
okay so what it is is you have basically

894
00:31:38,960 --> 00:31:43,919
videos of me talking exactly about this

895
00:31:41,200 --> 00:31:45,919
topic and you'll have jupiter notebooks

896
00:31:43,919 --> 00:31:48,399
um in which you can actually practice

897
00:31:45,919 --> 00:31:50,240
this um in which um you learn about

898
00:31:48,399 --> 00:31:53,360
genetic algorithms and vast techniques

899
00:31:50,240 --> 00:31:55,200
so you can uh use in your own projects

900
00:31:53,360 --> 00:31:56,880
uh okay so

901
00:31:55,200 --> 00:31:57,840
sorry i'm just going to

902
00:31:56,880 --> 00:31:59,120
share

903
00:31:57,840 --> 00:31:59,640
so i can see

904
00:31:59,120 --> 00:32:01,440
the

905
00:31:59,640 --> 00:32:04,480
[Music]

906
00:32:01,440 --> 00:32:07,120
okay so just um

907
00:32:04,480 --> 00:32:09,519
okay so uh just to summarize uh

908
00:32:07,120 --> 00:32:11,120
multi-objective optimization is useful

909
00:32:09,519 --> 00:32:12,399
when you have conflicting objectives you

910
00:32:11,120 --> 00:32:14,880
have full control over your decision

911
00:32:12,399 --> 00:32:17,519
space and you have an inexpected mapping

912
00:32:14,880 --> 00:32:20,640
to the objective space um

913
00:32:17,519 --> 00:32:22,640
yeah if you have an intractable

914
00:32:20,640 --> 00:32:24,320
objective space you might consider

915
00:32:22,640 --> 00:32:26,480
genetic algorithms again you have all

916
00:32:24,320 --> 00:32:29,519
the material online you have a tutorial

917
00:32:26,480 --> 00:32:31,279
um you can play the pareto wack game if

918
00:32:29,519 --> 00:32:33,200
anybody's interested in more material

919
00:32:31,279 --> 00:32:35,440
online i highly recommend anything by

920
00:32:33,200 --> 00:32:38,159
zitzler it's a great read and there's

921
00:32:35,440 --> 00:32:39,919
also python prototyping

922
00:32:38,159 --> 00:32:41,039
module called deep which i highly

923
00:32:39,919 --> 00:32:44,080
recommend

924
00:32:41,039 --> 00:32:45,039
as well to do prototyping i find it very

925
00:32:44,080 --> 00:32:49,039
useful

926
00:32:45,039 --> 00:32:49,039
so um yeah thank you very much

927
00:32:50,720 --> 00:32:55,440
thank you ayal this has been great and

928
00:32:53,519 --> 00:32:57,360
unfortunately we don't have time for

929
00:32:55,440 --> 00:32:58,799
more questions but i think

930
00:32:57,360 --> 00:33:00,720
you know all these follow-through

931
00:32:58,799 --> 00:33:04,320
materials and follow-on materials are

932
00:33:00,720 --> 00:33:08,399
gonna tide us so thank you yell again

933
00:33:04,320 --> 00:33:10,559
and next up on today's

934
00:33:08,399 --> 00:33:14,159
data science and analytic track at

935
00:33:10,559 --> 00:33:16,559
pyconeu 2021 is a closing session with

936
00:33:14,159 --> 00:33:18,320
lars jenkin systems of the world

937
00:33:16,559 --> 00:33:21,519
cataloging the world's data for great

938
00:33:18,320 --> 00:33:24,320
good please stay on uh he will be on in

939
00:33:21,519 --> 00:33:27,720
another 10 minutes at 4 45 australian

940
00:33:24,320 --> 00:33:27,720
eastern time