{"id":52,"date":"2012-02-06T16:50:39","date_gmt":"2012-02-06T21:50:39","guid":{"rendered":"https:\/\/michaelnielsen.org\/ddi\/?p=52"},"modified":"2012-07-09T15:11:54","modified_gmt":"2012-07-09T19:11:54","slug":"addendum-on-search-and-support-vector-machines","status":"publish","type":"post","link":"https:\/\/michaelnielsen.org\/ddi\/addendum-on-search-and-support-vector-machines\/","title":{"rendered":"Addendum on search and support vector machines"},"content":{"rendered":"<p>In the <a href=\"https:\/\/michaelnielsen.org\/ddi\/how-to-combine-multiple-notions-of-relevance-in-search\/\">last   post<\/a> I described how to use support vector machines (SVMs) to combine multiple notions of search relevance.  After posting I realized I could greatly simplify my discussion of an important subject in the final section of that post.  What I can simplify is this: once you&#8217;ve found the SVM parameters, how should you use them to rank a long list of documents for a particular query, in practice?<\/p>\n<p>Here&#8217;s the setup. We suppose we&#8217;ve already taken a bunch of training data, as in the last post, and used that training data to find a SVM with soft margin.  That SVM can be used to rank any query and document.  In particular, the SVM is specified by a vector <img src='https:\/\/s0.wp.com\/latex.php?latex=%5Cvec+w&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\vec w' title='\\vec w' class='latex' \/>, and we rank a document <img src='https:\/\/s0.wp.com\/latex.php?latex=d&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='d' title='d' class='latex' \/> above a document <img src='https:\/\/s0.wp.com\/latex.php?latex=d%27&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='d&#039;' title='d&#039;' class='latex' \/> for query <img src='https:\/\/s0.wp.com\/latex.php?latex=q&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='q' title='q' class='latex' \/> if:<\/p>\n<img src='https:\/\/s0.wp.com\/latex.php?latex=+++%5B%2A%5D+%5C%2C%5C%2C%5C%2C%5C%2C+%5Cvec+w+%5Ccdot+%5Cvec+%5Cphi%28q%2Cd%2Cd%27%29+%5Cgeq+0%2C+&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='   [*] \\,\\,\\,\\, \\vec w \\cdot \\vec \\phi(q,d,d&#039;) \\geq 0, ' title='   [*] \\,\\,\\,\\, \\vec w \\cdot \\vec \\phi(q,d,d&#039;) \\geq 0, ' class='latex' \/>\n<p>where <img src='https:\/\/s0.wp.com\/latex.php?latex=%5Cvec+%5Cphi%28q%2Cd%2Cd%27%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\vec \\phi(q,d,d&#039;)' title='\\vec \\phi(q,d,d&#039;)' class='latex' \/> is the feature difference vector.  (In the last post a parameter <img src='https:\/\/s0.wp.com\/latex.php?latex=b&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='b' title='b' class='latex' \/> also appeared in this condition, but as I noted in one of the exercises in that post, for the particular problem of search <img src='https:\/\/s0.wp.com\/latex.php?latex=b%3D0&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='b=0' title='b=0' class='latex' \/>, which is why it doesn&#8217;t appear here.)<\/p>\n<p>In the final section of the last post I did lots of perambulations with the condition [*] to figure out how to construct a linear order ranking documents, given a particular query <img src='https:\/\/s0.wp.com\/latex.php?latex=q&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='q' title='q' class='latex' \/>. We can simplify life quite a bit by recalling that <img src='https:\/\/s0.wp.com\/latex.php?latex=%5Cvec+%5Cphi%28q%2Cd%2Cd%27%29+%3D+%5Cvec+%5Cpsi%28q%2Cd%29-%5Cvec+%5Cpsi%28q%2Cd%27%29%2C&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\vec \\phi(q,d,d&#039;) = \\vec \\psi(q,d)-\\vec \\psi(q,d&#039;),' title='\\vec \\phi(q,d,d&#039;) = \\vec \\psi(q,d)-\\vec \\psi(q,d&#039;),' class='latex' \/> where <img src='https:\/\/s0.wp.com\/latex.php?latex=%5Cvec+%5Cpsi%28q%2Cd%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\vec \\psi(q,d)' title='\\vec \\psi(q,d)' class='latex' \/> is the feature vector for query <img src='https:\/\/s0.wp.com\/latex.php?latex=q&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='q' title='q' class='latex' \/> and document <img src='https:\/\/s0.wp.com\/latex.php?latex=d&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='d' title='d' class='latex' \/>.  The condition above can then be rewritten as:<\/p>\n<img src='https:\/\/s0.wp.com\/latex.php?latex=+++%5B%2A%2A%5D+%5C%2C%5C%2C%5C%2C%5C%2C+%5Cvec+w+%5Ccdot+%5Cvec+%5Cpsi%28q%2Cd%29+%5Cgeq+%5Cvec+w+%5Ccdot+%5Cvec+%5Cpsi%28q%2Cd%27%29+&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='   [**] \\,\\,\\,\\, \\vec w \\cdot \\vec \\psi(q,d) \\geq \\vec w \\cdot \\vec \\psi(q,d&#039;) ' title='   [**] \\,\\,\\,\\, \\vec w \\cdot \\vec \\psi(q,d) \\geq \\vec w \\cdot \\vec \\psi(q,d&#039;) ' class='latex' \/>\n<p>This condition suggests a much easier way of ranking documents: simply treat <img src='https:\/\/s0.wp.com\/latex.php?latex=%5Cvec+w+%5Ccdot+%5Cvec+%5Cpsi%28q%2Cd%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\vec w \\cdot \\vec \\psi(q,d)' title='\\vec w \\cdot \\vec \\psi(q,d)' class='latex' \/> as a score of how good document <img src='https:\/\/s0.wp.com\/latex.php?latex=d&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='d' title='d' class='latex' \/> is when the query is <img src='https:\/\/s0.wp.com\/latex.php?latex=q&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='q' title='q' class='latex' \/>, so that we rank <img src='https:\/\/s0.wp.com\/latex.php?latex=d&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='d' title='d' class='latex' \/> above <img src='https:\/\/s0.wp.com\/latex.php?latex=d%27&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='d&#039;' title='d&#039;' class='latex' \/> whenever the score for <img src='https:\/\/s0.wp.com\/latex.php?latex=d&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='d' title='d' class='latex' \/> is better than <img src='https:\/\/s0.wp.com\/latex.php?latex=d%27&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='d&#039;' title='d&#039;' class='latex' \/>.  We then just construct a linear ordering of the documents, based on the score.  Or, if we prefer, we can find just the top 10 (or 20, or whatever) documents by iterating linearly over the documents, and keeping a running tally of the best 10 (or 20) found to date.<\/p>\n<p>This is quite a bit simpler than the discussion at the end of my last post.  It only works, though, because our classifer (the SVM) is a linear function of the feature difference vectors.  It&#8217;s not difficult to construct other types of classifer for which it really wouldn&#8217;t be obvious how to construct this kind of score.  For those classifiers you could still fall back on the analysis I did in the last post, so it wasn&#8217;t all wasted effort.<\/p>\n<p>Incidentally, one thing I like about the equation [**] is that it makes it a lot clearer why <img src='https:\/\/s0.wp.com\/latex.php?latex=%5Cvec+w&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\vec w' title='\\vec w' class='latex' \/> is called the weight vector.  It means that the score for page <img src='https:\/\/s0.wp.com\/latex.php?latex=d&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='d' title='d' class='latex' \/> on query <img src='https:\/\/s0.wp.com\/latex.php?latex=q&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='q' title='q' class='latex' \/> is just a linear combination of the different feature values, with weights given by the weight vector.  That makes the notation <img src='https:\/\/s0.wp.com\/latex.php?latex=%5Cvec+w&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\vec w' title='\\vec w' class='latex' \/> and nomenclature &#8220;weight vector&#8221; much clearer, as well as making it a lot clearer what the vector <img src='https:\/\/s0.wp.com\/latex.php?latex=%5Cvec+w&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\vec w' title='\\vec w' class='latex' \/> means!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In the last post I described how to use support vector machines (SVMs) to combine multiple notions of search relevance. After posting I realized I could greatly simplify my discussion of an important subject in the final section of that post. What I can simplify is this: once you&#8217;ve found the SVM parameters, how should&hellip; <a class=\"more-link\" href=\"https:\/\/michaelnielsen.org\/ddi\/addendum-on-search-and-support-vector-machines\/\">Continue reading <span class=\"screen-reader-text\">Addendum on search and support vector machines<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-52","post","type-post","status-publish","format-standard","hentry","category-uncategorized","entry"],"_links":{"self":[{"href":"https:\/\/michaelnielsen.org\/ddi\/wp-json\/wp\/v2\/posts\/52","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/michaelnielsen.org\/ddi\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/michaelnielsen.org\/ddi\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/michaelnielsen.org\/ddi\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/michaelnielsen.org\/ddi\/wp-json\/wp\/v2\/comments?post=52"}],"version-history":[{"count":0,"href":"https:\/\/michaelnielsen.org\/ddi\/wp-json\/wp\/v2\/posts\/52\/revisions"}],"wp:attachment":[{"href":"https:\/\/michaelnielsen.org\/ddi\/wp-json\/wp\/v2\/media?parent=52"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/michaelnielsen.org\/ddi\/wp-json\/wp\/v2\/categories?post=52"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/michaelnielsen.org\/ddi\/wp-json\/wp\/v2\/tags?post=52"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}