Multi Armed Bandit

Recently, while studying Recommender Systems, I thought I needed to study the field of Multi-armed bandit. I’ve summarized it based on A Survey of Online Experiment Design with the Stochastic Multi-Armed Bandit. Table of Contents 1. Concept 2. Differences between MAB and Existing Statistical Models 1. Concept The background of the term Multi-armed Bandit (hereinafter MAB) is gambling. What is the method for someone to obtain maximum profit through N slot machines with different profit distributions within a given time? If given the opportunity to pull N slot machines within limited time to obtain profit, there should first be a process of exploring which slot machine can earn more money for some time (this is called Exploration), and then there is a process of maximizing profit by pulling slot machines that one judges to be good (this is called Exploitation). ...

February 5, 2019 · 5 min · AngryPark

Attention in NLP

This post summarizes what attention is, focusing on several important papers and how it is used in NLP. Table of Contents Problems with Existing Encoder-Decoder Architecture Basic Idea Attention Score Functions What Do We Attend To? Multi-headed Attention Transformer Problems with Existing Encoder-Decoder Architecture The most important part of the Encoder-Decoder architecture is how to vectorize the input sequence. In NLP, input sequences often have dynamic structures, so problems arise when converting them to fixed-length vectors. For example, sentences like “Hello” and “The weather is nice today but the fine dust is severe, so make sure to wear a mask when you go out!” contain very different amounts of information, yet the encoder-decoder structure must convert them to vectors of the same length. Attention was first proposed to reduce information loss and solve problems more intuitively by reflecting which parts should be paid particular attention to in sequence data, as the word suggests. ...

January 26, 2019 · 4 min · AngryPark