Sampling-Bias-Corrected Neural Modeling for Large Corpus Item Recommendations

์ด ๋…ผ๋ฌธ์„ ์ฒ˜์Œ ์•Œ๊ฒŒ ๋œ ๊ฒƒ์€ ์ €๋ฒˆ๋‹ฌ์— Google Brain์—์„œ Tensorflow Recommenders ๋ผ๋Š” ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ๊ณต๊ฐœํ•˜๋ฉด์„œ ์ž…๋‹ˆ๋‹ค. Youtube๋ผ๋Š” ๊ฑฐ๋Œ€ํ•œ ์ถ”์ฒœ์‹œ์Šคํ…œ์„ ์šด์˜ํ•˜๊ณ  ์žˆ๋Š” ๊ตฌ๊ธ€์ด ์ถ”์ฒœ ์‹œ์Šคํ…œ ๊ด€๋ จ ์ฝ”๋“œ๋ฅผ ๊ณต๊ฐœํ•œ๋‹ค๊ณ  ํ•ด์„œ ์ง‘์ค‘ํ•ด์„œ ๋ณด๊ฒŒ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ „์ฒด์ ์ธ ๋‚ด์šฉ์€ Tensorflow Blog์— ๋” ์ž์„ธํžˆ ๋‚˜์™€์žˆ์œผ๋‹ˆ ์ฝ์–ด๋ณด์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค. TFRS(TensorFlow Recommeners)์˜ ๋ชฉํ‘œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์ถ”์ฒœ ํ›„๋ณด๊ตฐ์„ ๋น ๋ฅด๊ณ  ์œ ์—ฐํ•˜๊ฒŒ ๋นŒ๋“œ Item, User, Context ์ •๋ณด๋ฅผ ์ž์œ ๋กญ๊ฒŒ ์‚ฌ์šฉํ•˜๋Š” ๊ตฌ์กฐ ๋‹ค์–‘ํ•œ objective๋ฅผ ๋™์‹œ์— ํ•™์Šตํ•˜๋Š” multi-task ๊ตฌ์กฐ ํ•™์Šต๋œ ๋ชจ๋ธ์€ TF Serving์œผ๋กœ ํšจ์œจ์ ์œผ๋กœ ์„œ๋น™ ์‚ฌ์‹ค ์ฝ”๋“œ ์ž์ฒด๋Š” ํฌ๊ฒŒ ๋‹ค์–‘ํ•œ ๋‚ด์šฉ๋“ค์ด ์žˆ์ง€๋Š” ์•Š์•˜์ง€๋งŒ, ์ œ์ผ ์ธ์ƒ ๊นŠ์—ˆ๋˜ ๊ฒƒ์€ ์ฝ”๋“œ์—์„œ ๊ธฐ๋ณธ ๋ชจ๋ธ๋กœ ์†Œ๊ฐœํ•œ Two Tower Model์ด์—ˆ์Šต๋‹ˆ๋‹ค. ๋ฐ”๋กœ User์™€ Item์„ ์•„์˜ˆ ๋…๋ฆฝ์ ์œผ๋กœ ํ•™์Šต์‹œ์ผœ ๋งˆ์ง€๋ง‰ ๋‹จ์—์„œ dot product๋กœ๋งŒ click / unclick์„ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ์ธ๋ฐ, ์ƒ๊ฐํ•˜๋ฉด ์ƒ๊ฐํ•  ์ˆ˜๋ก ์ข‹์€ ๊ตฌ์กฐ๋”๋ผ๊ตฌ์š”. ๋น„๋ก ํ•™์Šตํ•˜๋Š” ๋‹จ์—์„œ user tower์™€ item tower๊ฐ€ interact ๋ชปํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์—„์ฒญ๋‚œ ์„ฑ๋Šฅ์„ ๋‚ผ ์ง€๋Š” ๋ฏธ์ง€์ˆ˜์˜€์ง€๋งŒ, ๊ตฌ์กฐ ์ž์ฒด๊ฐ€ input feature์˜ ์ œ์•ฝ์ด ์—†์–ด์„œ ๊ฐ€๋Šฅํ•œ feature๋ฅผ ์ž์œ ๋กญ๊ฒŒ ๋„ฃ์„ ์ˆ˜ ์žˆ์—ˆ๊ณ , inferenceํ•  ๋•Œ๋Š” user๋ณ„ embedding, item๋ณ„ embedding์œผ๋กœ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค๊ฐ€ dot product๋กœ๋งŒ similarity๋ฅผ ๊ณ„์‚ฐํ•ด์„œ servingํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ANN(Approximate Nearest Neighbors) ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์™€์˜ ํ˜ธํ™˜์„ฑ๋„ ์ข‹์•„ ๋ณด์˜€์Šต๋‹ˆ๋‹ค. ...

10์›” 31, 2020 ยท 5 ๋ถ„ ยท AngryPark