Data Science/ํ†ต๊ณ„

Binomial Distribution์€ Bernoulli Distribution with mutliple trials๋กœ ์ดํ•ดํ•˜๋ฉด ์ข‹์Šต๋‹ˆ๋‹ค.For a random variable X, ์ด๋ฒคํŠธ์˜ ๊ฒฐ๊ณผ๊ฐ€ ๋‘๊ฐœ์˜ ์˜ต์…˜๋ฐ–์— ์กด์žฌํ•˜์ง€ ์•Š๋Š”๋‹ค๋ฉด, ์ด๋ฅผ ์šฐ๋ฆฌ๋Š” ๋ฒ ๋ฅด๋ˆ„์ด ๋ถ„ํฌ๋ผ๊ณ  ๋ถ€๋ฆ…๋‹ˆ๋‹ค.ํ™•๋ฅ  p์— ๋Œ€ํ•˜์—ฌ X~Bern(p)๋กœ ํ‘œ๊ธฐํ•˜๊ณ , ์ด๋Š” X~B(1,p)์™€ ๋™์ผํ•ฉ๋‹ˆ๋‹ค. ๋ฒ ๋ฅด๋ˆ„์ด ๋ถ„ํฌ์— ๋Œ€ํ•ด์„œ ์กฐ๊ธˆ๋งŒ ๋” ์•Œ์•„๋ณด์ž๋ฉด,E(x) = 1*p + 0*(1-p) = p Variance = p(1-p)STDEV = sqrt(p(1-p))์ž…๋‹ˆ๋‹ค. ๊ด€๋ก€์ ์œผ๋กœ ์šฐ๋ฆฌ๋Š” ๋‘๊ฐœ์˜ ๊ฒฐ๊ณผ ์ค‘ ๋”์šฑ ํ™•๋ฅ ์ด ๋†’์€ ๊ฒฐ๊ณผ๋ฅผ p๋กœ, ๊ทธ๋ ‡์ง€ ์•Š์€ ๊ฒƒ์„ 1-p, ํ˜น์€ q๋กœ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค.๋˜ํ•œ, ์šฐ๋ฆฌ๋Š” ๋ฒ ๋ฅด๋ˆ„์ด ๋ถ„ํฌ๋ฅผ ์ ์šฉํ•˜๊ณ  ์‹ถ์€ ์ƒํ™ฉ์—, ๊ฐ ์ด๋ฒคํŠธ์— 1๊ณผ 0..
์šฐ๋ฆฌ์—๊ฒŒ nCr๋กœ ๋”์šฑ ์นœ์ˆ™ํ•œ ์กฐํ•ฉ์€, ํ•œ ์ง‘ํ•ฉ์—์„œ ํŠน์ • ์š”์†Œ๋“ค์„ ๋ฝ‘์„ ์ˆ˜ ์žˆ๋Š” ๊ฐ€์ง“์ˆ˜๋ฅผ ์นญํ•ฉ๋‹ˆ๋‹ค.์—ฌ๊ธฐ์„œ ์ค‘์š”ํ•œ ์ ์€ permutation๊ณผ๋Š” ๋‹ค๋ฅด๊ฒŒ ์ˆœ์„œ๊ฐ€ ์ค‘์š”ํ•˜์ง€ ์•Š๋‹ค๋ผ๋Š” ์  ์ž…๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ํ•™๊ต์—์„œ ์šฐ๋ฆฌ ๋ฐ˜์˜ ๋Œ€ํ‘œ ํ•™์ƒ์„ 3๋ช… ๋ฝ‘๋Š”๋‹ค๊ณ  ํ–ˆ์„ ๋•Œ, ๋ฝ‘ํžˆ๋Š” ์ˆœ์„œ๋Š” ์•„๋ฌด๋Ÿฐ ์ƒ๊ด€์ด ์—†๊ฒ ์ฃ ? 3๋ช…์˜ ํ•™์ƒ์ด [๊น€์ค€์ˆ˜, ์ตœ๋‚˜์˜, ๋ฐ•๋ฏผ์ง€] ๋ผ๊ณ  ํ•˜๋ฉด, ์ด๋Š” [๊น€์ค€์ˆ˜, ๋ฐ•๋ฏผ์ง€, ์ตœ๋‚˜์˜]์„ ๋ฝ‘์•˜์„ ๋•Œ๋‚˜ [๋ฐ•๋ฏผ์ง€, ์ตœ๋‚˜์˜, ๊น€์ค€์ˆ˜]๋ฅผ ๋ฝ‘์•˜์„ ๋•Œ๋‚˜ ๊ฐ™์€ ์กฐํ•ฉ์ž…๋‹ˆ๋‹ค. ๊ทธ๋ ‡์ฃ ? Combinations ๊ณต์‹ nCr์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.nCr = n! / (n-r)! * r!         (์ค‘๋ณต ํ—ˆ์šฉ X) ์šฐ๋ฆฌ์˜ ์˜ˆ์‹œ์—์„œ ์šฐ๋ฆฌ ๋ฐ˜์˜ ์ด ํ•™์ƒ ์ˆ˜๋ฅผ 10๋ช…์ด๋ผ๊ณ  ํ•ด๋ด…์‹œ๋‹ค.๊ทธ๋ ‡๋‹ค๋ฉด n = 10, r = 3์ด ๋˜๊ณ , 10C3 = ..
Combinatorics์—์„œ variation์€ ๋ถ„์‚ฐ์„ ๋œปํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹™๋‹ˆ๋‹ค. Variation์ด๋ž€, ์ฃผ์–ด์ง„ ์ง‘ํ•ฉ์—์„œ ํŠน์ • ๊ฐœ์ˆ˜์˜ ์š”์†Œ๋“ค์„ ๊ณ ๋ฅด๊ณ  ๋‚˜์—ดํ•  ์ˆ˜ ์žˆ๋Š” ์ด ๊ฒฝ์šฐ์˜ ์ˆ˜๋ฅผ ๋œปํ•ฉ๋‹ˆ๋‹ค.์ •ํ™•ํ•œ ๋ฒˆ์—ญ์„ ์ฐพ๊ธฐ๊ฐ€ ์–ด๋ ค์›Œ์„œ ๊ทธ๋ƒฅ ๋ฐ”๋ฆฌ์—์ด์…˜์ด๋ผ๊ณ  ๋ถ€๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค. ๋ฐ”๋ฆฌ์—์ด์…˜ v์˜ ๊ณต์‹์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.v = n^p where n = total number of elements, and p = the number of positions. ์˜ˆ๋ฅผ ๋“ค์–ด ๋‘๊ฐœ์˜ 0~9๊นŒ์ง€ ์ž…๋ ฅํ•  ์ˆ˜ ์žˆ๋Š” ์ˆซ์ž ์•”ํ˜ธ๋ฅผ ์ถ”์ธกํ•ด์„œ ๋งž์ถฐ์•ผ ํ•œ๋‹ค๊ณ  ํ•  ๋•Œ,n์€ 0~9๊นŒ์ง€ ์ด 10๊ฐœ๊ฐ€ ๊ฐ€๋Šฅํ•˜๋‹ˆ n = 10, p๋Š” ๋‘๊ฐœ์˜ ๊ฐ’์„ ๋งž์ถฐ์•ผํ•˜๋‹ˆ 2์ž…๋‹ˆ๋‹ค.์ฆ‰ v = 10^2 = 100์ด๋ฏ€๋กœ, ํ•ด๋‹น ์ด๋ฒคํŠธ์— ๋Œ€ํ•ด์„œ ๊ฐ€๋Šฅํ•œ ๋ฐ”๋ฆฌ์—์ด์…˜์€ ์ด 100๊ฐœ์ž…๋‹ˆ๋‹ค.  ์ด ..
์กฐํ•ฉ๋ก (Combinatorics)์˜ ์ค‘์š”ํ•œ ์š”์†Œ ์ค‘ ํ•˜๋‚˜์ธ ์ˆœ์—ด(Permutation)์€ ์š”์†Œ๋“ค์„ ์–ด๋–ป๊ฒŒ ๋‚˜์—ดํ•  ์ˆ˜ ์žˆ๋Š”์ง€๋ฅผ ๊ตฌํ•˜๋Š” ๊ฒƒ์— ์ง‘์ค‘ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ์›”๋“œ์ปต์—์„œ A, B, C ๊ตญ๊ฐ€๊ฐ€ 1~3๋“ฑ์„ ์ฐจ์ง€ํ–ˆ๋‹ค๋Š” ์ •๋ณด๋งŒ ์•Œ๊ณ  ์žˆ์„ ๋•Œ, ๊ฐ€๋Šฅํ•œ ๋ชจ๋“  ๋“ฑ์ˆ˜๋ฅผ ๊ตฌํ•ด๋ด…์‹œ๋‹ค.1๋“ฑ2๋“ฑ3๋“ฑABCACBBACBCACABCBA์ด ๊ฐ€๋Šฅํ•œ ๊ฒฝ์šฐ์˜ ์ˆ˜๋Š” 6๊ฐœ๋กœ, ๊ทธ ๊ฐ€๋Šฅํ•œ ๊ฐ€์ง“์ˆ˜๋Š” 3 * 2 * 1 = 3! ์ด์˜€์Šต๋‹ˆ๋‹ค. n๊ฐœ์˜ ์š”์†Œ๋“ค ์ค‘์—์„œ r๊ฐœ์˜ ์š”์†Œ๋ฅผ ๋‚˜์—ดํ•  ๋•Œ (ํ˜น์€ ๋ฝ‘์„ ๋•Œ), ๊ฐ€๋Šฅํ•œ ๊ฐ€์ง“์ˆ˜์ธ nPr์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.nPr = n! / (n-r)!์šฐ๋ฆฌ์˜ ์˜ˆ์‹œ์—์„œ n = 3, r = 3 ์ด์˜€์œผ๋ฏ€๋กœ 3P3 = 3! / (3-3)! = 3! / 1 = 3! = 6 ์ด์˜€์Šต๋‹ˆ๋‹ค.(0 ํŒฉํ† ๋ฆฌ์–ผ์€ 1์ž…๋‹ˆ๋‹ค)  ์—ฌ๊ธฐ์„œ ๋‹ค ์•„์‹œ๊ฒ ์ง€..
Chan Lee
'Data Science/ํ†ต๊ณ„' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๊ธ€ ๋ชฉ๋ก