Data Science

์ง€๋‚œ ํฌ์ŠคํŠธ๊นŒ์ง€ ์‚ดํŽด๋ณด์•˜๋˜ ํ–‰์œ„๋“ค, data์™€ big data์˜ ์ฒ˜๋ฆฌ์™€ business intelligence๋Š” ๊ณผ๊ฑฐ์— ์ผ์–ด๋‚œ ์ผ๋“ค์„ ๋ถ„์„ํ•˜๊ธฐ ์œ„ํ•œ ์ง€ํ–ฅ์ ์„ ๊ฐ€์ง„ ํ–‰์œ„๋“ค์ด์˜€์Šต๋‹ˆ๋‹ค. ์ด๋ฒˆ ํฌ์ŠคํŠธ์—์„œ๋Š” ์ด๋ฅผ ํ†ตํ•ด ๊ถ๊ทน์ ์œผ๋กœ ์šฐ๋ฆฌ๊ฐ€ ๋‹ค๊ฐ€๊ฐ€๋ ค๊ณ  ํ•˜๋Š” ๋ฏธ๋ž˜ ๋ถ„์„ ์ค‘, ์ „ํ†ต์ ์ธ ๋ฐฉ๋ฒ•๋“ค์— ๋Œ€ํ•ด์„œ ์†Œ๊ฐœํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. Traditional Methods๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ •์˜๋ฉ๋‹ˆ๋‹ค.A set of methods that are derived mainly from statistics and are adopted for business.Statistic, ์ฆ‰ ์ƒ˜ํ”Œ ๋ฐ์ดํ„ฐ์—์„œ ๊ตฌํ•œ ํ†ต๊ณ„์  ๊ฐ’๋“ค์œผ๋กœ๋ถ€ํ„ฐ ์œ ์ถ”๋˜๊ณ , ๋น„์ฆˆ๋‹ˆ์Šค๋“ค์— ์‚ฌ์šฉ๋˜๋Š” ๋ฐฉ๋ฒ•๋“ค์„ ์ด์นญํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ traditional methods๋Š” ๋ฏธ๋ž˜์˜ ํผํฌ๋จผ์Šค๋ฅผ ์˜ˆ์ธกํ•˜๋Š”๋ฐ ๋†’์€ ..
Business Intelligence (BI)๋Š” ๋ฌด์—‡์ผ๊นŒ์š”?Business Intelligence, BI๋Š” ๊ณผ๊ฑฐ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„์„, ์ดํ•ด, ์„ค๋ช… ๋ฐ ๋ณด๊ณ ํ•˜๋Š” ๊ณผ์ •์— ํฌํ•จ๋œ ๋ชจ๋“  tool์ž…๋‹ˆ๋‹ค.๋Œ€์ƒ ๋น„์ฆˆ๋‹ˆ์Šค์˜ ๋งค์ถœ์ด ์ฆ๊ฐ€ํ•˜์˜€๋Š”์ง€, ๊ทธ๋ฆฌ๊ณ  ์™œ ์ฆ๊ฐ€ํ•˜์˜€๋Š”์ง€๋ฅผ ๋ถ„์„ํ•œ๋‹ค๊ณ  ์ƒ๊ฐํ•˜๋ฉด ์ง๊ด€์ ์ž…๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๋Š” BI๋ฅผ ํ†ตํ•ด์„œ ์•„์ด๋””์–ด์™€ ํ†ต์ฐฐ๋ ฅ์„ ์–ป๊ณ , ๋ฏธ๋ž˜์˜ ๊ฒฐ์ •์— ๋„์›€์„ ์ฃผ๊ธฐ๋ฅผ ์›ํ•ฉ๋‹ˆ๋‹ค. ์‹ค์ƒํ™œ์˜ ์˜ˆ์‹œ๋กœ๋Š”, ์„ฑ์ˆ˜๊ธฐ์— (์ˆ˜์š”๊ฐ€ ๋Š˜์–ด๋‚˜๋Š” ์‹œ๊ธฐ์—) ํ˜ธํ…” ๊ฐ€๊ฒฉ์„ ์˜ฌ๋ฆฌ๋Š” price optimization์ด ์žˆ๊ฒ ์Šต๋‹ˆ๋‹ค.๊ณผ๊ฑฐ์˜ ์ˆ™๋ฐ•๊ฐ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ๋ฐฉ์˜ ๊ฐ€๊ฒฉ์„ ์ตœ์ ํ™”ํ•˜์—ฌ ์ˆ˜์ต์„ ์ตœ๋Œ€ํ™” ์‹œํ‚ค๋Š” ํ–‰์œ„๊ฐ€ ๋˜๊ฒ ์Šต๋‹ˆ๋‹ค. BI๋ฅผ ๋”์šฑ ์ดํ•ดํ•˜๊ธฐ ์œ„ํ•ด, ๋ช‡๊ฐ€์ง€ ๊ฐœ๋…๋“ค์„ ์†Œ๊ฐœํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.Observation (๊ด€์ฐฐ): ์ˆ˜ํ•™์ ์œผ๋กœ ๋‹ค๋ฃจ์–ด์งˆ ์ˆ˜ ์—†๋Š”..
Big Data๋Š” Traditional Data์™€ ์ƒ๋ฐ˜๋˜๋Š” ๋ฐ์ดํ„ฐ๋“ค๋กœ, ๋งค์šฐ ํฐ ๋ฐ์ดํ„ฐ๋“ค์„ ์ง€์นญํ•ฉ๋‹ˆ๋‹ค.๋น… ๋ฐ์ดํ„ฐ๋Š” ๊ตฌ์กฐํ™” ๋œ ์ƒํƒœ์ผ ์ˆ˜๋„, ์กฐ๊ธˆ ๊ตฌ์กฐํ™”๋œ ์ƒํƒœ์ผ ์ˆ˜๋„, ํ˜น์€ ์ „ํ˜€ ๊ตฌ์กฐํ™”๋˜์ง€ ์•Š์€ ์ƒํƒœ์ผ ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.(= can be constructed, semi-constructed, or not constructed) ๋น… ๋ฐ์ดํ„ฐ๋Š” ์ฃผ๋กœ ์—ฌ๋Ÿฌ๊ฐœ์˜ ์ปดํ“จํ„ฐ์— ๋‚˜๋ˆ„์–ด์ ธ์„œ ์ €์žฅ๋ฉ๋‹ˆ๋‹ค.์šฐ๋ฆฌ๊ฐ€ ํ‰์ƒ์‹œ์— ์ ‘ํ•˜๋Š” ๋ฐ์ดํ„ฐ์˜ ๊ทœ๋ชจ์™€๋Š” ๋น„๊ต๋„ ์•ˆ๋˜๊ฒŒ ๋งค์šฐ ํฐ ๋ฐ์ดํ„ฐ๋กœ ์ธ์‹ํ•˜๋ฉด ๋  ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฐ ๋น… ๋ฐ์ดํ„ฐ์—๋Š” ์ „์ฒ˜๋ฆฌ๊ฐ€ ๋งค์šฐ ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค.๋น…๋ฐ์ดํ„ฐ์˜ ์ „์ฒ˜๋ฆฌ์˜ ๋ช‡๊ฐ€์ง€ ์ข…๋ฅ˜๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. Types of Pre-processing:1. Class Labeling (number, text, digital image, di..
Raw Data๋Š” raw facts, primary data๋ผ๊ณ ๋„ ๋ถˆ๋ฆฌ๋ฉฐ ๊ฐ€๊ณต๋˜์ง€ ์•Š์€ ์›์‹œ ๋ฐ์ดํ„ฐ๋ฅผ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.Raw data๋Š” ๋ฐ”๋กœ ๋ถ„์„(analysis)๋  ์ˆ˜ ์—†๋Š” ์ƒํƒœ์ด๊ณ , ์ฒ˜๋ฆฌ๋ฅผ ๊ฑฐ์ณ์•ผ ๋ถ„์„์— ์‚ฌ์šฉ์ด ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.ํ•˜์ง€๋งŒ ์ฒ˜๋ฆฌ ํ”„๋กœ์„ธ์‹ฑ ์ด์ „์—, ์ „์ฒ˜๋ฆฌ(pre-processing) ๊ณผ์ •์„ ๊ฑฐ์ณ์•ผํ•ฉ๋‹ˆ๋‹ค. ์šฐ์„  Traditional data์˜ ๊ด€์ ์—์„œ ์ „์ฒ˜๋ฆฌ์˜ ์ข…๋ฅ˜๋ฅผ ์—ด๊ฑฐํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.์—ฌ๊ธฐ์„œ Traditional data๋ž€, ํ•œ ๊ฐœ์˜ ์ปดํ“จํ„ฐ์—์„œ ์ฒ˜๋ฆฌ๋  ์ˆ˜ ์žˆ๋Š” ์–‘์˜ ๊ตฌ์กฐํ™”๋œ ๋ฐ์ดํ„ฐ๋ฅผ ์ง€์นญํ•ฉ๋‹ˆ๋‹ค.๋ฐ˜๋Œ€๋˜๋Š” ๊ฐœ๋…์œผ๋กœ๋Š” ๋น… ๋ฐ์ดํ„ฐ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. Traditional data์˜ ๊ด€์ ์—์„œ ์ „์ฒ˜๋ฆฌ์˜ ์ข…๋ฅ˜๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.Pre-processings for Traditional Data1. Class Labeling: ์–‘..
Chan Lee
'Data Science' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๊ธ€ ๋ชฉ๋ก (8 Page)