전체 κΈ€

Python, C++, Data Science 곡뢀 λΈ”λ‘œκ·Έ μž…λ‹ˆλ‹€.
The main goal of data science is learning about the world from data using computational methods.There are 3 key parts of data science.  - ExplorationIdentifying patterns in dataUses visulizations - InferenceQuantifying whether those patterns are reliableUses randomization - PredictionMaking informed guesses about unobserved dataUses machine learning  There are two important concepts about the re..
이번 ν¬μŠ€νŠΈμ—μ„œλŠ” μ„ ν˜• νšŒκ·€ λͺ¨λΈμ„ μ μš©ν•˜κΈ° μœ„ν•œ λͺ‡κ°€μ§€μ˜ 핡심 가정듀을 μ•Œμ•„λ³΄κ² μŠ΅λ‹ˆλ‹€.이 가정듀이 사싀이 μ•„λ‹ˆλΌλ©΄, μ΅œμ†Œμ œκ³±λ²•μ„ μ μš©ν•˜μ—¬ λͺ¨λΈμ„ λ””μžμΈ ν–ˆμ„ λ•Œ λ¬΄μ˜λ―Έν•˜κ³  λΆ€μ •ν™•ν•œ 결과값이 λ„μΆœλ  κ²ƒμ΄λ―€λ‘œ 이 점듀을 μœ μ˜ν•˜λŠ”κ²Œ μ’‹κ² μŠ΅λ‹ˆλ‹€. 1. Linearity (μ„ ν˜•μ„±)이름뢀터가 μ„ ν˜• νšŒκ·€μž–μ•„μš”? κ° 독립 λ³€μˆ˜λŠ” κ³ μœ ν•œ κ³„μˆ˜κ°€ 곱해지고, 이λ₯Ό λ‹€ ν•©ν•΄μ„œ μ’…μ†λ³€μˆ˜λ₯Ό λ„μΆœν•©λ‹ˆλ‹€. μ„ ν˜•μ„±μ„ νŒλ‹¨ν•˜λŠ” μ‰¬μš΄ 방법은 λ¬΄μ—‡μΌκΉŒμš”? λ…립 λ³€μˆ˜ 쀑 ν•˜λ‚˜(x1)λ₯Ό λ½‘μ•„μ„œ 쒅속 λ³€μˆ˜(y)에 λŒ€ν•΄μ„œ scatter plot을 κ·Έλ €λ³΄μ„Έμš”. κ·ΈλŸΌ  μ–ΌμΆ” λ°©ν–₯성이 보일텐데, 이게 μΌμ°¨ν•¨μˆ˜λ©΄ μ„ ν˜•μ„±μ΄ μžˆλŠ” 것이고, κ³‘선이 보이면 μ„ ν˜•μ„±μ΄ λΆ€μ‘±ν•œ 데이터겠죠?그리고 그런 κ²½μš°μ—λŠ” μ„ ν˜• νšŒκ·€κ°€ μ•„λ‹Œ λ‹€λ₯Έ 방법을 ν†΅ν•΄μ„œ 예츑 λͺ¨λΈμ„ λ””μž..
μ§€λ‚œ ν¬μŠ€νŠΈμ—μ„œλŠ” R-squared, κ²°μ • κ³„μˆ˜μ— λŒ€ν•΄μ„œ μ•Œμ•„λ³΄μ•˜μŠ΅λ‹ˆλ‹€.μ΄λŠ” 우리의 νšŒκ·€ λͺ¨λΈμ΄ μ‹€μ œ λ°μ΄ν„°μ˜ 뢄산을 μ–Όλ§ˆλ‚˜ 잘 μ„€λͺ…ν•˜λŠ”μ§€λ₯Ό λ‚˜νƒ€λ‚Έ κ°’μœΌλ‘œ, SST/SSR μ΄μ˜€μŠ΅λ‹ˆλ‹€. Linear Regression (μ„ ν˜• νšŒκ·€) - 6 | R-Squared (κ²°μ • κ³„μˆ˜)μ§€λ‚œ ν¬μŠ€νŠΈμ—μ„œλŠ” OLS, μ΅œμ†Œ μ œκ³±λ²•μ— λŒ€ν•΄μ„œ κ°„λ‹¨ν•˜κ²Œ μ•Œμ•„λ³΄μ•˜μŠ΅λ‹ˆλ‹€. Linear Regression (μ„ ν˜• νšŒκ·€) - 5 | Ordinary Least Squares (μ΅œμ†Œμ œκ³±λ²•)μ €λ²ˆ ν¬μŠ€νŠΈμ—μ„œλŠ” SST, SSR, SSE의 관계에 λŒ€ν•΄μ„œ μ•Œμ•„λ³΄μ•˜μŠ΅code-studies.tistory.com μ΄λ²ˆ ν¬μŠ€νŠΈμ—μ„œλŠ” μˆ˜μ •λœ κ²°μ •κ³„μˆ˜, Adjusted R Squared에 λŒ€ν•΄μ„œ μ•Œμ•„λ³΄κ² μŠ΅λ‹ˆλ‹€.μš°λ¦¬κ°€ 자주 보던 이 Regerssion summa..
μ§€λ‚œ ν¬μŠ€νŠΈμ—μ„œλŠ” OLS, μ΅œμ†Œ μ œκ³±λ²•μ— λŒ€ν•΄μ„œ κ°„λ‹¨ν•˜κ²Œ μ•Œμ•„λ³΄μ•˜μŠ΅λ‹ˆλ‹€. Linear Regression (μ„ ν˜• νšŒκ·€) - 5 | Ordinary Least Squares (μ΅œμ†Œμ œκ³±λ²•)μ €λ²ˆ ν¬μŠ€νŠΈμ—μ„œλŠ” SST, SSR, SSE의 관계에 λŒ€ν•΄μ„œ μ•Œμ•„λ³΄μ•˜μŠ΅λ‹ˆλ‹€. Linear Regression (μ„ ν˜• νšŒκ·€) - 4 | SST = SSR + SSEμ €λ²ˆ ν¬μŠ€νŠΈμ—μ„œλŠ” μ„ ν˜• νšŒκ·€ λͺ¨λΈμ„ λ””μžμΈ ν•˜λŠ” κ³Όμ •μ—μ„œ StatsModels 라이브러리λ₯Ό ν™œcode-studies.tistory.com 이번 ν¬μŠ€νŠΈμ—μ„œλŠ” κ²°μ •κ³„μˆ˜, R Squared에 λŒ€ν•΄μ„œ μ•Œμ•„λ³΄κ² μŠ΅λ‹ˆλ‹€.4번 ν¬μŠ€νŠΈμ—μ„œ SST (Total Variability) = SSR (Explained Variability) + SSE (Unexplained Variabil..
Chan Lee
Chan Code & DS πŸ§‘‍πŸ’»πŸ“Š