2
Mengapa model statistik cocok jika diberi set data yang sangat besar?
Proyek saya saat ini mungkin mengharuskan saya untuk membuat model untuk memprediksi perilaku sekelompok orang tertentu. set data pelatihan hanya berisi 6 variabel (id hanya untuk tujuan identifikasi): id, age, income, gender, job category, monthly spend di mana monthly spendadalah variabel respon. Tetapi dataset pelatihan berisi sekitar 3 juta baris, …
8
modeling
large-data
overfitting
clustering
algorithms
error
spatial
r
regression
predictive-models
linear-model
average
measurement-error
weighted-mean
error-propagation
python
standard-error
weighted-regression
hypothesis-testing
time-series
machine-learning
self-study
arima
regression
correlation
anova
statistical-significance
excel
r
regression
distributions
statistical-significance
contingency-tables
regression
optimization
measurement-error
loss-functions
image-processing
java
panel-data
probability
conditional-probability
r
lme4-nlme
model-comparison
time-series
probability
probability
conditional-probability
logistic
multiple-regression
model-selection
r
regression
model-based-clustering
svm
feature-selection
feature-construction
time-series
forecasting
stationarity
r
distributions
bootstrap
r
distributions
estimation
maximum-likelihood
garch
references
probability
conditional-probability
regression
logistic
regression-coefficients
model-comparison
confidence-interval
r
regression
r
generalized-linear-model
outliers
robust
regression
classification
categorical-data
r
association-rules
machine-learning
distributions
posterior
likelihood
r
hypothesis-testing
normality-assumption
missing-data
convergence
expectation-maximization
regression
self-study
categorical-data
regression
simulation
regression
self-study
self-study
gamma-distribution
modeling
microarray
synthetic-data