MathSciDoc: An Archive for Mathematician ∫

Machine Learningmathscidoc:2206.41013

2022.5
In a Data-Generating Experiment (DGE), the data, X, is often obtained either from a Black-Box with inputs θ and Y, or from a Quantile function or a learning machine, f(Y, θ); θ is unknown, element of metric space (Θ, ρ), Y is random. If X has intractable or unknown c.d.f., Fθ, non-identifiability of θ cannot be confirmed and when present, among others, limits the predictive accuracy of the learned model, f(Y, \hat{θ}); \hat{θ} estimate of θ. In Machine Learning, non-identifiability of θ is ubiquitous and its extent is a criterion for selecting a learning machine. Empirical indices, EDI and PPVI, are introduced using P-values of Kolmogorov-Smirnov tests: i) to confirm almost surely, using generated data, the discrimination of θ from θ^∗, namely that the Kolmogorov distance, dK(Fθ, Fθ^∗), is positive, ii) to confirm identifiability of θ(∈ Θ) by repeating i) for θ^∗ in a sieve of Θ, since neighboring parameter values are in practice indistinguishable, and iii) most important, to compare EDI-graphs of DGEs, preferring more discrimination and less non-identifiability among parameters, and select one DGE to use. In applications, EDI-graphs confirm nonidentifiability in mixture models and in models parametrised with sums of parameters. EDI and PPVI explain why Tukey’s g-and-h model (DGE1) has better g-discrimination than the g-and-k model (DGE2), unless the sample size is extremely large; h_0 = k_0. EDIgraphs indicate that Normal learning machines have better parameter discrimination thanSigmoid learning machines and their parameters are non-identifiable.
@inproceedings{yannis2022selection,