Consider two random vectors $\wt{\bx} \in \R^p$ and $\wt{\by} \in \mathbb R^q$ of the forms $\wt{\bx}=A \mathbf z + \bC_1^{1/2}\mathbf x $ and $\wt{\by}=B \mathbf z + \bC_2^{1/2}\mathbf y $, where $\mathbf x \in \R^p$, $\mathbf y \in \R^q$ and $\mathbf z\in \R^r$ are independent random vectors with i.i.d. entries of zero mean and unit variance, $\bC_1$ and $\bC_2$ are $p \times p$ and $q\times q$ deterministic population covariance matrices, and $A$ and $B$ are $p \times r$ and $q\times r$ deterministic factor loading matrices. With $n$ independent observations of $(\wt{\mathbf x},\wt{\mathbf y})$, we study the sample canonical correlations between $\wt{\bx} $ and $\wt{\by}$. We consider the high-dimensional setting with finite rank correlations, that is, ${p}/{n}\to c_1$ and ${q}/{n}\to c_2$ as $n\to \infty$ for some constants $c_1\in (0,1)$ and $c_2\in (0,1-c_1)$, and $r$ is a fixed integer. Let $t_1\ge t_2 \ge \cdots\ge t_r\ge 0$ be the squares of the nontrivial population canonical correlation coefficients between $\wt {\bx}$ and $\wt{\by}$, and let $\wt\lambda_1 \ge \wt\lambda_2\ge \cdots \ge \wt\lambda_{p\wedge q}\ge 0$ be the squares of the sample canonical correlation coefficients. If the entries of $\mathbf x$, $\mathbf y$ and $\mathbf z$ are i.i.d. Gaussian, then the following dichotomy has been shown in \cite{CCA} for a fixed threshold $t_c \in(0, 1)$: for any $1\le i \le r$, if $t_i < t_c$, then $\wt\lambda_i$ converges to the right-edge $\lambda_+$ of the limiting eigenvalue spectrum of the sample canonical correlation matrix, and moreover, $n^{2/3}(\wt\lambda_i-\lambda_+)$ converges weakly to the Tracy-Widom law; if $t_i>t_c$, then $\wt\lambda_i$ converges to a deterministic limit $\theta_i \in (\lambda_+, 1)$ that is determined by $c_1$, $c_2$ and $t_i$. In this paper, we prove that these results hold universally under the sharp fourth moment conditions on the entries of $\mathbf x$, $\mathbf y$ and $\mathbf z$. Moreover, we prove the results in full generality, in the sense that they also hold for near-degenerate $t_i$'s and for $t_i$'s that are close to the threshold $t_c$. Finally, we also provide almost sharp convergence rates for the sample canonical correlation coefficients under a general $a$-th moment assumption.