Expanding on the above
My interest and research has concerned rather fundamentally the mathematics and physics of complex systems, so I have a certain bias towards approaches to randomness here. For example, from an elementary text I’ve used:
“Before the discovery of this phenomenon, all studies of random processes and of chaos were usually conducted within the frame of classical theory of probability, which requires one to define a set of random events or a set of random process realizations or a set of other statistical ensembles. After that, probability itself is assigned and studied as a measure on this set, which satisfies Kolmogorov’s axioms [2]. The discovery of deterministic chaos radically changed this situation.
Chaos was found in dynamical systems, which do not contain elements of randomness at all, i.e. they do not have any statistical ensembles. On the contrary, the dynamic of such systems is completely predictable, the trajectory, assigned precisely by its initial conditions, reproduces itself precisely, but nevertheless its behavior is chaotic…the phenomenon of deterministic chaos requires a deeper understanding of randomness, not based on the notion of a statistical ensemble.”
Bolotin, Y., Tur, A., & Yanovsky, V. (2009). Chaos: Concepts, Control and Constructive Use (Understanding Complex Systems). Springer.
I also have Bayesian leanings when it comes to probability theory and statistics. The common notion of randomness in terms of sequences and distributions one is taught in undergraduate probability courses and to a certain extent graduate as well are heavily influenced by the work of Ronald Fisher, who almost single-handedly banished the Bayesian perspective from respectable circles for about half a century or more (others involved include von Mises, Kolmogorov himself, Popper, Neyman & Pearson, and the statistical communities response in particular to Fisher). Historically, everybody distinguished between chance (and probability) and randomness until almost the 18th century. Bayes was one of the first to identify the two and avoid this distinction, and Bayesians continue to regard randomness (chances) in terms of subjective (a priori) probabilities. For Bayesians, probability is fundamentally a matter of (rational) degrees of belief; see e.g.,
Press, S. J. (2003). Subjective and Objective Bayesian Statistics: Principles, Models, and Applications (2nd Ed.) (Wiley Series in Probability in Statistics). Wiley.
(“probability reflects a degree of randomness")
Jaynes, E. T. (2003). Probability Theory: The Logic of Science. Cambridge University Press.
(“In the vast majority of real applications there are no ‘random variables’ (What defines ‘randomness’?) and no ‘true distribution’...”)
Howson, C., & Urbach, P. (2006). Scientific Reasoning: The Bayesian Approach (3rd Ed.). Open Court.
(“'random variable' does not have to refer to a random procedure: there, it was just a way of describing the various possibilities determined by the parameters of some application. Indeed, not only do random variables have nothing necessarily to do with randomness, but they are not variables either”)
D'Agostini, G. (2003). Bayesian Reasoning in Data Analysis: A Critical Introduction. World Scientific.
(“In the subjective approach random variables (or, better, uncertain numbers) assume a more general meaning than that they have in the frequentistic approach: a random number is just any number in respect of which one is in a condition of uncertainty.”)
Hartigan, J. A. (1983). Bayes Theory (Springer Texts in Statistics). Springer.
(I include this text because it is rather short, assumes the student is familiar with measure-theoretic (graduate-level) probability, and is a very nice, concise treatment of Bayesian statistics and probability for those with a background sufficient for a rigorous approach)
In frequentist approaches, probability itself is divided into “probability theory” and the likelihood Fisher defined which, despite an initially highly negative reaction (in no small part to the ad hoc method of distinguishing “likelihood” from “probability”) is now almost universal in statistics and found everywhere in applied probability. However, a central component to frequentist probability is “random” experiments, alongside “random sampling.” These two are related concepts, mostly via their idealized nature. "Experiments" are conceived of in terms of a random sample from a set of infinitely many identical experiments. Random sampling more generally is the idea that the method of obtaining samples from a population did not itself introduce biases (i.e., it could be a biased sample, but the sampling itself was not biased). In fact, insofar as “random” can be readily distinguished in frequentist probability theory it is in terms of bias. Random variables are, as they always are, merely functions and no more random than they are variables (likewise with random vectors). In probability theory more generally, it might be (and has been) said to be yet again a measure of uncertainty characterized by entropy (e.g., “In probability theory, entropy is a measure of the disorder and randomness present in a distribution”; Liu, L., & Yager, R. R. (2008). Classic works of the Dempster-Shafer theory of belief functions: An introduction. In R.R. Yager and L. Liu (Eds.) Classic Works of the Dempster-Shafer Theory of Belief Functions (pp. 1-34). Springer.
The problem with the notion of randomness conceived of as simply something with equally likely outcomes (apart from the fact that in statistics and probability “likelihood” generally has a technical meaning other than that of probability) is
1) Usually, it is impossible to actually define all outcomes
2) Most of the time outcomes are equally likely in that they are all 0, for infinitely many non-uniform distributions as well as uniform.
3) Random measures are set-theoretic functions on sigma-algebras of a probability triple, and do not correspond either to any intuitive notion of randomness, “random sampling”, “random variables”, etc., but Polish (locally compact) spaces endowed with a mapping M on (the Borel sigma-algebra of) a set E.
4) The problems with “random” in “random sampling”
Concerning 4, consider one of my favorite teaching examples: selecting numbers “at random” from an interval on the real line (say, the interval [0,1]. Define a probability triple using the Dirichlet function (or characteristic function of the rational numbers) on this set, i.e., the probability function is equal to 0 whenever the argument of the function is a rational number and 0 otherwise. There are, of course, infinitely many rational numbers in this interval, and indeed within the interval between any two rational numbers there are infinitely many other rational numbers. So, what can we say about this function’s probability distribution? That is, what is the probability that a “randomly” sampled number from this interval will yield a rational number, given that there are infinitely many rational numbers in the interval? The answer is 0. The probability that any member of the entire set of infinitely many rational numbers will be in a “random sample” from this (or any other interval) is 0, because for any interval the sampled number will be irrational “almost surely” (“a.s.”, the probabilists version of “almost everywhere” or “a.e.”).