Вычислительный подход к построению биологии | Вестн. Том. гос. ун-та. Философия. Социология. Политология. 2021. № 61. DOI: 10.17223/1998863X/61/5

Вычислительный подход к построению биологии

По мнению некоторых критиков, если биология - это своего рода реверс-инжиниринг природы, то она довольно плохо подготовлена к этой задаче. Таким образом, проблема скорее в ее онтологии. Многочисленные гипотезы и предположения, содержащиеся в статьях по методологическим вопросам биологии, утверждают, что живые системы следует рассматривать как сложные сети каналов передачи сигналов, как нейронных, так и не-нейронных, которые характеризуются модульностью, обладают схемами обратной связи и склонны к появлению новых свойств и возрастающей сложности. Если это так, то мы находимся на пороге нового этапа в разработке компьютерных моделей, когда не только компьютеры используются для имитации жизни, но и сама жизнь представляет собой сложную сеть взаимодействующих естественных компьютеров.

Computational Treatment for Life Science.pdf The issue with ontology In 2002, Yuri Lazebnik, then an associate professor at Cold Spring Harbor Laboratory, published a paper [1] that included some personal story. When moving from Russia to the USA, his wife brought a broken radio set made in the USSR with her. Struggling with a paradox, according to which the more empirical facts the less we know in biology, Yuri turned to this device as a clarifying metaphor of what we lack when approaching the empirical field in the discipline. According to him, if a biologist tried to find out the working principles of the radio in order to fix it, s/he would go a long way classifying its small components by their color and shape, turning them off one at a time to see how this affects the sound, etc. Eventually, the researcher would come up with an assumed functional scheme made of labels and arrows that would represent a qualitative conjecture of which parts are important for bringing up the sound. No matter if we can call this outcome a new knowledge, the design, even if it turns out generally right by chance, will never allow for predicting any facts because it lacks a formal descriptive language and quantitative measures similar to those used by radio engineers. Thus, it does not tell us how the system is really tuned to be functional. Lazebnik concludes that probability of a biologist having fixed the radio is approximately that of a monkey typing out a Robert Burns poem. According to him, biology lacks a formal language that, akin to the language of radio circuits, would contain terms for typical abstract elements and their quantitative properties. Such a language, if it were there, would allow for building up explanatory models by putting typical elements in varying combinations and projecting their functional connections by indicating numerical values of their properties. Radio provides an impressive metaphor for what biology is in need of, but it should be noted that not only engineers possess such a language - physics as a science started when Newton established that the world of mechanics should contain only physical bodies that are characterized solely by mass and move in a straight line and uniformly accelerated unless affected by some force. If put against our everyday experience, this picture is counterintuitive, but it proved to be an effective explanatory tool together with appropriate mathematics. Here I would like to propose an important distinction. What Lazebnik means by a “formal language” is not a language of propositions about the world, i.e., of asserting some states of affairs. His analogy with radio circuits reveals that what is meant here is a language of listing relevant types of objects and their relations. Hereafter I will refer to it as a domain ontology. Besides, a Newtonian-style science needs another formal language built over the former to generate descriptive propositions. This propositional language1 may probably be mathematical - at least, so it has been thus far securing overall progress in the epistemic efficiency. Therefore, if we adopt Lazebnik's argument (and I do), then the issue with biology is not shortage of mathematics therein, as we have seen numerous attempts to quantify or to formalize our knowledge of life: quantitative and formal tools have been applied, such as statistics in experimental design, pattern seeking in bioinformatics, models in evolution, ecology, and epidemiology [2]. The fact is that they have not led to theoretical integration of biology so far. Rather, in this view, biology lacks a unified formal language that would name and describe some ultimate elements, combinations of which make for various live-matter designs. Thus, the issue is more likely with its domain ontology. There are popular views, according to which a domain ontology is inferred by a theory proper. Specifically, it is construed as a system of existential presuppositions (E) implied by the theory's propositions (T). For instance, if your theory asserts that living organisms evolve under heredity and variability, it presupposes that there are living organisms3 4. On this basis, it is often inferred that if T is true and it implies a certain E, then the latter is true as well. But remember that when Sadi Carnot proposed his Carnot Circle in 1824, he was a proponent of the Caloric Theory, i.e., he believed that the heat was carried by a self-repellent liquid. If the true description of the Carnot Circle is T and the belief in caloric is E, and Th E holds, then we must still believe in caloric, which is not the case. Therefore, the relation of T and E is not that of inference, but rather that of interpretation. That is, in its formal expression, for T to be true, its constituent terms must be interpreted on this or that ontological model, which, in Lazebnik's example, is the set of abstract radio elements (a capacitor, a resistor, etc.). But we could also try and interpret the circuit of his radio set on, say, characters of a Shakespeare's play. I suspect that, with an appropriate quantitative tuning of this ontological model, we could still keep the circuit as a valid theory. Therefore, domain ontologies are inferentially independent of theories proper. The illusion that T I- E holds for natural sciences stems from our natural language reasoning, according to which if ‘The actual king of France is bald' is true, then there is one. But, in the scientific contexts, T 's are usually formal expressions that are true, among other conditions, being interpreted on a relevant model. But the fact is also that there may be more than one relevant model for T to keep its truth value. This is important for the further discussion. Appropriate maths for life science As has already been said, attempts of incorporating mathematics into biology have been numerous up to now and some of them are locally successful, but they do not change the overall picture principally, for theoretical biology still lacks universal principles and a unified formal language for all the true propositions to be deduced thereby. Among the most interesting attempts, there was Alan Turing's article [4] on morphogenesis where the founder of the most viable theory of digital computations proposed some analog tools to explain emergence of biological complexity from initial homogeneity. The formal tools were mainly constrained to linear differential equations with constant coefficients. The theory posits two morphogens: one called “activator” and the second called “inhibitor”. The activator produces itself at a rate proportional to its abundance. It also produces the inhibitor, by which it is naturally inhibited. While they both diffuse all over the space, the inhibitor does so faster. According to Turing's calculations, at the first stage, the initial homogeneity is broken by small casual changes and, at the following stages, random fluctuations will be amplified. At the end, the activator gathers in multiple patches with empty areas between them. Thus, through a series of bifurcations within the initially homogeneous solution, structure is born out of equability. Bascompte in [5] mentions mathematical theory of deterministic chaos that was mainly inspired by biology and uses non-linear equations, whereby inexact representations (up to a certain number of decimal points) of real-number data derived from observations of complex non-equilibrium systems causes unexpected consequences over time. But, according to Hofmeyr in [6], systems like this are no more than simulations of the subject-matter, while mature sciences, like physics, reach up to the level of modeling. He explains the difference of the two with a simple scheme. There are objects in the natural world (N) and causal relations (C) between. And there are propositions within a formal theory (F) with inferential relations (I). A theorist constructs an encoding dictionary g that maps observables in N to input variables in F. And there is also a decoding dictionary d for the reverse mapping of every f 6 F to a certain n 6 N. Then, F is a model of N, iff (1) F = q(N ) and (2) I = q(C). If only (1) is the case, but not (2), F is a simulation of N. The latter case does not provide for prediction, as N = d(F) is not secured. As an example of “modeling” breakthroughs, Hofmeyr cites the work of Nicolas Rashevsky whose experience and competence in mathematical physics led him to a series of fruitful anticipations in life science. Thus, a Boolean version of his “two-factor theory” for excitable elements led in the hands of Walter Pitts and Warren McCullough to the development of neural networks, and was also a forerunner of Hodgkin and Huxley's model for the propagation of action potentials in neurons [6. P. 2]. But soon after Rashevsky realized that all he had done was simulation of state transitions in different modes of the live matter. He then shifted “from the components of biological systems to the relations between them” [6. P. 2], and that gave birth to the so-called “relational biology”. Mathematical tools used were topology, set theory and propositional logic. This enterprise was continued by his PhD student, Robert Rosen, who relied on category theory in his inquiry into metabolism-repair systems. According to Hofmeyr's estimation, “If the age of analysis was characterised by the unspoken motto ‘divide and conquer', then perhaps the age of synthesis aims to ‘integrate and rule'. Today relational biology uses a rich array of mathematical tools: category theory, graph theory, network theory, automata theory, formal systems, to name the most important” [6. P. 2]. Biological information pathways and networks Historically, the analytically constructed mathematics gave birth to nomothetical law-discovering science, within which biology does not prove to be much of success as compared to physics. In the remaining sections, I will try to add some evidence to the conjecture that the algorithmic, or - in a sense - constructive, view thereupon may well foster some computational ontologies and calculi, from which biology, cognitive and social science will probably benefit. In [7] Bhalla and Iyengar claim that living systems should be viewed as complex networks of signal-transmitting paths. Biologically, those are provided with regulation by protein-protein interactions, protein phosphorylation, regulation of elizymatic activity, production of second messengers, and cell surface signal transduction systems. They claim that signaling pathways interact with one another and the final biological response is shaped by interaction between pathways. These interactions result in networks that are quite complex and may have properties that are nonintuitive [7. P. 381]. The networking accounts for such emergent properties as extended signal duration, activation of feedback loops, definition of threshold stimulation for biological effects and multiple signal outputs. This analysis provides evidence that, being coupled appropriately, simple biochemical reactions can store and process information, and the whole mechanism of the reactions within signaling pathways forms another biological basis for memory and learning. Therefore, as Bhalla and Iyengar conclude, a computational approach fits well the task of comprehending both the complexity of multiple signaling interactions and the fine quantitative details. Csete and Doyle [8] dig into such an obvious property of living systems as their modularity. They conceive modules as subsystems of a larger system that use interfaces (protocols) for connecting to other modules, may be altered relatively independently, allow for simplified descriptions with the aim of a more abstract modeling, maintain their local identity when isolated or rearranged and, lastly, borrow additional identity from the rest of the system. Protocols are prescribed interfaces between modules that account for both facilitation of the system's upscaling and appearance of emergent properties of the whole. The introduction of the concepts makes it evident that abstractions such as gene regulation < ... >, covalent modification, membrane potentials, metabolic and signal transduction pathways, action potentials, and even transcription-translation, the cell cycle, and DNA replication could all be reasonably described as protocols < ... >, with their attendant modular implementations in various activators and repressors, kinases and phosphatases, ion channels, receptors, heterotrimeric guanine nucleotide binding proteins (G proteins), and so on [8. P. 1666]. The proposed ontology is abstract enough to allow for computer modeling of biological systems and processes. At the same time, one must take into consideration that the more realistic models of biological networks we need, the more entangled and sophisticated they ought to be to include multiple feedback signals, non-linear component dynamics, uncertain parameters, stochastic noise, parasitic dynamics and other fuzzy factors and inputs. We are at risk here, taking into consideration the above-mentioned deterministic chaos: the inevitably inexact values of the input may bring about too great mistakes in the theory's predictions. Csete and Doyle claim that mathematical tools are currently progressing in order to eventually cope with these issues. As is shown in [9], proteins are themselves highly versatile informationprocessing units. In unicellular organisms, protein-based circuits replace the whole of the nervous system as a behavior-controlling network. In the cells of plants and animals, numerous and networked proteins transmit information from the plasma membrane to the genome. The lasting impact of the environment on the concentration and activity of multitude of proteins in a cell is a real mechanism of memory saving important data of the environment. Interacting proteins architecturally and functionally are similar to neural networks. They are evolutionally trained to identify and respond appropriately to repeating patterns of external stimuli. Their connectome depends on diffusion-limited encounters between molecules, thus providing for unique features not found in artificial neural networks. So, while current technologies increasingly make use of the so-called bioinspired computing devices, such as neural networks, evolution algorithms and multi-agent systems, biology itself is in need of what I would call a computational ontology and computational formal tools. The free energy principle as a probable tool of integration An important instance of applying Bayesian statistics to biological, cognitive and social ontologies bound with the free energy principle is provided by inquiries into predictive processing models [10-19]. This intellectual movement that is commonly called a “paradigm” by many has been spreading extensively - winning new adepts - and intensively - covering new subject-matters and domains. The main principles of the doctrine were posited in reference to the cognitive realm, but their seeming explanatory strength pushed the founders to expand the scope of the theory, now covering non-cognitive subject-matters in psychology, as well as those of life and social sciences. The founders and proponents of the approach consider their theoretic system “a computationally tractable guide” to discovery in biological, cognitive, and social sciences [15. P. 1]. Sellro"dinger once noted that living systems “are unique among natural systems because they appear to resist the second law of thermodynamics by persisting as bounded, self-organizing systems over time” [15. P. 1]. A trans-disciplinary field known as evolutionary systems theory (EST) has been an attempt to deal with this issue. The free energy principle is a mechanistic version of EST that applies to living systems in general. Tle free energy doctrine rests on a set of bearing concepts and principles. In general, the theory construes organisms as embodying “expectations that they need to ensure are brought about through adaptive action” [10. P. 196]. Variational free energy is defined as a measure of tle difference between an anticipated state of environment and the actual input. Mathematically it is an upper bound on tle so-called “surprisal”, which reflects how strange or unexpected the current state of tle world or of tle organism's own inside is in its perception. Informational entropy is a measure of uncertainty, wlicl remains after free energy effectively places an upper limit on surprisal. Tle free energy principle (FEP), unlike statistical mechanics, pertains to systems at non-equilibrium steady state (NESS). It is consistent with thermodynamics, since the latter can be regarded as a special case of the FEP when certain conditions are met. A Markov blanket in a statistical network is the smallest set of nodes that renders an enclosed node conditionally independent of all others. The behavior of the enclosed node can be predicted by knowing only the states of those closest nodes. The same goes for the nodes outside: the enclosed node is pointless for predicting their behavior. A Markov blanket divides all tle states important for tle organism into external, sensory, active, and internal ones. Active inference is one of tle two ways of free energy minimization tlat consists in adaptive action reducing uncertainty or surprise about tle causes of tle input data. A generative model is a probabilistic mapping from external causes to the organism's observed input data. The statistical properties of Markov blankets account for processes that optimize Bayesian model evidence, eventually making the latter a relevant model of the external states. As Hesp et al. put it, we can describe the universe of biological systems as Markov blankets and their internal states, which are themselves composed of Markov blankets and their internal states [10. P. 201]. Tlis multi-level structure of Markov blankets forms a highly dimensional plase space - a formalization-ready meta-tleoretical ontology tlat, due to tle imagination of tle autlors, is labeled variational neuroethology. Markov blankets are capable of upscaling tleir structure. One example of this is a hypothesis that some of the organelles of eukaryotic cells used to be prokaryotic cells themselves (i.e., mitochondria and chloroplasts). This is what concerns ontology. And the theory states that for an organism to resist dissipation and persist as an adaptive system tlat is part of, coupled witl, and yet statistically independent from, tle larger system in wlicl it is embedded, it must embody a probabilistic model of tle statistical interdependencies and regularities of its environment [15. P. 2]. Biological systems, as any non-linear and non-equilibrium ones, are aligned by a random dynamical attractor, which is a set of frequently revisited states with high probability. A state space may be imagined as a free energy landscape the lowest points of which are inhabited by living systems. FEP asserts that all biological systems are constantly involved in minimizing their variational free energy. For an organism to survive is to avoid surprise, which corresponds to thermodynamic potential energy. According to EST, the drive to minimize surprise results from natural selection, the latter itself being a free energy minimizing process. Internal states of the organism encode a probability distribution over the external states. “Free energy is a functional (i.e., the function of a function) that describes the probability distribution encoded by the internal states of the Markov blanket” [15. P. 4]. The attempts to dispute the FEP conception as a biological meta-theory are not numerous to date. The only instance I met is in [20]. How biology might be done The Newtonian mechanics can describe and explain and even predict a trajectory of a flying stone, but it in no way is a science of minerals. When we expect biology to reach high theoretical standards of physics, we still think of it as of a “life science” that, ideally, should be able to deduce all the details of organic, genetic, physiological and evolutionary processes from a set of highly theoretical principles and formulas. But those deductively consistent theories never explain all the facts of any empirical domain. Rather, they cope with facts of a certain kind observed in different domains. Mechanics knows nothing of chemical composition of the flying stone, nor of its geological history, but it can predict events that can happen to both a stone and a bullet. Likewise, we can hardly promote “life science” from the state of what Kant called “history”, as opposed to “pure science”, to the desired state of deductive consistency. But we can do so to the science of closed systems in non-equilibrium states. Those systems, besides living things, can include societies, astronomical objects and even scientific theories themselves. Such a discipline, being based on a transparent ontology and equipped with a well-built formal language, may explain and predict certain kinds of facts in these domains including that of biology. But biology as such will probably remain a kind of “history” in Kant's terms where various formalized theories do their good jobs, side by side with empirical classifications and qualitative narratives. In order to “fix the radio” of life, a biologist should start with the first David Marr's level of computation: that is, determine the overall goal of the system's functioning. It may be a certain combination of adaptation [14], homeostasis [13, 21] and energy efficiency [22]. At that point, i.e., we have a set of control variables. Then a biologist should develop a complex of algorithms best fit for achieving goals determined at the first level. Within the algorithms, the so-called computational primitives [23] must be identified, i.e., the elementary nodes, combinations of which form principal circuits of living body parts, the whole organisms and their symbiotic and social combinations. And, lastly, biological observables must be mapped to those primitives in order for algorithmic steps of the latter to ‘encode' (not just ‘mimic') causalities of the former. Taking into consideration the multiple realization principle, we are in a position to expect the arrival of a set of competing (and obviously computing) theories generated this way. And from this very output life science is going to benefit.

Скачать электронную версию публикации

Загружен, раз: 41

Ключевые слова

онтология, вычисления, биология, математика, теория

Авторы

ФИО	Организация	Дополнительно	E-mail
Михайлов Игорь Феликсович	Институт философии РАН	кандидат философских наук, старший научный сотрудник	ifmikhailov@gmail.com; http://eng.iph.ras.ru/igor_mikhailov.htm

Всего: 1

Ссылки

Lazebnik, Y. (2002) Can a biologist fix a radio? Or, what I learned while studying apoptosis. Cancer Cell. 2(3). pp. 179-182. DOI: 10.1016/S1535-6108(02)00133-2

May, R.M. (2004) Uses and Abuses of Mathematics in Biology. Science. 303(5659). pp. 790-793. DOI: 10.1126/science.1094442

Mikhailov, I.F. (2020) Social Ontology: Time to Compute. Vestnik Tomskogo gosudarstven-nogo universiteta. Filosofiya, sotsiologiya, politologiya - Tomsk State University Journal of Philosophy, Sociology and Political Science. 55. pp. 36-46. (In Russian). DOI: 10.17223/1998863X/55/5

Turing, A.M. (2004) The Chemical Basis of Morphogenesis. In: Copeland, B.J. (ed.) The Essential Turing: Seminal Writings in Computing, Logic, Philosophy, Artificial Intelligence, and Artificial Life: Plus The Secrets of Enigma. Vol. 15. Oxford: Oxford University Press. pp. 519-561.

Bascompte, J. (2007) Biology and mathematics. Arbor. 182(725). pp. 347-351. DOI: 10.1002/9781119663416

Hofmeyr, J.-H.S. (2017) Mathematics and biology. South African Journal of Science. 113(3/4). DOI: 10.17159/sajs.2017/a0203

Bhalla, U. S. & Iyengar, R. (1999) Emergent Properties of Networks of Biological Signaling Pathways. Science. 283(5400). pp. 381-387. DOI: 10.1126/science.283.5400.381

Csete, M.E. (2002) Reverse Engineering of Biological Complexity. Science. 295(5560). pp. 1664-1669. DOI: 10.1126/science.1069981

Bray, D. (1995) Protein molecules as computational elements in living cells. Nature. 376(6538). pp. 307-312. DOI: 10.1038/376307a0

Hesp, C., Ramstead, M., Constant, A., Badcock, P., Kirchhoff, M., Friston, K. (2019) A Multiscale View of the Emergent Complexity of Life: A Free-Energy Proposal. In: Springer Proceedings in Complexity. pp. 195-227. DOI: 10.1007/978-3-030-00075-2 7

Es, T. van (2020) Living models or life modelled? On the use of models in the free energy principle. Adaptive Behavior. DOI: 10.1177/1059712320918678

Kuchling, F., Friston, K., Georgiev, G. & Levin, M. (2019) Morphogenesis as Bayesian inference: A variational approach to pattern formation and control in complex biological systems. Physics of Life Reviews. 33. pp. 88-108. DOI: 10.1016/j.plrev.2019.06.001

Hulme, O.J., Morville, T. & Gutkin, B. (31) Neurocomputational theories of homeostatic control. Physics of Life Reviews. 31. pp. 214-232. DOI: 10.1016/j.plrev.2019.07.005

Ramstead, M.J.D., Constant, A., Badcock, P.B. & Friston, K.J. (2019) Variational ecology and the physics of sentient systems. Physics of Life Reviews. 31. pp. 188-205. DOI: 10.1016/j.plrev.2018.12.002

Ramstead, M.J.D., Badcock, P.B. & Friston, K.J. (2018) Answering Schroedinger's question: A free-energy formulation. Physics of Life Reviews. 24. pp. 1-16. DOI: 10.1016/j.plrev.2017.09.001

Kirmayer, L.J. (2018) Ontologies of life: From thermodynamics to teleonomics: Comment on “Answering Schrodinger’s question: A free-energy formulation” by Maxwell James DEsormeau Ramstead et al. Physics of Life Reviews. 24. pp. 29-31. DOI: 10.1016/j.plrev.2017.11.022

Friston, K. (2013) Life as we know it. Journal of The Royal Society Interface. 10(86). DOI: 10.1098/rsif.2013.0475

Friston, K., Levin, M., Sengupta, B. & Pezzulo, G. (2015) Knowing one's place: a free-energy approach to pattern regulation. Journal of The Royal Society Interface. 12(105). DOI: 10.1098/rsif.2014.1383

Friston, K. J., Daunizeau, J., Kilner, J. & Kiebel, S.J. (2010) Action and behavior: A free-energy formulation. Biological Cybernetics. 102(3). pp. 227-260. DOI: 10.1007/s00422-010-0364-z

Martyushev, L.M. (2018) Living systems do not minimize free energy. Physics of Life Reviews. 24. pp. 40-41. DOI: 10.1016/j.plrev.2017.11.010

Auletta, G. (2013) Information and Metabolism in Bacterial Chemotaxis. Entropy. 15(1). pp. 311-326. DOI: 10.3390/e15010311

Kempes, C. P., Wolpert, D., Cohen, Z. & Perez-Mercader, J. (2017) The thermodynamic efficiency of computations made in cells across the range of life. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences. 375(2109). DOI: 10.1098/rsta.2016.0343

Marcus, G., Marblestone, A. & Dean, T. (2014) The atoms of neural computation. Science. 346(6209). pp. 551-552. DOI: 10.1126/science.1261661