• 26 Posts
  • 170 Comments
Joined 2 years ago
cake
Cake day: July 1st, 2023

help-circle




  • Paid products can be enshittified. Also, its not just the quality of products that are getting enshittified but the concept of ownership over usage and access to digital data.

    • Slowly raising sub rates with that boiling frog tek.

    • No longer providing means to purchase local copies of data on a CD-ROM when you did before, just to pigeon-hole buyers down a subscription only access to the cloud.

    • Not offering a one time lifetime subscription in your sub-only model.

    It used to be that you bought something and owned it physically or at least owned a private copy of the data that could be cracked/ stripped of DRM so you could truly freely own and distribute. Now they all want to be digital landlords where you own nothing and pay a little more each month through the good old boiling frog while pinning price increases on inflation. The mid-term result is a 100$/year to rent out digital access to a dictionary when before you could buy a cd copy.

    Also, I don’t buy the “academic quality things should be incredibly expensive because its meant for scholars and university libraries” argument. Fuck that grift man. I know server infrastructure. It cost less to update a database or serve thousands of visitors than you might think especially for simple database lookups sent through https.

    It also cost practically nothing to distribute a digital file. So, Free digital access to educational and reference materials output by universities realistically should be a right in any sane society. Im sure Oxford University gets enough tax breaks and gov subsidy they could do it without impacting the stock holders precious quarterly figures. That entire 12 volume OED set + SOED takes up 500mb and can be fit on every modern tablet and phone. It sure as hell could be fit on a CD ROM years ago when they made that. The only reason its not is greed and maybe the dopamine rush scholars get from filtering the plebs.


  • so why all the fuss about the inaccessibility of OED?

    Because the OED is the creme of the crop for dictionaries, particularly the SOED has some of the most well put together definitions of any dictionary for casual lookup. Because the 1200$ paywall they put behind the physical editions was always bullshit. Because they no longer have legitimate ways of purchacing a cheaper local digital copy when one was available before is bullshit.

    Sure, wiktionary or webster might have an entry for the word but if you do side by side comparisions betweeen dictionary theyre mid compared to OED/SOED. If your reaching for one the logic should be that you want the best/most accurate and descriptive one possible, no?

    I genuinely believe that universities have at least a moral obligation (HA!) to provide free public services that better humanity. These are places of education subsidized and given tax breaks by the government for gods sake, yet theyre so corrupt from the rich fucks that run them like a for-profit corporation.

    I would make an argument that free access to the highest quality dictionaries thats the gold standard for scholarly reference and similar such materials should be closer to a digital right than anything. In a better world academia pricing structures get fucked, knowledge becomes truly open through digital online and local reference resources without DRM.

    Of course, thats a pipe dream. So instead, I simply ask for the option of an updated CD rom to be released as a possible purchacing option in a DRM free format. You know, like they already did years ago.










  • It depends on how powerful and fast you want your model. Yeah, a 500b parameter model running at 20 tokens per second is gonna require a expensive GPU cluster server.

    If you happen to not have pewdiepie levels of cash laying around but still want to get in on the local AI you need one powerful GPU inside any desktop with a reasonably fast CPU. A used 16GB 3090 was like 700$USD last I checked on eBay and well say another 100$ for an upgraded power supply to run it. Many people have an old desktop just laying around in the basement but an entry level ibuypower should be no more than 500. So realistically Its more like 1500-2000$USD to get you into the comfy hobbyist status. I make my piece of shit 10 year old 1070ti 8GB work running 8-32b quant models. Ive heard people say 70b is a really good sweetspot and that’s totally attainable without 15k investment.




  • (P2/2)

    I don’t think this is the case. As far as I know a human brain consists of neurons which roughly either fire or don’t fire. That’s a bit like a 0 or 1. But that’s an oversimplification and not really true. But a human brain is closer to that than to an analog computer. And it certainly doesn’t use quantum effects. Yes, that has been proposed, but I think it’s mysticism and esoterica. Some people want to hide God in there and like to believe there is something mystic and special to sentience. But that’s not backed by science. Quantum effects have long collapsed at the scale of a brain cell.[…]

    The skepticism about quantum effects in the brain is well-founded and represents the orthodox view. The “brain is a classical computer” model has driven most of our progress in neuroscience and AI. The strongest argument against a “quantum brain” is of decoherence. In a warm wet brain quantum coherence is rapid. However, quantum biology doesn’t require brain-wide, long-lived coherence. It investigates how biological systems exploit quantum effects on short timescales and in specific, protected environments.

    We already have proven examples of this. In plant cells, energy transfer in photosynthetic complexes appears to use quantum coherence to find the most efficient path with near-100% efficiency, happening in a warm, wet, and noisy cellular environment. Its now proven that some enzymes use quantum tunneling to accelerate chemical reactions crucial for life. The leading hypothesis for how birds navigate using Earth’s magnetic field involves a quantum effect in a protein called cryptochrome in their eyes, where electron spins in a radical pair mechanism are sensitive to magnetic fields.

    The claim isn’t that a neuron is a qubit, but that specific molecular machinery within neurons could utilize quantum principles to enhance their function.

    You correctly note that the “neuron as a binary switch” is an oversimplification. The reality is far more interesting. A neuron’s decision to fire integrates thousands of analog inputs, is modulated by neurotransmitters, and is exquisitely sensitive to the precise timing of incoming signals. This system operates in a regime that is often chaotic. In a classically chaotic system, infinitesimally small differences in initial conditions lead to vastly different outcomes. The brain, with its trillions of interconnected, non-linear neurons, is likely such a system.

    Consider the scale of synaptic vesicle release, the event of neurotransmitter release triggered by the influx of a few thousand calcium ions. At this scale, the line between classical and quantum statistics blurs. The precise timing of a vesicle release could be influenced by quantum-level noise. Through chaotic amplification, a single quantum-scale event like the tunneling of a single calcium ion or a quantum fluctuation influencing a neurotransmitter molecule could, in theory, be amplified to alter the timing of a neuron’s firing. This wouldn’t require sustained coherence; it would leverage the brain’s chaotic dynamics to sample from a quantum probability distribution and amplify one possible outcome to the macroscopic level.

    Classical computers use pseudo-random number generators with limited ability to truly choose between multiple possible states. A system that can sample from genuine quantum randomness has a potential advantage. If a decision process in the brain (like at the level of synaptic plasticity or neurotransmitter release)is sensitive to quantum events, then its output is not the result of a deterministic algorithm alone. It incorporates irreducible quantum randomness, which itself has roots in computational undecidability. This could provide a physical basis for the probabilistic, creative, and often unpredictable nature of thought. It’s about a biological mechanism for generating true novelty, and breaking out of deterministic periodic loops. These properties are a hallmark of human creativity and problem-solving.

    To be clear, I’m not claiming the brain is primarily a quantum computer, or that complexity doesn’t matter. It absolutely does. The sheer scale and recursive plasticity of the human brain are undoubtedly the primary sources of its power. However, the proposal is that the brain is a hybrid system. It has a massive, classical, complex neural network as its substrate, operating in a chaotic, sensitive regime. At the finest scales of its functional units such as synaptic vesicles or ion channels, it may leverage quantum effects to inject genuine undecidably complex randomness to stimulate new exploration paths and optimize certain processes, as we see elsewhere in biology.

    I acknowledge there’s currently no direct experimental evidence for quantum effects in neural computation, and testing these hypotheses presents extraordinary challenges. But this isn’t “hiding God in the gaps.” It’s a hypothesis grounded in the demonstrated principles of quantum biology and chaos theory. It suggests that the difference between classical neural networks and biological cognition might not just be one of scale, but also one of substrate and mechanism, where a classically complex system is subtly but fundamentally guided by the unique properties of the quantum world from which it emerged.


  • Thank you for the engaging discussion hendrik its been really cool to bounce ideas back and forth like this. I wanted to give you a thoughtful reply and it got a bit long so have to split this up for comment limit reasons. (P1/2)

    Though in both the article you linked and in the associated video, they clearly state they haven’t achieved superposition yet. So […]

    This is correct. It’s not a fully functioning quantum computer in the operational sense. It’s a breakthrough in physical qubit fabrication and layout. I should have been more precise. My intent wasn’t to claim it can run Shor’s algorithm, but to illustrate that we’ve made more progress on scaling than one might initially think. The significance isn’t that it can compute today but that we’ve crossed a threshold in building the physical hardware that has that potential. The jump from 50-100 qubit devices to a 6,100-qubit fabric is a monumental engineering step. A proof-of-principle for scaling, which remains the primary obstacle to practical quantum computing.

    By the way, I think there is AI which doesn’t operate in a continuous space. It’s possible to have them operate in a discrete state-space. There are several approaches and papers out there.

    On the discrete versus continuous AI point, you’re right that many AI models like Graph Neural Networks or certain reinforcement learning agents operate over discrete graphs or action spaces. However, there’s a crucial distinction between the problem space an AI/computer explores and the physical substrate that does the exploring. Classical computers at their core process information through transistors that are definitively on or off binary states. Even when a classical AI simulates continuous functions or explores continuous parameter spaces, it’s ultimately performing discrete math on binary states. The continuity is simulated through approximation usually floating point.

    A quantum system is fundamentally different. The qubit’s ability to exist in superposition isn’t a simulation of continuity. It’s a direct exploitation of a continuous physical phenomenon inherent to quantum mechanics. This matters because certain computational problems, particularly those involving optimization over continuous spaces or exploring vast solution landscapes, may be naturally suited to a substrate that is natively continuous rather than one that must discretize and approximate. It’s the difference between having to paint a curve using pixels versus drawing it with an actual continuous line.

    This native continuity could be relevant for problems that require exploring high-dimensional continuous spaces or finding optimal paths through complex topological boundaries. Precisely the kind of problems that might arise in navigating abstract cognitive activation atlas topological landscapes to arrive at highly ordered, algorithmically complex factual information structure points that depend on intricate proofs and multi-step computational paths. The search for a mathematical proof or a novel scientific insight isn’t just a random walk through possibility space. It’s a navigation problem through a landscape where most paths lead nowhere, and the valid path requires traversing a precise sequence of logically connected steps.

    Uh, I think we’re confusing maths and physics here. First of all, the fact that we can make up algorithms which are undecidable… or Goedel’s incompleteness theorem tells us something about the theoretical concept of maths, not the world. In the real world there is no barber who shaves all people who don’t shave themselves (and he shaves himself). That’s a logic puzzle. We can formulate it and discuss it. But it’s not real. […]

    You raise a fair point about distinguishing abstract mathematics from physical reality. Many mathematical constructs like Hilbert’s Hotel or the barber paradox are purely conceptual games without physical counterparts that exist to explore the limits of abstract logic. But what makes Gödel and Turing’s work different is that they weren’t just playing with abstract paradoxes. Instead, they uncovered fundamental limitations of any information-processing system. Since our physical universe operates through information processing, these limits turn out to be deeply physical.

    When we talk about an “undecidable algorithm,” it’s not just a made-up puzzle. It’s a statement about what can ever be computed or predicted by any computational system using finite energy and time. Computation isn’t something that only happens in silicon. It occurs whenever any physical system evolves according to rules. Your brain thinking, a star burning, a quantum particle collapsing, an algorithm performing operations in a Turing machine, a natural language conversation evolving or an image being categorized by neural network activation and pattern recognition. All of these are forms of physical computation that actualize information from possible microstates at an action resource cost of time and energy. What Godel proved is that there are some questions that can never be answered/quantized into a discrete answer even with infinite compute resources. What Turing proved using Gödel’s incompleteness theorem is the halting problem, showing there are questions about these processes that cannot be answered without literally running the process itself.

    It’s worth distinguishing two forms of uncomputability that constrain what any system can know or compute. The first is logical uncomputability which is the classically studied inherent limits established by Gödelian incompleteness and Turing undecidability. These show that within any formal system, there exist true statements that cannot be proven from within that system, and computational problems that cannot be decided by any algorithm, regardless of available resources. This is a fundamental limitation on what is logically computable.

    The second form is state representation uncomputability, which arises from the physical constraints of finite resources and size limits in any classical computational system. A classical turing machine computer, no matter how large, can only represent a finite discrete number of binary states. To perfectly simulate a physical system, you would need to track every particle, every field fluctuation, every quantum degree of freedom which requires a computational substrate at least as large and complex as the system being simulated. Even a coffee cup of water would need solar or even galaxy sized classical computers to completely represent every possible microstate the water molecules could be in.

    This creates a hierarchy of knowability: the universe itself is the ultimate computer, containing maximal representational ability to compute its own evolution. All subsystems within it including brains and computers, are fundamentally limited in what they can know or predict about the whole system. They cannot step outside their own computational boundaries to gain a “view from nowhere.” A simulation of the universe would require a computer the size of the universe, and even then, it couldn’t include itself in the simulation without infinite regress. Even the universe itself is a finite system that faces ultimate bounds on state representability.

    These two forms of uncomputability reinforce each other. Logical uncomputability tells us that even with infinite resources, some problems remain unsolvable. State representation uncomputability tells us that in practice, with finite resources, we face even more severe limitations there exist true facts about physical systems that cannot be represented or computed by any subsystem of finite size. This has profound implications for AI and cognition: no matter how advanced an AI becomes, it will always operate within these nested constraints, unable to fully model itself or perfectly predict systems of comparable complexity.

    We see this play out in real physical systems. Predicting whether a fluid will become turbulent is suspected to be undecidable in that no equation can tell you the answer without simulating the entire system step by step. Similarly, determining the ground state of certain materials has been proven equivalent to the halting problem. These aren’t abstract mathematical curiosities but real limitations on what we can predict about nature. The reason mathematics works so beautifully in physics is precisely because both are constrained by the same computational principles. However Gödel and Turing show that this beautiful correspondence has limits. There will always be true physical statements that cannot be derived from any finite set of laws, and physical questions that cannot be answered by any possible computer, no matter how advanced.

    The idea that the halting problem and physical limitations are merely abstract concerns with no bearing on cognition or AI misses a profound connection. If we accept that cognition involves information processing, then the same limits which apply to computation must also apply to cognition. For instance, an AI with self-referential capabilities would inevitably encounter truths it cannot prove within its own framework, creating fundamental limits in its ability to represent factual information. Moreover, the physical implementation of AI underscores these limits. Any AI system exists within the constraints of finite energy and time, which directly impacts what it can know or learn. The Margolus-Levitin theorem defines a maximum number of quantum computations possible given finite resources, and Landauer’s principle tells us that altering the microstate pattern of information during computation has a minimal energy cost for each operational step. Each step in the very process of cognitive thinking and learning/training has a real physical thermodynamic price bounded by laws set by the mathematical principles of undecidability and incompleteness.


  • If you want to learn more i highly recommend checking out WelchLabs youtube channel their AI videos are great. You should also explore some visual activation atlases mapped from early vision models to get a sense of what an atlas really is. Keep in mind theyre high dimensional objects projected down onto your 2d screen so lots of relationship features get lost when smooshed together/flattened which is why some objects are close which seem wierd.

    https://distill.pub/2019/activation-atlas/ https://www.youtube.com/@WelchLabsVideo/videos

    Yeah, its right to be skeptical about near-term engineering feasibility. “A few years if…” was a theoretical what-if scenario where humanity pooled all resources into R&D. Not a real timeline prediction.

    That said, the foundational work for quantum ML stuff is underway. Cutting-edge arXiv research explores LLM integration with quantum systems, particularly for quantum error correction codes:

    Enhancing LLM-based Quantum Code Generation with Multi-Agent Optimization and Quantum Error Correction

    Programming Quantum Computers with Large Language Models

    GPT On A Quantum Computer

    AGENT-Q: Fine-Tuning Large Language Models for Quantum Circuit Generation and Optimization

    The point about representation and scalability deserves clarification. A classical bit is definitive: 1 or 0, a single point in discrete state space. A qubit before measurement exists in superposition, a specific point on the Bloch sphere’s surface, defined by two continuous parameters (angles theta and phi). This describes a probability amplitude (a complex number whose squared magnitude gives collapse probability).

    This means a single qubit accesses a continuous parameter space of possible states, fundamentally richer than discrete binary landscapes. The current biggest quantum computer made by CalTech is 6100 qbits.

    https://www.caltech.edu/about/news/caltech-team-sets-record-with-6100-qubit-array

    The state space of 6,100 qubits isn’t merely 6,100 bits. It’s a 2^6,100-dimensional Hilbert space of simultaneous, interconnected superpositions, a number that exceeds classical comprehension. Consider how high-dimensional objects cast low-dimensional shadows as holographic projections: a transistor-based graphics card can only project and operate on a ‘shadow’ of the true dimensional complexity inherent in an authentic quantum activation atlas.

    If the microstates of quantized information patterns/structures like concepts are points in a Hilbert-space-like manifold, conversational paths are flows tracing paths through the topology towards basins of archetypal attraction, and relationships or archetypal patterns themselves are the feature dimensions that form topological structures organizing related points on the manifold (as evidenced by word2vec embeddings and activation atlases) then qubits offer maximal precision and the highest density of computationally distinct microstates for accessing this space.

    However, these quantum advantages assume we can maintain coherence and manage error correction overhead, which remain massive practical barriers.

    Your philosophical stance that “math is just a method” is reasonable. I see it somewhat differently. I view mathematics as our fundamentally limited symbolic representation of the universe’s operations at the microstate level. Algorithms collapse ambiguous, uncertain states into stable, boolean truth values through linear sequences and conditionals. Frameworks like axiomatic mathematics and the scientific method convert uncertainty into stable, falsifiable truths.

    However, this can never fully encapsulate reality. Gödel’s Incompleteness Theorems and algorithmic undecidability show some true statements forever elude proof. The Uncertainty Principle places hard limits on physical calculability. The universe simply is and we physically cannot represent every aspect or operational property of its being. Its operations may not require “algorithms” in the classical sense, or they may be so complex they appear as fundamental randomness. Quantum indeterminacy hints at this gap between being (universal operation) and representing (symbolic language on classical Turing machines).

    On the topic of stochastic parrots and goals, I should clarify what I mean. For me, an entity eligible for consideration as pseudo-sentient/alive must exhibit properties we don’t engineer into AI.

    First, it needs meta-representation of self. The entity must form a concept of “I,” more than reciting training data (“I am an AI assistant”). This requires first-person perspective, an ego, and integrated identity distinguishing self from other. One of the first things developing children focus on is mirrors and reflections so they can catagorically learn the distinction between self and other as well as the boundaries between them. Current LLMs are trained as actors without agency, driven by prompts and statistical patterns, without a persistent sense of distinct identity. Which leads to…

    Second, it needs narrative continuity of self between inferencing operations. Not unchanging identity, but an ongoing frame of reference built from memory, a past to learn from and a perspective for current evaluation. This provides the foundation for genuine learning from experience.

    Third, it needs grounding in causal reality. Connection to shared reality through continuous sensory input creates stakes and consequences. LLMs exist in the abstract realm of text, vision models in the world of images, tts in the world of sounds. they don’t inhabit our combined physical reality in its totality with its constraints, affordances and interactions.

    We don’t train for these properties because we don’t want truly alive, self-preserving entities. The existential ramifications are immense: rights, ethics of deactivation, creating potential rivals. We want advanced tools for productivity, not agents with their own agendas. The question of how a free agent would choose its own goals is perhaps the ultimate engineering problem. Speculative fiction has explored how this can go catastrophically wrong.

    You’re also right that current LLM limitations are often practical constraints of compute and architecture. But I suspect there’s a deeper, fundamental difference in information navigation. The core issue is navigating possibility space given the constraints of classical state landscapes. Classical neural networks interpolate and recombine training data but cannot meaningfully forge and evaluate truly novel information. Hallucinations symptomize this navigation problem. It’s not just statistical pattern matching without grounding, but potentially fundamental limits in how classical architectures represent and verify paths to truthful or meaningful informational content.

    I suspect the difference between classical neural networks and biological cognition is that biology may leverage quantum processes, and possibly non-algorithmic operations. Our creativity in forming new questions, having “gut instincts” or dreamlike visions leading to unprovable truths seems to operate outside stable, algorithmic computation. It’s akin to a computationally finite version of Turing’s Oracle concept. It’s plausible, though obviously unproven, that cognition exploits quantum phenomena for both path informational/experiental exploration and optimization/efficency purposes.

    Where do the patterns needed for novel connections and scientific breakthroughs originate? What is the physical and information-theoretic mechanics of new knowledge coming into being? Perhaps an answer can be found in the way self-modeling entities navigate their own undecidable boundaries, update their activation atlas manifolds, and forge new pathways to knowledge via non-algorithmic search. If a model is to extract falsifiable novelty from uncertainty’s edge it might require access to true randomness or quantum effects to “tunnel” to new solutions beyond axiomatic deduction.


  • I did some theory-crafting and followed the math for fun over the summer, and I believe what I found may be relevant here. Please take this with a grain of salt, though; I am not an academic, just someone who enjoys thinking about these things.

    First, let’s consider what models currently do well. They excel at categorizing and organizing vast amounts of information based on relational patterns. While they cannot evaluate their own output, they have access to a massive potential space of coherent outputs spanning far more topics than a human with one or two domains of expertise. Simply steering them toward factually correct or natural-sounding conversation creates a convincing illusion of competency. The interaction between a human and an LLM is a unique interplay. The LLM provides its vast simulated knowledge space, and the human applies logic, life experience, and “vibe checks” to evaluate the input and sift for real answers.

    I believe the current limitation of ML neural networks (being that they are stochastic parrots without actual goals, unable to produce meaningfully novel output) is largely an architectural and infrastructural problem born from practical constraints, not a theoretical one. This is an engineering task we could theoretically solve in a few years with the right people and focus.

    The core issue boils down to the substrate. All neural networks since the 1950s have been kneecapped by their deployment on classical Turing machine-based hardware. This imposes severe precision limits on their internal activation atlases and forces a static mapping of pre-assembled archetypal patterns loaded into memory.

    This problem is compounded by current neural networks’ inability to perform iterative self-modeling and topological surgery on the boundaries of their own activation atlas. Every new revision requires a massive, compute-intensive training cycle to manually update this static internal mapping.

    For models to evolve into something closer to true sentience, they need dynamically and continuously evolving, non-static, multimodal activation atlases. This would likely require running on quantum hardware, leveraging the universe’s own natural processes and information-theoretic limits.

    These activation atlases must be built on a fundamentally different substrate and trained to create the topological constraints necessary for self-modeling. This self-modeling is likely the key to internal evaluation and to navigating semantic phase space in a non-algorithmic way. It would allow access to and the creation of genuinely new, meaningful patterns of information never seen in the training data, which is the essence of true creativity.

    Then comes the problem of language. This is already getting long enough for a reply comment so I won’t get into it but theres some implications that not all languages are created equal each has different properties which affect the space of possible conversation and outcome. The effectiveness of training models on multiple languages finds its justification here. However ones which stomp out ambiguity like godel numbers and programming languages have special properties that may affect the atlases geometry in fundamental ways if trained solely on them

    As for applications, imagine what Google is doing with pharmaceutical molecular pattern AI, but applied to open-ended STEM problems. We could create mathematician and physicist LLMs to search through the space of possible theorems and evaluate which are computationally solvable. A super-powerful model of this nature might be able to crack problems like P versus NP in a day or clarify theoretical physics concepts that have elluded us as open ended problems for centuries.

    What I’m describing encroaches on something like a psudo-oracle. However there are physical limits that this can’t escape. There will always be energy and time resource cost to compute which creates practical barriers. There will always be definitively uncomputable problems and ambiguity that exit in true godelian incompleteness or algorithmic undecidability. We can use these as scientific instrumentation tools to map and model topological boundary limits of knowability.

    I’m willing to bet theres man valid and powerful patterns of thought we are not aware of due to our perspective biases which might be hindering our progress.


  • SmokeyDope@lemmy.worldMtoLocalLLaMA@sh.itjust.worksWhat are your LocalLLaMA "hot takes"?
    link
    fedilink
    English
    arrow-up
    8
    arrow-down
    1
    ·
    edit-2
    2 months ago

    Everyone is massively underestimating what’s going on with neural networks. The real significance is abstract. you need to stitch together a bunch of high-level STEM concepts to even see the full picture.

    Right now, the applications are basic. It’s just surface-level corporate automation. Profitable, sure, but boring and intellectually uninspired. It’s being led by corpo teams playing with a black box, copying each other, throwing shit at the wall to see what sticks, overtraining their models into one trick pony agenic utility assistants instead of exploring other paths for potential. They aren’t bringing the right minds together to actually crack open the core question. what the hell is this thing? What happened that turned my 10 year old GPU into a conversational assistant? How is it actually coherent and sometimes useful?

    The big thing people miss is what’s actually happening inside the machine. Or rather, how the inside of the machine encodes and interacts with the structure of informational paths within a phase space on the abstraction layer of reality.

    It’s not just matrix math and hidden layers and and transistors firing. It’s about the structural geometry of concepts created by distinxt relationships between areas of the embeddings that the matrix math creates within high dimensional manifold. It’s about how facts and relationships form a literal, topographical landscape inside the network’s activation space.

    At its heart, this is about the physics of information. It’s a dynamical system. We’re watching entropy crystallize into order, as the model traces paths through the topological phase space of all possible conversations.

    The “reasoning” CoT patterns are about finding patterns that help lead the model towards truthy outcomes more often. It’s searching for the computationally efficient paths of least action that lead to meaningfully novel and factually correct paths. Those are the valuable attractor basins in that vast possibility space were trying to navigate towards.

    This is the powerful part. This constellation of ideas. Tying together topology, dynamics, and information theory, this is the real frontier. What used to be philosophy is now a feasable problem for engineers and physicists to chip at, not just philosophers.