The surprising parameter-efficiency of vision models

This is a short post meant to highlight something I do not yet understand and therefore a potential issue with my models. Why do vision (and audio) models work so well despite being so small? State of the art models like stable diffusion and midjourney work exceptionally well, generating near-photorealistic... [Read More]

Deep learning models are secretly (almost) linear

I have been increasingly thinking about NN representations and slowly coming to the conclusion that they are (almost) completely secretly linear inside 1. This means that, theoretically, if we can understand their directions, we can very easily exert very powerful control on the internal representations, as well as compose and... [Read More]

The computational anatomy of human values

Epistemic Status: Much of this draws from my studies in neuroscience and ML. Many of the ideas in this post are heavily inspired by the work of Steven Byrnes and the authors of Shard Theory. However, it speculates quite a long way in advance of the scientific frontier and is... [Read More]

Thoughts on the future of Predictive Coding

I have had several meetings and discussions recently about predictive coding and whether it is an interesting and viable direction of study with various people involved in PC. Naturally, I have a bunch of thoughts on this which are not always entirely expressed in published papers and which I thought... [Read More]

CEM as inference

Author’s note: I originally wrote this draft in mid 2020 as maths for a paper I never got around to writing. I think it may be somewhat valuable but primarily archiving for historical reasons. [Read More]