Beren's Blog

GPUs vs Brains. Hardware and Architecture

Posted on April 9, 2023

Epistemic status: I owe a lot of my thoughts to Jacob Cannell’s work on both brains and deep learning. My thinking here comes from my experience in large-scale ML as well as neuroscience background and specifically experience in analog hardware for deep learning [Read More]

The surprising parameter-efficiency of vision models

Posted on April 4, 2023

This is a short post meant to highlight something I do not yet understand and therefore a potential issue with my models. Why do vision (and audio) models work so well despite being so small? State of the art models like stable diffusion and midjourney work exceptionally well, generating near-photorealistic... [Read More]

Deep learning models are secretly (almost) linear

Posted on April 4, 2023

I have been increasingly thinking about NN representations and slowly coming to the conclusion that they are (almost) completely secretly linear inside 1. This means that, theoretically, if we can understand their directions, we can very easily exert very powerful control on the internal representations, as well as compose and... [Read More]

The computational anatomy of human values

Posted on April 3, 2023

Epistemic Status: Much of this draws from my studies in neuroscience and ML. Many of the ideas in this post are heavily inspired by the work of Steven Byrnes and the authors of Shard Theory. However, it speculates quite a long way in advance of the scientific frontier and is... [Read More]

Thoughts on the future of Predictive Coding

Posted on March 30, 2023

I have had several meetings and discussions recently about predictive coding and whether it is an interesting and viable direction of study with various people involved in PC. Naturally, I have a bunch of thoughts on this which are not always entirely expressed in published papers and which I thought... [Read More]