There has recently been a lot of discussion on Lesswrong about whether alignment is a uniquely hard problem because of the intrinsic lack of empirical evidence. Once we have an AGI, it seems unlikely we could safely experiment on it for a long time (potentially decades) until we crack alignment....
[Read More]
Empathy as a natural consequence of learnt reward models
Empathy, the ability to feel another’s pain or to ‘put yourself in their shoes’ is often considered to be a fundamental human cognitive ability, and one that undergirds our social abilities and moral intuitions. As so much of human’s success at becoming dominant as a species comes down to our...
[Read More]
The ultimate limits to alignment determine the shape of the long term future
The alignment problem is not new. We have been grappling with the fundamental core of alignment – making an agent optimize for the beliefs and values of another – for the entirety of human history. Any time anybody tries to get multiple people to work together in a coherent way...
[Read More]
How to evolve a brain
Epistemic status: This is mostly pure speculation, although grounded in many years of studying neuroscience and AI. Almost certainly, much of this picture will be wrong in the details, although hopefully roughly correct ‘in spirit’.
[Read More]
The Scale of the Brain vs Machine Learning
Epistemic status: pretty uncertain. There is a lot of fairly unreliable data in the literature and I make some pretty crude assumptions. Nevertheless, I would be surprised though if my conclusions are more than 1-2 OOMs off though.
[Read More]