Does our proprioceptive system try to recognize our own actions?

Proprioception is our sense of the motion and posture of our own body. This sixth sense uses signals from receptors in the joints, tendons, muscles, and skin that measure forces and degrees of extension. These receptors enable us to sense, for example, the posture of our body as we wake from sleep. They also provide feedback signals that help us precisely control our limbs, for example during handwriting.

Feedback is thought to be essential to motor control, enabling the controller in our brains to rapidly adapt to the unexpected. The unexpected may include changes in the environment (like something  pushing our hand that we didn’t see coming), changes in our bodies (such as muscle fatigue or injury), and shortcomings of the motor program (such as a lack of precision or a badly planned limb trajectory). Feedback can come from vision and even audition, but proprioception provides an essential additional feedback path that informs us directly about the motion and posture of our limbs, and any forces on them.

How does feedback control work in the human motor system? I want to write a ‘k’, but there are forces on my limbs resulting from the friction of chalk on this particular blackboard. Also, my muscles are recovering from tennis practice this morning, and I haven’t used chalk on a blackboard in years.

If the goal is to write a ‘k’, I have some flexibility. I am committed, not to a precise trajectory, but to a more abstractly defined objective: to write a legible ‘k’. This suggests that feedback processing should evaluate to what extent I am succeeding at the action, not at tracing out a particular trajectory. Does what I’m actually doing look like writing a ‘k’?

In a new paper, Sandbrink et al. (pp2022) report on simulations of the human musculoskeletal system and neural network models that suggest that the tuning properties of neurons in somatosensory cortex (S1) can be explained by assuming that the objective of the proprioceptive system is to recognize the action being performed.

They used recorded traces of a person writing lower-case letters to simulate the responses of muscle spindles sensing the lengths and velocities of muscles in the human arm as would be present if the hand were moved passively along these trajectories. The physical simulation uses a 3D model of the human arm with two parameters for the direction of the upper arm and two more for the direction of the lower arm. These four parameters are inferred by inverse kinematics from the hand trajectories tracing each letter in a variety of vertical and horizontal planes. A 3D muscle model then enables the authors to compute the expected spindle responses that reflect the lengths and velocities of 25 relevant upper arm muscles.

The authors then trained neural network models of proprioceptive processing that took the simulated muscle spindle signals as input. The neural net architectures included one that first integrates information over the muscle spindles and then across time (“spatial-temporal”), one that integrated across muscle spindles and time simultaneously (“spatiotemporal”), and a recurrent long-short-term-memory model.

Each architecture was trained on two objectives: to decode the trajectory (i.e. the position of the hand tracing a letter as a function of time) or to recognize the action (i.e. the letter being traced). The two objectives correspond to two hypotheses about the function of proprioceptive processing: To inform the feedback controller about either the current position of the hand or the letter being drawn.

The models trained to recognize the action developed tuning more consistent with what is known about the tuning of neurons in primary somatosensory cortex in primates. In particular, direction tuning with roughly equal numbers of units preferring each direction emerged in middle layers of the neural network models trained to recognize the action, similar to what has been observed in primate neural recordings. Direction tuning is already present in the muscle-spindle signals, but the spindle signals do not uniformly represent the directions.

The task-optimization approach to neural network modeling is inspired by work in vision, where neural networks trained on the task of image classification explained responses to novel images in populations of neurons in the inferior temporal cortex. This result suggested a tentative answer to the why question: Why do inferior temporal neurons exhibit the response profiles and representational geometry they exhibit? Because their function (or one of their functions) is to recognize the objects in the images. Here, similarly, the authors address a why question with task-optimized neural network models: Why do somatosensory cortical neurons exhibit the types of tuning that have been reported in the literature?

The function of proprioception, of course, is not for the brain to recognize which letter it is trying to write. It already knows that. The function is to sense how the current trajectory – the actual, not the intended one – differs from, say, a legible “k” (if that was the intention), and to map from that difference to a modification vector that will improve the outcome.

Why is action decoding relevant for performing the action? A key reason may be that the goal is not to produce a fixed trajectory, but to produce a legible ‘k’. A legible ‘k’ is not a single trajectory, but a class of trajectories containing an infinity of viable solutions. If someone nudged my arm while writing, adaptive feedback control should not attempt to return me to the originally intended trajectory, but to a new trajectory that traces the most legible ‘k’ that is still in the cards, which may be a different style of ‘k’ than I originally intended.

The paper contributes a useful data set for training models and a qualitative comparison of models to real neurons in terms of tuning properties. It would be good, in follow-up studies, to directly test to what extent each of the models can quantitatively predict either single-neuron responses or population representational geometries, as has been done in vision, and to perform statistical comparisons between models.

Importantly, this paper develops the idea of combining simulations body and brain, of the musculoskeletal system and the processing of control-related signals in the nervous system, which provides a very exciting direction for future research.



Strengths

  • The paper introduces a highly original research program that marries simulation of the musculoskeletal system and neural network modelling to predict neural representations in the proprioceptive pathway.
  • The authors performed an architecture search and trained multiple instances of different neural network architectures with each of two objectives.
  • The paper includes comprehensive analyses of the proprioceptive representations from the simulated muscle-spindle signals through the layers of the models. These analyses characterize unit tuning, linear decodability, and representational similarity.
  • The results suggest an explanation for the direction tuning with a roughly uniform distribution of the units’ direction preferences that has been reported previously for neurons in the primate primary somatosensory (S1) cortex.
  • If the simulated muscle-spindle data set, models, and analysis code were shared along with the published paper, this work could form the basis for quantitative model evaluation and further model development.

Weaknesses

  • The models are qualitatively evaluated by comparison of model unit tuning to what is known about the tuning of neurons in somatosensory cortex. Follow-up studies should quantitatively evaluate the models by inferential analyses of their ability to predict measured responses.
  • The two training objectives differ in multiple respects, making it difficult to assess what the necessary requirements are for the emergence of representations similar to primate S1. Decoding the hand position may be too simple, but what about decoding velocity, or trajectory descriptors such as curvature? There may be a middle ground between trajectory decoding and action recognition that also leads to the emergence of tuning properties as found in primate S1.