FACTR: Force-Attending Curriculum Training
for Contact-Rich Policy Learning

Jason Jingzhou Liu*         Yulong Li*        
Kenneth Shaw         Tony Tao         Ruslan Salakhutdinov         Deepak Pathak
Carnegie Mellon University
Carnegie Mellon University
*Equal contribution.

Abstract

Many contact-rich tasks humans perform, such as box pickup or rolling dough, rely on force feedback for reliable execution. However, this force information, which is readily available in most robot arms, is not commonly used in teleoperation and policy learning. Consequently, robot behavior is often limited to quasi-static kinematic tasks that do not require intricate force-feedback. In this paper, we first present a low-cost, intuitive, bilateral teleoperation setup that relays external forces of the follower arm back to the teacher arm, facilitating data collection for complex, contact-rich tasks. We then introduce FACTR, a policy learning method that employs a curriculum which corrupts the visual input with decreasing intensity throughout training. The curriculum prevents our transformer-based policy from over-fitting to the visual input and guides the policy to properly attend to the force modality. We demonstrate that by fully utilizing the force information, our method significantly improves generalization to unseen objects by 43% compared to baseline approaches without a curriculum.

Results Highlights

Force Feedback

Customizable Redundancy Resolution

Gravity Compensation

Bimanual Box Lifting

Non-Prehensile Pivoting

Rolling Dough

Force Feedback

Customizable Redundancy Resolution

Gravity Compensation

Bimanual Box Lifting

Non-Prehensile Pivoting

Rolling Dough



FACTR Autonomous Policies

Description of image

FACTR allows our policy to better integrate force information without overfitting to visual information, resulting in better generalization to objects with unseen visual appearances and geometries. FACTR applies a blurring operator of scale \(\sigma_n\) in either pixel or latent space, initialized at a large value then gradually decreased through training.


Task 1: Bimanual Box Lift
Task 1: Bimanual Box Lift


Train Scenario
Unseen Objects
Train Scenario
Unseen Objects

Recovery Behavior
Vision-Only ACT (Baseline)
FACTR Policy (Ours)
Vision-Only ACT (Baseline)
FACTR Policy (Ours)

Policies trained with no robot force information fails to recover when an object unseen during training is dropped. This likely occurs because the policy overfits to the Training Objects, hence is unaware when novel objects are dropped.

Our FACTR policies demonstrate robust recovery behavior when objects unseen during training are dropped. When the arms lose contact with the object, the robot's external joint torque readings to revert to pre-lift values. Since our policy effectively attends to the robot's forces, the policy can recover to a pre-lift state.

Policies trained with no robot force information fails to recover when an object unseen during training is dropped. This likely occurs because the policy overfits to the Training Objects, hence is unaware when novel objects are dropped.

Our FACTR policies demonstrate robust recovery behavior when objects unseen during training are dropped. When the arms lose contact with the object, the robot's external joint torque readings to revert to pre-lift values. Since our policy effectively attends to the robot's forces, the policy can recover to a pre-lift state.


Baseline Failure Modes
Vision-Only ACT
Vision+Force ACT
Vision-Only ACT

Fails to lift up the box.

Vision+Force ACT

Drops the box while lifting.

Fails to lift up the box.

Drops the box while lifting.


Cross Attention Analysis

We visualize the average cross attention of the action tokens to the force or vision tokens of the first decoder layer during policy rollout.

Vision+Force ACT (Baseline)
FACTR Policy (Ours)
Vision+Force ACT (Baseline)

Without the curriculum, the policy does not pay enough attention to force, and either fails to lift or balance the novel boxes.

FACTR Policy (Ours)

FACTR learns to attend to force more. We observe that the attention to force outweighs that of vision as the arms begin lifting the box, signaling a mode switch.

Without the curriculum, the policy does not pay enough attention to force, and either fails to lift or balance the novel boxes.

FACTR learns to attend to force more. We observe that the attention to force outweighs that of vision as the arms begin lifting the box, signaling a mode switch.


FACTR Policy Continuous Rollout Demo




Task 2: Non-Prehensile Pivot
Task 2: Non-Prehensile Pivot


Train Objects
Unseen Objects
Train Scenario
Unseen Objects

Baseline Failure Modes
Vision-Only ACT
Vision+Force ACT
Vision-Only ACT

Gets stuck and does not attempt pivoting.

Vision+Force ACT

Violates joint velocity limits during motion. Drops the object during pivoting.

Gets stuck and does not attempt pivoting.

Violates joint velocity limits during motion.

Drops the object during pivoting.


FACTR Policy Continuous Rollout Demo




Task 3: Rolling Dough
Task 3: Rolling Dough


Train Objects
Unseen Objects
Train Scenario
Unseen Objects

Baseline Failure Modes
Vision-Only ACT
Vision+Force ACT
Vision-Only ACT

Policies without force input fails to roll completely.

Vision+Force ACT

Policies without FACTR fail to continuously roll or crush out-of-distribution dough.

Policies without force input fails to roll completely.

Policies without FACTR fail to continuously roll.

Policies without FACTR crush out-of-distribution dough.




Task 4: Delicate Fruit Pick-and-Place
Task 4: Delicate Fruit Pick Place


Train Object
Unseen Objects
Train Scenario
Unseen Objects

Baseline Failure Modes
Vision-Only ACT
Vision-Force ACT
Vision-Only ACT
Vision+Force ACT

Both vision-only policies and vision-force policies without FACTR get stuck after the gripper closes on these unseen fruits. Since these fruits are visually out of the training distribution, vision-only policies fail to generate appropriate actions to proceed. Vision-force policies without FACTR likely do not learn to effectively utilize force input, leading to overfitting to visual input and resulting in the same failure as vision-only policies. In contrast, FACTR policies properly attend to force input. As a result, even when the vision input is out of distribution, the force input can remain within distribution, facilitating FACTR policies to predict the correct next actions in the pick-and-place trajectory.



Low-Cost Teleoperation with Force Feedback

Feature 1: Force Feedback

The follower arm relays its contact forces in the form of external joint torques back to the leader arm.


Force feedback allows the user to feel the geometric constraints of the environment through the leader arm.

The follower arm relays its contact forces in the form of external joint torques back to the leader arm.

Force feedback allows the user to feel the geometric constraints of the environment through the leader arm.


Feature 2: Customizable Redundancy Resolution

For 7-DOF manipulators, an unregulated joint-space causes the arm to drift into undesirable configurations under the influence of gravity during teleoperation due to kinematic redundancy. We leverage a null-space projection control law that allows us to resolve kinematic redundancy at any user-defined Rest posture configurations. Note that this control law, by construction, does not impose additional end-effector wrenches regardless of the arm's configuration.


In the video, we see that when disturbances are applied to the elbow of the leader arm, the null-space controller ensures the leader arm's elbow returns back to its default posture without affecting the end-effector pose.

In the video, we see that when disturbances are applied to the elbow of the leader arm, the null-space controller ensures the leader arm's elbow returns back to its default posture without affecting the end-effector pose.


Confined-Space Manipulation Example
Poorly-Chosen Rest Posture Can Cause Collisions
Well-Chosen Rest Posture Avoids Collisions
Poorly-Chosen Rest Posture Can Cause Collisions

This video exhibits a case where a poorly-chosen resting posture joint configuration can cause collisions, which highlights the importance of having the flexibility for the user to define any resting posture for the leader arm.


Well-Chosen Rest Posture Avoids Collisions

Our leader arm allows the user to define custom resting posture configuration, which helps the follower arm reach targets in confined-spaces during teleoperation.

This video exhibits a case where a poorly-chosen resting posture joint configuration can cause collisions, which highlights the importance of having the flexibility for the user to define any resting posture for the leader arm.

Our leader arm allows the user to define custom resting posture configuration, which helps the follower arm reach targets in confined-spaces during teleoperation.


Feature 3: Gravity Compensation

We implement active gravity compensation for the leader arms, allowing them to remain suspended motionless in midair at any joint configuration.

This enables the user to pause or stop teleoperation at any time and freely release the leader arms, which is especially beneficial for bimanual teleoperation.

We implement active gravity compensation for the leader arms, allowing them to remain suspended motionless in midair at any joint configuration. This enables the user to pause or stop teleoperation at any time and freely release the leader arms, which is especially beneficial for bimanual teleoperation.

BibTeX

                    
@article{liu2025factr,
  title={FACTR: Force-Attending Curriculum Training for Contact-Rich Policy Learning}, 
  author={Jason Jingzhou Liu and Yulong Li and Kenneth Shaw and Tony Tao and Ruslan Salakhutdinov and Deepak Pathak},
  journal={arXiv preprint arXiv:2502.17432},
  year={2025}, 
}
                    

Acknowledgements

We thank Arthur Allshire, Andrew Wang, Mohan Kumar Srirama, Ritvik Singh for discussions about the paper. We also thank Tiffany Tse, Ray Liu, Sri Anumakonda, and Sheqi Zhang with teleoperation. This work is supported in part by ONR MURI N00014-22-1-2773, ONR MURI N00014-24-1-2748 and AFOSR FA9550-23-1-0747.