In Search of Goodness: Large Scale Benchmarking of Goodness Functions for the Forward-Forward Algorithm

Indian Institute of Technology Gandhinagar
Academic Research 2025
Project Cover

Benchmarking 21 distinct goodness functions across four standard image datasets to evaluate classification accuracy, energy consumption, and carbon footprint for the Forward-Forward algorithm.

Abstract

The Forward-Forward (FF) algorithm offers a biologically plausible alternative to backpropagation, enabling neural networks to learn through local updates. However, FF's efficacy relies heavily on the definition of "goodness", which is a scalar measure of neural activity. While current implementations predominantly utilize a simple sum-of-squares metric, it remains unclear if this default choice is optimal. To address this, we benchmarked 21 distinct goodness functions across four standard image datasets (MNIST, FashionMNIST, CIFAR-10, STL-10), evaluating classification accuracy, energy consumption, and carbon footprint. We found that certain alternative goodness functions inspired from various domains significantly outperform the standard baseline. Specifically, game_theoretic_local achieved 97.15% accuracy on MNIST, softmax_energy_margin_local reached 82.84% on FashionMNIST, and triplet_margin_local attained 37.69% on STL-10. Furthermore, we observed substantial variability in computational efficiency, highlighting a critical trade-off between predictive performance and environmental cost. These findings demonstrate that the goodness function is a pivotal hyperparameter in FF design.

Methodology

In this study, we systematically benchmark 21 distinct goodness functions for the Forward-Forward (FF) algorithm. The FF algorithm replaces the global backward pass of backpropagation with two local forward passes, maximizing "goodness" for positive data and minimizing it for negative data.

The Forward-Forward Algorithm

For each layer, the objective is to have high "goodness" for positive data (real samples) and low "goodness" for negative data (corrupted or generated samples). The local loss function is defined as:

L = log(1 + exp(-(G(y_pos) - θ))) + log(1 + exp(G(y_neg) - θ))

where G(y) is the goodness function, and θ is a threshold.

Goodness Functions Evaluated

We categorize the 21 goodness functions into five groups:

  • Baseline: Sum of Squares.
  • Distance and Energy-Based: L2 Normalized Energy, Huber Norm, Triplet Margin, Tempered Energy, Outlier Trimmed Energy, Softmax Energy Margin.
  • Biologically Inspired: Hebbian, Oja's Rule, BCM Theory.
  • Information Theoretic: InfoNCE, Predictive Coding, NT-Xent.
  • Statistical and Other Approaches: Decorrelation, Game Theoretic, Fractal Dimension, Whitened Energy, PCA Energy, Gaussian Energy, Sparse L1, Attention Weighted.
Forward-Forward Goodness Overview

Figure: Overview of the Forward-Forward algorithm and the diverse goodness functions evaluated in this study.

Results

We evaluated the performance of the 21 goodness functions across four datasets: MNIST, FashionMNIST, CIFAR-10, and STL-10. We report Classification Accuracy (linear classifier on frozen embeddings), Multi-pass Accuracy (native FF inference), and Environmental Impact (Emissions and Energy).

1. MNIST

On MNIST, game_theoretic_local achieved the highest Multi-pass Accuracy of 98.17%. The baseline sum_of_squares was the most energy-efficient.

MNIST Results Bar Chart
Goodness Function Class. Acc. Multi-pass Acc. Emissions (g CO2)
attention_weighted_local 0.9737 0.9803 13.14
bcm_local 0.0986 0.0979 12.56
decorrelation_local 0.9738 0.9795 12.84
fractal_dimension_local 0.9676 0.9803 12.62
game_theoretic_local 0.9715 0.9817 12.78
gaussian_energy_local 0.9690 0.9805 13.11
hebbian_local 0.9690 0.9805 13.44
huber_norm_local 0.9696 0.9815 12.93
info_nce_local 0.9564 0.9799 13.47
l2_normalized_energy_local 0.9690 0.9805 13.04
nt_xent_local 0.9564 0.9799 12.92
oja_local 0.9056 0.9740 13.55
outlier_trimmed_energy_local 0.3645 0.4055 13.20
pca_energy_local 0.9690 0.9805 13.52
predictive_coding_local 0.9788 0.9803 13.19
softmax_energy_margin_local 0.9568 0.9791 13.60
sparse_l1_local 0.9719 0.9811 13.64
sum_of_squares (baseline) 0.9690 0.9805 12.32
tempered_energy_local 0.9690 0.9805 13.27
triplet_margin_local 0.9750 0.9806 13.28
whitened_energy_local 0.9690 0.9805 13.06

2. FashionMNIST

For FashionMNIST, softmax_energy_margin_local achieved the highest Multi-pass Accuracy of 86.32%. whitened_energy_local was the most efficient.

FashionMNIST Results Bar Chart
Goodness Function Class. Acc. Multi-pass Acc. Emissions (g CO2)
attention_weighted_local 0.8338 0.8594 12.52
bcm_local 0.1023 0.1000 12.91
decorrelation_local 0.8243 0.8556 12.76
fractal_dimension_local 0.8449 0.8540 12.56
game_theoretic_local 0.8323 0.8586 12.92
gaussian_energy_local 0.8246 0.8487 12.71
hebbian_local 0.8246 0.8487 13.33
huber_norm_local 0.8341 0.8573 12.49
info_nce_local 0.7860 0.8471 12.69
l2_normalized_energy_local 0.8246 0.8487 12.90
nt_xent_local 0.7860 0.8471 12.55
oja_local 0.1337 0.8059 13.31
outlier_trimmed_energy_local 0.3155 0.1678 12.85
pca_energy_local 0.8246 0.8487 13.39
predictive_coding_local 0.8539 0.8625 12.91
softmax_energy_margin_local 0.8284 0.8632 12.97
sparse_l1_local 0.8209 0.8536 13.17
sum_of_squares (baseline) 0.8246 0.8487 13.12
tempered_energy_local 0.8246 0.8487 13.27
triplet_margin_local 0.7887 0.8585 13.03
whitened_energy_local 0.8246 0.8487 11.91

3. CIFAR-10

On CIFAR-10, sparse_l1_local achieved the highest Multi-pass Accuracy of 43.82% and was also the most efficient, highlighting the importance of sparsity.

CIFAR-10 Results Bar Chart
Goodness Function Class. Acc. Multi-pass Acc. Emissions (g CO2)
attention_weighted_local 0.2173 0.3980 14.66
bcm_local 0.2608 0.3521 14.51
decorrelation_local 0.2309 0.4146 14.44
fractal_dimension_local 0.2617 0.4235 14.50
game_theoretic_local 0.2347 0.4305 14.31
gaussian_energy_local 0.2857 0.3959 13.91
hebbian_local 0.2857 0.3959 14.63
huber_norm_local 0.2363 0.3976 14.52
info_nce_local 0.2560 0.3867 15.22
l2_normalized_energy_local 0.2857 0.3959 14.57
nt_xent_local 0.2560 0.3867 14.29
oja_local 0.1753 0.1618 14.50
outlier_trimmed_energy_local 0.3747 0.1000 14.41
pca_energy_local 0.2857 0.3959 14.17
predictive_coding_local 0.4452 0.4342 14.29
softmax_energy_margin_local 0.2523 0.3869 14.16
sparse_l1_local 0.2733 0.4382 14.02
sum_of_squares (baseline) 0.2857 0.3959 14.99
tempered_energy_local 0.2857 0.3959 14.71
triplet_margin_local 0.2516 0.4101 15.62
whitened_energy_local 0.2857 0.3959 15.11

4. STL-10

For STL-10, triplet_margin_local achieved the highest Multi-pass Accuracy of 37.72%, suggesting that explicit separation is crucial for data-scarce tasks.

STL-10 Results Bar Chart
Goodness Function Class. Acc. Multi-pass Acc. Emissions (g CO2)
attention_weighted_local 0.3647 0.3647 6.45
bcm_local 0.2132 0.1447 6.18
decorrelation_local 0.3649 0.3689 6.46
fractal_dimension_local 0.3554 0.3614 6.33
game_theoretic_local 0.3770 0.3561 6.72
gaussian_energy_local 0.3655 0.3664 6.75
hebbian_local 0.3655 0.3664 6.56
huber_norm_local 0.3699 0.3479 6.68
info_nce_local 0.3494 0.3354 6.61
l2_normalized_energy_local 0.3655 0.3664 6.44
nt_xent_local 0.3494 0.3354 6.33
oja_local 0.3632 0.1096 6.41
outlier_trimmed_energy_local 0.2764 0.1004 6.77
pca_energy_local 0.3655 0.3664 6.56
predictive_coding_local 0.4152 0.3607 6.73
softmax_energy_margin_local 0.3475 0.3231 6.66
sparse_l1_local 0.3581 0.3657 6.44
sum_of_squares (baseline) 0.3655 0.3664 10.16
tempered_energy_local 0.3655 0.3664 6.22
triplet_margin_local 0.3769 0.3772 6.46
whitened_energy_local 0.3655 0.3664 6.55

Classification Loss Plots

Accuracy Per Layer

Multi-pass Accuracy

BibTeX

@misc{shah2025searchgoodness,
      title={In Search of Goodness: Large Scale Benchmarking of Goodness Functions for the Forward-Forward Algorithm}, 
      author={Arya Shah and Vaibhav Tripathi},
      year={2025},
      eprint={placeholder},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={placeholder}, 
}