Tensorflow on M1 Mac: "Incorrect checksum for freed object"-CodePudding

I've recently began using an Apple Silicon mac. I installed Tensorflow through Anaconda, version 2.6.2, which was the latest version I could find.

When I run the training code, the training seems to begin initializing, until it reaches some memory error. Then it hangs until I manually stop it.

The printed output looks like:

(machine_learning) eric@mac-mini cr_battle_predictor % python3 main.py
2021-12-25 22:11:24.286059: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
  0%|                                                                                                                                                                                                    | 0/350 [00:00<?, ?epoch/s]2021-12-25 22:11:24.368779: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
python3(48188,0x304135000) malloc: Incorrect checksum for freed object 0x7fe41cac4e80: probably modified after being freed.
Corrupt value: 0x7fe42c07e480
python3(48188,0x304135000) malloc: *** set a breakpoint in malloc_error_break to debug
zsh: abort      python3 main.py
(machine_learning) eric@mac-mini cr_battle_predictor % /Users/eric/.conda/envs/machine_learning/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

For some reason, very occasionally, (around 1/7 attempts), the error does not appear and as far as I can tell, the training progresses normally, without trouble.

I acknowledge that a similar problem has been asked here. However, the only solution provided was to ensure that I was using the correct interpreter, and the most recent version of tensorflow. I am using Python 3.9.7 and TensorFlow 2.6.2, and I made sure that my program was using these versions too.

What causes this problem? I am willing to share any needed information.

CodePudding user response：

Installing Tensorflow on Mac M1 is a real pain. My solution to your problem is to restart installing Tensorflow; I faced the same issue as you and was unable to fix it. First off, I'm going to assume that you are on Monterey (Mac 12); If you aren't, you'll have to refer to https://github.com/apple/tensorflow_macos/issues/153 which seems to have worked for some people.

If that doesn't work, upgrade to Monetery, and follow the steps outlined here: https://developer.apple.com/metal/tensorflow-plugin/. Here it is:

Download and install Conda env [you can get this from https://github.com/conda-forge/miniforge#miniforge3; download "arm64 (Apple Silicon)", because you run on M1]:
chmod  x ~/Downloads/Miniforge3-MacOSX-arm64.sh
sh ~/Downloads/Miniforge3-MacOSX-arm64.sh
source ~/miniforge3/bin/activate
Install the TensorFlow dependencies:
conda install -c apple tensorflow-deps

Then install base Tensorflow

python -m pip install tensorflow-macos

Finally get the Tensorflow Metal plugin

python -m pip install tensorflow-metal

Tensorflow should now work (and use the M1 GPU too!).