See the spark free
https://issues.apache.org/jira/browse/SPARK-4049
But now the community is not solve, the problem is when a cache of RDD is reused many times, the Fraction cached will be greater than 100%, is can't be more than 100% of the normal case, the last is the result of the consumption of memory is being kept, task is more and more slow, I don't know who encountered this problem, is there a way to solve or avoid?