Note : Use Accumulator only in action to get correct values. Do not use Accumulator in Transformation ; Use it only for debugging purpose in Transformation
c15 > a15
- scala> val input = sc.parallelize(List(1, 2, 3, 4, 5,
- | 6, 7, 8, 9, 10,
- | 11, 12, 13, 14, 15
- | ))
- input: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at parallelize at <console>:27
- scala> println("No of partitions -> " + input.partitions.size)
- No of partitions -> 8
- scala> val myAccum = sc.accumulator(0, "My Accumulator")
- myAccum: org.apache.spark.Accumulator[Int] = 0
- scala> // Used inside an action
- scala> input.foreach{ x =>
- | //Thread.sleep(50000)
- | myAccum += 1
- | }
- scala> println("myAccum -> " + myAccum.value)
- myAccum -> 15