04 May 2016

Accumulator : Example

Note : Use Accumulator only in action to get correct values. Do not use Accumulator in Transformation ; Use it only for debugging purpose in Transformation
  1. scala> val input = sc.parallelize(List(1, 2, 3, 4, 5,
  2. | 6, 7, 8, 9, 10,
  3. | 11, 12, 13, 14, 15
  4. | ))
  5. input: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at parallelize at <console>:27
  6.  
  7. scala> println("No of partitions -> " + input.partitions.size)
  8. No of partitions -> 8
  9.  
  10. scala> val myAccum = sc.accumulator(0, "My Accumulator")
  11. myAccum: org.apache.spark.Accumulator[Int] = 0
  12.  
  13. scala> // Used inside an action
  14.  
  15. scala> input.foreach{ x =>
  16. | //Thread.sleep(50000)
  17. | myAccum += 1
  18. | }
  19.  
  20. scala> println("myAccum -> " + myAccum.value)
  21. myAccum -> 15
c15 > a15