04 May 2016

Label 4...


Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest
Newer Post Older Post Home

Spark Basics (Labels)

  • a11|--------- Concepts
  • a12| Action & Transformation
  • a12| Combiner
  • a12| Driver Program
  • a12| Executor
  • a12| Job
  • a12| Lazy Evaluation
  • a12| Lineage Graph
  • a12| Modes of Operation
  • a12| RDD
  • a12| RDD Persistence
  • a12| RDD Types
  • a12| Serialization
  • a12| Stage
  • a12| Task
  • a21|---------- Setup
  • a22| Eclipse
  • a22| log4j
  • a22| SBT
  • a25|--------- Ports
  • a26| Spark Web App
  • a31|--------- Commands
  • a32| spark-shell
  • a32| spark-submit
  • a35|--------- Initialization
  • a36| cache()
  • a36| persist()
  • a36| SparkConf
  • a36| SparkContext
  • a36| stop()
  • a36| unpersist()
  • a41|--------- Input Generation
  • a42| parallelize()
  • a42| textFile()
  • a45|--------- Collection Types
  • a46| List
  • a46| Seq
  • a51|--------- Action (RDD)
  • a52| aggregate()
  • a52| collect()
  • a52| count()
  • a52| countByValue()
  • a52| first()
  • a52| fold()
  • a52| foreach()
  • a52| reduce()
  • a52| saveAsTextFile()
  • a52| take()
  • a52| takeOrdered()
  • a52| takeSample()
  • a52| top()
  • a53|--------- Action (PairRDDFunctions)
  • a54| collectAsMap()
  • a54| countByKey()
  • a54| lookup()
  • a55|--------- Map Operations
  • a56| flatMap()
  • a56| map()
  • a56| mapPartitions()
  • a56| mapPartitionsWithIndex()
  • a61|--------- Set Operations
  • a62| cartesian()
  • a62| distinct()
  • a62| intersection()
  • a62| subtract()
  • a62| union()
  • a65|--------- Other Operations
  • a66| filter()
  • a66| groupBy()
  • a66| toDebugString
  • a71|--------- PairRDDFunctions (Single RDD)
  • a72| combineByKey()
  • a72| foldByKey()
  • a72| groupByKey()
  • a72| mapValues()
  • a72| reduceByKey()
  • a73|--------- PairRDDFunctions (Two RDD)
  • a74| cogroup()
  • a74| join()
  • a74| leftOuterJoin()
  • a74| rightOuterJoin()
  • a90|--------- Sorting
  • a91| sortByKey()
  • a91| takeOrdered()
  • a91| top()
  • d11|--------- Partition
  • d12| General
  • d12| Hash-Partition
  • d12| Partitioner set Operations
  • d12| Partitioner unset Operations
  • d12| range-partition
  • d12| Shuffling
  • d13| coalesce()
  • d13| partitionBy()
  • d13| repartition()
  • e15|-------- Shared Variables
  • e17| Accumlators
  • e17| Broadcast Variable
  • part
  • x11|--------- Exceptions
  • x12| NotSerializableException
  • z15|--------- Best Practice
  • z16| Shuffle
  • z25| Others
  • z81|--------- Others
  • z83| Disk
  • z83| Doubts
  • z83| Iterable
  • z83| JVM
  • z83| Pair RDD Generation