Hadoop – Apache – PySpark Developer
Work with large data sets and libraries of Hadoop ecosystem
such as Spark, HDFS, YARN, Hive, Impala and Oozie (7+ Years).
Functional and/or object-oriented programming language using
Python (Java and Scala – Preferred)
Multi-threaded applications; Concurrency, Parallelism, Locking
Strategies and Merging Datasets
Solid understanding of SQL, Relational and NoSQL databases
Solid understanding (3+ years) in creating and consuming RESTful services
Solid understanding (5+ years) in Memory Management, Garbage Collection & Performance
Tuning
(5+ years) of experience and working knowledge of distributed/cluster computing concepts
Solid understanding (5+ years) experience in Linux environments; strong knowledge of shell
scripting and file systems
Knowledge of CI tools and build tools like Git, Maven, SBT, Jenkins, and Artifactory/Nexus
Self-managed and results-oriented with sense of ownership is required
Excellent analytical, debugging and problem-solving skills is required
Job Features
Job Category | Developer |