We are looking for a Lead Hadoop Developer who will focus on Big Data development within an existing development team supporting Business Analytics applications. The role requires experience working with Hadoop ecosystems and related technologies, implementing, troubleshooting and optimizing distributed solutions based on modern big data architecture.
- Responsible for design, development and delivery of data-sets from operational systems and files and ingestion into ODSs (operational data stores), Data Marts and files.
- Troubleshoot and develop on Hadoop technologies including HDFS, Kafka, Hive, Pig, Flume, HBase, Spark, Impala and Hadoop ETL development via tools such as ODI for Big Data and API’s to extract data from source.
- Translate, load and present disparate data-sets in multiple formats and from multiple sources including JSON, Avro, text files, Kafka queues, and log data. Data will range in type from structured through semi-structured to unstructured.
- Responsible for building solutions involving large data sets using SQL methodologies, Data Integration Tools like ODI in any database.
- Perform all technical aspects of software development (ARCHITECT I WRITE, I TEST, I SUPPORT) and automation. Deployment considerations will include scheduling and automation of routine processing activities.
- Will implement quality logical and physical ETL designs that have been optimized to meet the operational performance requirements for our multiple solutions and products, this will include the implementation of sound architecture, design, and the use of agreed development standards.
- Perform unit, component and integration testing of software components including the design, implementation, evaluation and execution of unit and assembly test scripts.
- Conduct code reviews and tests of automated build scripts.
- Debug software components, identify, fix and verify remediation of code defects (own work and the work of others).
- Work with Business Analysts, end users and architects to define requirements, agree end process, build code efficiently and work in collaboration with the rest of the team for effective solutions
Knowledge, Experience & Qualifications
- 3 or more years of experience with various tools and frameworks that enable capabilities within the big data ecosystem (Hadoop, Kafka, NIFI, Hive, YARN, HBASE, NoSQL, Cassandra and MongoDB).
- 4 or more verifiable years of software development experience in a professional environment.
- Experience of Big Data Query tools like Pig, Spark SQL and phoenix.
- Experience of data ingestion tools like Flume and Sqoop.
- Experience with AD and Kerberos - need to troubleshoot issues with service/user accounts on kerberized cluster
- Experience with Spark Streaming, Apache NIFI and Kafka for real time data processing
- Experience in DevOps tools like Jenkins
- Experience with source code controlled environments like TFS, GIT or SVN
- Hands on Experience with application design, software development and automated testing
- Solid experience with Java/J2EE, XML, XPath, Web Services, REST services
Bachelor’s Degree in Computer Science, Computer Science Engineering, or related field required; Higher role Degree preferred