Strong understanding and experience in distributed computing frameworks, particularly Apache Hadoop (YARN, MR, HDFS) and associated technologies - one or more of Hive, Sqoop, Avro, Flume, Oozie, Zookeeper, Impala, etc
Hands-on experience with Apache Spark and its components (Streaming, SQL, MLLib) is a strong advantage.
Operating knowledge of cloud computing platforms (AWS/Azure/GCP)
Experience working within a Linux computing environment, and use of command line tools including knowledge of Shell/Python scripting for automating common tasks
Experience in GCP services (Big Query/Bigtable , Pub sub , Data Flow , App engine ) or AWS or Azure will be preferred.
What You'll do:
The role would involve big data pre-processing & reporting workflows including collecting, parsing, managing, analyzing and visualizing large sets of data to turn information into business insights
Develop the software and systems needed for end-to-end execution on large projects
Work across all phases of SDLC, and use Software Engineering principles to build scalable solutions
Build the knowledge base required to deliver increasingly complex technology projects
You would be responsible for evaluating, developing, maintaining and testing big data solutions for advanced analytics projects
5 day work week
Option to 'work from home'
Flexible work hours
About Propellor.ai(ThinkBumblebee Analytics)
Propellor.ai helps businesses do better business. In a world of data surplus and optimisation deficit, we make the lives of business owners simpler with our AI data tools that turn data into effective insights for each member of the company.
The changing face of business requires a dynamic toolkit which integrates data sources (Bond), visualises the data (Bolt), understands the user base (Box), enhances campaign conversions (Beam) and identifies customer churn (Bynd).