Data Analytics Engineer

Short Description

Gartner is hiring a Data Analytics Engineer who can Work with team on managing AWS resources (EMR, ECS clusters, etc.) and continuously improve the deployment process of our applications.

Job Description

What You Will do
  • Participate in architecture design and implementation of high-performance, salable and optimized data solutions.
  • Design, build and automate the deployment of our data pipelines and applications to our data environment to support data scientists and researchers with their reporting and data requirements.
  • Build workflows using data from a wide variety of sources, including on-premise databases and external data sources with rest APIs and harvesting tools.
  • Work with internal infrastructure teams on monitoring, security, and configuration of database environment and applications
  • Collaborate with internal business units and data science teams on business requirements, data access, processing/transformation, and reporting needs and leverage existing and new tools to provide solutions; Effectively support and partner with businesses on implementation, technical issues and training on the data lake ecosystem
  • Work with team on managing AWS resources (EMR, ECS clusters, etc.) and continuously improve the deployment process of our applications
  • Work with administrative resources and support provisioning, monitoring, configuration and maintenance of AWS tools.
  • Promote the integration of new cloud technologies and continuously evaluate new tools that will improve the organization's capabilities while leading to the lower total cost of operation.
  • Support automation efforts across the data analytics team utilizing Infrastructure as Code (IaC) using Terraform, Configuration Management, and Continuous Integration (CI) / Continuous Delivery (CD) tools such as Jenkins.
  • Work with the team to implement data governance, access control and identify and reduce security risks.

What You Will Need:
  • Bachelor's or Master's Degree in Computer Science, Information Systems, Engineering or related technical fields.
  • 4-8 years' experience in software development, including significant experience in Big Data and Cloud Services
  • Expertise in Big data, AWS platform, Linux Operating Systems, and DevOps (preferred).
  • Passion for understanding and working with large amounts of data (structured and unstructured), building data pipelines for ETL workloads from internal and external sources and leveraging tools that extract raw data into useful information and insights utilizing Data Science, Analytics, Business Intelligence (BI) and visualization tools to support business needs.
  • Experience with big data tools: Hadoop, Spark, Hive, Presto, EMR, Kinesis, Athena, etc.
  • Experience with relational SQL and NoSQL databases, including RDS, MS SQL, DynamoDB, etc. and with data pipeline and workflow management tools: Oozie, Airflow, etc.
  • Experience with Linux/OSX command line and git is a plus
  • Experience with object-oriented/object function scripting languages: Python, Java, Scala, Shell scripting, etc.
  • Experience with stream-processing systems: Spark-Streaming, Kinesis is a plus.
  • Knowledge and some experience of AWS services such as EMR, S3, ECS, Lambda, etc. and AWS CLI; Self-learner and ability to experiment and adopt new tools to build more efficient processes.
  • Working knowledge and some experience with continuous integration/delivery tools like Jenkins and infrastructure as code using Terraform is preferred
  • Ability to take vague requirements and transform them into deliverables
  • Good combination of technical and interpersonal skills with strong written and verbal communication; detail-oriented with the ability to work independently.
  • Takes initiative on improvements and testing results.
  • Consultant mindset - identify, communicate, and act on issues and initiatives
  • Ability to handle multiple tasks and projects simultaneously in an organized and timely manner.
  • Detailed oriented, with the ability to plan, prioritize, and meet deadlines in a fast-paced environment.
  • Ability to work independently, as well as part of a team
  • Experience working with fast-paced operations/dev teams and DevOps

Who you are
  • Experience with NLP tools such as NLTK, OpenNLP, Stanford CoreNLP, and similar open source solutions is a Plus
  • Experience with NLP tagging methods and techniques such as CCG, Penn Treebank is a Plus
  • Experience with NLP applications such as tokenization, parsing, lemmatization, POS tagging techniques, Named Entity Recognition (NER) or Stanford NER (SNER) is a plus
  • Experience developing and applying machine learning using tools such as Python Scikit, R or similar languages is a plus
  • Ability to apply combinations of classifiers Na├»ve Bayes, Decision Tree, k-NN, Neural Networks, and SVM is a plus

Data Analytics Engineer
Mid-Senior-level Information Technology | Technology | Information Full-time Other | Engineering | Information Technology Data Analyst
Gartner, Inc. is a global research and advisory firm providing insights, advice, and tools for leaders in IT, Finance, HR, Customer Service and Support, Legal and Compliance, Marketing, Sales, and Supply Chain functions across the world.