Senior Data Engineer - bilingual English

Empresa: HAYS
Provincia: Madrid
Población:  Madrid, Madrid
Descripción: We are looking for a Senior Data Engineer to work on international company growing data business. The successful candidate will leverage his/her experience and skills to onboard new data sources and service new destinations that enhance our products and service and increase our sales revenue. This position requires a creative, logical, and self-driven person with high standards for quality and attention to detail, capable of taking responsibility and accountability for actions and time management.

Builds large-scale batch data pipelines
Leverages best practices in continuous integration and delivery
Drives optimization, testing and tooling to improve data quality
Interprets business requirements into technical requirements and executes
Enhances the ETL codebase for added efficiency and capacity.
Develops, recommends and implements process and procedure changes to systematically improve data integrity.
Analyzes data sets, builds visualizations and develops dashboards to inform internal and external clients.
Designs, creates and manages processes to prepare both static and streaming data for use in machine learning algorithms

We work as a distributed engineering team using agile methodologies such as Kanban/Scrum.

Skills & Requirements Must Have Skills:

Has 3+ years of professional experience in ETL and/or other “Big Data” processes including modelling and data architectures

Understands Big Data stack including:
MapReduce
Hadoop
Spark
Hive

Is familiarity with relational and hierarchical database development
Has strong written and verbal presentation skills and is capable of communicating technology benefits to business problems
Possesses knowledge of data warehousing, data preparation, analytics, reporting and dashboarding for data at terabyte scale using platforms such as Kylin
Has extensive experience in AWS services ecosystem:
S3
EC2RDS
EMR
Redshift
Quicksight
Is proficient with Linux OS, preferably RHEL/Centos/AWS Linux or Ubuntu.

Demonstrates ability to write advanced SQL queries for MySQL and/or PostgreSQL.
Is capable of scripting in various languages: Bash, Python or R, Perl or Ruby
Is familiar with source control platforms (github).

Demonstrates strong understanding and appreciation for the features, benefits and limitations of machine learning algorithms:
K-Nearest Neighbor
Hierarchical Multi-Label Classifiers
Dimensionality Reduction
High Correlation Filter
Principal Component Analysis

Nice to Have Skills:

– Domain expertise in mobile, consumer packaged goods or demographic data sets
– NoSQL experience: HBase, MongoDB.
– Apache Storm, Kafka, Cassandra.
Tecnologías: ETL, MapReduce, Hadoop, Spark, Hive,
Tipo de Contrato: 
Indefinido
Salario: Sin especificar
Experiencia: 3 años
Funciones: Big Data


Publicaciones Similares