ABOUT THE JOB We are looking for self-motivated, creative thinkers, people that are flexible and enjoy working in teams. Our data engineering / ETL team is responsible for the development of daily ETL processes in which large amounts of behavioral data from consumer panels are imported in our data lake. The data is used by our software to provide insights into consumer behavior.
The ETL processes need to evolve in order to deal with the increase in the size and complexity of the data and to cope with higher requirements with respect to data quality and throughput time. Our data engineering team is pragmatic and keen to apply the best tools for the job. We have wide experience with distributed systems such as Hadoop and Hive, in addition to in-memory distributed computation platforms like Spark.
And we develop everything on Linux locally, manage the source code in Git, and run our workflows on AWS in the cloud. The team has an open culture, works in an agile style and in close cooperation with software developers and colleagues from other disciplines, such as data scientists and client-facing solution managers. You will have the opportunity to develop yourself in areas like big data, cloud computing, data lake architecture and data orchestration.
YOUR PROFILE Bachelor or master’s degree in Computer Science or proven experience in the field; Knowledge of and experience with data engineering best practices is crucial, including the ability to work quickly and independently, communicate well, and furthermore knowing how to devise working solutions, which also generalize towards future workloads. The ability to deal with (big) data workloads is considered a given, like how to deal with daily data quality issues, big data scaling issues (like performance issues and memory constraint issues), etc. Advanced knowledge of Python, SQL and Apache Spark; Knowledge of and experience with Hadoop/HDFS, Hive, PySpark and SparkSQL.
Experience with developing in PyCharm is a plus; Knowledge of and experience with AWS (especially S3 and EMR). Experience with working on Linux (Ubuntu, Bash) is a plus; Knowledge of and experience with Git for source code version-control. Experience with GitHub or Gitlab is a plus.
Knowledge of and experience with FTP ingestion and data lakes / data warehousing. Knowledge of and experience with column-oriented data storage formats (ORC, Parquet) or Presto/Athena is a plus; Knowledge of and experience with Docker (Compose). Knowledge of and experience with Kubernetes or Airflow is a plus; Team player with flexible, proactive and pragmatic attitude; Preferably residing in the Netherlands.
Otherwise willing to relocate to Rotterdam WHAT WE OFFER Competitive salary and benefits; Personal and professional development opportunities; Flexibility in working hours and location; Exciting development projects and clients; An open, respectful and multicultural atmosphere; Time for socialising and fun; In the office: A Football and a Ping-pong table, and Friday afternoon drinks (every Friday) Daily fruit snacks Reimbursement of traveling expenses Working from home: Weekly team stand ups and frequent online project meet ups Regular online team coffee breaks and events Support with home office equipment 25 days of paid leave; LI-AM1 At Nielsen, division Pointlogic, we develop creative analytical solutions to support companies in the optimization of their marketing and communications budgets. Analytical and econometric approaches form the backbone of these solutions, answering questions like “What media mix works best to communicate my brand message?, What budget should I invest in media and marketing?
“. Our software tools and other decision-support solutions combine market research, data, modelling results and technical business intelligence.
You are only one step away from being able to work remotely from anywhere. Fill out your email address here and then you will be directed to the application page for this remote job position. Good luck!