Bulding a fully automatized spark ETL job over Kubernetes
GitLab CI / Kubernetes / Spark ETL / Apache Livy
PowerBI / Spark thrift server / Hive jdbc connector over more than 500 millions rows per table, composite model, Direct Query
Linux administration (Ubuntu)
ML Lib (Spark ML, graphFrames, logistic regression, binary classification, etc)
Docker containers
DeltaLake
Hadoop
Apache Livy