Photo: Hands-on Docker training, 40 participants
Duration (and price) depends on your specific requiremenets. Duration of training varies from 0.5 days to 3 days.
In recent years, Docker has become a de-facto world-wide standard for packaging an application that is deployed to production, incl. many data-science use-cases. The basic knowledge of Docker is thus a valuable part of any data-scientist’s toolbelt.
In this hands-on hworkshop, we will look into Docker fundamentals. After the training, you will have an understanding of:
This is a beginner workshop, suitable for anyone interested in making first steps with Docker. The workshop is not suitable for data-scientists who already use Docker and look for advanced usage.
Hands-on intro into the Hadoop stack: HDFS, Hive, HBase, Sqoop, Flume, Kafka, Spark. Every participant will have their own local Hadoop cluster to experiment with. After the training, participants should orientate themselves in the main Hadoop technologies.
In practice though, the training often pivots to fit exact needs of the team. For example, to cover NoSQL databases, different ingestion tools, etc.
"Understandable technical explanation, covered every topic in BigData and many real-life Hadoop use-cases." — Inmarsat engineer Read More »
Have Java developers who are converting to Scala and Spark on BigData projects? There are three basic Scala principles they need to know to achieve better code: functional programming, case-classes, and monads. In this training, we will cover all of these with practical examples.
The training will provoke the mind of an imperative Java engineer. And hopefully, a moment of "aha!" will occur during the training when the three principles fall in place.
NB: Basic prior knowledge of Scala would be great. I recommend Martin Odersky's course.
How to merge elegantly? How to collaborate effectively within a team? What is a pull-request? How to contribute to an open-source repo? How to quickly remove committed IDE files?
And some Git internals: What is a commit/index/staging area? What's inside the
.git directory? How to find an unreachable commit? What's Git garbage collection?
This training is ideal for everyone who's been using git for some time but never took time to understand it.
NB: The Git Book is a excellent resource for learning git without training. However, it takes 2-4 days to comprehend it fully. The training takes 3 hours and takes participants deep into git, coupled with interactive discussions and Q&As.
For many data-processing tasks, standard built-in Unix command-line utilities often offer the simplest and way fastest solutions. No need to use Hadoop, Pandas or SQL, if you have
cat, etc. at your disposal in the command-line.
In this training, we will cover these essential tools with hands-on exercises.
I based this training on the course I gave to Unix students at my university back in 2007 and my experience since then.