/images/avatar_ro.png

Back 2 Code

Spark History Server available in docker-spark

Spark comes with a history server, it provides a great UI with many information regarding Spark jobs execution (event timeline, detail of stages, etc.). Details can be found in the Spark monitoring page. I’ve modified the docker-spark to be able to run it with the docker-compose upcommand. With this implementation, its UI will be running at http://${YOUR_DOCKER_HOST}:18080. To use the Spark’s history server you have to tell your Spark driver:

Find Spark

Find Spark is an handy tool to use each time you want to switch between spark versions in Jupyter Notebooks without the need to change the SPARK_HOME environment variable.

WSJF

WSJF stands for Weighted Shortest Job First. It’s a technique used in scaled Agile framework (SAFe) to prioritise jobs—epics, features and capabilities—according to their value relative to the cost to perform it. Basically it’s a way of ranking a list of features in order to maximise the outcome—the value produced—with a constrained capacity to produce it. The job with the highest WSJF (value over the cost) is selected first for implementation.

Effective Monitoring and Alerting

A short note about this book I used in my work. First of all two good points. The first is that it deals with monitoring, alerting and reporting in general, that is to say independently of the tools used. This is both a strong point and a weak point since it could be useful to identify families of tools adapted to each use. This step back is not so common and allows to introduce higher level concepts, for example the organization of the monitoring in stacks which is absolutely crucial but also notions and general definitions applicable in all circumstances - or almost.

Error Budgets

SRE has found that roughly 70% of outages are due to changes in a live system. Problem Knowing this, there is no need to look any further the reasons why SRE teams–or production team or whatever the team that will be called by angry customers–are so reluctant to change. If it’s not enough, just remind that their objectives are certainly based on the reliability of the services they maintain.