
- There are no hard problems. There is just lack of information about how the system works
- Remember that the bug is happening for a logical reason
- Be unreasonably confident in your ability to fix the bug
- Every error is an opportunity to learn
- Be aware of the imposter syndrome
- Get enough sleep and take breaks
- Try to tackle hard problems in the morning with a fresh mind and without disruption (before you check mails, chat, ticket system, monitoring, …)
Finding the root cause of the problem
- What’s the error message? Are there any log files?
- Read the error description. Every word of it. Twice.
- Is there a typo somewhere (command line/configuration/code)?
- Try to get the issue reproducible
- Can you reproduce it from the command line?
- It’s easier for other people to reproduce the issue
- It’s easier to test the fix
- Isolate the problem
- Remove some parts of the system and try to reproduce the bug
- Vary one thing at a time while keeping all other things constant
Issue still not fixed? Checklist
After some time of debugging
If you copy/paste from Stackoverflow (we all do, at least sometimes)
- Don’t copy/paste from Stackoverflow without understanding the actual problem
- Don’t copy/paste from Stackoverflow without understanding the proposed solution
- If you don’t have time for it right now => make a note about it (even after solving it)
- If you don’t know what the command/tool is doing => read the man page (https://explainshell.com)
- Don’t copy/paste commands/code. Type it on your own!
After solving the issue
- Well done! I’m glad you didn’t give up! It’s time to celebrate your success!
- What have you learned during the journey?
- What were the wrong assumptions?
- Can prevent the problem from happening again in the future (write tests/docs, monitoring)?
- How can you solve a similar problem in future even faster?