I want to tell you about Marcus. That is not his real name, but the story is real.
Marcus was the IT person at a 120-person company in Gothenburg. He had been there six years. He set up the on-prem Active Directory back when the company was 30 people. He migrated them to a hybrid setup with Entra ID when they moved to Microsoft 365. He configured the conditional access policies. He knew which security groups controlled access to what, in both AD and the cloud, because he had created every single one of them. He handled onboarding, offboarding, license assignments, security alerts, the printer on the third floor, and the CEO who kept forgetting his password.
Marcus was not just the IT department. Marcus was the IT infrastructure.
Then Marcus got a better offer and gave his notice. Four weeks, as is standard here. Not a lot of time to transfer six years of knowledge.
I talked to the company about a month after he left. They were struggling.
The tribal knowledge problem
Nobody had the admin credentials for the Entra ID tenant. Well, they found them eventually, in a password manager Marcus had set up but never properly handed over. Took four days. The domain admin password for the on-prem Active Directory took another two days.
Nobody knew why there were 47 security groups in AD (plus another 20-something in Entra ID that were cloud-only), what half of them did, or which ones were actually in use. Marcus had a naming convention, but it existed only in his head. Some groups synced from AD to the cloud via Entra Connect. Some did not. Nobody knew which were which.
Nobody understood which conditional access policies were active, which were in report-only mode, and which were old ones he had been meaning to clean up. There were 23 of them.
The new IT person (a consultant, hired quickly) spent the first month just mapping out what existed. Not improving anything. Not fixing anything. Just understanding the current state.
Documentation is not the answer (not by itself)
The obvious response is "Marcus should have documented everything." Sure. But documentation gets written once and then immediately starts going stale.
Marcus probably did document some things early on. A wiki page here, a shared document there. But environments change constantly. A new app gets connected. A policy gets tweaked. A group gets repurposed for something it was not originally meant for. Unless someone updates the documentation every single time something changes — and nobody does that consistently, let us be honest — the docs drift from reality.
I have seen companies with thorough Confluence spaces full of identity management documentation that was last touched 18 months ago. That is almost worse than having nothing, because it gives a false sense of security.
Systems over brains
The real fix is not better documentation. It is reducing the amount of knowledge that needs to live in someone's head in the first place.
When your access management runs through a system with defined rules and workflows, the system is the documentation. The rules are visible. The logic is inspectable. A new person can look at the system and understand why someone has the access they have, because the system enforces it. Not because Marcus set it up by hand in 2022 and remembers why.
This is the difference between "Marcus knows how onboarding works" and "the onboarding workflow is defined in a tool that anyone with admin access can read and modify."
It is also the difference between "we need to hire someone who knows our setup" and "we need to hire someone who understands identity management." The first is nearly impossible to find. The second is a normal job requirement.
It is not just about quitting
Marcus-dependency is a risk even if Marcus never leaves. What happens when he takes four weeks off in July (as one does in Sweden) and someone needs emergency access revoked? What happens when he is home sick and a new hire starts on Monday? What happens when he is deep in a migration project and an auditor wants an access report by end of week?
Single points of failure in IT are treated as unacceptable when it comes to servers and networks. Redundancy, failover, backups. Standard practice. But when it comes to knowledge and process, companies tolerate single points of failure all the time.
I think it is because the risk is invisible until it hits. A server going down is immediate and loud. Marcus leaving is slow and quiet. The problems show up gradually, in things that do not get done, access that does not get reviewed, processes nobody knows how to run anymore.
For the Marcuses
One more thing. I feel for the Marcuses of the world. I have been one.
Running IT solo at a growing company is a lot. You know you should document things. You know you should build proper processes. But there is always something more urgent. The printer. The password reset. The new hire who starts Monday and needs everything set up.
If this is you, the best thing you can do for yourself — not just for the company — is to move things out of your head and into systems. Not because you are planning to leave. But because carrying an entire company's IT knowledge in your brain is not a reasonable thing to ask of one person. You deserve your four weeks in July without the phone buzzing.
You deserve to be replaceable. That is not an insult. It is relief.