Election Data Models Lesson for Cybersecurity
If you are like me, you were pretty convinced that Secretary Clinton was poised to be the President elect.    Confidence in this opinion was based on reviewing numerous big data analytics models from the fivethirtyeight.com, the New York Times, Princeton, etc.    The lowest percentage gave Mrs. Clinton roughly a 65% chance of winning on November 8.
Source: Jon Oltsik
So, what happened?    Every database jockey recognizes the old maxim of garbage in/garbage out.    In other words, killer algorithms and all the processing power in the world are rather useless if your model is built on the back of crappy data.    Obviously, all the brainiacs building these models made a critical mistake in not gathering data from disenfranchised white voters in rural areas.    The result?    A stunning election result and lots of eggs on ivy league elitist faces.
Now I know what you are thinking:    What does this have to do with cybersecurity?    Well you can’t get through a cybersecurity meeting in Santa Clara without some fat cat VC or startup crowing about security analytics based upon artificial intelligence, machine learning, neural networks, or some other big data analytics model.
Yup, my head is spinning with buzz word terms like supervised and unsupervised machine learning, entropy and information gain, decision trees, etc.    Mind you, very few people understand this stuff but everyone is talking about it.    And if you think the machine learning rhetoric is insane today, wait until the cacophony at the RSA Conference buzzathon in February.
Here’s where the election results and security analytics intersect:    The accuracy of these models depends upon people, not technology.    Regarding the election, the people building the data models did not understand the electorate making the models flawed by design.    I’m afraid that cybersecurity data models may suffer the same fate because there simply aren’t enough experienced cybersecurity professionals available with situational awareness and data expertise to make all of these models robust.
Here are a few data points I use to support my conclusion:
Per ESG research, 46% of organizations claim that they have a problematic shortage of cybersecurity skills today (note: I am an ESG employee).
Similarly, a recently published research report from ESG and the Information Systems Security Association (ISSA) indicates that 55% of cybersecurity professionals believe that the cybersecurity skills shortage is far worse than most people think.
The ESG/ISSA research report also indicates that 56% of cybersecurity professionals believe that the level of cybersecurity training they receive from their employers is inadequate for keeping up with the threat landscape.
The ESG/ISSA research also reveals that 33% of cybersecurity professionals categorize security analysis and investigations as the area where their organization has its biggest cybersecurity skills shortage.
In aggregate, this data concludes that there is an acute shortage of cybersecurity talent and an even more acute shortage of cybersecurity analytics talent.
New types of cybersecurity analytics depend upon two types of people – data scientists who can build the models and cybersecurity subject matter experts who can feed the models with the right assumptions, data, and situational awareness.    Unfortunately, the ESG/ISSA data demonstrates that there just aren’t enough of these latter folks to go around.
Given this situation, I have two words for CISOs looking to invest in advanced cybersecurity analytics – Caveat Emptor.    The technology itself may dazzle, but as the election results proved, we’ll all get bamboozled when the models are built with the wrong assumptions and flawed data.