« Dempsey defied Kerry over Syria | Main | Clerks often have a lot of access »

19 June 2013


Feed You can follow this conversation by subscribing to the comment feed for this post.


Thanks for this explanation. I'm not a big data guy, and I don't know how these systems work, but I'd guess you could do a little more than just track known peoples' contacts. I'd imagine a keyword search could be a blunt, but useful tool in identifying subjects for further analysis. Or is it too broad?

Medicine Man

I said damning, not damaging; And by damning, I mean it makes me doubt Snowden's sense of priorities and perspective. I'm sympathetic to the desire to inform the US populace about how and to what extent their government is spying on them. I draw the line at extending foreign governments the same courtesy and start to wonder what he's thinking.


The NSA isn’t the only one snooping:


Who was at Occupy Wall Street/ Right to Life/Abortion, Free the Whales, you name it, somebody can find it. Hope your boss approves as in “ Your company wants a contract with the city, look what your employees are doing”…. Or “so you think your employees doing a good job, look what they are doing in their free time…..” Nothing like facial recognition software to find out which citizens were exercising their God given rights and where. Not that anyone would intimidate a company or its employees or just solicit a bribe via the threat to do so. (not that any of that would ever happen in America).

Imagine what PM Erdogan could do with a few camers and some software.

Of course all those cameras, servers and softerware – supposedly prevented a park from getting burned down and ‘prevented’ lots of crimes. Right.

Then there is the simpler matter of traffic cameras, no problems there, except for the problem with actually seeing your accuser – the computer, and the burden of proof on the defendant after being accused.

But not to worry, we’ll just shorten the time the light stays yellow, that’s sure to help, um, drive revenue:


You mean checking to see if any applicant was at Occuppy Wall Street or any other public event/rally/polical protest that doesn't fit the hiring managers requested profile?

no one

Just to add a little to my last, I have worked with data mining fairly large databases in the healthcare insurance industry; large databases though orders of magnitude smaller than what, apparently, we are talking about re; the NSA.

Our databases (in an Oracle environment) contained hundreds of millions of medical claims associated with millions of members. Some higher-up got sold on the idea that one could simply purchase data mining software (in this case SAS), slap it on top of the data and, voila!, obtain statistically meaningful patterns of medical service use as related to demographics, medical conditions, etc.

Even after having SAS send its own dataminer experts out to work with us, it was mission impossible. There was so much pre-mining data scrubbing and normalization that had to occur for the application to work. Even then our data would have to be segmented into much smaller subsets and far more targeted analysis. The tool wasn't going to just find answers from raw data. You need to start with a hypothesis and then feed in preselected data based on that hypothesis.

Processing power aside (because it is practically fatal to the concept in and of itself) I can't even begin to imagine how you would normalize immense data sets from cell phones, internet, credit cards, etc so these different sources could be joined together at the level of a unique individual.

OTOH, you could start from a known individual's phone number and see what other numbers have been called. Ditto an IP address. Ditto credit card info. Again, you need to know who your target is.

That's why IMO this is a red herring re; identifying terrorists. You'd have to already know who is a terrorist or terrorist sympathizer before you began looking to all the data to flesh out a network. So, we are back to good old fashioned HUMINT and police work. You wouldn't need everyone's phone records as a starting point.

As an aside, the downplaying of Snowden as grandiose because a little fish like him wouldn't have access to all that data is ridiculous. As a lower level manager in insurance I had access to *all* of the company's data and all of the medical records of every member; as did the non-managerial analysts that reported to me. It is always the little fish that have the access. That's what they are there for. To run the queries. The higher ups are the talkers and deal makers. They don't know the first thing about extracting data.


Here are 2 pieces on PRISM etc. that I have published in the Huffington Post.




" There was so much ... data scrubbing and normalization that had to occur for the application to work"

Amen to that.

It is something many folks don't get when they want to apply technology on data. Normalisation is precondition to efficient processing, be it data mining or, for instance, publishing.

To do proper normalisation on a large number of data efficiently (preferably automatically) and correctly is something of a black art. It is always something that requires careful attention, and often, time.

XML, Regex and semantic capturing can do a lot for you, but when people get really creative ... eventually you're happy for everything at least nominally predictable.

robt willmann

Mr. Sale inquires about whether there is a problem with the NSA data collection, wondering whether the real menaces to our privacy are marketers "who plot every visit to a web site, every purchase in a store or on line, to make a pattern out of our habits to relieve them of our money. That is the real and enduring threat."

The problem is that the NSA and perhaps other government agencies (including their private contractors) are doing what the corporate marketers are doing plus much more, and a government has a claimed monopoly on force and violence to get you to do what it wants and to prevent you from quickly changing it, whereas the private company has no right to use force against you on its own accord.

Here are two things worth watching (totalling only about 18 minutes), in which former NSA employees who had much higher positions than Edward Snowden discuss the problem. The first is a short documentary on William Binney (over 30 years at NSA), whose duties included developing collection programs.


The second video is an interview by the USA Today newspaper of Mr. Binney, J. Kirk Wiebe (over 30 years at NSA), Thomas Drake, and attorney Jesselyn Radack.


An affidavit by Mr. Binney provided in a lawsuit that is still active in California in pdf form is here, and includes his cv/bio as an exhibit--


Mr. Binney also complained of massive waste of money and fraud at the NSA, but to no avail.

No one is asking what other information has been collected and intercepted by the NSA over the years. Are the states' driver license records there? License plate records? Credit card purchases? Credit histories as compiled by the three main credit reporting agencies? Bank records? FinCen records?

The FBI made the request to the FISA court for the order to Verizon to release all call information on every subscriber every day to the NSA. But the NSA is part of the Department of Defense. Perhaps the Posse Comitatus Act -- to prevent military involvement in law enforcement -- has gone the way of all flesh.

This brief discussion by the Guardian newspaper on metadata shows how it was used to link David Petraeus and Paula Broadwell--



A useful debate on this whole business between Maciej Ceglowski and David Simon.

They started far apart but in an all too rare example of open-mindedness and intellectual integrity, they closed the gap. As Ceglowski put it on Twitter) "I’m afraid our argument degenerated into a violent agreement at the end.”

Ceglowski's argument first:

"The security state operates as a ratchet. Once you click in a new level of surveillance or intrusiveness, it becomes the new baseline. What was unthinkable yesterday becomes permissible in exceptional cases today, and routine tomorrow. The people who run the American security apparatus are in the overwhelming majority diligent people with a deep concern for civil liberties. But their job is to find creative ways to collect information. And they work within an institution that, because of its secrecy, is fundamentally inimical to democracy and to a free society."


David Simon's response:

"Reform of the systemic is the only practical hope we have of rationalizing the necessary and continual conflict that will accompany the introduction of every single new technological capability, and a system that is capable of measuring the potentials and risks and then writing, keeping and enforcing the rulebook is the fundamental here. And yet the scare-tactics that accompany this NSA leak are enough to turn potential allies into cynics and take eyes off the legitimate and essential prize."


And, finally, their conversation on the comments thread following Simon's post.


Skip the essays if time’s short but do read this brief conversation. It’s almost enough to give you hope.

The comments to this entry are closed.

My Photo

February 2021

Sun Mon Tue Wed Thu Fri Sat
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
Blog powered by Typepad