Category Archives: Science

Exa! Exa! Read all about it!

MP900422455 Megabytes, gigabytes, exabytes.  This article places total global data capacity at 276 exabytes in 2007 (so we know it's already a stale statistic). 

Of course, it goes on to provide you an illustration of how much data this encompasses (e.g. "If I poured pancake syrup on top of Donald Trump's humongous ego, it would take four gazillion years for it to drip down to the plate."), but I prefer this example:

My dictionary gave me an error…it doesn't recognize the word 'exabyte'…which is also why predictive coding is the new, new thing.

Predictive Coding: Mutually Assured Destruction?

MP900385972 Ralph Losey: "So…Predictive Coding…for or against?"

Perry Segal: "I'll tell you what.  I'll attend your session, then give you my answer, ok?"

Heck, even if I had a ready answer, after sitting in on a session with these heavy-hitters, I might change my mind, anyway.

What is predictive coding?  I'll give you the short answer directly out of the accompanying documentation:  "Technology that informs the coding of uncoded documents based on their similarity to already-coded documents.  Predictive coding permits us to leverage review decisions across many documents, not just one."

Well, that certainly clears it up.  If this were The Hitchhiker's Guide to the Galaxy, I'd liken it to Infinite Improbability Drive – probably.  Or to put it in terms my mind can get around, predictive coding is sort of like assessing the probability of something being probable or improbable over a series of documents, then retaining the probable and discarding the improbable.  The important part for our purposes is that this is the latest approach to efficiently locating relevant documents – with or without human intervention.

The presenters provided two examples of what we face: 1) a theoretical example of one billion emails, 25% with attachments, that would take 54 years to complete under their scenario, and 2) an actual look into the Lehman Brothers bankruptcy, which started at 350 billion pages, culled down to 40 million pages for review by 70 contract attorneys.

A science fiction example was appropriate after all, since the requirements are astronomical.  I was in technology a long time before I became an attorney and the reality is simple.  Predictive coding – in the right hands – has the potential to be a very efficient element of document review.

What do I mean by "right hands"?  Two things, for the most part: qualified and ethical.  The "qualified" part is self-explanatory.  Ethical?  If a party plays the usual games – or only pretends to be implementing this – the entire process breaks down; hence my reference to Mutually Assured Destruction.

My answer to Mr. Losey at the conclusion of the session?  "I don't think the answer is between 'for' or 'against'.  I doubt we're going to have a choice."

Or, for my sci-fi answer…"DON'T PANIC"…

An Inauspicious Anniversary

MP900448582 Sorry, folks…it's year-end for corporations and I've been buried this week getting out the last-minute reports and preparing for LegalTech NY.  However, I did want to share something of interest with you for a late Friday afternoon.  Twenty-five years have passed since the accidental creation of the computer virus.

I know what you're thinking.  What does this have to do with e-discovery?  A lot, actually.  This isn't just about annoyances or stealing individual accounts and/or passwords; although that in itself is bad enough.  If you take it to the extreme, the issues are much more serious.

Have you heard of Stuxnet?  I've personally seen relatively benign viruses like "I Love You" bring corporations to a screeching halt, let alone one that specifically targets Iran's centrifuges.  Many corporations underestimate the danger – especially when it comes to the security of their data, since many viruses exist to open tunnels in security systems.

It's a dangerous way to lose safe harbor protection.

Invasion of the Body Snatchers

Gazoo So, let's see. Evil alien pods replace sleeping humans with clones, devoid of emotion…blah…blah…blah.  Big Deal.  That's just a typical morning in the Segal household, prior to coffee.

So who are the "pods" to be afraid of in 2010?  Federal police agencies, apparently, who are retaining images of your body scans.  Add this to our list of unusual sources of potential electronic evidence.

Pretty soon, the new regulation will be, "Just show up to the airport naked…".

They Know, Whether You Tell Them or Not

J0442485 Bear with me this week, folks.  My home network is completely down and it looks like it won't be up again until next Tuesday.  In the meantime, I'm a nomad in search of wireless networks…

If you want to take a look into the way information – and evidence – will be compiled in the future, this New York Times article spells it out in chilling detail.  Using a combination of data mining and modeling, individuals – who are a lot smarter than I am, by the way – have proven their ability to profile us by accumulating information from several sources.

I'd read about the Netflix issue before and considered posting about it, but the one that really scares me is the report from two Carnegie Mellon researchers who claimed they "could accurately predict the full, nine-digit Social Security numbers
for 8.5 percent of the people born in the United States between 1989 and
2003 — nearly five million individuals."

Keep this in mind as you reveal more and more about yourselves online…

We Must Protect this House! (HR4061)

J0399053 The Cybersecurity Enhancement Act of 2009 (HR 4061), introduced by the federal House Science and Technology Committee, passed February 4th.  Its stated purpose: "To advance cybersecurity research, development, and technical standards, and for other purposes."

What does it mean?  Essentially that the government will spend $396 million over the next four years to encourage cybersecurity best practices and standards.

Senate Bill S.773, the Cybersecurity Act of 2009, is Senator John Rockefeller's version, but isn't as far along.